Synergising Multi-Modal Data: Unveiling Audio-Visual and Radar Insights

Ghadban, N., Hameed, H., Abbasi, Q. H. , Imran, M. , Cooper, J. and Le Kernec, J. (2025) Synergising Multi-Modal Data: Unveiling Audio-Visual and Radar Insights. [Data Collection]

Not all data is available to download from this page. Usually this is because the dataset is too large or it is restricted in some way.

Collection description

The study uses a multimodal dataset comprising synchronized audio, video, and radar recordings collected to evaluate the effectiveness of audio–visual–radar (AVR) fusion in speech-recognition tasks. The dataset contains 800 labeled samples, produced by four fluent but non-native English speakers aged 28–40, each repeating ten phonetically challenging English words twenty times under controlled recording conditions.

Audio was recorded at 44.1 kHz/16-bit, video at 1080p/30 fps, and radar data were acquired using the XeThru X4M03 IR-UWB sensor positioned approximately 2 meters from the speaker to capture speech-related micro-Doppler articulatory signatures. The three modalities were automatically segmented and aligned using pretrained machine-learning models to ensure consistent synchronization across streams.

The purpose of this dataset is to serve as a resource for studying how radar sensing can enhance speech-recognition robustness under noise, occlusion, and other adverse conditions.

All audio–visual recordings contain identifiable human data and will therefore be shared only under controlled access to protect participant privacy.

Funding:
College / School: College of Science and Engineering > School of Engineering
College of Science and Engineering > School of Engineering > Biomedical Engineering
College of Science and Engineering > School of Engineering > Electronics and Nanoscale Engineering
College of Science and Engineering > School of Engineering > Systems Power and Energy
Date Deposited: 29 Jun 2026 08:28
Related resources:
URI: https://researchdata.gla.ac.uk/id/eprint/2109

Available Files

Read me

Repository Staff Only: Update this record

Ghadban, N., Hameed, H., Abbasi, Q. H. , Imran, M. , Cooper, J. and Le Kernec, J. (2025); Synergising Multi-Modal Data: Unveiling Audio-Visual and Radar Insights

University of Glasgow

DOI: 10.5525/gla.researchdata.2109

Retrieved: 2026-07-02

Downloads

Downloads per month over past year