Code for Synergising Multi-Modal Data: Unveiling Audio-Visual and Radar Insights

Ghadban, N., Hameed, H., Abbasi, Q. H. , Imran, M. , Cooper, J. and Le Kernec, J. (2025) Code for Synergising Multi-Modal Data: Unveiling Audio-Visual and Radar Insights. [Data Collection]

Collection description

The study uses a multimodal dataset comprising synchronized audio, video, and radar recordings collected to evaluate the effectiveness of audio–visual–radar (AVR) fusion in speech-recognition tasks. The dataset contains 800 labeled samples, produced by four fluent but non-native English speakers aged 28–40, each repeating ten phonetically challenging English words twenty times under controlled recording conditions.

Audio was recorded at 44.1 kHz/16-bit, video at 1080p/30 fps, and radar data were acquired using the XeThru X4M03 IR-UWB sensor positioned approximately 2 meters from the speaker to capture speech-related micro-Doppler articulatory signatures. The three modalities were automatically segmented and aligned using pretrained machine-learning models to ensure consistent synchronization across streams.

The purpose of this dataset is to serve as a resource for studying how radar sensing can enhance speech-recognition robustness under noise, occlusion, and other adverse conditions.

All audio–visual recordings contain identifiable human data and will therefore be shared only under controlled access to protect participant privacy.

Funding:
College / School: College of Science and Engineering > School of Engineering
College of Science and Engineering > School of Engineering > Biomedical Engineering
College of Science and Engineering > School of Engineering > Electronics and Nanoscale Engineering
College of Science and Engineering > School of Engineering > Systems Power and Energy
Date Deposited: 25 Jun 2026 14:45
Related resources:
URI: https://researchdata.gla.ac.uk/id/eprint/2111

Available Files

Data

Visible to:Anyone
File size:5kB
License:CC BY 4.0

Read me

Visible to:Anyone
File size:21kB
License:CC BY 4.0

Repository Staff Only: Update this record

Ghadban, N., Hameed, H., Abbasi, Q. H. , Imran, M. , Cooper, J. and Le Kernec, J. (2025); Code for Synergising Multi-Modal Data: Unveiling Audio-Visual and Radar Insights

University of Glasgow

DOI: 10.5525/gla.researchdata.2111

Retrieved: 2026-06-27

Downloads

Downloads per month over past year