Enlighten Research Data

In this section

Synergising Multi-Modal Data: Unveiling Audio-Visual and Radar Insights

Ghadban, N., Hameed, H., Abbasi, Q. H. , Imran, M. , Cooper, J. and Le Kernec, J. (2025) Synergising Multi-Modal Data: Unveiling Audio-Visual and Radar Insights. [Data Collection]

Datacite DOI: 10.5525/gla.researchdata.2109

Related Enlighten Publications

Artificial Intelligence and Radar-Enhanced Audio-Visual Speech Recognition for the Next Generation of Communication Technologies
Advancing Speech Recognition for the Hearing Impaired: Key Insights from the Language and Technology Workshop
Unlocking the Future: AI and Radar-Enhanced Audio-Visual Speech Recognition
Noise Reduction in Audio Visual Radar Speech Recognition System
Radar-enhanced multimodal speech recognition under occlusion and noise: methodology and experimental analysis

Not all data is available to download from this page. Usually this is because the dataset is too large or it is restricted in some way.

Collection description

The study uses a multimodal dataset comprising synchronized audio, video, and radar recordings collected to evaluate the effectiveness of audio–visual–radar (AVR) fusion in speech-recognition tasks. The dataset contains 800 labeled samples, produced by four fluent but non-native English speakers aged 28–40, each repeating ten phonetically challenging English words twenty times under controlled recording conditions.

Audio was recorded at 44.1 kHz/16-bit, video at 1080p/30 fps, and radar data were acquired using the XeThru X4M03 IR-UWB sensor positioned approximately 2 meters from the speaker to capture speech-related micro-Doppler articulatory signatures. The three modalities were automatically segmented and aligned using pretrained machine-learning models to ensure consistent synchronization across streams.

The purpose of this dataset is to serve as a resource for studying how radar sensing can enhance speech-recognition robustness under noise, occlusion, and other adverse conditions.

All audio–visual recordings contain identifiable human data and will therefore be shared only under controlled access to protect participant privacy.

Funding: