Abbasi, Q. , Ge, Y., Chong, T., Haobo, L., Zikang, C., Wang, J., Wenda, L., Cooper, J. , Chetty, K., Faccio, D. and Imran, M. (2023) A comprehensive multimodal dataset for contactless lip reading and acoustic analysis. [Data Collection]
Collection description
Nowadays, non-privacy small-scale motion detection has attracted an increasing number of researches of remote sensing in speech recognition. These new modalities target to enhance and restore the speech information from speakers from multiple types of data. In this paper, we propose a dataset contains 7.5 GHz Channel Impulse Response (CIR) data from Ultra-Wideband (UWB) radar, 77 GHz frequency modulated continuous wave (FMCW) data from millimeter-wave (mmWave) radar and laser data. Meanwhile, a depth camera is adopted to record the subjects’ landmarks of lip and voice. Approximately 6 hours of annotated speech profiles are provided, which are collected from 20 participants speaking 5 vowels, 15 words and 16 sentences. The dataset has been validated for and is potential for the researches of lip reading and multi-modalities speech recognition.
Funding: |
|
---|---|
College / School: | College of Science and Engineering > School of Engineering |
Date Deposited: | 09 Nov 2023 11:52 |
Statement on legal, ethical and access issues: | This is open data and is available under Creative Commons Licence (CCBY) |
URI: | https://researchdata.gla.ac.uk/id/eprint/1408 |
Available Files
There are no files for this dataset available to download.
Repository Staff Only: Update this record