An introduction to audio-visual speech recognition
Posted on April 30, 2007 - Filed Under audio-visual, research, speech | Leave a Comment
This is from an introduction to my latest paper, and I thought it might be useful to put up here. Feel free to leave any comments on this below.
Audio-visual Speech Recognition
Automatic speech recognition is a very mature area of research, and one that is increasingly becoming involved in our day-to-day lives. While many systems that [...]
Audio-visual speech and the McGurk effect
Posted on April 23, 2007 - Filed Under audio-visual, research, speech | Leave a Comment
It may not be immediately obvious to most, but speech is fundamentally a multimodal interaction. (Multimodal is the fancy-pants way of saying that the interaction occurs through more than one mode or channel of communication – audio, visual, gestural, etc.).
While we can communicate very well with audio alone, such as during a telephone call, our [...]
VidTIMIT dataset freely available online
Posted on September 20, 2006 - Filed Under audio-visual, research, speech | 2 Comments
Conrad Sanderson has released the VidTIMIT audio-visual speech dataset so that it is freely available online.
The dataset is comprised of video and corresponding audio recordings of 43 people, reciting short sentences. It can be useful for research on topics such as automatic lip reading, multi-view face recognition, multi-modal speech recognition and person identification.
Link.