An introduction to audio-visual speech recognition

Posted on April 30, 2007 - Filed Under audio-visual, research, speech | Comments Off

This is from an introduction to my latest paper, and I thought it might be useful to put up here. Feel free to leave any comments on this below.
Audio-visual Speech Recognition
Automatic speech recognition is a very mature area of research, and one that is increasingly becoming involved in our day-to-day lives. While many systems that [...]

Read More..>>

Audio-visual speech and the McGurk effect

Posted on April 23, 2007 - Filed Under audio-visual, research, speech | Comments Off

It may not be immediately obvious to most, but speech is fundamentally a multimodal interaction. (Multimodal is the fancy-pants way of saying that the interaction occurs through more than one mode or channel of communication – audio, visual, gestural, etc.).
While we can communicate very well with audio alone, such as during a telephone call, our [...]

Read More..>>

VidTIMIT dataset freely available online

Posted on September 20, 2006 - Filed Under audio-visual, research, speech | 2 Comments

Conrad Sanderson has released the VidTIMIT audio-visual speech dataset so that it is freely available online.
The dataset is comprised of video and corresponding audio recordings of 43 people, reciting short sentences. It can be useful for research on topics such as automatic lip reading, multi-view face recognition, multi-modal speech recognition and person identification.
Link.

Read More..>>

  • Pages

  • Recent Posts

  • Categories

  • Interesting from Elsewhere

  • Meta