Interspeech and AVSP 2007
Posted on October 12, 2007 - Filed Under biometrics, conference, research, speech | Leave a Comment
I recently attended two speech related conference over in Europe. It seems I like my international conferences in twos. The first conference was the Interspeech 2007 conference in Antwerp, Belgium, and the second was the International Conference on Auditory-Visual Speech Processing (AVSP) 2007 near Hilvarenbeek in the Netherlands. Both were good experiences and will be [...]
Read More..>>An introduction to audio-visual speech recognition
Posted on April 30, 2007 - Filed Under audio-visual, research, speech | Leave a Comment
This is from an introduction to my latest paper, and I thought it might be useful to put up here. Feel free to leave any comments on this below.
Audio-visual Speech Recognition
Automatic speech recognition is a very mature area of research, and one that is increasingly becoming involved in our day-to-day lives. While many systems that [...]
Audio-visual speech and the McGurk effect
Posted on April 23, 2007 - Filed Under audio-visual, research, speech | Leave a Comment
It may not be immediately obvious to most, but speech is fundamentally a multimodal interaction. (Multimodal is the fancy-pants way of saying that the interaction occurs through more than one mode or channel of communication – audio, visual, gestural, etc.).
While we can communicate very well with audio alone, such as during a telephone call, our [...]
Code: Create Prototype HTK HMM
Posted on November 16, 2006 - Filed Under code, howto, htk, speech | 3 Comments
One of the things that can be annoying about using HTK is creating the prototype HMM files. I think the first thing any researcher probably does is write a quick little program to generate a prototype from a number of parameters. So here’s my version to save you — the person who found this page [...]
Read More..>>Audio-visual speaker verification using continuous fused HMMs
Posted on October 24, 2006 - Filed Under fhmm, publications, research, speech | Leave a Comment
Dean, David and Sridharan, Sridha and Wark, Tim (2006) Audio-visual speaker verification using continuous fused HMMs. In Proceedings HCSNet Workshop on the Use of Vision in HCI, Canberra, Australia.
This paper examines audio-visual speaker verification using a novel adaptation of fused hidden Markov models, in comparison to output fusion of individual classifiers in the audio [...]
An examination of audio-visual fused HMMs for speaker recognition
Posted on October 24, 2006 - Filed Under fhmm, publications, research, speech | Leave a Comment
Dean, David and Wark, Tim and Sridharan, Sridha (2006) An examination of audio-visual fused HMMs for speaker recognition. In Proceedings Second Workshop on Multimodal User Authentication, Toulouse, France.
Fused hidden Markov models (FHMMs) have been shown to work well for the task of audio-visual speaker recognition, but only in an output decision-fusion configuration of both the [...]
Comparing Audio and Visual Information for Speech Processing
Posted on October 24, 2006 - Filed Under publications, research, speech | Leave a Comment
Dean, David and Lucey, Patrick and Sridharan, Sridha and Wark, Tim (2005) Comparing Audio and Visual Information for Speech Processing. In Proceedings The Eighth International Symposium on Signal Processing and Its Applications, pages pp. 58-61, Sydney, Australia.
This paper examines the utility of audio-visual speech for the two related tasks of speech and speaker recognition. A [...]
Audio-visual speaker identification using the CUAVE database
Posted on October 24, 2006 - Filed Under publications, research, speech | Leave a Comment
Dean, David and Lucey, Patrick and Sridharan, Sridha (2005) Audio-visual speaker identification using the CUAVE database. In Vatikiotis-Bateson, Eric and Burnham, Denis and Fels, Sidney, Eds. Proceedings Auditory-Visual Speech Processing 2005, British Columbia, Canada.
The freely available nature of the CUAVE database allows it to provide a valuable platform to form benchmarks and compare research. This [...]
Silent Speech Recognition
Posted on September 20, 2006 - Filed Under research, speech, subvocal | Leave a Comment
Talk to the hand:
Speech recognition is accomplished by electrodes on the face that detect electromyography in the underlying muscles. Currently, the five vowel sounds in Japanese language (a, i, u, e, o) can already be recognized by this technique, and consonants are in the works.
I found this from Richard Sprague at Microsoft who points out [...]
VidTIMIT dataset freely available online
Posted on September 20, 2006 - Filed Under audio-visual, research, speech | 2 Comments
Conrad Sanderson has released the VidTIMIT audio-visual speech dataset so that it is freely available online.
The dataset is comprised of video and corresponding audio recordings of 43 people, reciting short sentences. It can be useful for research on topics such as automatic lip reading, multi-view face recognition, multi-modal speech recognition and person identification.
Link.