An examination of audio-visual fused HMMs for speaker recognition

Posted on October 24, 2006

Dean, David and Wark, Tim and Sridharan, Sridha (2006) An examination of audio-visual fused HMMs for speaker recognition. In Proceedings Second Workshop on Multimodal User Authentication, Toulouse, France.

Fused hidden Markov models (FHMMs) have been shown to work well for the task of audio-visual speaker recognition, but only in an output decision-fusion configuration of both the audio- and video-biased versions of the FHMM structure. This paper looks at the performance of the audio and video-biased versions independently, and shows that the audio-biased version is considerably more capable for speaker recognition. Additionally, this paper shows that by taking advantage of the temporal relationship between the acoustic and visual data, the audio-biased FHMM provides better performance at less processing cost than best-performing output decision-fusion of regular HMMs.

[ link | paper (pdf) | slides (ppt) ]

» Filed Under fhmm, publications, research, speech

Comments

Leave a Reply




  • Pages

  • Recent Posts

  • Categories

  • Interesting from Elsewhere

  • Meta