<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>the blog of david dean &#187; publications</title>
	<atom:link href="http://www.davidbdean.com/category/publications/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.davidbdean.com</link>
	<description>currently not blogging much at all</description>
	<lastBuildDate>Sat, 21 Jun 2008 15:30:40 +0000</lastBuildDate>
	<generator>http://wordpress.org/?v=2.8.4</generator>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
			<item>
		<title>Audio-visual speaker verification using continuous fused HMMs</title>
		<link>http://www.davidbdean.com/2006/10/24/audio-visual-speaker-verification-using-continuous-fused-hmms/</link>
		<comments>http://www.davidbdean.com/2006/10/24/audio-visual-speaker-verification-using-continuous-fused-hmms/#comments</comments>
		<pubDate>Tue, 24 Oct 2006 03:06:57 +0000</pubDate>
		<dc:creator>David Dean</dc:creator>
				<category><![CDATA[fhmm]]></category>
		<category><![CDATA[publications]]></category>
		<category><![CDATA[research]]></category>
		<category><![CDATA[speech]]></category>

		<guid isPermaLink="false">http://www.davidbdean.com/2006/10/24/audio-visual-speaker-verification-using-continuous-fused-hmms/</guid>
		<description><![CDATA[Dean, David and Sridharan, Sridha and Wark, Tim (2006) Audio-visual speaker verification using continuous fused HMMs. In Proceedings  HCSNet Workshop on the Use of Vision in HCI, Canberra, Australia.
This paper examines audio-visual speaker verification using a novel adaptation of fused hidden Markov models, in comparison to output fusion of individual classifiers in the audio [...]]]></description>
			<content:encoded><![CDATA[<p>Dean, David and Sridharan, Sridha and Wark, Tim (2006) Audio-visual speaker verification using continuous fused HMMs. In <em>Proceedings  HCSNet Workshop on the Use of Vision in HCI</em>, Canberra, Australia.</p>
<blockquote><p>This paper examines audio-visual speaker verification using a novel adaptation of fused hidden Markov models, in comparison to output fusion of individual classifiers in the audio and video modalities. A comparison of both hidden Markov model (HMM) and Gaussian mixture model (GMM) classifiers in both modalities under output fusion shows that the choice of audio classier is more important than video. Although temporal information allows a HMM to out-perform a GMM individually in video, this temporal information does not carry through to output fusion with an audio classier, where the difference between the two video classifiers is minor. An adaptation of fused hidden Markov models, designed to be more robust to within-speaker variation, is used to show that the temporal relationship between video observations and audio states can be harnessed to reduce errors in audio-visual speaker verification when compared to output fusion.</p></blockquote>
<p>[ <a href="http://eprints.qut.edu.au/archive/00005271/">link</a> | <a href="http://eprints.qut.edu.au/archive/00005271/01/5271.pdf">paper (pdf)</a> | <a href="http://eprints.qut.edu.au/archive/00005390/02/Audio-Visual_Speaker_Verification_using_Continuous_Fused_HMMs.ppt">slides (ppt)</a> ]</p>
]]></content:encoded>
			<wfw:commentRss>http://www.davidbdean.com/2006/10/24/audio-visual-speaker-verification-using-continuous-fused-hmms/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>An examination of audio-visual fused HMMs for speaker recognition</title>
		<link>http://www.davidbdean.com/2006/10/24/an-examination-of-audio-visual-fused-hmms-for-speaker-recognition/</link>
		<comments>http://www.davidbdean.com/2006/10/24/an-examination-of-audio-visual-fused-hmms-for-speaker-recognition/#comments</comments>
		<pubDate>Tue, 24 Oct 2006 03:02:49 +0000</pubDate>
		<dc:creator>David Dean</dc:creator>
				<category><![CDATA[fhmm]]></category>
		<category><![CDATA[publications]]></category>
		<category><![CDATA[research]]></category>
		<category><![CDATA[speech]]></category>

		<guid isPermaLink="false">http://www.davidbdean.com/2006/10/24/an-examination-of-audio-visual-fused-hmms-for-speaker-recognition/</guid>
		<description><![CDATA[Dean, David and Wark, Tim and Sridharan, Sridha (2006) An examination of audio-visual fused HMMs for speaker recognition. In Proceedings Second Workshop on Multimodal User Authentication, Toulouse, France.
Fused hidden Markov models (FHMMs) have been shown to work well for the task of audio-visual speaker recognition, but only in an output decision-fusion configuration of both the [...]]]></description>
			<content:encoded><![CDATA[<p>Dean, David and Wark, Tim and Sridharan, Sridha (2006) An examination of audio-visual fused HMMs for speaker recognition. In <em>Proceedings Second Workshop on Multimodal User Authentication</em>, Toulouse, France.</p>
<blockquote><p>Fused hidden Markov models (FHMMs) have been shown to work well for the task of audio-visual speaker recognition, but only in an output decision-fusion configuration of both the audio- and video-biased versions of the FHMM structure. This paper looks at the performance of the audio and video-biased versions independently, and shows that the audio-biased version is considerably more capable for speaker recognition. Additionally, this paper shows that by taking advantage of the temporal relationship between the acoustic and visual data, the audio-biased FHMM provides better performance at less processing cost than best-performing output decision-fusion of regular HMMs.</p></blockquote>
<p>[ <a href="http://eprints.qut.edu.au/archive/00005343/">link</a> | <a href="http://eprints.qut.edu.au/archive/00005343/01/5269.pdf">paper (pdf)</a> | <a href="http://eprints.qut.edu.au/archive/00005343/02/An_Examination_of_Audio-Visual_Fused_HMMs_for_Speaker_Recognition.ppt">slides (ppt)</a> ]</p>
]]></content:encoded>
			<wfw:commentRss>http://www.davidbdean.com/2006/10/24/an-examination-of-audio-visual-fused-hmms-for-speaker-recognition/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Comparing Audio and Visual Information for Speech Processing</title>
		<link>http://www.davidbdean.com/2006/10/24/comparing-audio-and-visual-information-for-speech-processing/</link>
		<comments>http://www.davidbdean.com/2006/10/24/comparing-audio-and-visual-information-for-speech-processing/#comments</comments>
		<pubDate>Tue, 24 Oct 2006 02:59:54 +0000</pubDate>
		<dc:creator>David Dean</dc:creator>
				<category><![CDATA[publications]]></category>
		<category><![CDATA[research]]></category>
		<category><![CDATA[speech]]></category>

		<guid isPermaLink="false">http://www.davidbdean.com/2006/10/24/comparing-audio-and-visual-information-for-speech-processing/</guid>
		<description><![CDATA[Dean, David and Lucey, Patrick and Sridharan, Sridha and Wark, Tim (2005) Comparing Audio and Visual Information for Speech Processing. In Proceedings The Eighth International Symposium on Signal Processing and Its Applications, pages pp. 58-61, Sydney, Australia.
This paper examines the utility of audio-visual speech for the two related tasks of speech and speaker recognition. A [...]]]></description>
			<content:encoded><![CDATA[<p>Dean, David and Lucey, Patrick and Sridharan, Sridha and Wark, Tim (2005) Comparing Audio and Visual Information for Speech Processing. In <em>Proceedings The Eighth International Symposium on Signal Processing and Its Applications</em>, pages pp. 58-61, Sydney, Australia.</p>
<blockquote><p>This paper examines the utility of audio-visual speech for the two related tasks of speech and speaker recognition. A study of the confusion that exists between speaker and speech elements was performed to show that principal component analysis (PCA) based visual speech is considerably better for the task of speaker recognition than for speech. Decision fusion speech and speaker recognition engines were also tested under various levels of acoustic degradation to find that the optimal fusion configuration for speaker recognition was substantially different than that for speech. These results highlight the problem of employing similar visual features for both speech and speaker recognition.</p></blockquote>
<p>[ <a href="http://eprints.qut.edu.au/archive/00005342/">link</a> | <a href="http://eprints.qut.edu.au/archive/00005342/01/4693.pdf">paper (pdf)</a> | <a href="http://eprints.qut.edu.au/archive/00005342/02/Comparing_Audio_and_Visual_Information_for_Speech_Processing.ppt">slides (ppt)</a> ]</p>
]]></content:encoded>
			<wfw:commentRss>http://www.davidbdean.com/2006/10/24/comparing-audio-and-visual-information-for-speech-processing/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Audio-visual speaker identification using the CUAVE database</title>
		<link>http://www.davidbdean.com/2006/10/24/audio-visual-speaker-identification-using-the-cuave-database/</link>
		<comments>http://www.davidbdean.com/2006/10/24/audio-visual-speaker-identification-using-the-cuave-database/#comments</comments>
		<pubDate>Tue, 24 Oct 2006 02:56:28 +0000</pubDate>
		<dc:creator>David Dean</dc:creator>
				<category><![CDATA[publications]]></category>
		<category><![CDATA[research]]></category>
		<category><![CDATA[speech]]></category>

		<guid isPermaLink="false">http://www.davidbdean.com/2006/10/24/audio-visual-speaker-identification-using-the-cuave-database/</guid>
		<description><![CDATA[Dean, David and Lucey, Patrick and Sridharan, Sridha (2005) Audio-visual speaker identification using the CUAVE database. In Vatikiotis-Bateson, Eric and Burnham, Denis and Fels, Sidney, Eds. Proceedings Auditory-Visual Speech Processing 2005, British Columbia, Canada.
The freely available nature of the CUAVE database allows it to provide a valuable platform to form benchmarks and compare research. This [...]]]></description>
			<content:encoded><![CDATA[<p>Dean, David and Lucey, Patrick and Sridharan, Sridha (2005) Audio-visual speaker identification using the CUAVE database. In Vatikiotis-Bateson, Eric and Burnham, Denis and Fels, Sidney, Eds. <em>Proceedings Auditory-Visual Speech Processing 2005</em>, British Columbia, Canada.</p>
<blockquote><p>The freely available nature of the CUAVE database allows it to provide a valuable platform to form benchmarks and compare research. This paper shows that the CUAVE database can successfully be used to test speaker identifications systems, with performance comparable to existing systems implemented on other databases. Additionally, this research shows that the optimal configuration for decision-fusion of an audio-visual speaker identification system relies heavily on the video modality in all but clean speech conditions.
</p></blockquote>
<p>[ <a href="http://eprints.qut.edu.au/archive/00005341/">link</a> | <a href="http://eprints.qut.edu.au/archive/00005341/01/4860.pdf">paper (pdf)</a> | <a href="http://eprints.qut.edu.au/archive/00005341/02/AVSP_Poster.ppt">poster (ppt)</a> ]</p>
]]></content:encoded>
			<wfw:commentRss>http://www.davidbdean.com/2006/10/24/audio-visual-speaker-identification-using-the-cuave-database/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>
