Audio-visual speech and the McGurk effect
Posted on April 23, 2007
It may not be immediately obvious to most, but speech is fundamentally a multimodal interaction. (Multimodal is the fancy-pants way of saying that the interaction occurs through more than one mode or channel of communication - audio, visual, gestural, etc.).
While we can communicate very well with audio alone, such as during a telephone call, our brains make use of many visual cues when we talk face-to-face. As well as more broad visual cues such as gestures and facial expressions, it may come as a surprise to learn that the actual motion of the lips play a very important part in the comprehension of human speech.
A useful demonstration of the impact of the visual modality on speech is the McGurk effect, first published by McGurk and McDonald in 1976. Rather than explain it in too much detail right now, go watch the video below from an episode of the Hackszine video podcast.
The basic and original McGurk effect was demonstrated by dubbing a video of a person saying ‘gah’ with audio of them saying ‘bah’. If you watch the dubbed video, they appear to be saying ‘dah’, but the audio along clearly says ‘bah’. This shows that even though you may not realise it, the visual lip movements are having an effect on your perception of speech. The hackszine video extends the McGurk effect to cover bad dubbing in general, but I would only consider the McGurk effect to cover when said bad dubbing appears to make the person say something that is neither in the video or dubbed audio.
Finally, this (I think) Japanese talk show appears to be very interested in the McGurk effect. It makes for fairly amusing watching.
More information:
- McGurk Effect at Wikipedia
- Hearing with your eyes: The McGurk Effect
- McGurk, Harry; and MacDonald, John (1976); “Hearing lips and seeing voices,” Nature, Vol 264(5588), pp. 746–748
» Filed Under audio-visual, research, speech
Comments
Leave a Reply