Cantoche avatars Henri and Jacky chosen to animate AI vs. AI chatbot dialogue
Living Actor ™ Presenter avatars were chosen by Cornell University’s Creative Machines Lab to animate a chatbot dialogue that went viral on YouTube by August 31st. As of today, the YouTube Video has had over 1.9 million hits.
Henri and Jacky avatars
A chatbot is a computer program using artificial intelligence (AI) to simulate a human interlocutor. Creative Machines Lab students Igor Labutov and Jason Yosinski wanted to see what would happen if they connected two laptops to a chatbot and let it have a conversation with itself. They added text-to-speech capability and animated the AI vs. AI conversation with Living Actor™ Presenter avatars Henri and Jacky. The viral video has been covered by the news media in a number of countries (U.S., France, Germany, China, Poland, etc.) and by a number of media outlets including IEEE Spectrum , CNN, Gizmodo, National Public Radio, BoingBoing , The Sun, Wall Street Journal, and more.
The YouTube video demonstrates Cornell’s submission for the 2011 Loebner Prize Competition in Artificial Intelligence which takes place on October 19th. The students believe they are the first persons to animate an AI vs. AI conversation.
How did they do it? According to Cornell University Creative Machine Lab , they chose a publicly available chatbot with a large database of instances of human dialogue, a text-to-speech synthesizer to create an audio file of the chatbot dialogue, and finally, the Living Actor ™ Presenter AI speech-to-animation engine and avatars to bring the computer generated chatbot dialogue to life by synchronizing full body gestures, lips, and facial expressions to the audio stream.
How does Cantoche do it? The expert synchronization of the avatar behaviors with the dialogue was accomplished by using the Living Actor ™ speech-to-animation engine, which provides automatic animation of avatars by a human voice. The speech-to-animation engine is a proprietary audio analysis technology, independent of the linguistic content, which analyzes audio signal (recorded or streamed) and identifies sound volume, pace of the speech, prosody, emotions, and others parameters. These features are compared with the characteristics associated with the avatar’s animation data and the best behaviors and actions are automatically generated using several artificial intelligence techniques to realize the speech visualization by full-body avatars. This entire process can be deferred (as in videos, etc.) or occur in real time.
SpeechToAnimation is an innovative, protected technology
Try an experiment of your own: When you go to YouTube to view the viral video, first just listen to the dialogue without looking at the video. After you have done that, then replay the video and this time; watch the avatars animate the conversation. You’ll see that the dialogue will not only make more sense to you but also you will see firsthand how the synchronization of the avatars’ behaviors with speech adds a distinctly human quality to the dialogue that helps us identify the interlocutors intent as well as create our own meaning. Is the avatar conversation ironic, funny, familiar, nonsensical, or philosophical? Are they having an argument? Your answer depends on your interpretation. Let me know—what does their conversation mean to you?