Imagine this: You are in front of me and I am speaking a well-known celebrity name in front of you. I would say Rober Downey Jr. What happened in your mind? I know. You have made a picture of him in your mind by yourself which could be different from person to person. I have seen him in Iron Man suit, maybe you have made a picture of him in a casual look.
This is how the new AI algorithm works. It generates faces by listening to the voice. It is named as Speech2Face. New AI algorithm has the ability to think in such manner as the human brain and it was trained by scientists using millions of different educational videos that showed more than 100,000 people talking in different voices.
To generate a human face, Speech2Face algorithm has trained to understand the different features of the human face such as mouth, nose, etc. Furthermore, it is deeply trained to generate face by hearing different vocals of different people. I know it is quite confusing to understand.
Right now, Speech2Face algorithm is quite far from the perfect. It is trained to give mixed results. That means the algorithm is making faces which don’t even closely match to the vocals. For example, during the test scientists reported, AI listened to an audio clip of an Asian man speaking Chinese. So, the program produces an image of an Asian man face. While after listening the same person was speaking English, the program generated a white man face.
“As such, the model will only produce average-looking faces. It will not produce images of specific individuals.” – the scientists.