Even though a few decades ago artificial intelligence was just a concept people were dreaming about, now it surrounds us. It is embedded in our gadgets, it works as our personal assistant, and in some cases, it proves to be better than a human being. One of these cases is the field of lip-reading. According to some studies cited by the MIT Technology Review, artificial intelligence is better at lip-reading than humans are. This is quite a surprise given the fact that lip-reading is quite difficult. To be honest, the majority of us could not read lips even if their lives depended on it. Robots, on the other hand, seem to have a natural gift towards this area.
The aforementioned studies came to the conclusion that machines are not only perfectly capable of reading the words our lips form, they can also discern speech from silent video clips. LipNet, a new artificial intelligence system developed by a team from the University of Oxford’s Department of Computer Science, was able to identify 93.4 percent of words from a series of videos correctly. When watching the same materials, human volunteers were only able to identify 52.3 percent of words correctly. These were the results of LipNet, which was based on a data set known as GRID. This data set is made of videos that are well lit, and contain images of people who are facing the camera and read three-second sentences.
Another team from the same University conducted another study, on a different set of data. This time around, the team used more than 100,000 video clips from BBC television. Naturally, these aren’t as well lit as the ones from GRID, and the speakers do not always face the cameras. Even so, the artificial intelligence system developed by this team was able to recognize 46.8 percent of all words. In the same scenario, human lip-readers could only identify 12.4 percent of words correctly.
To some of us, this may not seem to be a major thing. However, the fact that artificial intelligence systems are able to read lips as efficiently as they do can come in handy in many situations. For instance, when they are used to turn speech into text, systems based on this kind of artificial intelligence will be able to do a great job even in less than perfect circumstances, such as noisy environments. The lip-reading capabilities of artificial intelligence systems can also prove to be extremely helpful in situations when people are in distress, and in a great number of other occasions also.