A research team from the University of Oxford’s Department of Computer Science has officially developed new lip-reading software, LipNet, which they are now claiming is the most accurate of its kind to date by a wide margin.
The development of the software has been detailed in a paper which reports LipNet has been able to best the existing top marks in lipreading tech accuracy by 13.8 percent. The previous best software and its 79.6% mark was already light-years ahead of the efforts of human lip-readers, who averaged 52.3% accuracy with the exact same test.
According to the paper, “All existing [lip-reading approaches] perform only word classification, not sentence-level sequence prediction…. To the best of our knowledge, LipNet is the first lip-reading model to operate at sentence-level.”
In other words, the software became even more effective as it moved closer to how the human brain best processes this type of visual data. It takes the video of a speaker and instead of diving in on each and every word as a distinct entity, its deep-learning predictive capabilities have allowed it to place them within a larger context for greater understanding (you can see it in action in the video below).
Although the software has not yet been put up to the task beyond the baseline test and still needs some further development, it is important to note that this heightened level of accuracy opens up a whole new world of possibilities. For those who depend a lot on sign language and, to a lesser degree, lip-reading, communication can become extremely challenging.