Group Members:
|   |
|
|
The goal of the Superhuman speech recognition project is to develop a recognition system that meets or exceeds human performance across the full spectrum of noise, channel, and speaker characteristics that is encountered in the real world. In contrast, current speech recognition systems require extensive tuning to reach acceptable performance in any particular domain. A system that gives optimal performance for, say, transcribing voicemail messages, will not be optimal for handling account inquiries at a bank, and may actually fail badly. Similarly, a system that recognizes speech that is input into a hand-held telephone will not produce good results when a person talks into a speakerphone from across the room. The tuning that is required to achieve good performance at a particular task is expensive, inconvenient, and hampers the widespread acceptance of the technology. By developing a system that works "out-of-the-box" as well as a person, this project will enable the truly pervasive use of speech recognition technology. In order to reach this level of performance, the Superhuman recognition project sets precisely defined numerical goals each year. These goals focus on reducing the word-error-rate across a wide variety of speech sources, including call-center conversations, meetings, voicemail messages, and telephone conversations. In one of our most difficult challenges, we are transcribing the oral histories of Holocaust survivors in the Shoah Visual History Foundation (see the MALACH link on the following page). To date, the Superhuman project has exceeded WER-reduction targets of 25 and 30%, with a compound rate of improvement of 30.1%. |
| Last updated: 2/4/03 | ||
|
|
|
|