PROJECTS
IBM Research Homepage 
 Research Home  >> Audio Visual Speech Technology Group


Audio Visual Speech Technology Group

Humans use a variety of senses to recognize people and understand their communications. Recently we have begun exploring the use of visual information to improve the performance of audio-based technologies such as speech recognition, speaker recognition, speech event detection and speaker change detection. This work is a collaboration, lead by Chalapathy Neti, between the Human Language Technologies Group (HLT), Computer Vision Group and the India Research Lab (ISRC).

The applications for this work include (but are not limited to) accurate audio transcription for efficient search and retrieval of multimedia content Improved human/computer interfaces that use multiple modes for robust recognition of human activity (speech, gesture, etc) in realistic environments like automobiles and public information kiosks, where background noise is a serious problem for recognition technologies based only on acoustics. The navigation bar contains more detailed information about the different areas we are exploring.

 
Related Websites:

  • Johns Hopkins Audio Visual Speech Recognition (Workshop 2000)
  • Audio-Visual Speech Processing (AVSP99)
  • 1999 International Workshop on Multimedia Signal Processing (MMSP99)
  • Inter-Agency Workshop on Smart Computing Environments
  • IEEE International Conference on Multimedia and Expo (ICME2000)
  •  Privacy | Legal | Contact | IBM Home | Research Home | Project List | Research Sites | Page Contact