State-of-the-art expressive Text-to-Speech (TTS) technology for delivering information and interacting with enterprise customers, as well as for education and entertainment purposes. In addition to active participation in development of the TTS service on the IBM Watson Developer Cloud, we’re working on other facets of TTS. Current research other topics include customizing the TTS voice by applying on-line voice transformation with user-controlled parameters, or by learning the characteristics and speaking style of uploaded samples of target speaker and automatically generating an appropriate voice transformation. We are also working on controlling the synthesized speech expressions, emphasis, and emotions while preserving high quality.
Advanced multimodal biometrics technology for mobile multi-factor authentication solutions. Our focus is on voice, face, and video authentication with biometric fusion, as well as vocal, visual, and audiovisual liveness detection to protect against spoofing/replay attacks.
Speech-based Emotion Recognition
Technologies and solutions for speech-based emotion recognition that enable analytics of spoken data, as well as for affect aware human computer interaction. Speech-based emotion recognition combines prediction from verbal content (textual transcript) as well as the non-verbal content (by direct signal analysis).