The Speech Technologies group focuses on voice and multimodal technologies and their use for advanced services and applications. We create technology components, solutions, frameworks, and services that enhance the experience and capabilities offered to mobile users and enterprises. The group's expertise covers a wide spectrum of technologies for expressive speech synthesis, speech based emotion detection and multimodal biometrics.
Currently, the group's activities focus on three areas:
- Text-to-Speech - We develop state-of-the-art expressive text-to-speech technology for delivering information and interacting with enterprise customers. The technology is currently offered as a service in the IBM Watson Developer Cloud.
- Multimodal Biometrics - We develop advanced multimodal biometrics technology and services for mobile multi-factor authentication solutions as well as for people identification by robots. Our focus is on mainly on voice and face modalities.
- Affective Computing – We develop technologies and solutions for speech based emotion detection as well as affective speech synthesis. These technologies can be used for analytics of spoken data as well as for affect aware human computer interaction.