Photo
Audio Visual Speech Technologies

Audio Visual Speaker Change Detection

 

Speaker change detection is very valuable information for speaker identification and as metadata for search and retrieval of multimedia content. Speaker change detection can be inherently unrobust due to mismatches in training and test conditions like, changes in acoustic channel and background noise. This research focuses on exploiting visual speaker and scene change information to remove the limitations of audio-based speaker change detection.

Key component technologies:

  • Visual scene/speaker change detection
  • Audio-based speaker change detection
  • Fusion techniques

Paper: