Our team is delving into some of the most exciting areas of AI Multimedia, computer vision, and speech technologies. We’re teaching computers to understand video, augmenting reality to guide field technicians when operations get complex, helping computers recognize people, detect sentiment and speak with emotion, and enrich video with metadata extracted from it.

The Video AI Technologies group focuses on algorithms and platforms for effective interaction with multimedia content. Domain areas include live and VOD (Video on Demand) streaming and management of multimedia content with rich metadata (derived from video analytics or geographical information).

Technological areas:
Streaming technology and standards, video compression, video analytics, deep learning, anomaly detection, geo-spatial processing and visualization.

Solutions domains:

  • Situational awareness
  • Multimedia enabled search & retrieval systems


Udi Barzelay, Manager Video AI Technologies, IBM Research - Haifa




In the area of video analytics, our group conducts research and develops novel computer vision algorithms (also based on deep learning and other machine learning tools) for various problems such as scene text detection and recognition in natural videos and images, video segmentation, visual recognition and scene understanding, object tracking, and more. A special focus of our group's research work is on effective analysis of video captured by moving platforms, also in combination with geographical information (GIS), for applications such as dashboard and body-worn camera analytics.

Video Enrichment/Retrieval/Summarization

Video Enrichment / Retrieval / Summarization

Using cognitive computing to discover insights from videos

Video Analytics

Video Analytics

  • Multimodal video semantic scene detection and segmentation
  • Video text detection and recognition
  • Image enhancement, Tracking