TRL
TOP PAGETokyo Research LaboratoryEmploymentProjectsRelated InformationIBM Research
Japanese page is here.

MPEG-7 authoring system



Overview

This system adds indexes which describe the content of a video, and generates the metadata used when the video digest is generated. It is very difficult for general computer-based image processing to properly add all of the index locations needed to understand the video content, so this system allows for manually adding two kinds of indexes. The manual indexing strategies used are:
  • Scene-based indexing
  • Event-based indexing
The indexes are described using MPEG-7 and used as metadata when the video digests are generated.

Scene-based indexing

Scene-based indexing is suitable for video content which has a stable scene layout. Semantic indexes can be added to each scene. For example, this technique will work with news programs, documentaries, and movies. This system automatically divides the video into segments for each scene. This allows adding titles and comments for each scene as metadata. Captions, sounds, and keywords can be used for indexing.

Scene Based Indexing

Event-based indexing

For sports videos, indexes associated with each event in the video are better for understanding the video content. This distinguishes event-based indexing from scene-based indexing. This event-based indexing adds the kinds of event and the time of this events as metadata. For example, in a soccer video when a trigger event occurs, additional information can be is added to the index, such as the names of the team and player triggering the event. There are two kinds of trigger event, a single trigger event and a multi-trigger event. The single trigger event creates an index entry which consists of one trigger event, but the multi-trigger event records information involving several related trigger events. For example, a multi-trigger event could describe a soccer play involving cooperation among several soccer players. For actual indexing, assigning each trigger event to a key on the keyboard reduces the indexing time. Using our current system, event-based indexing takes only about 1.5 times as long as the actual video. Our next target is real-time indexing.

Event Based Indexing

MPEG-7 metadata

The event index associated with a video is modeled as a continuous function with predefined importance. For scene-based indexing, the length of the generated video digest is a step function, because the index uses discrete weights. For event-based indexing, the weight of the index has continuous values, and the digest can be of arbitrary length. According to the digest lengths requested by the users, thresholds are set and digests of various lengths can be generated. In this authoring system, these event indexes and weights are described using MPEG-7.

Digest Length


Research home IBM home Order Privacy Legal Contact IBM
Last modified 26 Feb. 2002