![]() |
![]() |
![]() |
![]() |
|
| Video Semantic Summarization Systems | |||
|
|
|||
|
|
The VideoAnnEx annotation tool assists authors in the task of annotating video sequences with MPEG-7 metadata. Each shot in the video sequence can be annotated with static scene descriptions, key object descriptions, event descriptions, and other lexicon sets. The annotated descriptions are associated with each video shot and are stored as MPEG-7 descriptions in an output XML file. VideoAnnEx can also open MPEG-7 files in order to display the annotations for the corresponding video sequence. The annotation tool also allows customized lexicons to be created, saved, downloaded, and updated. VideoAnnEx takes an MPEG video sequence as the required input source. The tool also requires a corresponding shot segmentation file, where the input video sequence is segmented into smaller units called video shots by detecting the scene cuts, dissolves, and fades. This shot file can be loaded into the tool from other sources or generated when the input video is first opened. After VideoAnnEx performs shot detection on a video, the shot file can be saved in MPEG-7 schema for later use. As an alternative, the shot file can also be generated by the IBM CueVideo Shot Detection Toolkit. The VideoAnnEx annotation tool is divided into four graphical sections as illustrated in Figure 1. On the upper right-hand corner of the tool is the Video Playback window with shot information. On the upper left-hand corner of the tool is the Shot Annotation with a key frame image display. On the bottom portion of the tool is two different Views Panel of the annotation preview. A fourth component, not shown in Figure 1, is the Region Annotation pop-up window for specifying annotated regions. These four sections provide interactivity to assist authors of the annotation tool.
Overview
of Graphical User Interface The VideoAnn is divided into four graphical sections as illustrated in Figure 1. On the upper right-hand corner of the tool is the Video Playback window with shot information. On the upper left-hand corner of the tool is the Shot Annotation with a key frame image display. On the bottom portion of the tool is two different Views Panel of the annotation preview. A fourth component, not shown in Figure 1, is the Region Annotation pop-up window for specifying annotated regions. These four sections provide interactivity to assist authors of the annotation tool. The Video Playback window on the upper right-hand corner displays the opened MPEG video sequence as show in Figure 2. The four playback buttons directly below the video display window include:
The Shot Annotation module on the upper left-hand corner displays the defined annotation descriptions and the key frame window as depicted in Figure 3. As the video is displayed on the Video Playback, a key frame image of the current shot is displayed on the Key Frame window. The key frame is a representative image of the video shot segment, and thus offer an instantaneous recap of the whole video shot. Consequently, the key frame may provide the author with immediate assistance in annotating the shot descriptions. In the shot annotation module, the annotation lexicon is also displayed. There are three types of lexicon as follows:
The Views Panel on the bottom displays two different previews of representative images of the video. They are:
The Shots in the Video view shows all the key frames of each shot as representative images over the entire video as illustrated in Figure 5. Below each shot's key frame is the annotated descriptions, if indeed they have already been provided. The author can peruse the entire video sequence in this view and examine the annotated and non annotated shots. The <Prev> and <Next> buttons scroll the view panel horizontally to reflect the temporal video shot ordering. Also, one can double-click on any of the representative images in the panel. This action instantiates the selection of the corresponding shot, resulting in (1) the appropriate shot being displayed on the Video Playback window, (2) the simultaneous key frame being displayed on the Key Frame window, and (3) the corresponding checked descriptions on the Shot Annotation panels. In this preview mode, if the author clicks the <OK> button on the Shot Annotation Window then the video will FFF playback of the current shot and advance to play the next shot in normal playback mode.
The Region Annotation pop-up window shown in Figure 6 allows the author to associate a rectangular region with a labeled text annotation. After the text annotations are identified on the Shot Annotation window, each description can be associated with a corresponding region on the selected key frame of that shot. When the author finishes check marking the text annotations and clicks the <OK> button, then the Region Annotation window appears. On the left side of the Region Annotation window is a column of descriptions listed under <Annotation List>. On the right side is the display of the selected key frame for this shot along with some rectangular regions. For each description on the <Annotation List>, there may be one or no corresponding region on the key frame.
The descriptions under the <Annotation List> may be presented in one of four colors:
Download Software Download the IBM VideoAnnEx annotation tool at the IBM alphaWorks web site:
1. Open an MPEG video for annotation. 2. After an MPEG video is opened, the annotation lexicon will appear in the Shot Annotation panel. 3. Play the video sequence on the Video Playback window by selecting the <Play>, <FF>, <FFF>, or <Stop> buttons. 4. The video will pause playing at the end of the current shot, waiting for the author to enter the annotations. 5. For the current video shot,
Each shot should have at least one selection from the <Static Scenes> and from the <Key Objects>. Annotations for temporal features and actions can be selected from <Events>. Furthermore, the author can specify other descriptions on the <Keywords> textbox. Multiple entries can be entered for <Keywords>, as long as they are separated by commas. 7. When the author finishes annotation for a shot,
click the <OK> button on the Shot
Annotation module 8. View the annotations by switching to the Shots in the Video Views Panel. 9. Save the annotations for this
video. The important step in using the VideoAnn annotation tool is to study the annotation lexicon. The lexicon is divided into three categories, as displayed in the Shot Annotation module. As we annotate a shot, keep in mind that the shot occurs at some scene. So we suggest annotating the static scene descriptions first. Afterward, focus our attention to the key subjects in the scene. Identify these subjects with key object descriptions. Finally, observe the actions executed by these objects. These actions are labeled with event descriptions. Furthermore, some vocabularies are not available in the lexicon. Use the keywords box to annotate additional descriptions. Keywords may include proper nouns, titles, captions, and other remarks. At the end of this section, we have compiled a list of Keyword Vocabulary, some sample Keyword Images, and a list of Annotation Tips. After specifying the text annotations for a shot, the regions corresponding to these descriptions are also recorded. Here are the guidelines for identifying the regions of interest. Note that these guidelines are suggestive only and are generated with respect to our goal of training video retrieval models. The guideline is divided into three parts to correspond to the three different lexicon categories: static scenes, key objects, and events. Here is the summary:
"When in doubt, do not annotate." Here is a listing of the keywords used in our vocabulary for the TREC
Video Retrieval Benchmark. A corresponding set of Keyword
Images is listed in the appendix for your reference.
Annotation Tips 1. On the Key Objects panel, if you click on Rocket, Transportation is implied. You should not click on both, since that just creates redundant information to be stored in the database. 2. Although a lot of categories are not actually disjoint, but
for our purposes, we are assuming that they are. An example of this
is that a 3. Do not mix up descriptions that are in different categories. Anything you label under Static Scene is used to describe the background, and you must pick one background that best describes the sequence. Clicking on Man-Made under Static Scene means that the background is man-made, and not the key object in the foreground. Thus, Man-Made static scenes include Road, Cityscape, and other outdoor places that are not naturally occurring. Again, notice that although Man-Made scenery may be outdoors, we would not click on Outdoors, we would click on Man-Made. 4. In the Events category,
The main menu functions in the IBM VideoAnn annotation tool are file I/O. There are a total of 8 menu functions under the File menu, as defined follows:
Advanced Features The IBM VideoAnn annotation tool is designed to assist the advanced users with additional functions to refine the annotation process. These features are itemized below to correspond to the desired task.
Double-click on the shot image in this Views Panel that you would like to go to. The new current shot will be played back in the Video Playback window. The corresponding key frame will be displayed on the Key Frame window of the Shot Annotation module.
Go to the Frames in the Shot Views Panel to display all the representative I-frames in the current shot. Double-click on the image in this Views Panel that you would like to designate as the new key frame. The new key frame will be displayed on the Key Frame window of the Shot Annotation module.
Designate a different shots list. <File> <Load Shot List> Specify a new shots filename.
Go to the shot in which you want to modify the annotation. [see above] The annotated descriptions will appear below each shot key frame image in this Views Panel. The corresponding descriptions will appear in the annotation lexicon of the Shot Annotation module with check marks. Click on the existing check marks of the corresponding boxes for those annotations that you would like to delete. Click on the corresponding boxes for those annotations that you would like to add. Click <OK> when done modifying this shot.
Load shots list. Use default shot filename or specify a different one. [see above] Open annotation descriptions. <File> <Load MPEG-7 XML> Specify XML filename. Go to the Shots in the Video Views Panel to display all the shot key frames in the video. The annotated descriptions will appear below each shot key frame image in this Views Panel. The corresponding descriptions will appear in the annotation lexicon of the Shot Annotation module with check marks. In this section, we will illustrate how to start using the IBM VideoAnn
annotation tool to generate an MPEG-7 XML description file. Topics
covered will include using basis features of this tool to display the
video content, annotate the video sequence, save the annotations, and
review the annotations.
|
| About IBM | Privacy | Legal | Contact |