Photo
VideoAnnEx Annotation Tool

User Manual

 

The IBM VideoAnnEx Annotation Tool user manual is divided into two sections.  The first section is the graphical user interface, which explains the annotation tool interfaces and defines the various display modules.  The second section is the user's guide, which describes the annotation functionalities and includes an annotation example using the tool.  For a crash course, skip to the annotate video section of this manual.

 

Table of Content

   Graphical User Interface
           Video Playback
           Shot Annotation
           Views Panel
                 Frames in the Shot
                 Shots in the Video
           Region Annotation
   User's Guide
           Annotate Video
           Annotation Tips
           Functions and Features
                 File I/O Menu
                 Tools Menu
                 Lexicon Menu
           Annotation Example


Graphical User Interface

    The VideoAnnEx annotation tool is divided into four graphical sections as illustrated in Figure 1.  On the upper right-hand corner of the tool is the Video Playback window with shot information.  On the upper left-hand corner of the tool is the Shot Annotation with a key frame image display.  On the bottom portion of the tool is two different Views Panel of the annotation preview.  A fourth component, not shown in Figure 1, is the Region Annotation pop-up window for specifying annotated regions.  These four sections provide interactivity to assist authors of the annotation tool.

IBM VideoAnnEx Annotation Tool
Figure 1: IBM VideoAnnEx Annotation Tool divided into four regions: (1) Video Playback, (2) Shot Annotation, (3) Views Panel, and (4) Region Annotation (not shown).

    The Video Playback window on the upper right-hand corner displays the opened MPEG video sequence as show in Figure 2.  The four playback buttons directly below the video display window include:

  • Play - Play the video in normal real-time mode.
  • FF - Play the video in fast forward mode [display I- and P-frames].
  • FFF - Play the video in super fast forward [display only I-frames].
  • Stop - Pause the video in the current frame.
As the video is played back in the display window, the current shot information is given as well.  These shot information include the current shot number, the shot start frame, and the shot end frame.  Note that the first shot starts at number 0.

IBM VideoAnnEx Annotation Tool
Figure 2: Video Playback of the IBM VideoAnnEx Annotation Tool.

    The Shot Annotation module on the upper left-hand corner displays the defined annotation descriptions and the key frame window as depicted in Figure 3.  As the video is displayed on the Video Playback, a key frame image of the current shot is displayed on the Key Frame window.  The key frame is a representative image of the video shot segment, and thus offer an instantaneous recap of the whole video shot.  Consequently, the key frame may provide the author with immediate assistance in annotating the shot descriptions.  In the shot annotation module, the annotation lexicon is also displayed.  There are three types of lexicon as follows:

  • Events - List the action events that can be used to annotate the shots.
  • Static Scene - List the background static scenes that can be used to annotate the shots.
  • Key Objects - List the significant objects that are present in the shots.
In each of the three lexicons, the descriptions are organized in a hierarchical tree structure.  These annotation descriptions have corresponding check boxes for the author to select.  Furthermore, there is a Keywords box for customized annotations.  Once the check boxes have been selected and the keywords typed, the author hits the <OK> button to advance to the next shot.

IBM VideoAnnEx Annotation Tool
Figure 3: Shot Annotation of the IBM VideoAnnEx Annotation Tool.

    The Views Panel on the bottom displays two different previews of representative images of the video.  They are:

The Frames in the Shot view shows all the I-frames as representative images of the current shot as shown in Figure 4.  A maximum of 18 images can be displayed in this view.  This allows the author to obtain an instantanous temporal insight into the video shot without having to playback the video shot over time. The <Prev> and <Next> buttons refresh the view panel to reflect the previous and next shot frames in the video sequence.  Also, one can double-click on any of the representative images in the panel.  This action designates that selected image to be the new key frame for this shot, and is respectively displayed on the Key Frame window.  In this preview mode, if the author clicks the <OK> button on the Shot Annotation Window then the video will stop playback of the current shot and advance to play the next shot.

IBM VideoAnnEx Annotation Tool
Figure 4: Frames in the Shot of the Views Panel in the IBM VideoAnnEx Annotation Tool.

The Shots in the Video view shows all the key frames of each shot as representative images over the entire video as illustrated in Figure 5.  Below each shot's key frame is the annotated descriptions, if indeed they have already been provided.   The author can peruse the entire video sequence in this view and examine the annotated and non annotated shots.  The <Prev> and <Next> buttons scroll the view panel horizontally to reflect the temporal video shot ordering.  Also, one can double-click on any of the representative images in the panel.  This action instantiates the selection of the corresponding shot, resulting in (1) the appropriate shot being displayed on the Video Playback window, (2) the simultaneous key frame being displayed on the Key Frame window, and (3) the corresponding checked descriptions on the Shot Annotation panels.  In this preview mode, if the author clicks the <OK> button on the Shot Annotation Window then the video will FFF playback of the current shot and advance to play the next shot in normal playback mode.

IBM VideoAnnEx Annotation Tool
Figure 5: Shots in the Video of the Views Panel in the IBM VideoAnnEx Annotation Tool.

    The Region Annotation pop-up window shown in Figure 6 allows the author to associate a rectangular region with a labeled text annotation.  After the text annotations are identified on the Shot Annotation window, each description can be associated with a corresponding region on the selected key frame of that shot.  When the author finishes check marking the text annotations and clicks the <OK> button, then the Region Annotation window appears.  On the left side of the Region Annotation window is a column of descriptions listed under <Annotation List>.  On the right side is the display of the selected key frame for this shot along with some rectangular regions.  For each description on the <Annotation List>, there may be one or no corresponding region on the key frame.

IBM VideoAnnEx Annotation Tool
Figure 6: Region Annotation of the IBM VideoAnnEx Annotation Tool.

The descriptions under the <Annotation List> may be presented in one of four colors:

  • Black - the corresponding description has not been region annotated.
  • Blue - the corresponding description is currently selected.
  • Gray - the corresponding description has been labeled with a rectangular region.
  • Red - the corresponding description has no applicable region. (ie, when you click <N/A>)
The regions on the Key Frame image may be presented in one of two colors:
  • Blue - the region is associated with one of the not-current descriptions (ie, the description in Gray color).
  • White - the region is associated with the currently selected description (ie, the description in Blue color).
When the Region Annotation window pops up, the first description on the <Annotation List> is selected and highlighted in Blue, while the other descriptions are colored Black.  The system then waits for the author to provide a region on the image where the description appears by click-and-drag a rectangular bounding box around the area of interest.  Right after the region is designated for one description, the system advances to the next description on the list.  If there is no applicable region on the key frame image, click the <N/A> button, and the corresponding description will appear in Red.  At any time, the author can click any description on the <Annotation List> to make that selection current.  Thus the description text will appear in Blue and the corresponding region, if any, will appear in White.  Furthermore, this action allows the author to modify the current region of any description at any time.  For rules regarding region annotation, please refer to the Annotation Guidelines.

< Back to Table of Contents >


User's Guide
Annotate Video

1.    Open an MPEG video for annotation.
        > File    > Open
        Select the location of the MPEG-1 or MPEG-2 video file.

2.    After the MPEG video is opened, the annotation tool searches for the corresponding shot segmentation information.
       First, the tool checks for the corresponding MPEG-7 XML file (myfile.mp7.xml) in the same directory. If this file is found, the tool automatically loads it.
       Second, the tool checks for the corresponding video shots file (myfile.sht.xml) in the same directory.  If this file is found, the tool automatically loads it.
       Finally, the tool does not find any shot information and thus generates one by performing shot boundary detection.
       This process executes in the background and the detected shots are numbered on the lower right-hand corner of the annotation tool.

3.    After the MPEG video is opened and the shot information is loaded, the annotation lexicon is required.  
       Initially, the tool checks for the corresponding lexicon file (myfile.lex.xml) in the same directory.  If this file is found, the tool automatically loads it.
       Otherwise, the tool uses the default lexicon set (VideoAnnEx_default.lex.xml) and automatically loads it.
       The lexicon labels appear in the Shot Annotation panel.

4.    Associated with each MPEG video is a corresponding random frame access file (myfile.frp).
       If this file is not found in the same directory as the video, the frame access file is automatically generated and saved.

5.    Play the video sequence on the Video Playback window by selecting the <Play>, <FF>, <FFF>, or <Stop> buttons.

6.    The video will pause playing at the end of the current shot, waiting for the author to enter the annotations.

7.    For the current video shot,

8.    Identify the annotation for a shot by selecting the check boxes on the Shot Annotation module.
        Each shot should have at least one selection from the <Static Scenes> and from the <Key Objects>.
        Annotations for temporal features and actions can be selected from <Events>.
        Furthermore, the author can specify other descriptions on the <Keywords> textbox.
        Multiple entries can be entered for <Keywords>, as long as they are separated by commas.

9.    When the author finishes annotation for a shot, click the <OK> button on the Shot Annotation module
        in order to advance to the next shot.

10.    View the annotations by switching to the Shots in the Video Views Panel.

11.    Save the annotations for this video.
        > File    > Save MPEG-7 XML
        Specify the location and filename.


Annotation Tips

The important step in using the VideoAnnEx annotation tool is to study the annotation lexicon.  The lexicon is divided into three categories, as displayed in the Shot Annotation module.  As we annotate a shot, keep in mind that the shot occurs at some scene.  So we suggest annotating the static scene descriptions first.  Afterward, focus our attention to the key subjects in the scene.  Identify these subjects with key object descriptions.  Finally, observe the actions executed by these objects.  These actions are labeled with event descriptions.  Furthermore, some vocabularies are not available in the lexicon.  Use the keywords box to annotate additional descriptions.  Keywords may include proper nouns, titles, captions, and other remarks.

After specifying the text annotations for a shot, the regions corresponding to these descriptions are also recorded.  Here are the guidelines for identifying the regions of interest.  Note that these guidelines are suggestive only and are generated with respect to our goal of training video retrieval models.  The guideline is divided into three parts to correspond to the three different lexicon categories: static scenes, key objects, and events.  Here is the summary:

  • For a static scene annotation, we can inscribe the bounding box within the region of interest, so as to capture the corresponding color and texture features.  [ie, clouds, water, greenery, desert]
  • For a key object annotation, we can circumscribe the bounding box around the region of interest, so as to capture the corresponding shape, edge, and dominant color.  [ie, airplane, deer, flag, person, logo]
  • For an event annotation, we do not need to specify any region for the bounding box, since the key object(s) that performed these actions are already annotated.  [ie, rocket launch, boat sailing, person speaking]

Functions and Features


File I/O Menu

The main menu functions in the IBM VideoAnnEx Annotation Tool are file I/O.  There are a total of nine menu functions under the File menu, defined as follows:
 
  •     Open - Open an MPEG-1 or MPEG-2 video file and corresponding video shots file.  If a FRP frame random-access file exists in the same directory, this file is loaded as well; otherwise the FRP will be generated automatically.
  •     Save MPEG-7 XML - Save the video annotation as an MPEG-7 XML file.
  •     Load MPEG-7 XML - Load the video annotation from a specified MPEG-7 XML file.
  •     Save Shot List - Save the new shots list.  The original video shots file may be modified to include a different key frame for any shot.
  •     Load Shot List - Load an existing shots list, instead of the default one loaded under the <Open> menu.
  •     Save Shot Frames - Save all the frames in the current shot as individual JPEG images under the current directory.
  •     Save Shot I-Frames - Save all the I-frames in the current shot as individual JPEG images under the current directory.
  •     Save All Key Frames - Save all the key frames in the entire video as individual JPEG images under the current directory.
  •     Exit - Exit from the VideoAnnEx annotation tool.

 

IBM VideoAnnEx Annotation Tool

Tool Menu

The second menu function is associated with the annotation tool mode.  There are 3 menu functions under the Tool menu, defined as follows:
 
  •     Annotation Learning - Selection of this mode will allow the tool to assist the annotator in finding similar shot and labeling them with the same descriptions.
  •     Region Annotation - Allow region annotation of the key frame and associates each description label with a corresponding region.
  •     Horizontal Stretch 2:1 - Display the video frame in a stretched mode by doubling the width.

 

IBM VideoAnnEx Annotation Tool

Lexicon Menu

The second menu function is associated with the annotation tool mode.  There are 3 menu functions under the Tool menu, defined as follows:
 
  •     Load Lexicon - Load a specific lexicon set.  This allows the annotator to use the appropriate lexicons for different applications..
  •     Save Lexicon - Save the current customized lexicon, which can be created or modified by the annotation tool.
  •     New Lexicon - Start with an empty lexicon and allow the annotator to define new lexical terms and hierarchies.
  •     New Sibling Label - Creates a new lexical entry that is a sibling to the active lexicon label.
  •     New Child Label - Creates a new lexical entry that is a child of the active lexicon label.
  •     New Parent Label - Creates a new lexical entry that is the parent to the active lexicon label.
  •     Delete Label - Delete the active lexicon label.

Each lexicon entry can also be modified by right-clicking on the label.

IBM VideoAnnEx Annotation Tool

< Back to Table of Contents >


Annotation Example

In this section, we will illustrate how to start using the IBM VideoAnnEx Annotation Tool to generate an MPEG-7 XML description file.  Topics covered will include using basis features of this tool to display the video content, annotate the video sequence, save the annotations, and review the annotations.
 
 

Open a Video Sequence

On Menu, <File>  <Open>
Specify the video filename.

IBM VideoAnnEx Annotation Tool
Play the Video Content

On Video Playback, <Play> or <FF> or <FFF>
Pause by clicking <Stop>

IBM VideoAnnEx Annotation Tool
View all Frames in the Shot

On Views Panel, <Frames in the Shot>

IBM VideoAnnEx Annotation Tool
View all Shots in the Video

On Views Panel, <Shots in the Video>

IBM VideoAnnEx Annotation Tool
Study the Annotation Lexicons

On Shot Annotation, scroll up and down the <Events>, <Static Scenes>, and <Key Objects> panels.  Note the hierarchical structures of the annotation lexicons.

IBM VideoAnnEx Annotation Tool
Annotate the Shot

On Shot Annotation, click the boxes next to the appropriate annotations that describes the video shot.
Also, type additional descriptions in the <Keywords> box.  When finished with the shot annotation, click the <OK> button.

IBM VideoAnnEx Annotation Tool
Check the Shot Annotations

On Views Panel, go to <Shots in the Video>.
The annotations are listed under the key images of each shot.

IBM VideoAnnEx Annotation Tool
Save the Annotations

On Menu, <File>  <Save MPEG-7 XML>.
Specify the ouput XML filename. 
 

Load the Annotations

First, the video sequence must be opened.
On Menu, <File>  <Load MPEG-7 XML>.
Specify the XML filename. 

IBM VideoAnnEx Annotation Tool
Select New Key Frame for a Shot

Go to the shot, whose key frame is to be modified.
On Views Panel, select <Frames in the Shot>.
Double-click on the desired image to designate as the new key frame for this shot.  The new key frame will be displayed on the <Key Frame> window of the Shot Annotation partition.

IBM VideoAnnEx Annotation Tool
Modify the Shot Annotations

Go to the shot, whose annotations are to be modified.
On Views Panel, go to <Shots in the Video>.
Double-click on the key image of that shot.
The key image will become highlighted.
The corresponding annotations will be displayed on the Shot Annotation windows with marked check boxes.
Modify the annotation by clicking the check boxes.
 

IBM VideoAnnEx Annotation Tool