Photo
PeopleVision

3D Multi-people Tracking

The 3D tracker uses wide baseline stereo to derive the 3D positions of objects.  At every frame, we measure the color distance between all possible pairings of tracks from the 2 views. We use the Bhattacharya distance between the normalized color histograms of the tracks. For each pair we also measure the triangulation error, which is defined as the shortest 3D distance between the rays passing through the centroids of the appearance models in the two views. The triangulation error is generated using the camera calibration data. To establish correspondence we minimize the color distance between the tracks from the view with the smaller number of tracks to the view with the larger number. This process can potentially lead to multiple tracks from one view being assigned to the same track in the other. We use the triangulation error to eliminate such multiple assignments.  The triangulation error for the final correspondence is thresholded to eliminate spurious matches that can occur when objects are just visible in one of the two views. Once a correspondence is available at a given frame, we now need to establish a match between the existing set of 3D tracks and 3D objects present in the current frame.  We use the component 2D track identifiers of a 3D track and match them against the component 2D track identifiers of the current set of objects to establish the correspondence. The system also allows for partial matches, thus ensuring a continuous 3D track even when one of the 2D tracks fails. Thus the 3D tracker is capable of generating 3D position tracks of the centroid of each moving object in the scene. It also has access to the 2D shape and color models from the two views that make up the track. A demo of the 3D tracker is shown. All demo videos are in MPEG1 format. (video 5.8MB)
 

Picture of 3D track

Other Research Areas: