Skip to main content
[ IBM Research ]

Everywhere Displays Home | Technology | Applications | Publications | People | Events/Media

 Everywhere Displays: Technology
Diagram of the ED-projector
Correction of oblique projection distortion
Detection of user interaction
Prototypes
Related publications

For more information, contact Claudio Pinhanez

 
Diagram of the ED-Projector
 

diagram of ED-projector

 
Correction of Oblique Projection Distortion
 
To correct the distortions caused by oblique projection and by the shape of the projected surface (if not flat), the image to be projected must be inversely distorted prior to projection. In general, this distortion is non-linear and is computationally expensive to correct, involving the selective compression and the expansion of the original image. We have developed a simple scheme that uses standard computer graphics hardware (present now in most computers) to speed up this process. Our method relies on the fact that, geometrically speaking, cameras and projectors with the same focal length are identical. Therefore, to project an image obliquely without distortions it is sufficient to simulate the inverse process (i.e., viewing with a camera) in a virtual 3D computer graphics world.

relation projector-camera

As shown here, we texture-map the image to be displayed onto a virtual computer graphics 3D surface identical (minus a scale factor) to the real surface. If the position and attitude of this surface in the 3D virtual space in relation to the 3D virtual camera is identical (minus a scale factor) to the relation between the real surface and the projector, and if the virtual camera has identical focal length to the projector, then the view from the 3D virtual camera corresponds exactly to the "view" of the projector (if the projector was a camera). Since projectors do the inverse of viewing, i.e., they project light, the result is a projection free of distortions. In practice we use a standard computer graphics board to render the virtual camera's view of the virtual surface and send the computed view to the projector. If the position and attitude of the virtual surface are correct, the projection of this view compensates the distortion caused by oblique projection or by the shape of the surface. Of course, a different calibration of the virtual 3D surface must be used for each surface where images are projected in an environment.

undistorted pattern warped pattern
undistorted pattern projected warped pattern projected

In a typical situation of oblique projection, the pattern shown in the top-left is projected without any correction, resulting in the bottom-left image. After calibration of the virtual 3D surface and camera parameters, the projection of the rendered image (top-right) creates a projection free of distortion (bottom-right). So far we have experimented only with projecting on planar surfaces. The calibration parameters of the virtual 3D surface are determined manually by simply projecting the pattern shown in Fig. 5 and interactively adjusting the scale, rotation, and position of the virtual surface in the 3D world, and the "lens angle" of the 3D virtual camera. Another simple technique to correct for distortion on planar surfaces is simply to distort the texture to be projected by a homography. In this case, calibration is obtained by interactively grabbing with the mouse each corner of the projected pattern and moving it to the desired location on the surface. However, unlike the previous approach, homographies work only for planar surfaces.

For more details, see this IBM technical report.

 
Detection of User Interaction
 

A pan/tilt camera mounted near the projector is used to observe the user's hands as they interact with the projected display.  We have implemented just two simple interactions so far, fingertip tracking and isolated button "presses".  The prospects for more sophisticated interactions are good. 

Segmentation and tracking is made  challenging in this environment because the user's hand moves through the projected display, thus changing appearance from moment to moment.  Common techniques such as color based segmentation or the use of edge information are rendered almost useless.  Background subtraction does not give reliable results because the projected image often completely overwhelms the inherent surface color.  
Since we are primarily interested in following the hand as it moves, we make use of the thresholded frame-to-frame difference.  Moving objects create distinctive shapes in this data that are relatively easy to identify.

selecting a color
movement image
pushing a button
movement image

Slight jitter in the projected image often creates a faint outline, but this generally does not present a problem.  When it does, it is easily reduced with a touch of morphology.

To identify the fingertip in this data, a fingertip template is convolved over a search region to identify fingertip candidates.  The locations that match well are screened with domain specific heuristics to reach a final decision.

Although the difference image is binary, the fingertip template is gray scale.  The gray value indicates the importance of finding "on" image pixels at that point.  Since the fingertip may appear as either an outline, or a filled in shape, the outline of the fingertip is more important than the center.  To distinguish the fingertip from knuckles and other noise, it is important that pixels outside the fingertip to be "off".  If pixels are "on" in that region, they strongly reduce the match score. 

fingertip template

During calibration, the interaction area within the image is identified.  During operation, a search region within the interaction area is determined using the time and location of  the last fingertip location, and knowledge of the maximum speed of hand motion.

Before matching, the template is scaled and rotated to match the expected orientation of the user's hand.  The regions that match the template sufficiently well are examined to ensure that neither too many nor too few pixels are "on" within that region.  Of the surviving hypotheses, the hypothesis furthest from the user is taken as the true fingertip location.  If no location matches sufficiently well, we assume the user's hand is either motionless or not within the image.

The following images show the search region and the winning fingertip hypothesis overlaid on the difference image.  The image on the right also shows the active region of the button.
region and fingertip
region and fingertip

 Tracking is reliable and stable in nearly any environment.  Tracking rate varies between 5 and 30 FPS on a 500 MHz workstation for a 320x240 image, depending on the size of the search region and the size of the fingertip template (as determined by the expected size of the user's hand).

To use the tracked fingertip as a pointer, the image location of the fingertip is warped back to screen coordinates.  To determine a button press, the trajectory of the fingertip over the last period of time is examined.  If the fingertip disappears for several frames, the point in the recent trajectory which is furthest from the user is examined.  If it lies within a button which has not been pressed within a debounce interval, that button is "pressed". 
This simple algorithms can detect several types of button press: when the user enters the button and stops; when the user exits the button after being stopped (say the entry event has been missed for some reason); or when the user reaches out to touch a button and pulls back in one motion.  Importantly, it does not register a button press when the user flies through a button on the way to or from another location.  The algorithm does not work when the user presses several buttons without retracting their hand.  It can also fail when the user "flies" their finger around in the image before pressing a button.  For interactions where the user is asked to press one button at a time, it works surprisingly well.  Performance holds up well as the tracking frame rate drops, though the increased delay before a press is registered can be disturbing to the user.

This approach relies on several assumptions:
 - the hand comes from within about 30 degrees of a known direction 
 - the fingertip is the furthest point from user
 - the user's hand is the only moving object in the search region.

For each surface where the display is to be projected, a calibration step is required to determine the following information:
 - The expected size and orientation of the hand
 - The boundaries of the projected image
 - The location of buttons or active regions
We obtain this in a few steps by sizing and rotating a hand icon to match the image of the user's hand, clicking on the corners of the projected "screen" and drawing shapes over the buttons.
The calibration information is saved for each location, so that it can be reloaded as needed when the display moves or the projected data changes.

calibration screen
 
Prototypes
 
ed prototype 1 ed prototype 2

Prototype 1 (photo: 8/11/00)

  • brightness: 1200 lumens
  • pan: 200° / tilt: 50°

Prototype 2 (photo: 7/11/01)

  • brightness: 3000 lumens
  • pan: 230° / tilt: 50°

Last update: 10/10/02

Contact IBM Legal Privacy Orders IBM home IBM research