Figure 1: Two images of a test sequence, estimated optical flow and the corresponding color-code.
Motion estimation is an important part of image analysis. Estimated motion vectors in image sequences may be used for e.g. motion detection, tracking, identification, segmentation, and 3d reconstruction. Moreover motion vectors may be used for motion analysis, e.g. detection of abnormal behavior.
Differential methods based on the so called optical flow belong to the most accurate methods for motion estimation. Optical flow methods are based on the assumption that pixel-values between images change only because of motion. The main drawback of optical flow methods is the complexity of the algorithms which may lead to long processing times. Figure 1 shows to images of the Middlebury test sequence “DogDance” and the estimated optical flow. The flow is color-coded according to the color code on the image on the right, i.e. dark red means large motion to the right, yellow means motion towards the bottom. It can be seen, that the girl moves to the right while the dog moves to the lower left side.
Current GPUs (Graphic Processing Units) process parallel algorithms much faster than standard PCs. Therefore GPUs are commonly used in image processing to achieve real-time processing of complex algorithms.
Several different methods for motion estimation based on optical flow have been developed at Fraunhofer IOSB. Some methods allow real-time processing of large images in real-time on GPUs, whereas other methods allow for computation of more accurate and reliable estimates.
Figure 2: Input image of a car sequence, corresponding optical flow and segmented foreground objects.
Movement in image sequences provides valuable information regarding the structure of the scene, the movement of the camera and the properties of objects. By estimating the fields of displacement between two successive shots, the so-called optical flow is obtained. The optical flow can be used to segment foreground and background movements, which in turn can play a role in road traffic accident prevention. Figure 2 shows a picture of a car sequence on the left, in the middle, the motion field calculated in real time and on the right, the motion field corrected for the vehicle's own motion. The foreground objects (a person and the two cars) are clearly visible.
On the basis of displacement vectors in the image, it is possible, for example, to stabilise shaky video recordings, generate image sequence carpets or achieve higher image resolutions.
As once detected objects rarely remain in one place, especially in street or surveillance scenarios, the information of the object movement is decisive for the detection of the situation. The tracking of individual objects is a major research focus of the VID department.
In some cases, resolution and quality of images is not enough for further analysis, e.g. identification of a person in large distances. Using optical flow estimates, several images of a sequence or from different cameras may be fused to achieve a higher resolution image. Figure 3 shows in the top row a series of low resolution images. In the lower left corner a higher resolution image is depicted, based only on the first image. On the right side the higher resolution image is shown, achieved by combining more images of the sequence based on the corresponding optical flow fields. In contrast to the left image, the contrast could be improved and the edges are more pronounced.
Motion fields in videos may also be used for “abnormal” motion detection, e.g. a fight. Therefore “normal” motion patterns are learned first. Then the motion fields are analyzed and compared to the learned motion fields, in order to detect “abnormal” behavior. We use an “abnormality” index to demonstrate the “abnormality” of a motion field. Figure 4 shows two input images of an “abnormal” activity. The red point in the top right corner shows, that the motion has been rated as “abnormal”. The third image shows the computed optical flow and the image on the right show the “Abnormality”-index. Here “0” is normal and “1” abnormal motion.
Figure 4: Two input images of an “abnormal” sequence, the corresponding motion field and the “Abnormality“-index.
Using images from more than one camera, motion estimates may be used for 3d reconstruction and 3d motion estimation. Figure 5 shows the input image from a rotating arm. On the right the 3d reconstruction of this stereo setup is shown with the corresponding 3d motion estimates.