Sunday, December 11, 2016

bookmarks: Obtaining Depth Information from Stereo Images

This paper gives an overview of the main processing steps for depth perception
with a stereo camera.
Depth perception from stereo vision is based on the triangulation principle. We
use two cameras with projective optics and arrange them side by side, such that
their view fields overlap at the desired object distance. By taking a picture with each camera we capture the scene from two different viewpoints.
  
so, 

For each surface point visible in both images, there are two rays in 3D space
connecting the surface point with each camera’s centre of projection. In order to obtain the 3D position of the captured scene we mainly need to accomplish two
tasks: First, we need to identify where each surface point that is visible in the left image is located in the right image. And second, the exact camera geometry must be known to compute the ray intersection point for associated pixels of the left and right camera. As we assume the cameras are firmly attached to each other,the geometry is only computed once during the calibration process.

so, the distance of position in left and right scenes of one point can be used to calculate the distance.

Calibration process is used to compute mainly the distortion of lens(camera optics). it can remove image distortions results in straight epipolar lines.we can remove the image distortions by reversely applying the distortion learnt during the calibration process. The resulting undistorted images have straight epipolar lines,depicted in Figure 2 (middle).
rectification an additional perspective transformation to the images, so that the epipolar lines are aligned with the image scanlines.the resulting images are shown in Figure 2 (bottom).
after calibration and rectification, we can search the object in left scene simply in one line in right scene, it can be found easily. this step is called stereo match. For each pixel in the left image, we can now search for the pixel on the same scanline in the right image, which captured the same object point.

after stereo match, we found the offset of all pixel. we call it Disparity map
finally, make a reprojection, We can then again use the camera geometry obtained during calibration to convert the pixel based disparity values into actual metric X, Y and Z coordinates for every pixel. This conversion is called reprojection. We can simply intersect the two rays of each associated left and right image pixel

No comments:

Post a Comment