Shape from Motion Project: 3D Modelling by Analogic Video Input Data for the Reconstruction of Archaeological Sites

Calori, L.; Forte, M.; Guidazzoli, A.; Fraticelli, F.; Simani, Silvio

One of the fundamental problems in early Computer Vision is the measurement of optical flow. In many cases when a scene is observed by a camera there exists motion, created either by the movement of the camera or by the independent movement of objects in the scene. In both cases the goal is to correspond to each visible point a 3D velocity vector. In general, it is impossible to infer from one view the 3D velocity map. However most motion estimation algorithms calculate the projection of the velocity map onto the imaging surface. A large number of different algorithms have been developed in order to solve this problem. The problem of estimating the optical flow has received much attention because of its many different applications. Tasks such as passive scene interpretation, image segmentation, surface structure reconstruction, inference of egomotion, and active navigation, all use optical flow as input information. Until now, most motion estimation algorithms consider optical flow with displacements of only a few pixels per frame. This approach limits the applications to slower motions and fails to seriously address the issue of motion blur. Moreover they work on images that are considered to be taken with infinitely small exposure time, more or less in a "stop and shoot" approach, which limits the real time applications. Also most of these algorithms work on a series of images by calculating the displacement of every pixel from image to image, ignoring any information about motion that exists within each single image. In this work we approach the problem of visual motion estimation from a different point of view. We have developed an algorithm that is based on interpreting the motion blur to estimate the optical flow field in a single image. The algorithm works in the spatial frequency domain, using all the information that can be gathered from a patch of the image. The first step is to obtain the logarithm of the spectrum, by applying the Fast Fourier Transform (FFT) on the image patch. It is recognizable on the result an ellipse centered on the origin with orientation perpendicular to the orientation of the velocity vector. In order to exctract this orientation we apply a steerable second Gaussian derivative filter. To compute the magnitude of the velocity vector, which is one dimensional we need only a 1D signal. Noise patterns make it difficult to estimate this magnitude from the 2D log spectrum. Therefore we collapse the 2D logarithm of the spectrum along the orientation of the velocity vector, transfering from two dimensions to one. The next step is to take the Cepstrum, by applying the FFT to this one dimensional signal. The Cepstrum is useful in extracting the magnitude of the velocity vector because the central ellipse in the 2D spectrum is echoed along the direction of the velocity vectors with a period equivalent to the magnitude of the velocity vector. This is obvious in the Cepstrum by the appearence of a negative peak at the value of the magnitude of the velocity vector. The computational complexity of the algorithm is bounded by the FFT which is The computational complexity of the algorithm is bounded by the FFT which is O(n \log n) where n the number of pixels in the image.

SFERA Archivio dei prodotti della Ricerca dell'Università di Ferrara