Computational vision is concerned with the automatic processing of image and video data for scene reconstruction, object recognition, navigation, and activity detection. Computational vision will become increasingly important as we get more and more sources of visual data through the increased use of consumer electronics (digital cameras, camcorders, etc), desktop and surveillance cameras, and internet databases. The objective of this course is to provide a concise treatment of some fundamental problems in computational vision. This course will focus on a core set of problems where efficient and robust algorithms can be applied. Students will be required to implement several algorithms using real datasets. In addition to providing practical approaches and algorithms, this course will provide the foundation required to pursue research computational vision.
It is advisable to have some exposure to numerical computation and some basic
programming experience. All programming will be done in Matlab.
3 hours of lectures plus 1 hour tutorial per week.
All required material will be provided in the lectures. The following books
provide reference material. They will be put on reserve in the library: E. Trucco
and A. Verri, Introductory Techniques for 3D Computer Vision, Prentice-Hall,
1998. B.K.P. Horn, Robot Vision, MIT Press, 1986. K.R. Castleman, Digital Image
Processing, Prentice-Hall, 1996.
Problems. Levels of visual processing. Common approaches. Applications.
Optics. Cameras and imaging systems. Shading and reflectance.
Convolution. Low and high-pass filters. Fourier theory. Spatiotemporal filters and wavelets.
Iterative techniques for image alignment. Application to image compositing.
Edge detection. Corner detection.
Least squares fitting. Robust estimation. Mixture models.
Derivation of image flow field from 3D motion. Estimation of optical flow. Mixture models of optical flow.
Scene reconstruction from image flow field: factorization method, direct methods.
Baseline stereo. Depth reconstruction by triangulation. Maximum flow formulation of the stereo problem. Epipolar geometry.
View based methods: principle components analysis, factor analysis. Regonition by linear combinations of views. Model-based approaches.
A selection of topics including: feature grouping for object recognition; recognition of objects based on function; perception of scene dynamics; event recogntion.