Department of Computer Science
COSC 5P73: Computer Vision
Winter 2017 Instructor: Vlad Wojcik
Assignment 2: Due Date : 17 March Nov 2017, 2 PM
In aerial photography we frequently take a series of overlapping photographs of land with specialized cameras, like the one shown in the top corner, by Linhof. These photographs are later "stitched" together to create a large, composite image of a chosen geographical area. The ultimate goal is to create a topographical map of that area, by suitable interpretation of the image. Both image stitching and interpretation is frequently done manually (or, rather, visually). In this assignment we will attempt to automate the stitching process.
You are given two sets of simulated aerial images. Each set consists of four images of a certain artwork, in no particular order, as a GIF archive. The colour image originals of the first set are clickable here: , for your convenience and reference.
All these images were calibrated as follows:
- The original artwork consists of a silver line and glittering dinosaur stickers on a dark foamboard too big to fit in a single field of view of the camera.
- Four partially overlapping images were taken under the same conditions:
- A single macro lens of fixed focal length was used,
- Illumination consisted of a shadowless macro flash,
- The artwork was put on a table,
- The camera was mounted on a tripod pointing strictly down, in order to minimize distortion and to avoid the keystoning effect,
- The immobilized tripod with the camera was straddling the table with the artwork,
- For each photograph the artwork was slid along a T-square ruler attached to the table, in order to assure its parallel motion only.
Using your knowledge acquired during lectures, write a program that would find largest common subregion in every two of these images and stitch these images together so as to create one final image to be displayed on the screen or printed. Your challenge here is: You are to pretend to be blind. You should therefore not interact with nor modify interactively the partial results of your program. Your program should just read in the four images, figure out how to stitch them and display or print the resulting image.
Notes and Hints:
Given our mathematical and programming background (see especially pseudocode (5.3) on page 14 of lecture notes) this problem would be trivially easy if not for the pesky noise in the images. Observe that the pseudocode in question, although theoretically correct, is particularly vulnerable to the noise.
Consider two patterns A and B that are identical, but the image of pattern B contains one extra bright noise pixel that is distant from the pixels forming the pattern B. That pseudocode, being unaware that the offending pixel is to be ignored, will consider it and will report the dissimilarity D(A, B) > 0 instead of the correct D(A, B) = 0.
Is there a way of reliably removing such noise? If we had an intelligent filter capable of totally removing noise from images, that would be equivalent of having upfront a miraculous solution to our problem (and to all visual pattern recognition problems). However, that filter would have to have full understanding of the patterns it sees. We do not have such a filter.
It is possible to remove some of the noise, perhaps even most of it, using a dumb filter, provided that the signal to noise ratio in the photos is sufficiently big.
The brightness of pixels in our GIF images is in the range of 0 (pitch plack) to 255 (blazing white). Suppose we knew that the noise in our images generates pixels of brightness in the range between 0 and 10. If we rendered new image in which all pixels in that range would be black, and all other pixels would be dimmed by 10 units, then the new image would be darker, but noise free.
Such methods are used and work well for patterns with high signal/noise ratio, but they are unable to remove all noise (some original noise pixels are too bright). People keep imagers cool to reduce thermal imaging noise (see an image of a telescope imager ar right), some keep imagers in liquid nitrogen, or put them on space telescopes to keep them extra cool.
Whenever possible, mutiple images of the same pattern are taken and then averaged on a pixel-by-pixel basis. (Most film scanners use this feature). The central limit theorem, used in statistics, tells us that in this way we can remove the noise bias from images.
However, in this assignment we have one copy of each image. What can we then do? Look again at the pseudocode (5.3) on page 14 of the Notes. Suppose we noted the probability distribution of the variable dis (treating it as a random variable) and consider two patterns aligned for the variable distance not maximal, but greater than, say, 90% of observed values of dis ... Would that be a good way to get rid of noise pixel outliers?
After all, if we align two pieces in a traditional puzzle, or insert a spare part into some mechanism (these are pattern matching exercises too!) we focus our attention on two, three, etc. key points of the piece of the puzzle. Once these points fall into their places, all other piece points fall in their respective places too. That is the nature of puzzle arrangement. Let's try this approach, parhaps for the values 90%, 80% and 70%. Let us see how these images can be stitched, after all.
You will be asked to demonstrate your programs to your prof. and submit all printouts in an envelope during demonstration.
Instructor: Vlad Wojcik
Revised: Friday, 10-Feb-2017 11:11 PM
Copyright © 2017 Vlad WOJCIK