Tuesday, March 12, 2013

Panoramic Image stitching using SIFT and Projective transforms

Aim

The aim of this project to make a Panoramic Image stitching algorithm which should successfully stitch two images without any occlusions or artifacts.

This project requires the VL_SIFT library, which is a free to use library, graciously provided by Andrea vedaldi & team

What you need:

1. A functional C/C++ compiler installed (installing visual studio 2008 should do the trick)
2. VL_SIFT library
3. A functional RANSAC algorithm (not to worry, I'll help you make one if you don't have one!)
4. A basic understanding of the SIFT algorithm
5. A basic understanding of the homogeneous coordinate system.


Advantages:


1. Fast, really fast - mainly due to the excellent SIFT implementation by Andrea & team. Computation time for ~600 X 600 images is 3.2056 seconds (which includes displaying the figure window etc - it can easily be made real time)
2. Rotation and scale invariant, which makes it an excellent descriptor for estimating transform parameters (It's pretty good tool to figure how the images are related to each other).

Basic Flow Diagram









How to install the VL library?

a. Copy the folder contents after extraction into the current directory of operation (Where your main file is).
b. Add this in your main M file
'run VLFEATROOT/toolbox/vl_setup' (Note: without the single apostrophe) 


Okay! so I am ready to start, what next?

1. Take two images and import them into MATlab.

2. Compute their SIFT vectors (command: vl_sift)
vl_sift function takes in the edge and peak threshold limits as parameters (to read about what they are, check out David Lowe's paper on SIFT, its a great bed time read for geeks. To read how to enter the parameters, bring your cursor on the command vl_sift in your editor, and then cntrl + D)











3. Match the SIFT vectors (command: vl_ubcmatch)
SIFT vectors are matched using their Euclidean distance between each other. if Eucid_dist2 / Eucid_dist1 > = THRESH, then the SIFT vectors pairs forming Eucid_dist1 are considered as a positive match. Many other variations to this logic are also available but this will do just fine for our project. Euclidean distance is also known as the 'score' of the match.
For this project, a THRESH between 1.2 to 1.5 is acceptable, but you can experiment with various values >1 to see how to results vary. THRESH cannot go below 1 (For obvious reasons)

We must normalize the scores (divide by the sum of all the scores) to obtain the probability distribution.

Matches should look something like this:
31 32 20 55 22 21 23 24
16 18 41 44 3 42 46 5
which means, that the 31st SIFT feature from the first image has been mapped to the 16th SIFT feature from the second image.

4. Remove / Refine multiple matches
Many SIFT vectors generate multiple matches. We must either eliminate them or choose the match which is BEST suited for our purpose, the one with least score. The problem with choosing such a match arises when the least score match is an incorrect match (This is quite possible because remember, we are ESTIMATING how the second image has been transformed - this means that due to this transformation of the second image, certain incorrect matches, which seem similar, might occur)
I decided to eliminate them.

5.  Picking the guides
All matches which are <10% (for smaller images - approx<750 X 750) or <1% (for larger images - approx>1200 X 1200) are considered as the 'guides'. These guides tell us, which part of image 1 maps to which part of image 2.

A very popular image used by David Lowe
RIT sentinel statue





























6. RANSAC & Transformation
Random Sample consensus - Sounds big, don't worry it's not that big!

The flow diagram gives a very simplified version of the RANSAC algorithm. It's an excellent and effective technique to model dependencies.

a. Randomly select M points.

b. Generate a model fit (using command: maketform).

c. Check model by applying transform to the remaining points. If the transformed value is close to value from our SIFT match, its considered an inlier (or a positive) and inlier_counter is increasd. If the transformed value is > LIM, increase the counter for outliers. If, inlier_counter > 85% of no of original N points, then the model is considered as a good fit and the transformation is used.

d. If not, update the value of LIM depending upon the found inlier & outlier ratio, update the best model and redo the above mentioned procedure.

e. Using the best fit model, generate the transformation matrix (command: maketform, cp2tform)


f. Transform the second image using the generated transformation matrix (command: imtransform)

g. Ready for the stitch





7. The Stitch
Probably the hardest part of this project (gets your brain cells burning nice and bright). The concept of a mask should be properly understood - in MATlab terms. A mask means any region from an image/ data set which has to be processed without affecting the regions outside the mask perimeter. The transformed image 2 is converted into a mask and is 'overlapped' with image 1 in its appropriate location - How to find this location? Where to begin the overlap?

How to find the location?
Function imtransform gives us back the points where the TRANSFORMED IMAGE 2 will lie on image 1. the xdata and ydata vectors contain the top left and the bottom right corner of the transformed image 2. So, this means, we have to overlap starting from these points. Overlapping begins when the final image is ready to accept these two images onto itself. Well, makes a zero matrix of the appropriate size (You have the starting locations, you have the ending locations of the rectangle)
So, steps:

1. Make a mask 1 for image 1 (ones matrix the same size as image 1)
2. Make a mask 2 for image 2 (ones matrix the same size as image 2)
3. Transform image 2 and mask 2, get the starting locations
4. Fit mask 1 so that it can accumulate the starting locations (fitting would mean zero padding)
5. Find the net size required by adding the two widths of mask 1 and mask 2, and checking the maximum height.
6. Place the masks
7. get the locations belonging to mask 1 and mask 2 - It should look something like this

The significance of these masks will become apparent soon enough. Just to keep in mind what they should look like.



8. Overlap these two mask with the transformed images 1 and 2 and add the images.

Final Outputs

 Well this is what it looks like (Pretty good eh?)

P.S: SIFT is awesome

P.S.S: We still observe lines which mark out the two different images used. This is because I used the round function on image co-ordinates during stitching, if you got the time - use tri-linear or bi-cubic interpolation to reduce this effect.

For further additions and good looks - look into the paper by David Lowe, I have provided the link at the start of this page.

Disadvantages:
1. This panoramic stitching algorithm (pretty much all the stitching algorithms) work only for images which are taken from a 'distance'. Meaning that the content of a picture looks like it has been flattened out on a chart or something.

This algorithm would fail, lets say, if a person was standing very close to the camera and the different pictures estimate his location differently - in some the person would seem standing on the left, in some he would seem to be on the right.

2. Even though RANSAC is VERY effective, the algorithm can take occasional potshots when Noutliers>Ninliers. This can be rectified using appropriate guides as described in section 5. This makes our algorithm dependent on a thresholding factor, which is bad - BUT, this thresholding factor can be generalized based on image sizes. I'm gonna work on that soon!

If you liked my project, do let me know!

1 comment:

  1. thank you, it is so help ful. can u send me the codes in matlab, my email is teddyfaya@gmail.com

    ReplyDelete