cs180: proj4
[Auto]Stitching Photo Mosaics
Project 4A: Image Warping and Mosaicing
Part 1: Shoot the Pictures
I used my phone and made sure to aim for 40%-70% overlap, only rotating my phone at the camera lens axis. I locked all my settings to ensure the best chance of a clean blend and had no moving items (people, cars, etc) in my photos.
Note: I took a lot of photos, processed all of them, and then picked the ones with the best results.
Part 2: Recover Homographies
I based my computeH
function on the concepts and derivations from lecture. A homography matrix transforms set A of inputted points into set B of inputted points. Below is the system of equations I solved to find (using least squares).
Note: To avoid an underdetermined system, I need at least 4 pairs of known keypoints (as discussed in lecture).

We can now construct the homography matrix given the definition below!

Part 3: Warp the Images
Having now estimated the homography matrix , I took the four corners of img1
and applied to re-project it into img2
. Similar to proj3, I used scipy.interpolate.RegularGridInterpolator
(it’s quicker than griddata
) to compute the interpolated values for img1
’s pixels, which will then set the corresponding pixels for the resulting image. I implemented an inverse warp for a smoother re-projection.
Part 4: Image Rectification
With my functions from parts 2 and 3, I performed “rectification” on two images. To find my correspondence points, I used a previous student’s labeling tool (had also used in proj3: Face Morphing). I also aggressively cropped my images to reduce runtime (most of my raw images were about 2500x4000 px).
- Example 1:
soda606.jpg
- Example 2:
painting-cropped.jpg
Part 5: Blend into a Mosaic
I kept one image as unwarped so that I could stitch it to the warped version of the other image. I padded my unwarped image and blended the overlapping parts of the images with the Laplacian pyramid blending technique. My images were all very large, so I resized them for easier use.
I had originally tried a simple, linear gradient mask to smooth the overlapping parts, but this still left a visible seam in my resulting photo.
Note: I trimmed the excess black to display the cleanest result.
- Example 1:
anchor_mosaic.jpg

- Example 2:
vlsb_mosaic.jpg

- Example 3:
view_mosaic.jpg

Reflection
I see how one of the goals of proj4a is to understand just how helpful 4b’s auto-stitching will be. Manually selecting correspondences between both images was frustrating at times as it’s prone to slight errors. I had a lot of fun “rectifying” different images — I’ve seen it in various apps and never knew how it was done!
Project 4B: Feature Matching for Autostitching
A huge component of this project involved reading Multi-Image Matching using Multi-Scale Oriented Patches by M. Brown, R. Szeliski and S. Winder!
Part 1: Harris Interest Point Detector
With the provided harris.py
, I used get_harris_corners
to extract all prominent corners in each image. I discarded any extracted points that are near the image’s border.
Note: I resized my images for easier use (I scaled each to have a width of 1000px).
Part 2: Adaptive Non-Maximal Suppression (ANMS)
Now having the Harris points for each image, I filtered some out before applying ANMS to improve control (some images had over 14,000 points!). I implemented the ANMS algorithm, which chooses the top 500 interest points, all of which are fairly evenly spread out. To decide which Harris interest point to keep, I implemented the radii formula below:

Here are my ANMS 500 with results for anchor_a.jpg
and anchor_b.jpg
:
Part 3: Feature Descriptor Extraction (MOPS)
Now with the ANMS points, I extracted the descriptors for each using Multi-Scale Oriented Patches (MOPS). To describe each small region, I sampled a 40x40 patch centered on an ANMS point. From there, I downsampled (by a factor of 5) and blurred the patch to an 8x8 patch. I used skimage.transform.rescale
with anti_aliasing=True
to perform the Gaussian blur.
I then normalize each 8x8 patch by performing bias-gain normalization for each of the 3 color channels to form a vector that describes each feature.
Here are some example 8x8 patches extracted from anchor_a.jpg
(first row) and anchor_b.jpg
(second row):
Here is the side-by-side of a 40x40 patch with an 8x8 feature descriptor for anchor_a.jpg
:
Part 4: Feature Matching
Now having both the feature points and its descriptors, I need to find the correspondence points. For every patch in img1
, I found its most similar patch in img2
using its L2 distance. To filter out points with no match, I used Lowe’s trick by computing one_nearest_neighbor_dist / two_nearest_neighbor_dist
and only keeping values whose Lowe’s ratio is below a threshold value (I used 0.8). The result is an array whose first column contains the feature index in img1
and second column contains the matching feature index in img2
. Here are the matched corners for the anchor
set:

Part 5: Random Sample Consensus (RANSAC)
As seen above, there are some outliers. My computeH
function from Part A uses least squares, which is highly sensitive to outliers, so I implemented Random Sample Consensus (RANSAC) to remove the incorrect correspondences. For 4-point RANSAC, I chose 4 random and unique pairs of points to compute a homography matrix. From there, I return the homography matrix that resulted in the maximum amount of inliers after 5000 iterations. That resulting homography is what I used in the next part. Here are the finalized correspondence points for the anchor
set:

Auto-Stitched Mosaics
Now that I have the correspondence points, I used Part A’s warping and blending processes to stitch the pairs of images together. Here are the manual (Part A) and automatic (Part B) results side-by-side.
Note: To the human eye, the mosaics formed with manual versus automatic correspondence points are essentially the same!
Reflection & Bloopers
This was by far my favorite project we’ve done! The research paper was definitely dense, but it leads you to appreciate the derivations and algorithms developed to do what an iPhone camera does so “easily” when creating a panorama. One of the coolest things I learned from this project was Lowe’s trick, which helps our feature matching algorithm find the best point pair matches.
As always, you run into hiccups along the way. Here are some bloopers I saved from Project 4A and 4B:

top left image: plotted s=200 sized points by accident
top right image: hiccups when trying to feature match (also was using a threshold of 0.9)
bottom middle image: needed to double check my dimensions for rectification