CS280A Project4 by Xintian Liu

Part A.2: Recover Homographies

Before warping images into alignment, we need to recover the parameters of the transformation between each pair of images. The transformation is a homography: p’=Hp, where H is a 3x3 matrix with 8 degrees of freedom.

$$ \begin{bmatrix} wx' \\ wy' \\ w \end{bmatrix} = H \cdot \begin{bmatrix} x \\ y \\ 1 \end{bmatrix} $$

Where $ H $ is:

$$ H = \begin{bmatrix} h_{11} & h_{12} & h_{13} \\ h_{21} & h_{22} & h_{23} \\ h_{31} & h_{32} & 1 \end{bmatrix} $$

Given a point $ (x, y) $ in the first image and its corresponding point $ (x', y') $ in the second image, the equation can be rearranged to:

$$ h_{11}x + h_{12}y + h_{13} - h_{31}x \cdot x' - h_{32}y \cdot x' = x' $$ $$ h_{21}x + h_{22}y + h_{23} - h_{31}x \cdot y' - h_{32}y \cdot y' = y' $$

To solve for $ H $, computeH constructs a matrix equation $ A \cdot H = b $ and uses least squares method to find the solution that minimizes the sum of the squared residuals, where:

$$ A = \begin{bmatrix} x & y & 1 & 0 & 0 & 0 & -x \cdot x' & -y \cdot x' \\ 0 & 0 & 0 & x & y & 1 & -x \cdot y' & -y \cdot y' \end{bmatrix} $$

The vector $ b $ is:

$$ b = \begin{bmatrix} x' \\ y' \end{bmatrix} $$

Part A.3: Warp the Images

After computing $ H $ from corresponding points from two images, I warp the second image to the first image with forward warping and fill the unmapped pixels using nearest-neighbor interpolation via warpImage_rec(im, H, polygon = True) The details are:

Original Image Corner Detection and Transformation: The corners of the original image are identified and projected using the homography matrix $ H $. This defines the boundaries(corners) of the warped image. The max and min coordinates of the transformed corners are used to determine the size of the output image.
Forward Mapping and Interpolation: The original image coordinates are transformed using the homography matrix. A mask is created to identify valid (in-bounds) coordinates for mapping. The griddata function with the 'nearest' method is used to interpolate pixel values from the original image to the warped image grid, ensuring that unmapped pixels are filled accurately.
Polygon Masking: If a polygon mask is applied, the shifted corners of the warped image are used to define the polygon region. The mask ensures that only the valid region (defined by the warped polygon) is preserved in the final image, while other areas are set to zero.

Two examples of rectified images are shown below:

Description of the image — Figure : Rectified Image: Calendar

Part A.4: Blend the Images into a Mosaic

The goal of this section is to blend two images smoothly by warping the second image into the coordinate of the first image and using a distance transform mask for a gradual transition in the overlapping area of first and second images.

Distance Transform Mask create_distance_transform_mask: This functions computes the ditance of each pixel from the nearest edge, then normalizes the distance to the range of 0 and 1.
Blend Two Images BlendTwoImages_Distance: This functions mainly involves: (1) Warp the second image; (2)Create a binary mask to identify overlap area; (3) Distance transform for alpha blending; (4) Combine the first and second images.

Blended view1 & view2 — Figure : Blended apartment view1 and view2 (Unwarped: view1)

Blended view3 & view4 — Figure : Blended apartment view3 and view4 (Unwarped: view3)

Blended view1234 — Figure : Blended apartment view1-4 (Unwarped: view1)

Blended lab12 — Figure : Blended lab1 and lab2

Part B.1: Detecting Corner Features

Note: For implementation of part B, I realize the images with more features are better candidates for mosaic. So I've taken more pictures for image processing in this part.

Building corner detection — Figure: Buildings.

Harris Interest Point Detector

We start with directing using harris corner detection given in harris.py. In practice, I find out min_distance=5 and edge_discard=20 gives resonable number of harris corners in a short time period.

Examples of Harris corners detection are shown below.

Adaptive Non-Maximal Suppression(ANMS)

To minimize redundant feature corner points, I implement ANMS algorithon[1] as outlined in the function apply_ANMS. The array radii is initialized as infinite values to store the mininum suppression radius for each point.

The minimum suppresion radius $ r_i $ is given as:

$$ r_i = \min_j \, | \mathbf{x}_i - \mathbf{x}_j |, \quad \text{s.t. } f(\mathbf{x}_i) < c_{\text{robust}} f(\mathbf{x}_j), \quad \mathbf{x}_j \in \mathcal{I} $$

For each corner point, the function iteratively computes $ r_i $. A point i of considers another point j only if f[i] < c_robust * f[j]. This ensures the weak points are suppressed if they are near a strong point. The Eculidean distance between points i and j is calculated and stored as the suppresion radius if it is smaller than the current value.

Examples of detected Harris corners with ANMS are shown below. In these cases, c_robust = 0.9 and the number of selected points is 500.

[1]Brown, Matthew, Richard Szeliski, and Simon Winder. "Multi-image matching using multi-scale oriented patches." 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05). Vol. 1. IEEE, 2005.

Corner detection ANMS — Figure: Detected corners with ANMS.

Part B.2: Extracting a Feature Descriptor for Each Feature Point

The descriptor function generates descriptors for each selected corners from the previous parts by extracting the local patches (40x40) around it and downscale it to 8x8.

It is noted that Gaussian blurring is used here to reduce the noise. Then skimage.transform.resize with anti_aliasing=True is used. To make the descriptor invariant to intensity, the resized patch is normalized.

Part B.3: Matching Feature Descriptors between Two Images

The feature_matching function identifies corresponding features between two sets of the descriptors extracted from images by comparing their similarity using Euclidean distance.

To ensure reliable matches, the function applies Lowe's ratio test. If the distance to the closest match is significantly smaller (defined by threshold) than the distance to the second-closest match, the pair of descriptors is considered as a good match.

Output here is a list of matched index pairs representing reliably matched feature points.

Matched features — Figure : View of matched feature descriptors between two images.(Left: Building1, Right: Building2)

However, as shown above, some line pairs indicate incorrect matches that should not be paired. It can be fixed in next section when finding homography.

Part B.4: A Robust Method (RANSAC) to Compute a Homography

From lecture note, RANSAC for estimating homography follows the key steps in RANSAC loop (num_iteration = 1000): 1. Select four feature pairs (at random); 2. Compute homography $ H $ (exact); 3. Compute $ inliers $ where dist(pi', Hpi) < $ \epsilon $; 4. Keep largest set of inliers; 5. Re-compute least sqaures H estimate on all of the inliers.