Augmented Reality & Image Mosaics

A comprehensive implementation of augmented reality video overlay and image panorama stitching using SIFT feature detection, planar homography estimation with RANSAC, and advanced warping techniques.

📋 Overview

This project implements two major computer vision applications:

Part 1: Augmented Reality with Planar Homographies

Real-time AR video overlay that replaces a book cover in a video with custom AR content using:

Planar Tracking: Detect and track a book cover across video frames
Homography-based Warping: cv2.warpPerspective for AR content projection
Mask-based Compositing: Polygon masking for seamless overlay
Video Processing: Frame-by-frame processing with audio preservation

Part 2: Image Mosaics & Panorama Stitching

Create wide panoramic images from overlapping photographs using:

Feature Detection & Matching: SIFT keypoint detection with Lowe's ratio test
Homography Estimation: Direct Linear Transform (DLT) using SVD
Robust Estimation: RANSAC for outlier rejection
Image Warping: Inverse warping with bilinear interpolation
Multi-Image Stitching: Sequential stitching for 3+ images

🎯 Features

Part 1: Augmented Reality Application

✅ Real-time book cover detection and tracking
✅ SIFT-based feature matching between book cover and video frames
✅ Frame-by-frame homography computation with RANSAC
✅ Perspective warping using cv2.warpPerspective
✅ Polygon-based masking for seamless AR overlay
✅ Aspect ratio-aware cropping and resizing
✅ Video generation with original audio preservation
✅ Support for different AR source videos

Part 2: Image Mosaics & Panoramas

✅ SIFT-based feature detection and matching
✅ Custom DLT homography estimation (no OpenCV homography functions)
✅ RANSAC implementation for robust homography computation
✅ Bilinear interpolation for sub-pixel accuracy
✅ Backward warping to avoid holes in output
✅ Homography verification with visual point mapping
✅ Support for 2-image and 3-image panoramas

Implementation Highlights

No built-in homography functions (Part 2): All homography computation done from scratch using SVD
Efficient warping: Inverse warping ensures every output pixel has a value
Quality interpolation: Bilinear interpolation for smooth results
Robust matching: Lowe's ratio test (0.75) + RANSAC (5-pixel threshold)
Production-ready AR: Full video processing pipeline with audio

🚀 Getting Started

Prerequisites

pip install opencv-python numpy matplotlib moviepy tqdm

Usage

Part 1: Augmented Reality

jupyter notebook augmented_reality.ipynb

Run all cells to:

Load book cover image and video frames
Compute homographies for each frame
Overlay AR content onto the book
Generate final video with audio

Part 2: Image Mosaics

jupyter notebook img_mosaics.ipynb

Or use VS Code with Jupyter extension to run the notebooks interactively.

📊 Results

Part 1: Augmented Reality

Successfully created AR video with:

Input: Book cover image (cv_cover.jpg) + tracking video (book.mov)
AR Source: Custom video content (ar_source.mov)
Output: Seamless AR overlay video with synchronized audio
Processing: ~300+ frames with real-time book tracking

Part 2: Image Mosaics

The implementation successfully stitches:

2-image panoramas: pano_image1.jpg + pano_image2.jpg
Test datasets: Multiple test image pairs (test2, test3, test5)
3-image panoramas: Shanghai skyline series, test6 series

🔧 Technical Details

Part 1: AR Pipeline Architecture

Book Cover Image + Video Frames
    ↓
SIFT Feature Detection & Matching
    ↓
RANSAC Homography Estimation (per frame)
    ↓
Book Corner Detection & Mapping
    ↓
AR Frame Cropping & Aspect Ratio Adjustment
    ↓
Perspective Warping (cv2.warpPerspective)
    ↓
Polygon Mask Creation
    ↓
AR Overlay Compositing
    ↓
Video Encoding + Audio Synchronization
    ↓
Final AR Video Output

Part 2: Panorama Pipeline Architecture

Input Images
    ↓
SIFT Feature Detection
    ↓
Feature Matching (BFMatcher + Lowe's Ratio Test)
    ↓
RANSAC Homography Estimation
    ↓
Canvas Creation & Reference Image Placement
    ↓
Inverse Warping with Bilinear Interpolation
    ↓
Final Stitched Panorama

Key Algorithms

AR-Specific: Perspective Warping & Masking

1. Compute homography H mapping book to video frame
2. Warp AR content using cv2.warpPerspective(ar_frame, H, frame_size)
3. Create polygon mask at mapped book corner positions
4. Composite: result = frame * (1-mask) + warped_ar * mask
5. Only pixels inside polygon show AR content

AR-Specific: Aspect Ratio Preservation

- Calculate aspect ratios of book and AR video
- Crop AR frames to match book aspect ratio
- Center-crop to avoid distortion
- Resize to exact book dimensions for warping

1. SIFT Feature Matching

- Detects keypoints in grayscale images
- Computes 128-dimensional descriptors
- BFMatcher with L2 norm
- Lowe's ratio test: distance(m1) < 0.75 * distance(m2)
- Keeps top 50 matches

2. DLT Homography Estimation

- Constructs 2N×9 matrix A from N correspondences
- Each correspondence contributes 2 equations
- Solves Ah = 0 using SVD
- Solution: right singular vector with smallest singular value
- Normalizes: H[2,2] = 1

3. RANSAC

- Iterations: 500
- Sample size: 4 points (minimum for homography)
- Inlier threshold: 5 pixels
- Refinement: Recompute H using all inliers

4. Inverse Warping (Panorama only)

- For each output pixel (x,y):
  1. Apply H_inverse to get source coordinates (x',y')
  2. Check if (x',y') is within source image bounds
  3. Use bilinear interpolation to get pixel value
  4. Assign to output canvas

🧮 Mathematical Background

Homography Matrix

A 3×3 matrix representing a projective transformation:

H = [h11  h12  h13]
    [h21  h22  h23]
    [h31  h32  h33]

Maps point (x,y) to (x',y'):

[x']   [h11  h12  h13] [x]
[y'] = [h21  h22  h23] [y]
[w']   [h31  h32  h33] [1]

x' = (h11*x + h12*y + h13) / (h31*x + h32*y + h33)
y' = (h21*x + h22*y + h23) / (h31*x + h32*y + h33)

DLT Formulation

For correspondence (x,y) → (x',y'):

[-x  -y  -1   0   0   0  x*x'  y*x'  x'] [h1]
[ 0   0   0  -x  -y  -1  x*y'  y*y'  y'] [h2] = 0
                                          [h3]
                                          [h4]
                                          [h5]
                                          [h6]
                                          [h7]
                                          [h8]
                                          [h9]

🎓 Key Functions

Part 1: Augmented Reality

Function	Description
`sift_match_images()`	SIFT feature detection and matching between images
`compute_homography()`	Compute 3×3 homography from correspondences using DLT
`RANSAC()`	Robust homography estimation with outlier rejection
`apply_homography()`	Transform points using homography matrix
`map_book_corners_to_frame()`	Detect book position in video frame
`crop_and_resize_frame()`	Adjust AR content to book aspect ratio
`overlay_ar_frame_on_book_masked()`	Composite AR content with polygon masking
`load_video_frames()`	Load all frames from video file

Part 2: Image Mosaics

Function	Description
`findMatchesSift()`	SIFT detection, matching with Lowe's ratio test
`DLT_HomographyEstimation()`	Compute homography using SVD (custom implementation)
`RANSAC()`	Robust homography estimation with outlier rejection
`bilinear_interpolation()`	Sub-pixel sampling for smooth warping
`warp_image()`	Create output canvas and place reference image
`inverse_warp()`	Backward warping with bilinear interpolation
`stitch_images()`	Complete stitching pipeline
`verify_homography()`	Visual verification of homography accuracy

Name		Name	Last commit message	Last commit date
Latest commit History 18 Commits
Assignment 2 materials (1)/assignment_2_materials		Assignment 2 materials (1)/assignment_2_materials
images		images
Assignment 2		Assignment 2
README.md		README.md
ar_result_video.mp4		ar_result_video.mp4
ar_result_video_with_audio.mp4		ar_result_video_with_audio.mp4
augmented_reality.ipynb		augmented_reality.ipynb
img_mosaics.ipynb		img_mosaics.ipynb

Folders and files

Latest commit

History

Repository files navigation

Augmented Reality & Image Mosaics

📋 Overview

Part 1: Augmented Reality with Planar Homographies

Part 2: Image Mosaics & Panorama Stitching

🎯 Features

Part 1: Augmented Reality Application

Part 2: Image Mosaics & Panoramas

Implementation Highlights

🚀 Getting Started

Prerequisites

Usage

Part 1: Augmented Reality

Part 2: Image Mosaics

📊 Results

Part 1: Augmented Reality

Part 2: Image Mosaics

🔧 Technical Details

Part 1: AR Pipeline Architecture

Part 2: Panorama Pipeline Architecture

Key Algorithms

AR-Specific: Perspective Warping & Masking

AR-Specific: Aspect Ratio Preservation

1. SIFT Feature Matching

2. DLT Homography Estimation

3. RANSAC

4. Inverse Warping (Panorama only)

🧮 Mathematical Background

Homography Matrix

DLT Formulation

🎓 Key Functions

Part 1: Augmented Reality

Part 2: Image Mosaics

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages