Aarav Patel
Project 2 is an exploration of the following: 1) analyzing edges by computing image gradient, 2) sharpening images by accentuating high frequencies (subtracting image blurred with Gaussian), 3) creating hybrid images using low and high frequency filters on two images, 4) multi-resolution blending.
I first computed the partial derivatives with respect to x [1, -1] and y ([1], [-1]) of the cameraman image through convolution with the respective finite difference operators. I arrived at the gradient magnitude image by calculating the Euclidean Norm of the resulting partial derivative images. Finally, comparing the gradient magnitude with an empirically determined threshold results in a binarized gradient magnitude. The threshold is a scalar multiplier representing how many standard deviations above the mean of gradient magntiude should be.
I first created a 2D Gaussian filter by taking the outer product of a 1D Gaussian with its transpose. I blurred the original cameraman image with this 2D Gaussian. I then repeated the steps in part 1.1. The gradient magnitude image has less noise; thus, the binarized gradient magnitude image more clearly highlights the edges.
Now, instead of convolving the image with the 2D Gaussian and subsequently convolving with the finite difference operators, I first convolved the 2D Gaussian with the finite difference operators. This resulted in the Derivative of Gaussian (DoG) Filter. As you can see, there is no difference between the two processes as convolution is commutative.
To sharpen an image, I first blurred the image with a Gaussian and subtracted that from the original image. Blurring an image with a Gaussian filter filters out high frequencies, which is effectively a low-pass filter. Thus, subtracting the blurred image from the original image leaves the higher frequencies. Therefore, adding this result multiplied by a scalar to the original image sharpens it–– a larger scalar results in a sharper image. I repeated this process for each channel for color images and recombined them at the end.
Now, I compress the sharpening process using the Unsharp Mask Filter, which performs a single convolution. I used the following derivation shown in lecture:
ƒ + ɑ(ƒ - ƒ∗g) = (1 + ɑ)ƒ - ɑƒ∗g = ƒ∗((1 + ɑ)e - ɑg))
Here, ƒ is the image, g is the Gaussian filter, ɑ is the scalar multiplier of the high-frequencies, and e is the unit impulse (a matrix that is 0 everywhere except it's 1 at the center and has the exact dimensions as the Gaussian filter in use). As you can see, there is no difference between the processes.
I've now sharpened some other examples here.
Now, I take sharp images, blur them, then attempt to sharpen them again. As you can see, these attempts were unsuccessful. This is because the initial blurring resulted in information loss.
I first show what an already sharp image looks like when sharpened. Then, I take the sharp image, blur it, and then attempt to sharpen it again. As you can see, this attempt was also unsuccessful.
To create a hybrid image, I first aligned the images using 2 manually selected points from each image. Then, I applied a low-pass filter to one image and a high-pass filter to the other image. I used the standard 2D Gaussian filter for the low-pass filter and subtracted the Gaussian-filtered image from the original for a high-pass filter. I then experimented to determine the best cut-off frequency for each hybrid image I created.
I then created a hybrid image from two images of myself in my room at two different positions.
I then created a hybrid image out of a man who is happy in high frequency and unhappy in low frequency.
For my best result (the happy/unhappy man), I also completed the process through frequency analysis. Below, I show the log magnitude of the Fourier transform for the original unhappy man, the original happy man, the low-pass filtered unhappy man, the high-pass filtered happy man, and the hybrid image, respectively.
Here is a failed attempt to create a hybrid image out of Tony Stark and Steve Rogers.
Here, I've reconstructed Figure 3.42 in the Szelski textbook using Gaussian and Laplacian stacks. I first constructed a Gaussian stack for the Oraple. Using that Gaussian stack, I constructed the Laplacian stacks for the right and left sides of the image. I then constructed the Gaussian stack for my mask. Next, I computed the weighted average of the mask and Laplacian for each image at each level. I then compressed the result down into the final image. I repeated this for each color channel, stacking them at the end.
I first attempted to reconstruct the Oraple using the given images of an apple and an orange. Then, I attempted to blend the Berkeley and Stanford campuses. Lastly, I attempted to blend myself into Ironman using an irregular mask.