SVD
SVD (Singular Value Decomposition) is one of my favorite topics in linear algebra. It's almost magical to factorize any matrix into a product of two orthogonal matrices and a diagonal matrix.
Derivation
In simple words, any matrix 'A' can be decomposed into a product of 3 matrices : U, ∑ and V, where ∑ contains the singular values along it's diagonal, and U & V are unitary matrices. Here is a derivation for SVD
Applications
- Image compression
- Low-rank approximations
- PCA
- Rank determination
- Least squares
and a million more...
Let's look at one such application in computer vision:
Image compression
What if we delete small singular values from ∑ and its corresponding vectors from U and V ? We can then obtain the projection of A onto a lower-dimensional subspace. This technique can be used to compress an image at the loss of some high-frequency information.
Using this code, we can reconstruct the image using the first N components
from PIL import Image
import numpy as np \
import matplotlib.pyplot as plt
img = Image.open('tiger.webp') img = np.array(img) gray = 0.2989 * img[:, :, 0] + 0.5870 * img[:, :, 1] + 0.1140 * img[:, :, 2]
U,s,V = np.linalg.svd(gray)
recon_imgs=[] for n in [5, 15, 30]: S = np.zeros(np.shape(gray)) for i in range(0, n): S[i,i] = s[i] recon_img = U @ S @ V recon_imgs.append(recon_img)
fig, ax = plt.subplots(2, 2)
ax[0][0].imshow(gray, cmap='gray') ax[0][0].axis('off') ax[0][0].set_title('Original')
ax[0][1].imshow(recon_imgs[0], cmap='gray') ax[0][1].axis('off') ax[0][1].set_title(f'Reconstructed n = 5')
ax[1][0].imshow(recon_imgs[1], cmap='gray') ax[1][0].axis('off') ax[1][0].set_title(f'Reconstructed n = 15')
ax[1][1].imshow(recon_imgs[2], cmap='gray') ax[1][1].axis('off') ax[1][1].set_title(f'Reconstructed n = 30')
plt.show()
There are 942 singular values for the original image, since it's a (942x942) image with unique pixel rows.
We can recover most of the low-frequency information from the image using only the first 15-30 components of SVD. Even the finer details like eyeballs are present in the reconstructed image, with a compression rate of ~ 2000-4000.