Derivation

In simple words, any matrix 'A' can be decomposed into a product of 3 matrices : U, ∑ and V, where ∑ contains the singular values along its diagonal, and U & V are unitary matrices. Here is a derivation for SVD:

SVD Derivation Part 1
SVD Derivation Part 2
SVD Derivation Part 3
SVD Derivation Part 4

Applications

  1. Image compression
  2. Low-rank approximations
  3. PCA
  4. Rank determination
  5. Least squares

and a million more...

Let's look at one such application in computer vision:

Image compression

What if we delete small singular values from ∑ and its corresponding vectors from U and V? We can then obtain the projection of A onto a lower-dimensional subspace. This technique can be used to compress an image at the loss of some high-frequency information.

Using this code, we can reconstruct the image using the first N components:

from PIL import Image
import numpy as np
import matplotlib.pyplot as plt

img = Image.open('tiger.webp')
img = np.array(img)
gray = 0.2989 * img[:, :, 0] + 0.5870 * img[:, :, 1] + 0.1140 * img[:, :, 2]

U, s, V = np.linalg.svd(gray)

recon_imgs = []
for n in [5, 15, 30]:
    S = np.zeros(np.shape(gray))
    for i in range(0, n):
        S[i, i] = s[i]
    recon_img = U @ S @ V
    recon_imgs.append(recon_img)

fig, ax = plt.subplots(2, 2)

ax[0][0].imshow(gray, cmap='gray')
ax[0][0].axis('off')
ax[0][0].set_title('Original')

ax[0][1].imshow(recon_imgs[0], cmap='gray')
ax[0][1].axis('off')
ax[0][1].set_title(f'Reconstructed n = 5')

ax[1][0].imshow(recon_imgs[1], cmap='gray')
ax[1][0].axis('off')
ax[1][0].set_title(f'Reconstructed n = 15')

ax[1][1].imshow(recon_imgs[2], cmap='gray')
ax[1][1].axis('off')
ax[1][1].set_title(f'Reconstructed n = 30')
plt.show()

There are 942 singular values for the original image, since it's a (942x942) image with unique pixel rows.

SVD Image Compression Result

We can recover most of the low-frequency information from the image using only the first 15-30 components of SVD. Even the finer details like eyeballs are present in the reconstructed image, with a compression rate of ~ 2000-4000.