The dimensionality reduction method commonly used in machine learning is principal component analysis (PCA), while principal component analysis is commonly used for singular value decomposition (SVD). So what is the effect of SVD? SVD is often used to compress images, let's experiment.
Package used:
PIL
Numpy
experimentLoad a color image and perform SVD decomposition on its RGB channel. The formula for the proportion of singular values ​​is:
The first K singular values ​​of [0.1, 0.2, ..., 0.9] are taken and the image recovery operation is performed.
From PIL import Image
Import numpy as np
Def rebuild_img(u, sigma, v, p): #p represents the percentage of singular values
Print p
m = len(u)
n = len(v)
a = np.zeros((m, n))
Count = (int)(sum(sigma))
curSum = 0
k = 0
While curSum <= count * p:
Uk = u[:, k].reshape(m, 1)
Vk = v[k].reshape(1, n)
a += sigma[k] * np.dot(uk, vk)
curSum += sigma[k]
k += 1
Print 'k:',k
a[a < 0] = 0
a[a > 255] = 255
#According to the nearest distance, take the integer and set the parameter type to uint8
Return np.rint(a).astype("uint8")
If __name__ == '__main__':
Img = Image.open('test.jpg', 'r')
a = np.array(img)
For p in np.arange(0.1, 1, 0.1):
u, sigma, v = np.linalg.svd(a[:, :, 0])
R = rebuild_img(u, sigma, v, p)
u, sigma, v = np.linalg.svd(a[:, :, 1])
G = rebuild_img(u, sigma, v, p)
u, sigma, v = np.linalg.svd(a[:, :, 2])
B = rebuild_img(u, sigma, v, p)
I = np.stack((R, G, B), 2)
#Save the picture in the img folder
Image.fromarray(I).save("img\\svd_" + str(p * 100) + ".jpg")
effect
A total of 10 pictures, from top to bottom singular value and ratio [0.1, 0.2, ..., 0.9, 1.0], singular value and the proportion of [0.7, 0.8, 0.9] recovery images are still relatively clear, but the corresponding singularity The number of values ​​is very small, as shown in the following table:
Singular value and singular value singular value
0.7 45 0.10
0.8 73 0.16
0.9 149 0.33
1.0 450 1.00
summary
Singular value decomposition can effectively reduce the dimensionality of data. Taking the picture in this article as an example, after reducing from 450 to 149, 90% of the information is retained.
Although the singular value decomposition is effective, it cannot be abused. In general, the loss of information after dimension reduction cannot exceed 5% or even 1%.
Ng's video mentions common mistakes in the use of dimensionality reduction, which is also posted here:
Use dimension reduction to solve the over-fitting problem. In any case, first use the dimension reduction to process the data, that is, the dimension reduction as a necessary step for model training.
Laptop With Touch Screen,2-In-1 Laptop,Laptop For Education,2 In1 Windows Tablet
C&Q Technology (Guangzhou) Co.,Ltd. , https://www.gzcqteq.com