博客
关于我
强烈建议你试试无所不能的chatGPT,快点击我
ML 14 part2 principal component analysis
阅读量:4040 次
发布时间:2019-05-24

本文共 3188 字,大约阅读时间需要 10 分钟。

这里写图片描述

Given m samples in Rn

这里写图片描述

这里写图片描述

The goal is to find the direction u along which maximizing the sum of projection of each point
u
这里写图片描述
这里写图片描述
So to speak, u ought to be the eigenvectors of the matrix
Σ
这里写图片描述
Here we reduce n-dimension space down to k-dimension subspace which is composed of u1 , u2 , … , uk basis
这里写图片描述
Given any x in n-dimension space, we can express it in k-dimension space as
(uT1x,uT2x,,uTkx) . In other words, we project xRn into subspace {
u1,u2,,uk}Rk

In order to perceive PCA algorithm more intuitive, we use matlab to draw some figures.

X0=read_obj('football1.obj'); % seahorse_extended.obj');vertex=X0.xyz';  %坐标faces=X0.tri'; %三角形顺序索引center0 = mean(vertex,1);% display the meshclf;plot_mesh(vertex, faces);shading interp;[pc,scores,pcvars] = princomp(vertex);u1 = pc(:,1);u2 = pc(:,2);u3 = pc(:,3);% % hold on center = [];center = [center ;center0];center = [center ; center0 + u1'*20];% center = [center ;center0];% center = [center ; center0 + u1'*-20];center = [center ;center0];center = [center ; center0 + u2'*20];% center = [center ;center0];% center = [center ; center0 + u2'*-20];center = [center ;center0];center = [center ; center0 + u3'*20];% center = [center ;center0];% center = [center ; center0 + u3'*-20];plot3 (center(:,1), center(:,2), center(:,3), '-*');

The core PCA algorithm is implemented in princomp function. the document of princomp says:

[COEFF,SCORE,latent,tsquare] = princomp(X)

COEFF = princomp(X) performs principal components analysis (PCA) on the n-by-p data matrix X, and returns the principal component coefficients, also known as loadings. Rows of X correspond to observations, columns to variables. COEFF is a p-by-p matrix, each column containing coefficients for one principal component. The columns are in order of decreasing component variance.
The scores are the data formed by transforming the original data into the space of the principal components. The values of the vector latent are the variance of the columns of SCORE. Hotelling’s T^2 is a measure of the multivariate distance of each observation from the center of the data set.

Expressing in our own language, the columns of COEFF is the eigenvectors of matrix Σ which are the basis of the reduced space. In the document, principal components are their aliases.

In our case, because the dimension of original space is 3, the dimension of reduced space must be less than or equal to 3.

Then the data in original space can be reduced into the reduced space by means of dot product with each principal component to constitute a new point in reduced space.

the result picture seems like:

这里写图片描述

Due to the fact that I am fond of sharing, I would like to release the whole matlab source code:

Application in face recognition

这里写图片描述

Assuming faces are encoded in 100*100 images, each image consists in 10000 pixels, which means each image xR10000 . Every pixel in the image stands for a dimension of the space.

Each image can be expressed a dot in R10000 , then we can solve the k top eigenvectors of Σ in which each eigenvector can denote a people’s face.

say if we get a new face image, then we can project it on each eigenvector of subspace Rk . The large the projection is, more like these two faces are.

你可能感兴趣的文章
Android自定义apk名称、版本号自增
查看>>
adb command not found
查看>>
Xcode 启动页面禁用和显示
查看>>
【剑指offer】q50:树中结点的最近祖先
查看>>
二叉树的非递归遍历
查看>>
【leetcode】Reorder List (python)
查看>>
【leetcode】Linked List Cycle (python)
查看>>
【leetcode】Linked List Cycle (python)
查看>>
【leetcode】Candy(python)
查看>>
【leetcode】Clone Graph(python)
查看>>
【leetcode】Sum Root to leaf Numbers
查看>>
【leetcode】Pascal's Triangle II (python)
查看>>
java自定义容器排序的两种方法
查看>>
如何成为编程高手
查看>>
本科生的编程水平到底有多高
查看>>
AngularJS2中最基本的文件说明
查看>>
从头开始学习jsp(2)——jsp的基本语法
查看>>
使用与或运算完成两个整数的相加
查看>>
备忘:java中的递归
查看>>
DIV/CSS:一个贴在左上角的标签
查看>>