PCA with Matlab
2 PCA example in \(\mathbb{R}^2\)
clear all; close all; % clear workspace
%% Dataset is created
N=100; % number of data points
x1=10*randn(N,1); % first coordinates
x2=4+x1+10*randn(N,1); % second coordinates - example in space of dimension D=2
%% pca
X=[(x1-mean(x1))';(x2-mean(x2))'];
S=(X*X'/N);
[U,L,V] = svd(S);
%% plot data and visualise eigenvectors scaled using their eigenvalues
figure;plot(x1,x2,'kx','LineWidth',2,'MarkerFaceColor','k'),...,
hold on, plot(mean(x1),mean(x2),'ro','LineWidth',2,'MarkerFaceColor','r'), hold on,...
quiver(mean(x1),mean(x2),V(1,1),V(2,1),sqrt(L(1,1)),'g','LineWidth',2), hold on, ...
quiver(mean(x1),mean(x2),V(1,2),V(2,2),sqrt(L(2,2)),'b','LineWidth',2),hold off, axis equalThe resulting figure shows the dataset in the original coordinates system (using standard basis):
2.1 Visualisation of \(\alpha\)
The resulting figure shows the dataset in the coordinates system defined by PCA:
3 PCA example with MNIST
MNIST dataset https://en.wikipedia.org/wiki/MNIST_database
clear all; close all; % clear workspace
websave('mnist.mat',"https://github.com/daniel-e/mnist_octave/raw/master/mnist.mat");
data=load('mnist.mat');
X = double(data.trainX);
[N,D]=size(X);
% computing the mean image of MNIST dataset
MeanImage=mean(X,1);
% centering data
Xtilde=X-mean(X);
% covariance
S=Xtilde'*Xtilde/N;
[U,L,V]=svd(S);Example (first) image from the training set:
% visualising the first image in MNIST dataset
figure;imagesc(reshape(X(1,:), 28, 28)'),colormap("gray")Mean image of the training set:
Example of a test image:
% reconstruction of a test image
xtest = double(data.testX(1,:));
figure;imagesc(reshape(xtest, 28, 28)'),colormap("gray")Reconstruction of the test image with 10 principal components:
% reconstruction using #nbVector eigenvectors
nbVector=10;
Reconstruction = MeanImage+ ((xtest-MeanImage)*U(:,1:nbVector))*U(:,1:nbVector)';
figure;imagesc(reshape(Reconstruction, 28, 28)'),colormap("gray")3.1 Eigenvectors
% visualising the first eigenvector
figure;imagesc(reshape(U(:,1),28,28)'),colormap("gray")
% visualising the second eigenvector
figure;imagesc(reshape(U(:,2),28,28)'),colormap("gray")
3.2 Eigenvalues
% visualising eigenvalues
eigenvalues=diag(L);
figure; plot(100*eigenvalues/sum(eigenvalues),'x','Linewidth',2), ...
title('percentage of variance explained by each principal component')figure; plot(100*cumsum(eigenvalues)/sum(eigenvalues),'x','Linewidth',2),...
title('percentage of variance explained by the first d principal components')3.3 Visualisation of coordinates \(\alpha\)’s
We are looking at the coordinates \(\alpha\)’s for all \(N=60000\) training images in MNIST.
nc=10; % nb of classes in MNIST
Ytrain=zeros(nc, N);
for i = 0:nc - 1
Ind = find(data.trainY == i);
Ytrain(i + 1, Ind) = 1;
end
[rows,cols,vals] = find(Ytrain==1);
clr = hsv(size(Ytrain,1));
figure;
gscatter(alpha(1,:),alpha(2,:),rows-1,clr), axis equal, ...
title('eigenspace F with d=2')