PCA simple example (principle component analisys)

simple examples of what PCA does with 2D set of points:
ellipsoid like set of points:

pca example

pca example

upper graph is original set of points
red vectors are PCA components scaled by explained variance
buttom graph is transformed set of points

so one can see that PCA will

1) move center of axis to mean of points
2) find axis with most variance.set second axis will be perpendicular to first one
3) project points to new axis

results with small outlier: (outlier make PCA less useful)

pca example outlier

pca example outlier

pca simple example python code to play with PCA:

[copy paste it to ipython enviroment with sklearn-kit installed]

press left mouse button to add points to 2D set
press right mouse button to generate PCA of points

from Tkinter import *
import numpy as np
from sklearn.decomposition import PCA

points=np.zeros((1,2))

def leftbutton(event):
    global points
    event.widget.create_oval(event.x,event.y,event.x+5 ,event.y+5,fill="red" )
    points[0][0]=event.x
    points[0][1]=event.y
    points=np.append(points,[[event.x,event.y]],axis=0)

def rightbutton(event):
    pca = PCA()
    pca.fit(points)
    traint=pca.transform(points)
    scatter(points[:,0],points[:,1],alpha=0.5)
    pmean=np.mean(points,0)
    axis('equal')
    plot([pmean[0]],[ pmean[1]], 'g.', markersize=10.0)
    S=sqrt(pca.explained_variance_)
    plot([pmean[0],pmean[0]+S[0]*pca.components_[0][0]],[pmean[1],pmean[1]+S[0]*pca.components_[0][1]],'r-',lw=2)
    plot([pmean[0],pmean[0]+S[1]*pca.components_[1][0]],[pmean[1],pmean[1]+S[1]*pca.components_[1][1]],'r-',lw=2)
    show()
    scatter(traint[:,0],traint[:,1],alpha=0.5)
    show()
    print "pca vector=",pca.components_
    print "0,0->",pca.inverse_transform([0,0])

root = Tk()
cnv = Canvas(root, bg="white", width=500, height= 500)
cnv.pack()
cnv.bind("<Button-1>", leftbutton)
cnv.bind("<Button-3>", rightbutton)
root.mainloop()
Posted in math Tagged with: ,