# PCA -How many variables needs to be considered?

#1

Principal component analysis is a method of extracting important variables from a large set of variables
available in a data set.

I Have run PCA on Iris and plotted the scree plot which tells me to choose number of PC. but how many variable do I choose here from the given results of the PCA analysis pls let me know.

print(ir.pca)

Standard deviations:
[1] 1.7124583 0.9523797 0.3647029 0.1656840

Rotation:
PC1 PC2 PC3 PC4
Sepal.Length 0.5038236 -0.45499872 0.7088547 0.19147575
Sepal.Width -0.3023682 -0.88914419 -0.3311628 -0.09125405
Petal.Length 0.5767881 -0.03378802 -0.2192793 -0.78618732
Petal.Width 0.5674952 -0.03545628 -0.5829003 0.58044745

Importance of components:
PC1 PC2 PC3 PC4
Standard deviation 1.7125 0.9524 0.36470 0.16568
Proportion of Variance 0.7331 0.2268 0.03325 0.00686
Cumulative Proportion 0.7331 0.9599 0.99314 1.00000

insights much appreciated.

Regards,
tony

#2

Hi @tillutony,

In PCA we always choose high variance with minimum number of components. For example, if you have thousand features and after applying PCA, only 10 components give 99% variance(If you think 99% variance is good) then we can choose only 10 components.

Regards
Ankit

#3

Just sort proportion of variance in descending order and then do a cumsum for it, and choose the top variables which are giving around >=95% variance…

#4

@tillutony Select variable whose Eigen value is more than 1

Have a look at this link to understand importance of eigenvalue in PCA
http://www.cs.otago.ac.nz/cosc453/student_tutorials/principal_components.pdf

#5

Hi Experts.

unable to understand what variables to pick after PCA Analysis. please let me from belowww

Retained PCs

``````    PC1        PC2
``````

1 -2.303540 -0.4748260
2 -2.151310 0.6482903
3 -2.461341 0.3463921

``````                PC1         PC2