Hi All,

I am working on a problem having 500 numerical variables and 10,000 observations. In general, my approach of problem solving is, first focus on hypothesis generation then univariate, bi and multivariate analysis but in this case it is is not easy to perform due to 500 variables.

I read some methods like dimensionality reduction to deal with this situation and have suggested PCA. Is there any disadvantge of using PCA and is this easy to communicate to end user that these are most significant variables.

Please help with good techniques to perform dimesionality reduction and which method is effective in which situation.

Regards,

Imran