I came across the following lines while studying about Dimensionality Reduction:

Dimensionality reduction can also be seen as the process of deriving a set of degrees

of freedom which can be used to reproduce most of the variability of a data set. Consider

a set of images produced by the rotation of a face through di®erent angles. Clearly only

one degree of freedom is being altered, and thus the images lie along a continuous one-

dimensional curve through image space.

Can somebody please explain in simple terms what this means?

The purpose of any “data analysis” is to derive meaningful information from it. One way to extract information from data is to study the variability in data points. The more is the variability, the more careful you have to study or explore the dataset, so that you can capture all of its meaning.

Now, when your data set contains a lot of variables or attributes, its difficult to analyze the data across so many dimensions. Dimensionality reduction techniques focus on deciding the most important attributes of your data sets, so that you can ignore the rest and use only these attributes for data analysis. These “most important” attributes are chosen in such a way so as to capture the “maximum variability” of data set. In this way, you don’t lose the essential information which can be obtained from your data set.

The statement given by you is really difficult to comprehend. But what I understand is, it is trying to convey the same message i.e while doing data analysis, select those attributes by which you can capture maximum variability of your data set. In the example given, the “rotation of face” at different angles captures all necessary information about data and hence this variable or attribute is sufficient for doing data analysis.

1 Like

Thanks for the reply @gauravkantgoel.