What is the difference between parametric and non-parametric regression?




What is the difference between parametric and non-parametric regression? What does it have to do with the sample size of our data?



Hi Pravin,

Machine learning algorithms can be classified as parametric or non-parametric.

Parametric: It has fixed number of parameters and computationally faster, but takes stronger assumptions about the data. These algorithms do well if assumptions turn out to be correct and badly if the assumptions are wrong.

Non-parametric: It uses a flexible number of parameters and parameters grows as it learns from more data. A non-parametric algorithm is computationally slower but takes fewer assumptions about the data.



Parametric methods assume a form for the model (for example in linear regression, we assume that the regressand is linearly dependent on the regressors and each regressor has an effect of beta on the regressand). They are simpler and are interpretable but require more number of data points. They will work well if the assumption is correct i.e., if the original (actual) relationship is similar to what we assumed viz. the bias is low.

Non Parametric methods do not assume anything about the data and learn from the data. For example, consider Random Forests. They learn from the data gradually and are slower. The interpretability is lower than that of Parametric methods. Typically they require less data than what is required for Parametric methods.