How the number of observation and number of predictors affect the flexibility of the model?

#1

I am currently studying about the base- variance Tradeoff of the model.

In statistics and machine learning, the** bias–variance tradeoff** (or dilemma) is the problem of simultaneously minimizing two sources of error that prevent supervised learning algorithms from generalizing beyond their training set.

The flexibility of function increases, its variance increases, and its bias decreases.

But I am not able to understand how the flexibility of the model affected by the number of predictions and number of observation.

#2

Hi Harry,

I cannot completely grasp your point. I don’t know what you exactly mean by “flexibility”.

Let me share my understanding of bias and variance briefly. First you need to understand the concept of an “average” model. When you are solving a problem, you have 1 particular training set. But there might be many such training sets. While analyzing a model complexity, we try to see the average performance on all such training sets. Just keep this “average” model in mind.

Bias:

• How well does the “average” model gets closer to the true/actual model.
• As the model complexity increases, though the individual model might overfit, but on average, the prediction will get closer to true model.
• Conclusion: higher complexity -> lower bias

Variance:

• What is the variance of the different possible models for same data point
• As complexity increases, the variation in outcomes of same model on different training sets will be high thus high resulting in high variance.
• Conclusion: higher complexity -> higher variance

Now it is easy to comprehend. Since the model complexity:

• Increases with increase in #predictors - high #predictors would mean low bias but high variance (for same #observations)
• Reduces with increase in #observations, high #observations would mean high bias but low variance (for same #predictors)

Hope this makes sense.

Cheers,
Aarshay