How to avoid overfitting in Prediction?



I am recently looking at the methods to improve the power of model and there are various methods people have suggested:

  • Focus on hypothesis generation
  • Data Cleaning and Exploration (Find the relation between features)
  • Right selection of method (machine learning technique)
  • Cross-Validation of model

While reading all these methods, I came across terms Overfitting and Underfitting. My understanding about “Underfitting” is, you have not predicted well or power of prediction is low and for “Overfitting”, your model is not generalized for unknown data set.

Here, I need your help for methods to avoid “Overfitting” and what are the metrics to validate it.



Hi @hinduja1234

Here is one technique you can use to avoid underfitting or overfitting in your model, you can tune the parameters of the machine learning algorithm you are using. For example, let’s say you are using the support vector machine for building a model and SVM has following parameters -

  1. class_weight
  2. coef0
  3. decision_function_shape
  4. degree
  5. gamma
  6. kernel

and many other parameters which you can find here

you can find optimum values of these parameters using Grid Search Cross Validation or Random Search Cross Validation which will give you optimum values for your parameter to avoid underfitting or overfitting. You can find more details on how to implement these techniques here

I hope this will help :slight_smile: