Check Overftting of Model & Validate the model



I want to know how to find the accuracy of the train model for classification. I want to check whether its overfit by checking the accuracy of train & test model. For instance we use actual vs predicted in case of test to find the accuracy so what will be the parameter for train model. Also want to know whether the approach will be different for regression or it remain same. If possible provide the python code also.


Hi @swarup17

Overfitting is when you model works extremely well on the train dataset but doesn’t show a similar performance on the test dataset.

If you want to check the performance of your model using the train dataset, you can follow the below steps

  1. Read the train file
  2. Split the train file into train and validation set.
  3. Fit the model on the train set.
  4. Check score on validation set.

Then you can make predictions on the test file and make a submission and check your score.

Create a validation set, and check model performance on the same.

We do not have the actual values for test set. If you have the target variable given, you are probably using the train-test split on your complete train dataset. After the split, you can name the set as ‘train and test’ or ‘train and validation’.

The approach will be same, only the evaluation metric will be different