Model selection in machine learning problem



Hi all,
I am new to the machine learning world there fore my question would be very simple so please forgive me for that.
To solve a particular problem we can use different models for fitting .
e.g. I have a data of the employees which contains their information like their Position ,Level,Salary
I fitted this using Polynomial Regression as well as Support vector regression also we can use other models but how to decide which model will be better for such a kind of problems.


Hi Arvind,
This is a very generic question from many modellers. Let me try to decode this and hopefully you will understand.
At first, from your specific example, I believe you are trying to predict salary which would be a continuous variable and thus you have applied polynomial regression & Support Vector Regression classifier.
As this is regression base problem, evaluation metrics for model will be RMSE (or MSE) i.e. Root-Mean-Squared-Error. RMSE can be calculated post model fit when you have predicted outcomes by taking difference from actual result of train dataset.
As we know that we did entire modelling to reduce errors, though with a single model base results we do not know if this is the least RMSE. In your case, you can compare RMSE for both of these 2 classifiers and, whichever gives least RMSE, you can consider same. In case, the problem would be sort of classification, evaluation metrics could be Confusion-Matrix, ROC (AUC). However, this is not just the only evaluation criteria to pick the best model.

I hope this gave you your answer.


Hi @arvindpednekar2013,

Here is a nice article summarizing 7 important model evaluation metrics. You need to compare different models using these metrics based on whether the problem is regression or classification. I hope this helps. :smile: