Model accuracy without over fitting data

machine_learning

#1

Hi All,

I was going through machine learning topics and came across term “Over-fitting of data”, So question i want to ask, is it possible for any model to have 100% accuracy without over-fitting the data ?


#2

Yes! It is possible.

explanation using an example:
A model can give 100% accuracy without overfitting if your there is not “irreducible error” in your data.

Irreducible error: Some times your data doesn’t capture all the information that it has to. Imagine you are predicting the occurrence of a lightning or earthquake. you may capture features like temperature, pressure etc. but they are not sufficient for the prediction of a lightning. So there is some error caused by missing features. your model tries to fit to training data(with error). when you stretch your model too much during training to fit your erroneous data to labels, it leads to overfitting.

I hope this helps…


#3

@sriram.pillutla Thank you, but i am wondering like you said if model tries to predict with less number of features then error will be there but generally if model tries to fit on all available features isn’t it going to over-fit the data?. one more question, is there any way to reduce the " irreducible error"?


#4

Pray to God :pray: :laughing:

Jokes aside, I will try to answer your question, but in a more abstract way.[quote=“ajas.bakran, post:1, topic:13387”]
is it possible for any model to have 100% accuracy without over-fitting the data
[/quote]

Yes I think it can be. But it would be on the training data. I would never expect my model to have 100% accuracy on test data. If there is, I would think that something is wrong.

Do you know what industrial standards are when evaluating the model? They say you should have 99% accuracy on validation to have the model industrially viable. Why not 100%? Because everyone knows that there always a lurking variable to account for, and if you aren’t; you are probably fooling yourself.

Even we as humans are not perfect, then why should we expect our models to be perfect?


#5

Yes it can be possible, There is no noise in test data and we got 100% accuracy on validation set which is a representation of test data.


#6

Thanks for you reply…

If you think it can be achieved on training set, isn’t it going to over fit the data on training set and will be failed to generalize?..:confused:

Everyone knows except me, currently i am beginner in this field, thanks for the fact. you are sounding judgmental.

I asked the question out of curiosity,


#7

I may or may not be true. I once trained a model which attained 99.9% accuracy on train set and 99.6% on test (MNIST dataset) It basically depends on the distribution of train and test dataset and how do they match[quote=“ajas.bakran, post:6, topic:13387”]
Everyone knows except me, currently i am beginner in this field, thanks for the fact. you are sounding judgmental.
[/quote]

I meant “everyone” as in everyone who had the power to decide that the threshold to be 99%. I’m sorry if I offended you in any way.[quote=“ajas.bakran, post:6, topic:13387”]
I asked the question out of curiosity,
[/quote]
Yes I know. This is a discussion portal right? I’m allowed to have freedom of speech :wink: