Machine Learning Project on Imbalanced data - Test dataset query

machine_learning
data_science

#1

I am not sure if I understand the dataset properly. Here we are trying to classify the test data into either ones with more than 50k income or less. But that column is already given in the test dataset( income_level). Isn’t the test dataset only supposed to contain other feature values and we make a prediction for this specific column?( the income level in our case)? I was of the impression that only the train data will contain this specific column value and for test data we predict the value for this column based on other features


#2

Hi @pudkeaayush,

Yes, the test data has target given. This is because the dataset is for your practice; for you to evaluate how your model scores on an unseen data.


#3

Got it. Thanks!