Should target be missing in the test file?

loan_prediction
target

#1

Hey guys,

I am new to data science and very excited to start learning! I took on this loan prediction challenge. So I clean the data, applied some feature extraction technique (pca) and now I implemented a knn model with sklearn KNeighborsClassifier. Now to know my prediction rate, I need the Loan_Status column for the test file. Is there anyway of finding that somewhere??

Thank you very much


Response variable column missing from test file
#2

Hello @atybzz

Actually, in order to check your prediction accuracy, you can submit your submission file and check your score and leaderboard position on the data hack portal itself.

Cheers!
Shubham :slight_smile:


#3

In other words, you are basically predicting the target value using the independent variables. You submit the target values. You can even use random generator and fill the target values and upload. You never know by the very meaning of randomness it could be 100% accurate :slight_smile:


#4

Also, you can actually make a constant submission (all 0’s or all 1’s) to get the baseline score for the model and the dataset. The practice of getting a baseline score and different ways(better ways) you can achieve that is explained nicely here -

https://machinelearningmastery.com/how-to-get-baseline-results-and-why-they-matter/

Regards,
Sanad


#5

I’m also facing same issue , please anyone can suggest what is the problem


#6

Hi @akshay_siras,

You first train your model on the training files and then use that model to predict the target for test file.


#8

thnx @PulkitS , I’m not able to submit the file i’m getting this error : [Please check with the test data set, all IDs are not available]
can any one help


#9

Hi @akshay_siras,

Make sure that your submission file contains all the IDs present in the test file.