Loan Prediction Problem Dataset


In the regression task in mlr, if the test dataset csv doesn’t have the continuous target feature, as it is only available in the train set, predict function showing error as undefined column,
I tried adding same column name with zeros (not sure whether I can do), the error in predict function has gone but the performance method not computing the measures rsq, rmse.
May I please know how to do this.


Hi @apremgeorge,

Drop the target variable column for train dataset before you fit the model. Then the train and test will have same number of columns.


But both test and train data set are empty and does not contain any data. How to open it after download .csv pls tell me.


Hi @kotrappa93

You can download the train and test file from the link below. Also, you can read the csv file in python using the following command : df=pd.read_csv("file_name.csv")


Congratulations…you may know a lot of data science but you do know nothing about setting up a website o giving instructions…how come you can not just give an URL for downloading the data set problem?..I CAN NOT FIND IT…this is incredible…what good is data science if you can not give a simple instruction???


@ropardo, The UI of datahack platform is similar to any other platform for online hackathons and it is pretty simple. I have given the link, with every answer in the above conversation. For your reference, there you go again:

Let me just put the process into steps for easier understanding .

Step 1: Login to datahack. Link for datahack platform:

After you LOGIN, it does not mean you have REGISTERED in the problem statement.

Step 2: Register for the loan prediction practice problem. Link for loan prediction practice problem :

Once you register, you will be able to read the problem statement. Scroll down to the ‘Data’ section of the page and you’ll see the train, test and submission files.

Step 3: Download the files.

Step 4: If you still don’t get it, DM me your email id and I’ll send the datasets to you. But you won’t be able to make submissions or check your rank on the leaderboard.

Discussions for article "A Complete Tutorial to Learn Data Science with Python from Scratch"

Hi Aishwarya,

Please forward the datasets to my email, since I have registered but not able to download the dataset



Hi Aishwarya, i have just missed the registration date for the above hackathon by 1 day and now i am not able to register and consequently not able to download the test data. Can you please either mail the material to my mail id ‘’ or upload the material on any other url.

Thanks in advance


Can you please mail me the data set at In my view, best would be to make it available freely on the problem page. In my humble view, it should be easy to access, otherwise it defeats the whole purpose of the exercise. You can’t be mailing it to everyone who needs it. Thank you.


Can’t really access data despite registering for the problem ? Nothing happens when i click on ‘data’. Can you please have a look ?


okay… I’ll convey this to my tech team. For now, can you scroll down to the data section and see if you find three csv files to download? Here is a screenshot for your reference -

I’m sorry for the inconvenience.


Hi Aiswarya,
I couldn’t find Login to datahack.
I would like to get Loan Prediction Problem data set.
My email id is
Thanking you.


Hi @chithrakishore,

Refer to this discussion thread:

You have to login, then register. Here is the screenshot for where you will find the dataset once you register.

Mailing the dataset is certainly easier for me but if you do not register, you won’t be able to make a submission or see your score against other participants. So please try an register for the competition.


Hi Kunal,

I registered , I can download the test and train.Are they supposed to be used for the analysis?


Hi @poochau,

this is a practice problem for you to try different ML techniques and generate predictions.


Hello Kunal,
I am following the AV’s 2019 DS path and I am facing problem in a certain topic in pandas
For the Loan Prediction dataset
data[‘Gender’].fillna(mode(data[‘Gender’]).mode[0], inplace=True) returns an error


Hi Aishwarya, cannot download the Test File. Nothing happen when I clicked on the link. Can download the Train File. Please help send the file to my email Tx


Please try again. You should be able to download the Test file now.



Try the code below:

 data['Gender'].fillna(data['Gender'].mode()[0], inplace=True) 


` train=train.drop([‘Income_bin’, ‘Coapplicant_Income_bin’, ‘LoanAmount_bin’, ‘Total_Income_bin’, ‘Total_Income’], axis=1)

train[‘Dependents’].replace(‘3+’, 3,inplace=True)

test[‘Dependents’].replace(‘3+’, 3,inplace=True)

train[‘Loan_Status’].replace(‘N’, 0,inplace=True)

train[‘Loan_Status’].replace(‘Y’, 1,inplace=True) `

this code dosen’t work,

on this side, i got :slight_smile:

KeyError: “labels [‘Income_bin’ ‘Coapplicant_Income_bin’ ‘LoanAmount_bin’ ‘Total_Income_bin’\n ‘Total_Income’] not contained in axis”