Source of Loan Prediction Data




Question out of curiosity, what is the source of the data?


Hi @vinit28,

For a healthy competition, we are not supposed to reveal the source from which the data was taken. Instead, you should also focus on solving the problem and learn different techniques rather than just searching for the data source because the main aim of the competition is learning.



Hey @shubham.jain

Thanks for replying. Not asking for averting healthy competition. Want to use this data for a hackathon with some friends and it’s more compelling if the source of the data is known and if the problem feels like a real-world problem. But if you can’t share, I understand.



How do I get the Loan_Prediction data file. I can’t find them. Thanks


Hi @sapmarvins

Register yourself in the below competetion. You will be able to download the dataset after that


Fake page to get your data. it’s not possible to download the data.


Hi @dadoubt,

Please register yourself for the competition. Once you do that, scroll down to the bottom of the page and you will see downloadable train, test and submission csv files.


It is hard to find the train data csv file,even if I register in this wesite.


Hi @dadoubt @akka,

The dataset for loan prediction is available right below the problem statement and data description. Below is a screenshot of the same.

If you are unable to locate it, let me know and I will mail you the dataset.


Good answer!


why sample submissions file is there what does it means??


The sample submission file is to give you an idea about how the submission file should look like.


Logged in, registered, but cannot locate data for download!


@jackie76 , scroll down to the bottom of the page. You’ll find the train, test, submission files. Of you still can’t find it, let me know and I’ll share it with you via email


I am not able to get to the data either. the data link on the left navigation bar does not work. can you please email it to me?



Please dm me your email id.


I am also not able to get the data, If you could mail it to


I have made a tutorial on building a machine learning model for this problem, and it will be published on IBM Developer Content, is it okay to host the data there with keeping the source to your competition?


@hissah, yes.


Couldn’t find sample dataset,