Best practices for data preprocessing and its resources

data_science

#1

Hi,
Could you suggest how should one approach the data preprocessing problem? What should we look for when cleaning the data? What is the normal pipeline for data preprocessing?


#2

@jalFaizy- This is a very good article about complete procedures to solve a data science problem.

Hope this helps!

Regards,
Hinduja


#3

Great article! Thanks for sharing


#4

Hi @jalFaizy

If you work in R then you can consult this caret mage Caret or the book from Max Kuhn “Applied Predictive Modelling”, by the way M. Kuhn is the father of Caret so a book worth to read
Hope this help
Alain


#5

Thanks @Lesaffrea . I guess I have to kick python and learn R! The roadblock of this R is killing me! :stuck_out_tongue:


#6

Hi @jalFaizy

well good luck with R :slight_smile: it has few tricks as well, perhaps some people here can tell you the equivalent of Caret in Python.

Have a good day

Alaiin