Best practices for data preprocessing and its resources



Could you suggest how should one approach the data preprocessing problem? What should we look for when cleaning the data? What is the normal pipeline for data preprocessing?


@jalFaizy- This is a very good article about complete procedures to solve a data science problem.

Hope this helps!



Great article! Thanks for sharing


Hi @jalFaizy

If you work in R then you can consult this caret mage Caret or the book from Max Kuhn “Applied Predictive Modelling”, by the way M. Kuhn is the father of Caret so a book worth to read
Hope this help


Thanks @Lesaffrea . I guess I have to kick python and learn R! The roadblock of this R is killing me! :stuck_out_tongue:


Hi @jalFaizy

well good luck with R :slight_smile: it has few tricks as well, perhaps some people here can tell you the equivalent of Caret in Python.

Have a good day