Could you suggest how should one approach the data preprocessing problem? What should we look for when cleaning the data? What is the normal pipeline for data preprocessing?


@jalFaizy- This is a very good article about complete procedures to solve a data science problem.

Hope this helps!



Great article! Thanks for sharing


Hi @jalFaizy

If you work in R then you can consult this caret mage Caret or the book from Max Kuhn “Applied Predictive Modelling”, by the way M. Kuhn is the father of Caret so a book worth to read
Hope this help


Thanks @Lesaffrea . I guess I have to kick python and learn R! The roadblock of this R is killing me! :stuck_out_tongue:


Hi @jalFaizy

well good luck with R :slight_smile: it has few tricks as well, perhaps some people here can tell you the equivalent of Caret in Python.

Have a good day