Resources for data preparation requested


#1

When discussing analytics projects, the typical statement is that the data preparation phase is 80 to 90% of the effort. At the same time, data preparation seems to be the area that is “glossed over” in terms of how to perform this step. I realize that the data preparation process is obviously project specific but I have to believe that there are some general resources on this topic that provide examples and starting recommendations on the data cleaning process.

As I work on our university’s analytics curriculum, I am therefore looking for any material (web sites, books,
software, etc.) that I can use as resources as well as provide to my students for teaching both undergraduate and graduate classes. In addition to these type of resources, any recommendations on where else I can post this question is also appreciated.


#2

Hi @jflatto

A good reference could “Exploratory Data Mining and Data Cleaning” By t. Dasu and T Johnson Wiley.
Chapter 4 about Data Quality is really worth reading, to my knowledge this is the only book, which gives one overview of the problems of data cleaning.

Hope this help

Alain


#3

@jflatto

You can refer to some of the articles written on Analytics VIdhya:

There would be other articles as well on the topic.

Hope this helps.

Regards,
Kunal