Scrubbing of unstructured text data


#1

Can anybody help me out on how to we clean or profile or scrub the text data which is unstructured and pulled from different data sources.

I would be happy if someone can share a use case about the same or even a POC will help.

Thanks,
Manish


#2

@Manishbafna

If you are working on a specific case, it is best that you ask specific questions. This is a very broad subject and the answers can vary depending on the context.

Here is an article, which can help as a starter:

Regards,
Kunal


#3

Kunal,

I had read the article but it didn’t help in my scenario but its nice for a novice. I would share the specific case with example in a while so that I can get help from someone here.

Thanks,
Manish