Data Science/Analytics use case advice

Hi guys,

Desperately need advice on the analytics use case. I have around 60 odd BI reports (cloud based data reports for employees across globe). I need to identify similar ones so that the reports can be merged and number of data models can be reduced.


Excel sheet with following columns:

Column 1: Report ID (60 reports)

Column 2: Sub category -( reports have been categorized into 4 sub-parts based on usage )

Column 3: Table names which from which data is being fetched Can be more than 1.

Column 4: Names of users who use the report ( not more than 10)

Column 5: Report Field names - SI, CI etc = final columns that we arrive at, after using function etc on the data

Based on these columns/Data for each report, is there a way I can find similar or merge-able reports to reduce the number of data models.

Somebody suggested clustering, but wasn’t sure about it.

So is there a Data Science way/method that i can apply here with good enough accuracy. Any advice would be a huge help.

Thanks & Regards

Any help would be great

© Copyright 2013-2020 Analytics Vidhya