Using customer relationships to create groups

sql
algorithms
analytics
data_science
opensource

#1

Doc1.pdf (205.9 KB)
Hi,

I have a database of about 20,000 business customers that is growing. But many of these customers are part of a larger business entity and I need to identify that. They are connected through names of owners listed on the customers. However, not all customers under the main entity have same owners. If they have even one common owner, I will assume that they are part of that major entity. For example: if I have 10 customers C1, C2…C10 and 10 owners O1,O2,O10. They are connected as below. Also, the attached image shows how companies are connected in a group.

C1: O1, O2, O3
C2: O2, O4, O5
C3: O4, O5, O6
C4: O7
C5: O6, O7
C6: O8
C7: O8, O10
C8: O9, O10
C9: O8
C10: O10

I can tell that there are 2 major groups: Group 1 includes C1, C2, C3, C4, C5 and Group 2 includes C6, C7, C8, C9, C10 from as they directly or indirectly have common owners. I would like to use an algorithm that does this or create one in SQL? Any suggestions? I am not sure if this can be done using decision trees. Please let me know if you need more clarity.

PS: There are other ways to identify if two business customers are part of the same entity for example if they same mailing addresses, email address, phone numbers and such. If I can introduce multiple criteria into my groups that would be ideal, but I can do with just one for now.

Thanks in advance.