Calculating just a single row of dissimilarity/distance matrix

r
distance
clustering

#1

I have a data-frame with 30k rows and 10 features. I would like to calculate distance matrix like below;

gower_dist <- daisy(data-frame, metric = “gower”),

This function returns whole dissimilarity matrix. I want to get just the first row. (Just distances of the first element in data-frame). How can I do it? Do you have an idea?


#2

Hi @aeren

Try to use subset of the dataframe like this…

gower_dist = daisy(data-frame[1, ], metric = “gower”)

#4

Hi,

Thanks for your answer.

Yes I did and actually it works. I calculated the dissimilarty matrix row by row and store it in the ROM.

Now my problem is I would like to run a clustering algorithm over that matrix. I import it to R studio like this,

Mydata<-read.csv(“Mydata.csv”)

Mydata<-as.dist(Mydata)

Results<-hclust(Mydata)

But R gives RAM error when I convert the matrix to a dist object. How can I handle it. Can I run hclust algorithm in a loop/chunking? Can I divide the matrix into pieces and run them separately? Do you have an idea ?