How to Chunk large dissimilarity / distance matrices in R?

r
distance
clustering

#1

I would like to cluster mix-type data that contains 50k rows and 10 features/columns. I am using R in my 64 bit PC. When I calculate dissimilarity / distance matrix with “daisy” function, I got “Error: cannot allocate vector of size X GB” error.

gower_dist <- daisy(df, metric = “gower”).

This is the command to generate distance matrix. How to handle this script with chunks to avoid RAM error ?


#2

@aeren can u check here http://astrostatistics.psu.edu/su07/R/html/cluster/html/daisy.html


#3

Yes It explains gower distance. But my question is how to handle that distance matrix by chunking. Because R gives RAM error