How to use break function to break the data by a specific point?

r

#1

I am currently working on one data set in which I want to create the bucket of the variable(year_joined) so I have used the cut function.But I am unable to break the data at a specific point.

table(pf$year_joined)

 2006  2007  2008  2009  2010  2011  2012  2013  2014 
   10    18   667  1648  4624  5462 10055 33663 42854

I want to create bucket between these values

         (2004, 2009]
         (2009, 2011]
         (2011, 2012]
         (2012, 2014]

I am not able to break at different point like above.


#2

It would be good if you could show a sample of your data set.


#3

Hi @harry

I consider this as a problem of data grouping. You can use cut2 function from Hmisc package. It’s quite convenient and fast too.

cut2 function is known to create factors from a numeric vector. The function is simple. It takes 3 parameters (mainly):

  1. name of data set

  2. how many cuts (group) do you want to get

  3. returns the vector of computed cuts

    install.packages(“Hmisc”)
    library(Hmisc)
    cut2(pf$year_joined, g = 4, onlycuts = TRUE)

Let me know if this works.

Alternate: You can also try this. I think this should answer your question. Using base function cut
> cut(pf$year_joined, breaks = c(2004, 2009, 2011, 2012, 2014)

Best,
Manish