Faster way to search in a data.table in R

r
data_wrangling

#1

Hello,

I was reading about searching by key in the data.table package in R.

https://rpubs.com/ggData/datatable

It says

but when I run both the comands using system.time on

DT = data.table(x=rep(c(“a”,“b”),each=2), y=c(1,3), v=1:2)

I get

system.time(DT[“a”])
user system elapsed
0.00 0.00 0.01
system.time(DT[x==“a”])
user system elapsed
0 0 0

so is DT[“a”] actually slower than DT[x==“a”]?


#2

I suggest you try these on larger datasets because for such small datasets there is hardly going to be any visible difference. :smiley:


#3

@shuvayan +1.

I suggest going through the tutorials on Getting started page. Particularly, the quick start guide and the new keys and fast binary search based subsets HTML vignette, which explains this quite clearly IMHO.