# How to count number of distinct values in a column of a data table in R?

#1

Hello,

I have a table with 2947 rows and 1 column containing only integer values in the range 1 to 30. I want to calculate the number of distinct values in that column. I used the for loop like this->

k=test[1,1]
count=1
for(i in 1:2947)
{
if(test[i,1]!=k)
{
count=count+1
k=test[i,1]
}
}

which seems to work fine. How can I do it without using the for loop?

Thank you.

#2

For a particular column you could use:

``````   > df <- c(1,2,3,4,5,6,7,4,5,6)
> df_uniq <- unique(df)
> length(df_uniq)
[1] 7
``````

For unique values of rows in a dataset there is a function distinct in a package in R called dplyr which can be used.

``````df1 <- data.frame(x=c(1,2,3,2),y = c("a","b","c","b"))
> df1
x y
1 1 a
2 2 b
3 3 c
4 2 b
``````

As you can see here the value pair 2,b is repeated.

``````> distinct(df1)
x y
1 1 a
2 2 b
3 3 c
``````

This gives only the distinct rows in a dataset.
Hope this helps!!

#3

sorry i forgot to mention that there is another package called sqldf which can be used like:

``````> sqldf("select distinct(x) from df1")
x
1 1
2 2
3 3
``````

Then as you can see the count should be =3 ;

``````> sqldf("select count(distinct(x)) from df1")
count(distinct(x))
1                  3
``````

Hope this helps!!

#4

You can also use the table function.

Here is an example:

``````df <- data.frame(val=c(1,2,3,4,5,6,7,4,5,6,2,2,2,2,2,1,1,1,25,28,25,29,3));
table(df\$val)
``````

Hope this helps.