Zero being encoded as one after conversion from factor to numeric

r

#1

Hi there,

Yesterday while working on a problem, I encountered a problem while converting a factor variable to numeric data type.

as.numeric(factor(c(0,1,2,3,1,2,3)))
[1] 1 2 3 4 2 3 4

while

as.numeric(factor(c(1,1,2,3,1,2,3)))
[1] 1 1 2 3 1 2 3

can anybody explain , why is it happening so ?

Thanks
Neeraj


#2

Hi @NSS,

as.numeric converts simply the index part of factor into numeric.

Here,

levels(factor(c(0,1,2,3,1,2,3)))
[1] "0" "1" "2" "3"

Each level here has an index. “0”->1, “1”->2, “2”->3, “3”->4
These indices get converted as it is by as.numeric.

For the second instance,

 levels(factor(c(1,1,2,3,1,2,3)))
[1] "1" "2" "3"

The indices are luckily the same as the actual values. This is the reason you don’t see any difference.

To avoid this problem, use as.character.

as.character(factor(c(0,1,2,3,1,2,3)))
[1] "0" "1" "2" "3" "1" "2" "3"
as.numeric(as.character(factor(c(0,1,2,3,1,2,3))))
[1] 0 1 2 3 1 2 3

Regards,
Shashwat