How to check the level of each variable in data

r
classification

#1

I am currently doing a problem of classification using C5.0 decision tree and while creating the model I am getting the error c50 code called exit with value 1 I came across that this error is due to if any column has missing value but I want to know how to check which column has missing value.

‘data.frame’: 9887 obs. of 12 variables:
Var1 : Factor w/ 9887 levels "AA10000186","AA10001186",..: 4691 6023 7382 9368 118 810 1478 2130 2763 3429 ... Var2 : Factor w/ 3 levels “HAXX”,“HAXY”,…: 1 1 1 1 1 1 1 1 2 1 …
Var3 : Factor w/ 23 levels "LC10","LC11",..: 22 22 2 2 16 18 18 14 2 18 ... Vintage : Factor w/ 144 levels “0 Month(s)”,“0 Year(s)”,…: 138 131 131 91 122 131 142 33 119 138 …
UG_Education: Factor w/ 213 levels "A Level from NIELT",..: 11 46 11 56 46 56 56 26 56 46 ... UG_College : Factor w/ 1830 levels “(NIBM)”,"_",“A Reputed Institute”,…: 174 1271 1330 1439 921 197 66 363 703 1477 …
PG_Education: Factor w/ 252 levels "aaa","Actuary from IAI",..: 99 99 99 194 99 178 178 92 103 99 ... PG_College : Factor w/ 1973 levels “…”,“A Reputed University”,…: 68 1485 1515 755 213 1342 1342 538 798 1591 …
Skills : Factor w/ 9778 levels "' Risk Management, Underwriting and Compliance' Appraising Loans and Credit Cards applications for Eligibility and limit assign"| __truncated__,..: 4176 8272 8318 1521 8315 9422 3584 7725 8434 8981 ... Domain : Factor w/ 47 levels “Accounts / Finance / Tax / CS / Audit”,…: 3 3 3 3 3 3 29 33 3 3 …
Var4 : Factor w/ 27 levels "","RR1074","RR115E",..: 13 27 7 22 2 20 14 1 12 10 ... Category : Factor w/ 6 levels “Champion”,“Megastar”,…: 3 3 6 3 3 4 4 4 4 3 …


#2

@harry,

The function is.na(x) will return true if x contains any missing values.
You can check for the missing values in your data frame using the apply function.

index <- apply(dataframe, 2, function(x) any(is.na(x)))

This will check for the missing values column-wise.
Now to get the column names which satisfy the above condition, you use this

colnames[index]

This would give you the column names in your data frame with missing values.

Hope this helps!