Error in using GBM in multinomial prediction

r
gradient_boosting

#1

gbm_model1 <- gbm(formula = trainformula, data = trainingdata1,
distribution = “multinomial”, n.trees = 50,
interaction.depth = 5, shrinkage = 0.1)

The training formula used is -

failure ~ voltmean + rotatemean + pressuremean + vibrationmean +
voltsd + rotatesd + pressuresd + vibrationsd + voltmean_24hrs +
rotatemean_24hrs + pressuremean_24hrs + vibrationmean_24hrs +
voltsd_24hrs + rotatesd_24hrs + pressuresd_24hrs + vibrationsd_24hrs +
error1count + error2count + error3count + error4count + error5count +
sincelastcomp1 + sincelastcomp2 + sincelastcomp3 + sincelastcomp4 +
model + age

str(trainingdata1)
‘data.frame’: 167876 obs. of 30 variables:
datetime : POSIXct, format: "2015-01-02 05:00:00" "2015-01-02 08:00:00" "2015-01-02 11:00:00" "2015-01-02 14:00:00" ... machineID : int 1 1 1 1 1 1 1 1 1 1 …
voltmean : num 180 176 160 170 163 ... rotatemean : num 441 439 424 443 469 …
pressuremean : num 94.1 101.6 99.6 102.4 102.7 ... vibrationmean : num 41.6 36.1 36.1 40.5 40.9 …
voltsd : num 21.3 19 13 16.6 17.4 ... rotatesd : num 48.8 51.3 13.7 56.3 38.7 …
pressuresd : num 2.14 13.79 9.99 3.31 9.11 ... vibrationsd : num 10.04 6.74 1.64 8.85 3.06 …
voltmean_24hrs : num 170 171 170 170 170 ... rotatemean_24hrs : num 445 444 446 447 452 …
pressuremean_24hrs : num 96.8 97.7 96.9 96.2 96.4 ... vibrationmean_24hrs: num 40.4 39.8 40 39.9 40 …
voltsd_24hrs : num 11.2 12.6 13.3 13.8 14.8 ... rotatesd_24hrs : num 48.7 46.9 42.8 42.8 42.5 …
pressuresd_24hrs : num 10.08 9.41 9.07 8.26 8.67 ... vibrationsd_24hrs : num 5.85 6.1 5.48 5.86 5.91 …
error1count : num 0 0 0 0 0 0 0 0 0 1 ... error2count : num 0 0 0 0 0 0 0 0 0 0 …
error3count : num 0 0 0 0 0 0 0 0 0 0 ... error4count : num 0 0 0 0 0 0 0 0 0 0 …
error5count : num 0 0 0 0 0 0 0 0 0 0 ... sincelastcomp1 : num 20 20.1 20.2 20.3 20.5 …
sincelastcomp2 : num 215 215 215 215 215 ... sincelastcomp3 : num 155 155 155 155 155 …
sincelastcomp4 : num 170 170 170 170 170 ... model : chr “model3” “model3” “model3” “model3” …
age : int 18 18 18 18 18 18 18 18 18 18 ... failure : Factor w/ 5 levels “comp1”,“comp2”,…: 5 5 5 5 5 5 5 5 5 5 …

When I try to run the GBM model , I am getting an error-

Error in gbm.fit(x, y, offset = offset, distribution = distribution, w = w, :
variable 26: model is not of type numeric, ordered, or factor.

Can anyone please tell how to resolve the error ?


#2

@Sajal_Roy_92

I’m not an R expert, but I would suggest converting variable that’s creating the issue (variable 26 I presume) to a factor. You can do it as below

trainingdata1$model <- as.factor(trainingdata1$model )


#3

@jalFaizy

Hello , The error causing variable is
$ sincelastcomp3 , I tried changing the variable to factor and explicitly changing the variable again to numeric , it is still throwing an error.

sincelastcomp3 is in numeric format already .


#4

Could you try the above code and see if it works?


#5

Hi , It worked ! Thanks a lot @jalFaizy ! Can you please tell what caused the model to fail at execution ? Will help me avoid using the mistakes in future .


#6

This shows"char" instead of factor, and thats what was causing the issue. Glad it worked :slight_smile: