Pruning decision tree

machine_learning

#1

Hi Experts,
I have applied Pruning on decision tree on khyposis dataset the CP remains unchanged after pruning
Insights much appreciated.

Regards,
tony


#2

Hi @tillutony

Can you elaborate on what the dataset is and what the problem entails? There is too less information to suggest you anything


#3

Hi Jal faizy,
I am working on kyphosis dataset
below is the code

str(kyphosis)
‘data.frame’: 81 obs. of 4 variables:
Kyphosis: Factor w/ 2 levels "absent","present": 1 1 2 1 1 1 1 1 1 2 ... Age : int 71 158 128 2 1 1 61 37 113 59 …
Number : int 3 3 4 5 4 2 2 3 2 6 ... Start : int 5 14 5 1 15 16 17 16 16 12 …
ind <- sample(2, nrow(kyphosis), replace=TRUE, prob=c(0.7, 0.3))
kyphosis.train <- kyphosis[ind==1,]
kyphosis.test <- kyphosis[ind==2,]
fit <- rpart(Kyphosis ~ Age + Number + Start,

  • method="class", data=kyphosis.train)
    

printcp(fit)

Classification tree:
rpart(formula = Kyphosis ~ Age + Number + Start, data = kyphosis.train,
method = “class”)

Variables actually used in tree construction:
[1] Number Start

Root node error: 15/57 = 0.26316

n= 57

   CP nsplit rel error  xerror    xstd

1 0.23333 0 1.00000 1.00000 0.22164
2 0.01000 2 0.53333 0.73333 0.19863

pfit<- prune(fit, cp=fit$cptable[which.min(fit$cptable[,“xerror”]),“CP”])
pfit
n= 57

node), split, n, loss, yval, (yprob)
* denotes terminal node

  1. root 57 15 absent (0.73684211 0.26315789)
  2. Start>=12.5 30 1 absent (0.96666667 0.03333333) *
  3. Start< 12.5 27 13 present (0.48148148 0.51851852)
  4. Number< 4.5 14 4 absent (0.71428571 0.28571429) *
  5. Number>=4.5 13 3 present (0.23076923 0.76923077) *

printcp(pfit)

Classification tree:
rpart(formula = Kyphosis ~ Age + Number + Start, data = kyphosis.train,
method = “class”)

Variables actually used in tree construction:
[1] Number Start

Root node error: 15/57 = 0.26316

n= 57

   CP nsplit rel error  xerror    xstd

1 0.23333 0 1.00000 1.00000 0.22164
2 0.01000 2 0.53333 0.73333 0.19863

Regards,
Tony