I am pretty much new to data science, regarding which I have one question. I am running one dataset from one of the Kaggle competition related to marketing. For the dependent variable prediction, I am using CART model using the rpart package. So in this, I started adding random dependent variables, for some variable there is no tree made, while for some other variable tree with 3-4 nodes is created. I want to know how to select which variables I should use to train my model ?
There is also one interesting scenario happening, whenever I add another variable it counter effects the 2 other variable and gives me a tree with lesser number of nodes. Could anyone explain why is this happening ?
Looking forward to hear from you soon.Thanks!