What does the shrinkage option in gbm do




I am trying to use gbm for a classification problem,but there is a part I do not understand:

churn.gbm <- gbm(formula = Churned ~ .,
                 distribution = "bernoulli",data = logit.data,
                 n.trees = 5000,interaction.depth = 3,
                 **shrinkage** = 0.001,cv.folds = 4,verbose = T)

What does the shrinkage option do in gbm?
Someplaces I have looked at say that smaller shrinkage gives better results at the expense of more iterations,but then how do we determine an appropriate value of it?
When I have run this model and tried to generate the roc it says not enough distinct predictions,I have tried for shrinkage 0.01-0.001 but the results are all the same.
Can someone please help me in understanding this parameter?


the shrinkage defines the steps taken in the gradient descent of boosting, as boosting will do a convergence toward Y taking an optimisation view.
Higher steeps mean you will converge faster, will the danger missing the optimum point. In Gbm you have to balance shrinkage with iterations, which makes sense. If you take a very small shrinkage you should increase the interations to come to optimum but you are nearly 100% sure to find the optimum (when the delta between iterations is zero).
In gbm this could take huge amount of time, if you can do a first pass using caret with a grid and then go back for optimisation.