In a recent article the pseudo code of GBM is described as follows :
- Initialize the outcome
- Iterate from 1 to total number of trees
2.1 Update the weights for targets based on previous run (higher for the ones mis-classified)
2.2 Fit the model on selected sub sample of data
2.3 Make predictions on the full set of observations
2.4 Update the output with current results taking into account the learning rate
- Return the final output.
It will be helpful if my following confusions can be mitigated.
a) Does total number of trees referring to n_estimators?
b) Sub point 2.2 mentions the model is fitted on sub sample of data.
2.3 refers to predictions on full set of observations.
So we are predicting on total data, using model from the sub sample of data.
Explanation with example will be helpful.
c) An example of warm_start will be helpful.
Thanks in anticipation…