How does Gradient Boosting deal with class imbalance problem?



In this article on dealing with class imbalance,

it was mentioned in the section, Gradient Tree Boosting, can help solve this problem. Based on the explanations, I am not able to understand what part of Gradient Boosting is paying attention to class imbalance?


Hi @nithanaroy,

Suppose you have an unbalanced data with 80 percent 1 and rest 20 percent 0. Usually when we fit a model like logistic regression or random forest on such a dataset, there are high chances that the model is biased. These models might predict 1 for every data point and will still be correct 80% of the times.

Gradient Boosting is a sequential process and thus every time it makes an incorrect prediction, it focuses more on that incorrectly predicted data point. So, if the first iteration gave you an accuracy of 80 %, the second iteration would focus on the remaining 20%.


I see, just like Ada Boost then. Thank you