How does the Recursive Feature elimination(RFE) works and how it is different from Backward elimination?



I read about the backward elimination and I understood its working as:

  1. Select the significance level.
  2. Fit our model with all independent variables.
  3. Consider variables with the highest p-value. If the p-value is greater than the significance level, remove that variable.
  4. Again build the model with leftover independent variables.
  5. Repeat the process until the removal of any variable will affect the accuracy of the model.

But on the other hand, I am not able to understand the working of RFE. If RFE also eliminates the variables on every iteration then what is the difference between both of these methods.




The backward selection method you mentioned works on removing variable iteratively on the basis of p-value.
RFE is also a type of backward selection method however RFE works on feature ranking system.

First model is fit on linear regression based on all variables. Then it calculates variable coefficients and their importance. The It ranks the variable on the basis on linear regression fit and then remove low ranking variable in each iteration. scikit package can do this automatically if you define the number of feature you want to reduce it to. Hope this helped.

You can go ahead and check the link below for better understanding.