Need help in an Interview question

model
regression

#1

Recently I attended an interview in which the interviewer asked a question which I do not know how to approach. Can anyone help me with this?

Q. Let’s say you have a Linear regression model create for a customer. You show it to the customer and relevant significant features. But customer asks you to add one more feature to this model as customer thinks this feature is very important, but according to model this feature is not significant. What will you do? How will you add this feature to the model?


#2

This is a strange question; it almost seems as if in order to keep the client happy, you are forced to include a useless feature. But having said that, you could try rebuilding the model from scratch. You could use forward/backward selection to see whether the new feature is significant or not. Or you could look at transforming the new feature or combining it with some other feature to see whether it improves the signal to noise ratio. You could try other algorithms to see whether the new feature appears significant there; If all else fails, then you should be able to tell the client that the new feature didn’t
add any value.

Hope this helps


#3

Thanks.
I also felt that transforming the feature could help, but other than that i could not think of anything else. Adding unwanted features to model would increase model complexity and reduce variance explained by the model.


#4

Before approaching to clear the problem, I would first ask a few questions to the interviewer which I think will clear ambiguity

  • Which machine learning model is been used as of now?
  • How much experience does the customer have? (aka does he have domain knowledge?)
  • What steps did we take to come to the conclusion of feature being insignificant?

#5

I think the answer to your questions are below:

  1. No machine learning model is used as of now
  2. Customer is a domain expert (eg finance, telecom)
  3. Customer has been using the information in this feature in the past to make decisions.

#6

Hi @sonicboom8

One point in the answer you mentioned linear regression, well it is not the only model … what about non linear when you a linear how is you residual ? Do you have a funnel or any non constant variance ? the famous Heteroscedasticity… well if not then the variable mentioned is worth to try with a non linear model.
If you first model gives good accuracy and no hêtèroscacity, why not to add and to go with one backward regression you will find if the suggest variable is of any importance. The suggest variable could take all…
Hope this help.
Alain