I’m working on the bike sharing problem in kaggle. My hypothesis is that higher the temperature and lower the humidity people prefer to use bikes. So I created the variable:
data$newfeature = data$temp/data$humidity
Though this feature has a very high importance in my model the model’s accuracy has decreased a little bit.
I think the problem here is I that the temperature has a different scale and the humidity has a different scale. Should I scale it? If then how? What is the most optimum way in case of this problem to create a feature containing my hypothesis?