Introducing non-linear transformations of Independent variables into a Logistic Model to improve accuracy

machine_learning
datascience

#1

Hi I built a Logistic Model which has an accuracy of around 87% but a sensitivity of 37%.

I want to improve both the above statistics. Is there a way to introduce no-linear transformations of the independent variables in the equation itself.

My original Logistic Model is

glm(formula = RainTomorrow ~ Cloud3pm + Pressure3pm + Pressure9am + 
    Humidity3pm + Humidity9am + WindSpeed3pm + WindSpeed9am + 
    WindGustSpeed + Evaporation, family = "binomial", data = Train)

I want to introduce lets say squares of all variables or logs of all variables or for some of the IVs.
My attempt at the revised formula is as below

glm(formula = RainTomorrow ~ Cloud3pm + Pressure3pm + Pressure9am + 
    Humidity3pm + Humidity9am + WindSpeed3pm + WindSpeed9am + 
    WindGustSpeed + Evaporation + poly(Cloud3pm, 2) + poly(Pressure3pm, 
    2), family = "binomial", data = Train)

weather.csv (35.8 KB)

However the above formula is introducing NAs to added transformed variables. How to make this right?


#2

@Pranov_Mishra The poly() function returns a matrix and not a vector. Maybe that’s why NAs are being introduced.


#3

You can just use the complete cases it will be fine and you will get an improved accuracy