How to use variable with exponential values for classification problem in r


#1

Dear Experts and Data Scientist,

I need to know the detail method or reference to solve or learn about following questions:

a) I am working on classification problem of predicting the purchase of Y based on 22 features.
out of these 22 features, two features are in exponential form (i.e 8.05+E+12), so how to use these for classification using Random Forest and Logistic Regression.

b) in the same problem of part a above, out of these 22 features I have two features containing dates , so how to use these dates in Random Forest and Logistic Regression.
I have used
as.numeric(as.date(data$F15)), but this resulted in negative numbers.

c) what kind of data type for independent variable or feature is valid for using random forest for classification task.

hoping for the answer as early as possible.

thank you


#2

Hi @manishceeri

a) for the exponential data, either you can store the power of the values instead of whole ( example, for 8.05 + E + 12, store 12) or if all the values are nearly identical you can use integer value without including power.

b) For dates, you can use the date feature to extract other valuable information like day, month, year, date, week etc. and use them in model building.

c) You can use int and factor dtype for the independent features.

Hope this helps.
Shubham


#3

Before uploading data file you should verify all the data is in right format. If not then you need to make correction then upload. If number is so big viz 1,000,000,000, you can change this number to 1000 millions. In this way it would be helpful.