I am working on project related to prediction of sales quantity based on previous data.
Data consists of attributes like .MRP, Size , Discount,Color,Fabric type e.t.c
I am having both sales quantity and Sock on hand in data .Currently I am trying to predict sales quantity using Random forest Algorithm(in R).
I did some basic feature transformations like one hotted categorical variables and performed Normalization on data and training the model.
Here I am having some issues in data.I performed some manipulations listed below
I am having two different datasets one contains sales records and other has stock on hand.
we are matching and aggregating the records week wise. In some cases stock on hand is not available but sales happened .For those records making sales quantity as stock on hand.
For unsold records we are not having any discount .To keep discount we are taking average of discounts over the same period for particular item from other store (Suggestion by client).
Some of the records are having negative sales quantity after aggregation. we are making those negative sales to zero .
To train the model I am having sales quantity “0”(Zero) for unsold items. In data unsold items are more than sold items (like if sold items are 20 unsold are around 300)
I am not getting any considerable results .I tried the same thing with **Neural Network ** also in R.
I am beginner in predictive modelling.This is my first project .Please correct me if anything wrong in my approach