How to convert data in "Practice Problem: Time Series" to ts format in R?

r

#1

Hi all,

I am working on the dataset (in the practice problem in AV) which has hourly values over a period of 2 years. I had difficulty in converting the data to timeseries format.since the frequency is required to be very different with hourly information. it cant be 12, 4 or 1 for monthly, quaterly or annually. I tried using frequency as below

frequency = 247365.25

So i used xts package to get it to a format with rownames as date+time and values as a column. The data is now in xts class and i am guessing i cant use the ts functions like plot, decompose etc to proceed.

The plot function does not show the trending for full time period. The decompose function does not work.

If someone can tell me how to convert this data to ts format i think I can proceed with what is required. Appreciate your help. A snapshot of the data is below

ID Datetime Count
1 0 2012-08-25 00:00:00 8
2 1 2012-08-25 01:00:00 2
3 2 2012-08-25 02:00:00 6
4 3 2012-08-25 03:00:00 2
5 4 2012-08-25 04:00:00 2
6 5 2012-08-25 05:00:00 2

My code at this point is as below and i would delighted if somebody can help me proceed to do timeseries forecasting on this dataset.

TS.train=read.csv(“Train.csv”)

TS.train$Datetime=as.POSIXct(strptime(TS.train$Datetime,format = “%d-%m-%Y %H:%M”))
library(xts)
train=xts(x=TS.train[,-c(1,2)], order.by = TS.train[,2])
names(train)=“Values”
summary(train$Values)
train.ts=ts(train$Values,start = c(2012,8), frequency = 247365.25)


#2

Hi @Pranov_Mishra

Since it is an hourly time series you have to use 24 as the frequency. Apart from working on the hourly data, you can even aggregate the hourly data into daily data as done in the the following course.


#3

Hi @pjoshi15,
I used the frequency of 24 and have been able to build a model. But i think a lot more needs to be done as i can see from the leaderboard. I see you have shared a approach document in the leader-board but can’t to it. Do you have a direct link.

Also the tool I am using is R and hence not sure if Python reference will be very helpful. Will go through it to find out. Appreciate your help.


#4

@Pranov_Mishra you can ignore that approach link, that was just for testing purpose.

Do go through that python course once. Let me know if it you face any issue.


#5

1…get the date and month on string or separator (function)basis…in a separate field…
group by and aggregate hourly to daily and then daily to monthly field,check if daily is getting aggregated properly to monthly,remove the first column and store it in a new df…
2.see if there is any mssing dates by doing(sum(is.na(date_field)…you need to fill in the mssing dates .
3.you can also use apply.monthly function…and that will aggregate the daily to monthly…there are different apply.? functions…you can check more on help…
4. then frequency can be 30 days or 1 month…
thanks…
vikas