I have two different time series:
location is lat/long/floor – these are “significant location changes” from a mobile device, and not every few meters or anything
timestamp app_name metric (0-1)
these apps will be things like dieting apps
I’d like to correlate the two, to see if any location changes (or lack thereof) are predictive of improved metrics in apps.
I know that later, I will be comparing two RNNs, LSTM and ESN, to see if trying to build out a well-tuned LSTM is worth it… that is later. For now, I need to simply get a statistical (classical ML) baseline – like with VARMAX.
I have generated mock data – several thousands of rows of data for 3 apps and three users over about a year of use. It is designed to have positive and negative correlation: both for metrics with app-location pairs and the use/disuse of apps, leading to a lack of metrics. There is a decay function in these apps, so disuse trends down, but the general trend is strongly positive. The data actually uses handy names rather than lat/long/floor location data objects.
I have loaded these in a Jupyter notebook, and generated cross tab rows to convert location categories to columns, so I have some rows like:
ts user home work relatives
<timestamp> user_1 0 0 1
and in another DF:
ts user app metric
<timestamp> user_1 app_1 0.3
With VARMAX itself I only have written a “hello world” of sorts, with statsmodels.tsa.statespace.varmax With it, there was one measurement for each series at every timestamp. I’m not sure how to do it with this data though.
The problem I am facing now is data normalization: if I am to put rows in for each timestamp in both data sets, then I need to normalize the metric data with a rolling average or something … no big deal there but … on the other side is 1/0 categorical data. How do I “smooth” that out ?
can anyone explain what I need to do? And, am I on the right path here? Or should I be converting the categorical data to a range value (based on mean of all metric data near it at some time-based decay… or something)?