How to prove causation with confidence for telecommunications data


We have telco business case, we installed picocells in some areas and have collected daily upload, download as well as picocell upload, download data for last 3 months. I have earlier 3 years daily upload, download numbers as well when there was no picocells installed.
Goal is to statistically prove causation that increase in data usage, if any is due to picocell and not due to something else. If proven, it could help reduce churn.
What should be the solution approach? Any ideas?



Yes, something very basic you can do is to apply to your time-series (data consumption over time) a decomposition (seasonal, trend, etc).

With that you will see if in the previous years your data consumption trend was stable and if a change in the trend appears once you introduced your picocells.

You can do that in “R” with just one function decompose() applying that to your data.
This function will give you back a set of values that when plotting will show you several charts where you will see if that upward trend is present.

Carlos Ortega


If I am correctly understanding your problem, you can just apply T-tests or Anova and check for the impact by picocells on the data usage. These tools will help you compare these two situations.


Agree with Guru. The simplest approach is performing a hypothesis testing. Set your H0 and test for mean/median or variance of the value you want to test. Pick the right test and you should be able to figure out if there is an impact.

Regression can help you figure if the variable(Say a flag that says cells = Y or N) has any impact in deciding the outcome based on result