Automatic Causal Inference and Forecasting
Time Series and Forecasting Symposium
December 2, 2022
df <- read.csv("chicago.csv")
head(df)
#> Time Temperature Crime
#> 1 1 24.08 1605
#> 2 2 19.04 1119
#> 3 3 28.04 1127
#> 4 4 30.02 1154
#> 5 5 35.96 1251
#> 6 6 33.08 1276
library(fastEDM)
crimeCCMCausesTemp <- easy_edm("Crime", "Temperature", data=df)
#> ✖ No evidence of CCM causation from Crime to Temperature found.
tempCCMCausesCrime <- easy_edm("Temperature", "Crime", data=df)
#> ✔ Some evidence of CCM causation from Temperature to Crime found.
Jinjing Li
University of Canberra
George Sugihara
University of California San Diego
Michael J. Zyphur
University of Queensland
Patrick J. Laub
UNSW
Imagine x_t, y_t, z_t are interesting time series…
If the data is generated according to the nonlinear system:
\begin{aligned} x_{t+1} &= \sigma (y_t - x_t) \\ y_{t+1} &= x_t (\rho - z_t) - y_t \\ z_{t+1} &= x_t y_t - \beta z_t \end{aligned}
then y \Rightarrow x, both x, z \Rightarrow y, and both x, y \Rightarrow z.
Say \mathbf{x}_t = (x_t, y_t, z_t), then if:
\mathbf{x}_{t+1} = \mathbf{A} \mathbf{x}_{t}
we have a linear system.
\mathbf{x}_{t+1} = f(\mathbf{x}_{t})
we have a nonlinear system.
Using a term like nonlinear science is like referring to the bulk of zoology as the study of non-elephant animals. (Stanisław Ulam)
We don’t fit a model for f, non-parametrically use the data. Hence the name empirical dynamic modelling.
Takens’ theorem to the rescue, though…
Takens’ theorem is a deep mathematical result with far-reaching implications. Unfortunately, to really understand it, it requires a background in topology. (Munch et al. 2020)
Source: Munch et al. (2020), Frequently asked questions about nonlinear dynamics and empirical dynamic modelling, ICES Journal of Marine Science.
Given two time series, create E-length trajectories
\mathbf{x}_t = (\text{Temp}_t, \text{Temp}_{t-1}, \dots, \text{Temp}_{t-(E-1)}) \in \mathbb{R}^{E}
and targets
y_t = \text{Crime}_{t} .
For point \mathbf{x}_{s} \in \mathcal{P}, pretend we don’t know y_s and try to predict it.
\forall \, \mathbf{x} \in \mathcal{L} \quad \text{ find } \quad d(\mathbf{x}_{s}, \mathbf{x})
This is computationally demanding.
For point \mathbf{x}_{s} \in \mathcal{P}, find k nearest neighbours in \mathcal{L}.
Say, e.g., k=2 and the neighbours are
\mathcal{NN}_k = \bigl( (\mathbf{x}_{3}, y_3), (\mathbf{x}_{5}, y_5) \bigr)
The simplex method predicts
\widehat{y}_s = w_1 y_3 + w_2 y_5 .
Sequential Locally Weighted Global Linear Maps (S-map)
Weight the points by distance w_i = \exp\bigl\{ - \theta d(\mathbf{x}_{s}, \mathbf{x}_i) \bigr\} .
Build a local linear system \widehat{y}_s = \mathbf{x}_s^\top \boldsymbol{\beta}_s .
For all s \in \mathcal{P}, compare \widehat{y}_s to true y_s, and calculate \rho.
If \text{Temp}_t causes \text{Crime}_t, then information about \text{Temp}_t is somehow embedded in \text{Crime}_t.
By observing \text{Crime}_t, we should be able to forecast \text{Temp}_t.
By observing more of \text{Crime}_t (more “training data”), our forecasts of \text{Temp}_t should be more accurate.
Example: Chicago crime and temperature.
Thanks to Rishi Dhushiyandan for his hard work on easy_edm
.
😊 Give it a try, feedback would be very welcome.
😍 If you’re talented in causal inference or programming (Stata/Mata, R, Javascript, C++, Python), we’d love contributions!
Patrick Laub, Time Series and Forecasting Symposium, University of Sydney