# Correlations in time series are sensitive to timescale

... and it's something that perhaps we don't look at quite often enough!

What am I talking about? Well, let's imagine that we're interested in the relationship between two signals, $X_1(t)$, and $X_2(t)$. One of the most basic analyses we might do is ask "are they correlated?" But perhaps the correlation depends on the timescale that we focus on. Could a signal be positively correlated at once timescale and negatively correlated at another scale?

Since I've shown you an example, hopefully you believe that a signal could be positively correlated at one timescale and negatively correlated at another! Here, $X_1$ and $X_2$ are positively correlated on a long timescale but negatively correlated on a short timescale.

Can we characterize this sort of relationship? Yes, and I'll outline one way of characterizing this sort of time-scale dependent correlation below.

How might such a relationship occur? Imagine $X_1$ and $X_2$ depend on multiple factors, some of which they have in common. As an example, perhaps $X_1$ and $X_2$ are both positively affected by variable $X_3$, and $X_1$ is positively affected by $X_4$ while $X_2$ is negatively affected by $X_4$.

Interestingly, in such a situation,$X_1$ and  $X_2$ might not even appear to be correlated at all!

Looking at the problem from the opposite perspective, perhaps we've observed some temporal dynamics of two variables  $X_1$ and  $X_2$. Can we infer the existence of $X_3$ and $X_4$ even if we didn't directly observe them? Perhaps, but only if X3 and X4 have different temporal characteristics!

Here, I'll go through an approach (detailed below) motivated by the Allan deviation, that here I'll call the Allan Covariance/Correlation or Allan Cross-Covariance/Cross-Correlation* (After writing this, I've found similar approaches under the term wavelet cross-correlation, of which this would be something like a Haar wavelet cross-correlation).

##### First, the Allan variance.

In short, the Allan variance is a metric for the variance of a signal $X(t)$ as a function of the time-scale at which one looks at that signal. That is, the Allan variance of a signal is not a single number, but rather a function of timescale, taking a particular value at a particular timescale. This timescale is also called the gate time or averaging time, denoted τ, and I will denote the Allan variance as $\sigma_A^2(\tau)$.

How does one calculate $\sigma_A^2(\tau)$?

• First, average the data over intervals of $\tau$.
• Second, take the difference between the means of consecutive intervals.
• Third, calculate the variance of the resulting signal, and divide by two in order to make it consistent with the definition of variance in the case of white noise.

Put more algebraically, we first calculate filtered version of the signal that includes only the stuff happening at a particular timescale. We'll call this filtered signal $Y(t)$.

$Y(t) = \frac{1}{\tau}\left[ \int_{t}^{t+\tau} X(T) dT - \int_{t-\tau}^t X(T) dT \right]$

Then the Allan variance is simply $\sigma_A^2(\tau) = Var(Y)/2$, and the Allan deviation is $\sigma_A(\tau) = \sqrt{V(Y)/2}$.

How do we relate this to correlations on multiple timescales? Well, we will apply the same filter to both signals - first averaging the signal over a time window and then taking the difference between contiguous averages - in order to isolate the parts of that signal at that timescale! Then, we'll calculate the covariance, correlation, or cross-correlation (depending on your interest). Simple!

Here's an example in R. First, we'll dream up two signals, $X_1$ and $X_2$ that are positively correlated at low frequency ($\omega_1$) and negatively correlated at high frequency ($\omega_2$).

set.seed(2)
t=1:1000
x3 = sin(0.01*2*pi*t) # one period is 100 datapoints
x4 = sin(0.05*2*pi*t) # one period is 20 datapoints
x1 = x3 + x4 + rnorm(length(t), 0, 0.2)
x2 = x3 - x4 + rnorm(length(t), 0, 0.2)

par(mfrow=c(2,1),mar=c(3,4,1,1))
plot(t,x1, type='l'); grid();
plot(t,x2, type='l'); grid();


This produces the signals as shown below.

A simple correlation doesn't detect any sort of relationship between the two, as is clearly evident by plotting the two against each other. While the cross-correlation suggests they may carry information about each other, it's pretty confusing - I am not sure how I would interpret a cross-correlation like this.

Next, lets calculate the Allan covariance and Allan correlation below:

allanCov = function(x1, x2, fs=1, corFunc=cov){
taus  = unique(round(1.05^(0:(log(length(x1)/5)/log(1.05)))))     # evaluate at log-spaced timescales
acor = vector('double',length=length(taus))  # initialize allan var vector
for (j in 1:length(taus)){
maxMultipleOfGateTime = taus[j] * floor(length(x1)/taus[j])
x1_dec = diff(colMeans(matrix(x1[1:maxMultipleOfGateTime], nrow=taus[j])))
x2_dec = diff(colMeans(matrix(x2[1:maxMultipleOfGateTime], nrow=taus[j])))
acor[j] = corFunc(x1_dec, x2_dec)
}
data.frame(time=taus/fs,acor=acor)
}

acov = allanCov(x1, x2, corFunc=cov)
acor = allanCov(x1, x2, corFunc=cor)
par(mfrow=c(2,1), mar=c(4,4,1,1), mgp=c(2,1,0))
plot(acov$time, acov$acor, log='x', type='o', xlab='timescale', ylab='covariance', main='Allan covariance'); abline(h=0,lty=2)
plot(acor$time, acor$acor, log='x', type='o', xlab='timescale', ylab='correlation',main='Allan correlation'); abline(h=0,lty=2)


This produces the following output:

One can clearly see in both plots that at short timescales, the signals are negatively correlated, and at longer timescales they become positively correlated. I slightly prefer interpreting the Allan covariance over the correlation, since it's not normalized and hence tells us something about both the correlation and the magnitude of the spectral power at a particular frequency, but one could easily argue that the correlation is the better metric too. Up to you.

What about generalizing to the cross covariance, where we might allow signals to be lagged? That's easily done as well. Since sinusoids produce infinitely long cross-correlations, perhaps we can generate a new example, with a more biologically plausible signal. Let's consider two stochastic processes $X_3$ and $X_4$, both generated by filtering white noise down. $X_3$ will be made by filtering white noise using an 11 point moving average filter (and therefore we might consider this as operating on a 'fast' timescale), and $X_4$ will be filtered using a 101 point filter (and hence will be 'slow'). We'll imagine that our two processes of interest $X_1$ and $X_2$ are functions of $X_3$ and $X_4$, in the same way as I sketched out above.

We'll also add in a delay for $X_1$'s response to $X_3$ where $X_3$ responds 50 datapoints

# generate filtered white noise as inputs to x1 and x2
set.seed(2)
t=1:10000
x3 = na.omit(filter(rnorm(11000,0,1), rep(1/sqrt(101),101)))[t] # slow timescale input
x4 = na.omit(filter(rnorm(11000,0,1), rep(1/sqrt(11),11)))[t] # fast timescale input
x1 = x3[-(1:50)] + x4 + rnorm(length(t),0,0.2)
x2 = x3 - x4 + rnorm(length(t),0,0.2)

par(mfrow=c(4,1),mar=c(3,4,1,1))
plot(t,x3, type='l', main='slow input (x3)'); grid();
plot(t,x4, type='l', main='fast input (x4)'); grid();
plot(t,x1, type='l', main='x1'); grid();
plot(t,x2, type='l', main='x2'); grid();


Again, there's only a weak correlation between $X_1$ and $X_2$ at the timescale of single datapoints.

And as before, we can clearly see that at short timescales, they're negatively correlated, and very slightly correlated at longer timescales.

However, this positive correlation at longer time scales is partly masked by the fact that at long timescales, they're offset temporally by 50 datapoints!  We should look into a cross-correlation-like metric, in which we will calculate the correlation between $X_1$ and $X_2$ for every possible time offset between the two signals.


allanCrossCorr = function(x1, x2, fs=1, type='covariance'){
taus = unique(round(1.05^(0:(log(length(x1)/5)/log(1.05)))))
acc = NULL
allanKernel = function(k) c(rep(1/k, k), -rep(1/k, k)) # this is the Allan filter kernel
for (j in 1:length(taus)){
maxMultipleOfGateTime = taus[j] * floor(length(x1)/taus[j])
x1_filt = na.omit(filter(x1, allanKernel(taus[j])))
x2_filt = na.omit(filter(x2, allanKernel(taus[j])))
cc = ccf(x1_filt, x2_filt, type=type, lag.max=100, plot=FALSE)\$acf[,1,1]
acc = rbind(acc, cc)
}
data.frame(time=taus/fs, acc=acc)
}

accov = allanCrossCorr(x1, x2, type='covariance')
accor = allanCrossCorr(x1, x2, type='correlation')

colors = colorRampPalette(c('violet','blue', 'black', 'red', 'pink'))(100)
par(mfrow=c(1, 2))
image(x=accov[,1], y=-floor((ncol(accov)-1)/2):floor((ncol(accov)-1)/2), z=as.matrix(accov[,-1]),
col=colors, log='x', zlim=c(-1.3,1.3), main='Allan cross-covariance',
xlab='timescale',ylab='lag')
image(x=accor[,1], y=-floor((ncol(accor)-1)/2):floor((ncol(accor)-1)/2), z=as.matrix(accor[,-1]),
col=colors, log='x', zlim=c(-1, 1), main='Allan cross-correlation',
xlab='timescale',ylab='lag')