In the previous post, I discussed an approach to obtain autocovariances1 of time series data through discrete Fourier transforms that I implemented in an R function acf_fft_R()
.
# ACF using FFT in R
acf_fft_R <- function(x) {
n <- length(x)
a_j <- fft(x)
I_x <- Mod(a_j)^2/n
return(
Re(fft(I_x, inverse = T)/n)
)
}
An RcppArmadillo version
Recently, I wrote an Armadillo version for an Rcpp project. Here’s its definition and how to source it using Rcpp:
Rcpp::cppFunction(
"vec acf_arma(const arma::vec& x) {
size_t n = x.n_elem;
cx_vec a_j = fft(x);
cx_vec I_x = conv_to<cx_vec>::from(
square(real(a_j)) + square(imag(a_j))
);
cx_vec gamma_hat = ifft(I_x)/n;
return real(gamma_hat);
}",
depends = "RcppArmadillo",
includes = "using namespace arma;"
)
Let’s verify that acf_arma()
and acf_fft_R()
produce equivalent results.2 Here’s a comparison of the first 10 autocovariances for a time series y
simulated using an ARMA(2, 2) DGP:
# simulate time series
set.seed(1234)
n <- 300
y <- arima.sim(
model = list(
ar = .75
),
n = n
)
And here’s a quick ggplot2 graphic.
library(tibble)
library(ggplot2)
library(cowplot)
library(dplyr)
# plot the time series y
tibble(time = 1:n, value = y) %>%
ggplot(aes(x = time, y = value)) +
geom_line() +
labs(title = "Simulated Time Series `y`", x = "Time", y = "Value") +
theme_cowplot()
Let’s verify that both functions acf_arma()
and acf_fft_R()
compute equivalent autocovariances by comparing their outputs:3
# compute autocovariances
armadillo_acf <- acf_arma(y)[1:21]
R_acf <- acf_fft_R(y)[1:21]
# compare
round(armadillo_acf, 10) == round(R_acf, 10)
[1] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE
[16] TRUE TRUE TRUE TRUE TRUE TRUE
We also plot the autocovariances for lags 0 to 20 computed by acf_arma()
, acf_fft_R()
, and the built-in stats::acf()
function, which implements the well-known sample moment estimator, for comparison.
library(tidyr)
# compute and compare ACF values
acf_values <- tibble(
lag = 0:20,
acf_arma = armadillo_acf,
acf_fft_R = R_acf,
acf = acf(y, plot = F, type = "covariance")$acf[1:21]
)
# create a barplot of the first 26 autocovariances
acf_values %>%
pivot_longer(
cols = acf_arma:acf,
names_to = "Method",
values_to = "value"
) %>%
ggplot(aes(x = factor(lag), y = value, fill = Method)) +
geom_bar(stat = "identity", position = "dodge") +
labs(
title = "Comparison of ACF functions",
x = "Lag",
y = "Autocovariance"
) +
theme_cowplot() +
theme(legend.position = "top")
Figure 2 shows good correspondence between acf_arma()
/acf_fft_R()
and the base R implementation.
Long-run Variance Estimation
“What do I need this for?” you may ask. Well, when dealing with time series data, the long-run variance (LRV) is a frequently needed measure of the total variance of the underlying process, including all autocovariances. In applications, the LRV is frequently needed in various fields. In financial econometrics, the LRV is used to estimate the volatility of asset returns over long horizons, which is essential for risk management and portfolio optimization. Additionally, in hypothesis testing, the LRV is used to adjust standard errors in the presence of autocorrelation, ensuring valid inference. If we need to estimate the LRV many times (e.g., in a simulation-based method) and/or for long time series, the efficiency of the estimator’s implementation is crucial.
For a stationary process $Y_t$, the LRV is
$$ \begin{align} \sigma^2_{\infty} = \gamma(0) + 2\sum_{k=1}^{\infty} \gamma(k),\label{eq:lrv} \end{align} $$
where $\gamma(k)$ is the $k$-th order autocovariance of $Y_t$.
As touched upon in a previous post, there’s an interesting link between the LRV and the power spectral density (PSD) of a time series process: The PSD $S(f)$ of a stationary process is the Fourier transform of its ACF,4
$$ S(f) = \sum_{h=-\infty}^\infty \gamma(h) e^{-i 2\pi h f}. $$
$S(f)$ captures how the variance of the process is distributed across different frequencies $f$. Importantly, there is a direct relationship between $S(f)$ and $\sigma^2_\infty$:
$$ \sigma^2_\infty = S(0), $$
where $S(0)$ is the spectral density evaluated at frequency $f = 0$. At this frequency, the PSD aggregates the contributions of all frequencies to the total variance, reflecting the sum of all autocovariances in the time domain.
For an AR(1) process, it is straighforawd to compute the LRV from its PSD,
$$ S(f) = \frac{\sigma^2}{1 - 2\cdot\phi \cos(2\pi f) + \phi^2} $$
We have
$$ S(0) = \frac{\sigma^2}{(1 - \phi)^2}, $$
which you can check to yield $S(0)=16$ for $\sigma=1$ and $\phi=.75$, as in my R example.
A parametric estimator of the LRV is in an AR(1) model is
$$ S^2_\textup{AR} := \frac{\widehat\sigma^2}{(1 - \widehat\phi)^2}, $$
with $\widehat\sigma^2$ and $\widehat\phi$ estimators of the error term variance $\sigma^2$ and $\phi$, respectively.5
Not all at once!
If we do not make parametric assumptions (the time series being generated by an AR(1) process), $\sigma^2_{\infty}$ needs to be estimated based on estimates of the $\gamma(k)$. It turns out that computing as many of the $\widehat\gamma(k)$ as we can for a given sample size $n$ (that is $n-1$) and plugging these into equation \eqref{eq:lrv} yields an inconsistent estimator. A remedy is to assign weights to the $\widehat\gamma(k)$ that decrease as the lag increases. This is crucial because it reduces the contribution of estimated higher-lag autocovariances, which are typically less reliable and more variable estimates (cf. Hayashi 2000).
A kernel-based LRV estimator can be defined as
$$ \widehat{\sigma}^2_{\infty} = \widehat{\gamma}(0) + 2\sum_{k=1}^{n-1} K\left(\frac{k}{b_n}\right)\cdot\widehat{\gamma}(k), $$
where $n$ is the sample size and $b_n$ is the bandwidth parameter that controls how many lags we include in the estimation.6 The bandwidth determines the rate at which the kernel weights decay, with larger values of $b_n$ including more lags in the estimation.
A fast C++ function for the estimator is straightforward to implement using our FFT-based ACF estimator, acf_arma
. First, we define the Bartlett kernel function, which is commonly used in LRV estimation:
A popular kernel function is the Bartlett kernel,
$$ K_\textup{Bartlett}(x) = \begin{cases} 1 - |x| & \textup{if } |x| \leq 1 \\ 0 & \text{otherwise} \end{cases}, $$
which ensures that $\widehat{\sigma}^2_{\infty}>0$ and $\widehat{\sigma}^2_{\infty}\to\sigma^2_{\infty}$ under some regularity conditions.
I’ve implemented $K_\textup{Bartlett}(x)$, along with two other common kernels, in C++. The following code chunks provide a unified kernel function that is straightforward to extend (kernfun
) and the LRV estimator (lrv
):
#include <RcppArmadillo.h>
// [[Rcpp::depends(RcppArmadillo)]]
using namespace arma;
// Unified kernel function
// [[Rcpp::export]]
double kernfun(const double& x, const std::string& type) {
double mx = std::abs(x);
if (type == "Bartlett") {
if (mx <= 1) {
return 1 - mx;
} else {
return 0;
}
} else if (type == "Truncated") {
if (mx <= 1) {
return 1;
} else {
return 0;
}
} else if (type == "Parzen") {
if (0 <= mx && mx <= .5) {
return 1 - 6 * pow(mx, 2) + 6 * pow(mx, 3);
} else if (.5 < mx && mx <= 1) {
return 2 * pow(1 - mx, 3);
} else {
return 0;
}
} else {
return 0;
}
}
// Long-run variance estimator
// [[Rcpp::export]]
double lrv(const arma::vec& x, const std::string& type) {
size_t T = x.n_elem;
double bw = trunc(12 * pow(T / 100.0, .25));
vec summands(x.n_elem, fill::zeros),
acovs = acf_arma(x);
summands(0) = acovs(0);
for(size_t j = 1; j < x.n_elem; j++) {
summands(j) = 2 * kernfun(j / bw, type) * acovs(j);
}
return accu(summands);
}
Let’s visualize the selected kernel functions to get an impression of their shapes:
library(tidyr)
library(purrr)
tibble(
x = seq(-2, 2, 0.01)
) %>% mutate(
"Bartlett" = map_dbl(x, \(x) kernfun(x, "Bartlett")),
"Parzen" = map_dbl(x, \(x) kernfun(x, "Parzen")),
"uniform" = map_dbl(x, \(x) kernfun(x, "Truncated"))
) %>%
pivot_longer(2:4, names_to = "Kernel", values_to = "value") %>%
ggplot(aes(x = x, y = value, color = Kernel)) +
geom_line() +
lims(x = c(-2, 2)) +
cowplot::theme_cowplot() +
theme(legend.position = c(.8, .8))
Warning: A numeric `legend.position` argument in `theme()` was deprecated in ggplot2
3.5.0.
ℹ Please use the `legend.position.inside` argument of `theme()` instead.
To validate our LRV estimator, we compare it with the theoretical LRV for an AR(1) process with parameter 0.75. The theoretical LRV can be computed using the MA representation and the variance of the process:
- Use
stats::ARMAtoMA()
to obtain (an approximation of) the MA($\infty$) representation of the AR(1) process.7 ARMAacf()
yields autocorrelations, which scale by the variance to get autocovariances- Compute the LRV as in the equation above
# MA coefficients
psi_weights <- ARMAtoMA(
ar = c(.75),
lag.max = 100
)
# variance
variance <- 1 + sum(psi_weights^2)
# first 100 autocorrelations
acf_vals <- ARMAacf(ar = .75, lag.max = 100, pacf = FALSE)
# convert to autocovariances
autocovariances <- unname(acf_vals * variance)
# the approximate true LRV of the process
c("LRV" = autocovariances[1] + 2 * sum(autocovariances[-1]))
LRV
16
Next, we estimate the LRV using our C++ function and compare it with the Newey-West (Newey and West 1987) estimator as implemented in the sandwich
package’s lrvar()
function:8
lrv(y, type = "Bartlett")
[1] 16.16568
sandwich::lrvar(y, prewhite = F, adjust = F, type = "Newey-West") * n
[1] 15.91419
For this particular sample, both estimates are pretty close to the true LRV of 16.
However, our FFT-based C++ implementation is much more efficient: We benchmark the performance of both approaches.
(
res <- bench::mark(
"lrv" = lrv(y, type = "Bartlett"),
"lrvar" = sandwich::lrvar(y, kernel = "Bartlett", prewhite = F),
check = F
)
)
# A tibble: 2 × 6
expression min median `itr/sec` mem_alloc `gc/sec`
<bch:expr> <bch:tm> <bch:tm> <dbl> <bch:byt> <dbl>
1 lrv 12µs 12.8µs 76722. 9.02KB 0
2 lrvar 889µs 1ms 964. 1.07MB 15.2
The benchmark results show that lrv()
requires less memory allocation and is approximately 79 times faster than lrvar()
.
That’s it for now 😊.
References
Hayashi, Fumio. 2000. Econometrics. New Jersey: Princeton University Press.
Newey, Whitney K., and Kenneth D. West. 1987. “A Simple, Positive Semi-Definite, Heteroskedasticity and Autocorrelation Consistent Covariance Matrix.” Econometrica 55 (3): 703. https://doi.org/10.2307/1913610.
Ng, Serena, and Pierre Perron. 2001. “Lag Length Selection and the Construction of Unit Root Tests with Good Size and Power.” Econometrica 69 (6): 1519–54.
-
Note that these are empirical autocovariances, meaning they are estimates based on the data. Since this post focuses on computational aspects, I’m sloppy and just call them autocovariances now and then. ↩︎
-
The comparison uses rounding to the tenth decimal place since there are tiny differences between R and Rcpp results due to floating-point arithmetic precision. ↩︎
-
Results are rounded to the tenth decimal place to account for minor floating-point arithmetic differences between R and C++. ↩︎
-
In this post, we formulated this equivalence for sample quantities (periodogram and sample ACF). ↩︎
-
This approach to estimating the LRV in an AR model is well-known for is use in unit root test, see Ng and Perron (2001). ↩︎
-
This is a tuning parameter, which is often selected using a rule-of-thumb, cf. the C++ implementation. ↩︎
-
This is a general approach for ARMA processes. The closed-form solution for the variance is $\sigma^2/(1-.75^2)$, with $\sigma^2=1$ the innovation variance ($\sigma^2=1$ is the default in
arima.sim()
). ↩︎ -
I scale by the sample size
n
sincelrvar()
returns the variance of the mean estimator. ↩︎