Martin C. Arnold's Homepage

Causal Forests for Treatment Effect Estimation

Causal Forests Estimating treatment effects often requires moving beyond linear models, which make rather strict assumptions about the relationships in the data and treat them as uniform across all observations. Causal forests (Athey, Tibshirani, and Wager 2019), a non-parametric ensemble method, extend the principles of random forests to estimate treatment effects that vary across subgroups. This approach captures complex, localized patterns in the data. Unlike traditional decision trees, which focus on improving overall predictive accuracy, causal trees (Athey and Imbens 2016) within a causal forest are designed to split the data based on variations in treatment effects rather than variability in the observed outcome variable. ...

Glitch proxy for API requests in Observable

Give me a reason: CORS headers CORS (Cross-Origin Resource Sharing) is a fundamental web security feature that acts like a gatekeeper, controlling how web pages can request resources from different domains. For example, when working in an Observable notebook and trying to fetch data directly from an external API (as I did), you are making a cross-origin request because your notebook is running on observablehq.com while trying to access data from another domain. Your browser will reject this “cross-origin” action unless the server explicitly allows it! ...

A utility function for LaTeX in R plots

I found myself repeatedly writing similar code to generate pubplication ready plots that include LaTeX annotations for my papers and teaching materials. The tikzDevice R package provides the foundation for combining R plots with LaTeX. I use the magick library to convert the compiled PDF file to the desired output format. To streamline this workflow, I wrote a utility function that handles the entire pipeline. create_latex_plot <- function( plot_expr, # plot object / function call out_name, # out files name out_format = "png", # out format out_dir = ".", # out directory width = 9, # plot width height = 6, # plot height cleanup = T # remove intermediate files ) { # libraries if (!requireNamespace("tikzDevice", quietly = TRUE)) stop("Please install the 'tikzDevice' package.") if (!requireNamespace("magick", quietly = TRUE)) stop("Please install the 'magick' package.") library(tikzDevice) library(magick) # tikzDevice options options(tikzLatexPackages = c( "\\usepackage{tikz}", "\\usepackage[active,tightpage]{preview}", "\\PreviewEnvironment{pgfpicture}", "\\setlength\\PreviewBorder{0pt}", "\\usepackage{amsmath, amssymb, amsthm, amstext}", "\\usepackage{bm}" )) # file paths tex_file <- file.path(out_dir, paste0(out_name, ".tex")) pdf_file <- file.path(out_dir, paste0(out_name, ".pdf")) out_file <- file.path(out_dir, paste0(out_name, ".", out_format)) # tikz file tikz(tex_file, standAlone = TRUE, width = width, height = height) eval(plot_expr) dev.off() # Compile to PDF system( paste( "cd", out_dir, "; lualatex -output-directory .", shQuote(basename(tex_file)) ) ) # Convert the PDF to PNG image_write( image_convert( image_read_pdf(pdf_file), format = out_format ), path = out_file, format = out_format ) message("Output file created at: ", out_file) if(cleanup) { system("rm *.aux; rm *.log; rm *.tex; rm *.pdf") message("Removed intermediate files.") } } The magick package is doing the PDF to X conversion internally using ImageMagick, which provides a cross-platform solution that doesn’t depend on Ghostscript being installed. ...

Karhunen–Loève Approximation of Brownian Motion

Playing around with time series data in Observable, I decided to create a small interactive application to visualize the Karhunen-Loève (KL) approximation (Karhunen 1947) of Brownian motion. The widget at the end of this post lets you explore how KL works, illustrating how adding more terms in the series leads to increasingly accurate representations of Brownian paths. ...

Mallows Model Averaging: Python Example

Quick Facts Mallows Model Averaging (MMA) (Hansen 2007) combines predictions from multiple models to minimize the mean squared prediction error (MSE), balancing predictive accuracy against model complexity. Hansen demonstrates that MMA is asymptotically optimal, achieving the lowest possible squared error within a class of discrete model averaging estimators. ...

Distance to degenerate gamma distribution

When working with distance measures between distributions, singularities can pose a significant challenge. This happens when one of the distributions is degenerate, concentrating all its probability mass on a single point. In this post we discussed the comparison of a Gamma distribution (representing a complex model) with a singular Gamma distribution (representing a base model) in the context of constructing a penalized complexity (PC) prior for the overdispersion parameter $\phi$ in a Bayesian negative binomial regression. In said post, I stated that the distance measure of interest for constructing the PC prior is ...

Neg. Binomial Regression and PC Priors in R-INLA

Negative Binomial as Poisson Mixture I have recently found it useful having the negative binomial (NB) distribution represented as a continuous mixture distribution. This makes it straightforward to understand how the NB distribution relates to the Poisson distribution (how the Poisson assumptions can be relaxed to allow for overdispersion in count data regression). Also, the Bayesian Poisson-Gamma mixture model is nice to illustrate the concept of penalized complexity priors. ...

Reduced-Rank Linear Discriminant Analysis: R Example

Introduction Linear Discriminant Analysis (LDA) is a widely used technique in both classification and dimensionality reduction. Its goal is to project data into a lower-dimensional subspace where class separability is maximized. While it is routinely applied in many fields, many practitioners leverage its power without fully grasping what the algorithm used actually does. Recently, during one of my applied statistical learning classes, students raised a question about the R implementation in MASS::lda(). They were curious about how the associated predict() method actually transforms the feature data data into what is given as “LD” entries in the output object. It turns out that the method transforms the feature data into a lower-dimensional space to achieve optimal class separability. More mathematically: MASS::lda() implements reduced-rank LDA, where the optimal decision boundaries are determined in a lower-dimensional feature space created by projecting the original features into that space. ...

Understanding and Implementing the Box-Muller Transform in Python

When simulating random variables in statistics or machine learning, we often need samples from a standard normal distribution. However, programming languages that are not focused on statistics often lack a rich suite of random number generators for various distributions. Therefore, it is essential to know how to implement such generators from scratch, typically using uniformly distributed pseudo-random numbers. This is where the Box-Muller Transform comes into play—a clever method to transform two uniform random variables into two independent standard normal random variables. ...

FFT based covariance estimation in R — Pt. II

In the previous post, I discussed an approach to obtain autocovariances1 of time series data through discrete Fourier transforms that I implemented in an R function acf_fft_R(). # ACF using FFT in R acf_fft_R <- function(x) { n <- length(x) a_j <- fft(x) I_x <- Mod(a_j)^2/n return( Re(fft(I_x, inverse = T)/n) ) } An RcppArmadillo version Recently, I wrote an Armadillo version for an Rcpp project. Here’s its definition and how to source it using Rcpp: ...