Reduced-Rank Linear Discriminant Analysis: R Example

Introduction Linear Discriminant Analysis (LDA) is a widely used technique in both classification and dimensionality reduction. Its goal is to project data into a lower-dimensional subspace where class separability is maximized. While it is routinely applied in many fields, many practitioners leverage its power without fully grasping what the algorithm used actually does. Recently, during one of my applied statistical learning classes, students raised a question about the R implementation in MASS::lda(). They were curious about how the associated predict() method actually transforms the feature data data into what is given as “LD” entries in the output object. It turns out that the method transforms the feature data into a lower-dimensional space to achieve optimal class separability. More mathematically: MASS::lda() implements reduced-rank LDA, where the optimal decision boundaries are determined in a lower-dimensional feature space created by projecting the original features into that space. ...

May 29, 2021 · 6 min · 1132 words · Martin C. Arnold

Understanding and Implementing the Box-Muller Transform in Python

When simulating random variables in statistics or machine learning, we often need samples from a standard normal distribution. However, programming languages that are not focused on statistics often lack a rich suite of random number generators for various distributions. Therefore, it is essential to know how to implement such generators from scratch, typically using uniformly distributed pseudo-random numbers. This is where the Box-Muller Transform comes into play—a clever method to transform two uniform random variables into two independent standard normal random variables. ...

March 15, 2021 · 7 min · 1420 words · Martin Christopher Arnold