Introduction Linear Discriminant Analysis (LDA) is a widely used technique in both classification and dimensionality reduction. Its goal is to project data into a lower-dimensional subspace where class separability is maximized. While it is routinely applied in many fields, many practitioners leverage its power without fully grasping what the algorithm used actually does.
Recently, during one of my applied statistical learning classes, students raised a question about the R implementation in MASS::lda(). They were curious about how the associated predict() method actually transforms the feature data data into what is given as “LD” entries in the output object. It turns out that the method transforms the feature data into a lower-dimensional space to achieve optimal class separability. More mathematically: MASS::lda() implements reduced-rank LDA, where the optimal decision boundaries are determined in a lower-dimensional feature space created by projecting the original features into that space.
...