Introduction to Statistical Modeling with SAS/STAT Software

Sweep Operator

The sweep operator (Goodnight 1979) is closely related to Gauss-Jordan elimination and the Forward Doolittle procedure. The fact that a sweep operation can produce a generalized inverse by in-place mapping with minimal storage and that its application invariably leads to some form of matrix inversion is important, but this observation does not do justice to the pervasive relevance of sweeping to statistical computing. In this section the sweep operator is discussed as a conceptual tool for further insight into linear model operations. Consider the nonnegative definite, symmetric, partitioned matrix

bold upper A equals Start 2 By 2 Matrix 1st Row 1st Column bold upper A 11 2nd Column bold upper A 12 2nd Row 1st Column bold upper A prime 12 2nd Column bold upper A 22 EndMatrix

Sweeping a matrix consists of performing a series of row operations akin to Gauss-Jordan elimination. Basic row operations are the multiplication of a row by a constant and the addition of a multiple of one row to another. The sweep operator restricts row operations to pivots on the diagonal elements of a matrix; further details about the elementary operations can be found in Goodnight (1979). The process of sweeping the matrix bold upper A on its leading partition is denoted as normal upper S normal w normal e normal e normal p left-parenthesis bold upper A comma bold upper A 11 right-parenthesis and leads to

normal upper S normal w normal e normal e normal p left-parenthesis bold upper A comma bold upper A 11 right-parenthesis equals Start 2 By 2 Matrix 1st Row 1st Column bold upper A 11 Superscript minus Baseline 2nd Column bold upper A 11 Superscript minus Baseline bold upper A 12 2nd Row 1st Column minus bold upper A prime 12 bold upper A 11 Superscript minus Baseline 2nd Column bold upper A 22 minus bold upper A prime 12 bold upper A 11 Superscript minus Baseline bold upper A 12 EndMatrix

If the kth row and column are set to zero when the pivot is zero (or in practice, less than some singularity tolerance), the generalized inverse in the leading position of the swept matrix is a reflexive, g 2-inverse. Suppose that the crossproduct matrix of the linear model is augmented with a "Y-border" as follows:

bold upper C equals Start 2 By 2 Matrix 1st Row 1st Column bold upper X prime bold upper X 2nd Column bold upper X prime bold upper Y 2nd Row 1st Column bold upper Y prime bold upper X 2nd Column bold upper Y prime bold upper Y EndMatrix

Then the result of sweeping on the rows of bold upper X is

StartLayout 1st Row 1st Column normal upper S normal w normal e normal e normal p left-parenthesis bold upper C comma bold upper X right-parenthesis equals 2nd Column Start 2 By 2 Matrix 1st Row 1st Column left-parenthesis bold upper X prime bold upper X right-parenthesis Superscript minus 2nd Column left-parenthesis bold upper X prime bold upper X right-parenthesis Superscript minus Baseline bold upper X prime bold upper Y 2nd Row 1st Column minus bold upper Y prime bold upper X left-parenthesis bold upper X prime bold upper X right-parenthesis Superscript minus 2nd Column bold upper Y prime bold upper Y minus bold upper Y prime bold upper X left-parenthesis bold upper X prime bold upper X right-parenthesis Superscript minus Baseline bold upper X prime bold upper Y EndMatrix 2nd Row 1st Column equals 2nd Column Start 2 By 2 Matrix 1st Row 1st Column left-parenthesis bold upper X prime bold upper X right-parenthesis Superscript minus Baseline 2nd Column ModifyingAbove bold-italic beta With caret 2nd Row 1st Column minus ModifyingAbove bold-italic beta With caret 2nd Column bold upper Y prime bold upper M bold upper Y EndMatrix equals Start 2 By 2 Matrix 1st Row 1st Column left-parenthesis bold upper X prime bold upper X right-parenthesis Superscript minus Baseline 2nd Column ModifyingAbove bold-italic beta With caret 2nd Row 1st Column minus ModifyingAbove bold-italic beta With caret 2nd Column normal upper S normal upper S normal upper R EndMatrix EndLayout

The "Y-border" has been transformed into the least squares solution and the residual sum of squares.

Partial sweeps are common in model selection. Suppose that the bold upper X matrix is partitioned as left-bracket bold upper X 1 bold upper X 2 right-bracket, and consider the augmented crossproduct matrix

bold upper C equals Start 3 By 3 Matrix 1st Row 1st Column bold upper X prime 1 bold upper X 1 2nd Column bold upper X prime 1 bold upper X 2 3rd Column bold upper X prime 1 bold upper Y 2nd Row 1st Column bold upper X prime 2 bold upper X 1 2nd Column bold upper X prime 2 bold upper X 2 3rd Column bold upper X prime 2 bold upper Y 3rd Row 1st Column bold upper Y prime bold upper X 1 2nd Column bold upper Y prime bold upper X 2 3rd Column bold upper Y prime bold upper Y EndMatrix

Sweeping on the bold upper X 1 partition yields

normal upper S normal w normal e normal e normal p left-parenthesis bold upper C comma bold upper X 1 right-parenthesis equals Start 3 By 3 Matrix 1st Row 1st Column left-parenthesis bold upper X prime 1 bold upper X 1 right-parenthesis Superscript minus Baseline 2nd Column left-parenthesis bold upper X prime 1 bold upper X 1 right-parenthesis Superscript minus Baseline bold upper X prime 1 bold upper X 2 3rd Column left-parenthesis bold upper X prime 1 bold upper X 1 right-parenthesis Superscript minus Baseline bold upper X prime 1 bold upper Y 2nd Row 1st Column minus bold upper X prime 2 bold upper X 1 left-parenthesis bold upper X prime 1 bold upper X 1 right-parenthesis Superscript minus Baseline 2nd Column bold upper X prime 2 bold upper M 1 bold upper X 2 3rd Column bold upper X prime 2 bold upper M 1 bold upper Y 3rd Row 1st Column minus bold upper Y prime bold upper X 1 left-parenthesis bold upper X prime 1 bold upper X 1 right-parenthesis Superscript minus Baseline 2nd Column minus bold upper Y prime bold upper M 1 bold upper X 2 3rd Column bold upper Y prime bold upper M 1 bold upper Y EndMatrix

where bold upper M 1 equals bold upper I minus bold upper X 1 left-parenthesis bold upper X prime 1 bold upper X 1 right-parenthesis Superscript minus Baseline bold upper X prime 1. The entries in the first row of this partition are the generalized inverse of bold upper X prime bold upper X, the coefficients for regressing bold upper X 2 on bold upper X 1, and the coefficients for regressing bold upper Y on bold upper X 1. The diagonal entries bold upper X prime 2 bold upper M 1 bold upper X 2 and bold upper Y prime bold upper M 1 bold upper Y are the sum of squares and crossproduct matrices for regressing bold upper X 2 on bold upper X 1 and for regressing bold upper Y on bold upper X 1, respectively. As you continue to sweep the matrix, the last cell in the partition contains the residual sum of square of a model in which bold upper Y is regressed on all columns swept up to that point.

The sweep operator is not only useful to conceptualize the computation of least squares solutions, Type I and Type II sums of squares, and generalized inverses. It can also be used to obtain other statistical information. For example, adding the logarithms of the pivots of the rows that are swept yields the log determinant of the matrix.

Last updated: December 09, 2022