Introduction to Statistical Modeling with SAS/STAT Software

Important Linear Algebra Concepts

A matrix bold upper A is a rectangular array of numbers. The order of a matrix with n rows and k columns is left-parenthesis n times k right-parenthesis. The element in row i, column j of bold upper A is denoted as a Subscript i j, and the notation left-bracket a Subscript i j Baseline right-bracket is sometimes used to refer to the two-dimensional row-column array

bold upper A equals Start 5 By 5 Matrix 1st Row 1st Column a 11 2nd Column a 12 3rd Column a 13 4th Column midline-horizontal-ellipsis 5th Column a Subscript 1 k Baseline 2nd Row 1st Column a 21 2nd Column a 22 3rd Column a 23 4th Column midline-horizontal-ellipsis 5th Column a Subscript 2 k Baseline 3rd Row 1st Column a 31 2nd Column a 32 3rd Column a 33 4th Column midline-horizontal-ellipsis 5th Column a Subscript 3 k Baseline 4th Row 1st Column vertical-ellipsis 2nd Column vertical-ellipsis 3rd Column vertical-ellipsis 4th Column down-right-diagonal-ellipsis 5th Column vertical-ellipsis 5th Row 1st Column a Subscript n Baseline 1 Baseline 2nd Column a Subscript n Baseline 2 Baseline 3rd Column a Subscript n Baseline 3 Baseline 4th Column midline-horizontal-ellipsis 5th Column a Subscript n k Baseline EndMatrix equals left-bracket a Subscript i j Baseline right-bracket

A vector is a one-dimensional array of numbers. A column vector has a single column (k equals 1). A row vector has a single row (n equals 1). A scalar is a matrix of order left-parenthesis 1 times 1 right-parenthesis—that is, a single number. A square matrix has the same row and column order, n equals k. A diagonal matrix is a square matrix where all off-diagonal elements are zero, a Subscript i j Baseline equals 0 if i not-equals j. The identity matrix bold upper I is a diagonal matrix with a Subscript i i Baseline equals 1 for all i. The unit vector bold 1 is a vector where all elements are 1. The unit matrix bold upper J is a matrix of all 1s. Similarly, the elements of the null vector and the null matrix are all 0.

Basic matrix operations are as follows:

Addition

If bold upper A and bold upper B are of the same order, then bold upper A plus bold upper B is the matrix of elementwise sums,

bold upper A plus bold upper B equals left-bracket a Subscript i j Baseline plus b Subscript i j Baseline right-bracket
Subtraction

If bold upper A and bold upper B are of the same order, then bold upper A minus bold upper B is the matrix of elementwise differences,

bold upper A minus bold upper B equals left-bracket a Subscript i j Baseline minus b Subscript i j Baseline right-bracket
Dot product

The dot product of two n-vectors bold a and bold b is the sum of their elementwise products,

bold a dot bold b equals sigma-summation Underscript i equals 1 Overscript n Endscripts a Subscript i Baseline b Subscript i

The dot product is also known as the inner product of bold a and bold b. Two vectors are said to be orthogonal if their dot product is zero.

Multiplication

Matrices bold upper A and bold upper B are said to be conformable for bold upper A bold upper B multiplication if the number of columns in bold upper A equals the number of rows in bold upper B. Suppose that bold upper A is of order left-parenthesis n times k right-parenthesis and that bold upper B is of order left-parenthesis k times p right-parenthesis. The product bold upper A bold upper B is then defined as the left-parenthesis n times p right-parenthesis matrix of the dot products of the ith row of bold upper A and the jth column of bold upper B,

bold upper A bold upper B equals left-bracket bold a Subscript bold i Baseline dot bold b Subscript bold j Baseline right-bracket Subscript n times p
Transposition

The transpose of the left-parenthesis n times k right-parenthesis matrix bold upper A is denoted as bold upper A prime and is obtained by interchanging the rows and columns,

bold upper A prime equals Start 5 By 5 Matrix 1st Row 1st Column a 11 2nd Column a 21 3rd Column a 31 4th Column midline-horizontal-ellipsis 5th Column a Subscript n Baseline 1 Baseline 2nd Row 1st Column a 12 2nd Column a 22 3rd Column a 23 4th Column midline-horizontal-ellipsis 5th Column a Subscript n Baseline 2 Baseline 3rd Row 1st Column a 13 2nd Column a 23 3rd Column a 33 4th Column midline-horizontal-ellipsis 5th Column a Subscript n Baseline 3 Baseline 4th Row 1st Column vertical-ellipsis 2nd Column vertical-ellipsis 3rd Column vertical-ellipsis 4th Column down-right-diagonal-ellipsis 5th Column vertical-ellipsis 5th Row 1st Column a Subscript 1 k Baseline 2nd Column a Subscript 2 k Baseline 3rd Column a Subscript 3 k Baseline 4th Column midline-horizontal-ellipsis 5th Column a Subscript n k Baseline EndMatrix equals left-bracket a Subscript j i Baseline right-bracket

A symmetric matrix is equal to its transpose, bold upper A equals bold upper A prime. The inner product of two left-parenthesis n times 1 right-parenthesis column vectors bold a and bold b is bold a dot bold b equals bold a prime bold b.

Matrix Inversion

Regular Inverses

The right inverse of a matrix bold upper A is the matrix that yields the identity when bold upper A is postmultiplied by it. Similarly, the left inverse of bold upper A yields the identity if bold upper A is premultiplied by it. bold upper A is said to be invertible and bold upper B is said to be the inverse of bold upper A, if bold upper B is its right and left inverse, bold upper B bold upper A equals bold upper A bold upper B equals bold upper I. This requires bold upper A to be square and nonsingular. The inverse of a matrix bold upper A is commonly denoted as bold upper A Superscript negative 1. The following results are useful in manipulating inverse matrices (assuming both bold upper A and bold upper C are invertible):

StartLayout 1st Row 1st Column bold upper A bold upper A Superscript negative 1 Baseline equals 2nd Column bold upper A Superscript negative 1 Baseline bold upper A equals bold upper I 2nd Row 1st Column left-parenthesis bold upper A prime right-parenthesis Superscript negative 1 Baseline equals 2nd Column left-parenthesis bold upper A Superscript negative 1 Baseline right-parenthesis prime 3rd Row 1st Column left-parenthesis bold upper A Superscript negative 1 Baseline right-parenthesis Superscript negative 1 Baseline equals 2nd Column bold upper A 4th Row 1st Column left-parenthesis bold upper A bold upper C right-parenthesis Superscript negative 1 Baseline equals 2nd Column bold upper C Superscript negative 1 Baseline bold upper A Superscript negative 1 5th Row 1st Column normal r normal a normal n normal k left-parenthesis bold upper A right-parenthesis equals 2nd Column normal r normal a normal n normal k left-parenthesis bold upper A Superscript negative 1 Baseline right-parenthesis EndLayout

If bold upper D is a diagonal matrix with nonzero entries on the diagonal—that is, bold upper D equals normal d normal i normal a normal g left-parenthesis d 1 comma ellipsis comma d Subscript n Baseline right-parenthesis—then bold upper D Superscript negative 1 Baseline equals normal d normal i normal a normal g left-parenthesis 1 slash d 1 comma ellipsis comma 1 slash d Subscript n Baseline right-parenthesis. If bold upper D is a block-diagonal matrix whose blocks are invertible, then

bold upper D equals Start 5 By 5 Matrix 1st Row 1st Column bold upper D 1 2nd Column bold 0 3rd Column bold 0 4th Column midline-horizontal-ellipsis 5th Column bold 0 2nd Row 1st Column bold 0 2nd Column bold upper D 2 3rd Column bold 0 4th Column midline-horizontal-ellipsis 5th Column bold 0 3rd Row 1st Column bold 0 2nd Column bold 0 3rd Column bold upper D 3 4th Column midline-horizontal-ellipsis 5th Column bold 0 4th Row 1st Column vertical-ellipsis 2nd Column vertical-ellipsis 3rd Column vertical-ellipsis 4th Column down-right-diagonal-ellipsis 5th Column vertical-ellipsis 5th Row 1st Column bold 0 2nd Column bold 0 3rd Column bold 0 4th Column midline-horizontal-ellipsis 5th Column bold upper D Subscript n Baseline EndMatrix bold upper D Superscript negative 1 Baseline equals Start 5 By 5 Matrix 1st Row 1st Column bold upper D 1 Superscript negative 1 Baseline 2nd Column bold 0 3rd Column bold 0 4th Column midline-horizontal-ellipsis 5th Column bold 0 2nd Row 1st Column bold 0 2nd Column bold upper D 2 Superscript negative 1 Baseline 3rd Column bold 0 4th Column midline-horizontal-ellipsis 5th Column bold 0 3rd Row 1st Column bold 0 2nd Column bold 0 3rd Column bold upper D 3 Superscript negative 1 Baseline 4th Column midline-horizontal-ellipsis 5th Column bold 0 4th Row 1st Column vertical-ellipsis 2nd Column vertical-ellipsis 3rd Column vertical-ellipsis 4th Column down-right-diagonal-ellipsis 5th Column vertical-ellipsis 5th Row 1st Column bold 0 2nd Column bold 0 3rd Column bold 0 4th Column midline-horizontal-ellipsis 5th Column bold upper D Subscript n Superscript negative 1 EndMatrix

In statistical applications the following two results are particularly important, because they can significantly reduce the computational burden in working with inverse matrices.

Partitioned Matrix

Suppose bold upper A is a nonsingular matrix that is partitioned as

bold upper A equals Start 2 By 2 Matrix 1st Row 1st Column bold upper A 11 2nd Column bold upper A 12 2nd Row 1st Column bold upper A 21 2nd Column bold upper A 22 EndMatrix

Then, provided that all the inverses exist, the inverse of bold upper A is given by

bold upper A Superscript negative 1 Baseline equals Start 2 By 2 Matrix 1st Row 1st Column bold upper B 11 2nd Column bold upper B 12 2nd Row 1st Column bold upper B 21 2nd Column bold upper B 22 EndMatrix

where bold upper B 11 equals left-parenthesis bold upper A 11 minus bold upper A 12 bold upper A 22 Superscript negative 1 Baseline bold upper A 21 right-parenthesis Superscript negative 1, bold upper B 12 equals minus bold upper B 11 bold upper A 12 bold upper A 22 Superscript negative 1, bold upper B 21 equals minus bold upper A 22 Superscript negative 1 Baseline bold upper A 21 bold upper B 11, and bold upper B 22 equals left-parenthesis bold upper A 22 minus bold upper A 21 bold upper A 11 Superscript negative 1 Baseline bold upper A 12 right-parenthesis Superscript negative 1.

Patterned Sum

Suppose bold upper R is left-parenthesis n times n right-parenthesis nonsingular, bold upper G is left-parenthesis k times k right-parenthesis nonsingular, and bold upper B and bold upper C are left-parenthesis n times k right-parenthesis and left-parenthesis k times n right-parenthesis matrices, respectively. Then the inverse of bold upper R plus bold upper B bold upper G bold upper C is given by

left-parenthesis bold upper R plus bold upper B bold upper G bold upper C right-parenthesis Superscript negative 1 Baseline equals bold upper R Superscript negative 1 Baseline minus bold upper R Superscript negative 1 Baseline bold upper B left-parenthesis bold upper G Superscript negative 1 Baseline plus bold upper C bold upper R Superscript negative 1 Baseline bold upper B right-parenthesis Superscript negative 1 Baseline bold upper C bold upper R Superscript negative 1

This formula is particularly useful if k less-than less-than n and bold upper R has a simple form that is easy to invert. This case arises, for example, in mixed models where bold upper R might be a diagonal or block-diagonal matrix, and bold upper B equals bold upper C prime.

Another situation where this formula plays a critical role is in the computation of regression diagnostics, such as in determining the effect of removing an observation from the analysis. Suppose that bold upper A equals bold upper X prime bold upper X represents the crossproduct matrix in the linear model normal upper E left-bracket bold upper Y right-bracket equals bold upper X bold-italic beta. If bold x prime Subscript i is the ith row of the bold upper X matrix, then left-parenthesis bold upper X prime bold upper X minus bold x Subscript i Baseline bold x prime Subscript i right-parenthesis is the crossproduct matrix in the same model with the ith observation removed. Identifying bold upper B equals minus bold x Subscript i, bold upper C equals bold x prime Subscript i, and bold upper G equals bold upper I in the preceding inversion formula, you can obtain the expression for the inverse of the crossproduct matrix:

left-parenthesis bold upper X prime bold upper X minus bold x Subscript i Baseline bold x prime Subscript i right-parenthesis Superscript negative 1 Baseline equals left-parenthesis bold upper X prime bold upper X right-parenthesis Superscript negative 1 Baseline plus StartFraction left-parenthesis bold upper X prime bold upper X right-parenthesis Superscript negative 1 Baseline bold x Subscript i Baseline bold x prime Subscript i Baseline left-parenthesis bold upper X prime bold upper X right-parenthesis Superscript negative 1 Baseline Over 1 minus bold x prime Subscript i Baseline left-parenthesis bold upper X prime bold upper X right-parenthesis Superscript negative 1 Baseline bold x Subscript i Baseline EndFraction

This expression for the inverse of the reduced data crossproduct matrix enables you to compute "leave-one-out" deletion diagnostics in linear models without refitting the model.

Generalized Inverse Matrices

If bold upper A is rectangular (not square) or singular, then it is not invertible and the matrix bold upper A Superscript negative 1 does not exist. Suppose you want to find a solution to simultaneous linear equations of the form

bold upper A bold b equals bold c

If bold upper A is square and nonsingular, then the unique solution is bold b equals bold upper A Superscript negative 1 Baseline bold c. In statistical applications, the case where bold upper A is left-parenthesis n times k right-parenthesis rectangular is less important than the case where bold upper A is a left-parenthesis k times k right-parenthesis square matrix of rank less than k. For example, the normal equations in ordinary least squares (OLS) estimation in the model bold upper Y equals bold upper X bold-italic beta plus bold-italic epsilon are

left-parenthesis bold upper X prime bold upper X right-parenthesis bold-italic beta equals bold upper X prime bold upper Y

A generalized inverse matrix is a matrix bold upper A Superscript minus such that bold upper A Superscript minus Baseline bold c is a solution to the linear system. In the OLS example, a solution can be found as left-parenthesis bold upper X prime bold upper X right-parenthesis Superscript minus Baseline bold upper X prime bold upper Y, where left-parenthesis bold upper X prime bold upper X right-parenthesis Superscript minus is a generalized inverse of bold upper X prime bold upper X.

The following four conditions are often associated with generalized inverses. For the square or rectangular matrix bold upper A there exist matrices bold upper G that satisfy

StartLayout 1st Row 1st Column left-parenthesis normal i right-parenthesis 2nd Column bold upper A bold upper G bold upper A 3rd Column equals 4th Column bold upper A 2nd Row 1st Column left-parenthesis normal i normal i right-parenthesis 2nd Column bold upper G bold upper A bold upper G 3rd Column equals 4th Column bold upper G 3rd Row 1st Column left-parenthesis normal i normal i normal i right-parenthesis 2nd Column left-parenthesis bold upper A bold upper G right-parenthesis prime 3rd Column equals 4th Column bold upper A bold upper G 4th Row 1st Column left-parenthesis normal i normal v right-parenthesis 2nd Column left-parenthesis bold upper G bold upper A right-parenthesis prime 3rd Column equals 4th Column bold upper G bold upper A EndLayout

The matrix bold upper G that satisfies all four conditions is unique and is called the Moore-Penrose inverse, after the first published work on generalized inverses by Moore (1920) and the subsequent definition by Penrose (1955). Only the first condition is required, however, to provide a solution to the linear system above.

Pringle and Rayner (1971) introduced a numbering system to distinguish between different types of generalized inverses. A matrix that satisfies only condition (i) is a g 1-inverse. The g 2-inverse satisfies conditions (i) and (ii). It is also called a reflexive generalized inverse. Matrices satisfying conditions (i)–(iii) or conditions (i), (ii), and (iv) are g 3-inverses. Note that a matrix that satisfies the first three conditions is a right generalized inverse, and a matrix that satisfies conditions (i), (ii), and (iv) is a left generalized inverse. For example, if bold upper B is left-parenthesis n times k right-parenthesis of rank k, then left-parenthesis bold upper B prime bold upper B right-parenthesis Superscript negative 1 Baseline bold upper B prime is a left generalized inverse of bold upper B. The notation g 4-inverse for the Moore-Penrose inverse, satisfying conditions (i)–(iv), is often used by extension, but note that Pringle and Rayner (1971) do not use it; rather, they call such a matrix "the" generalized inverse.

If the left-parenthesis n times k right-parenthesis matrix bold upper X is rank-deficient—that is, normal r normal a normal n normal k left-parenthesis bold upper X right-parenthesis less-than min left-brace n comma k right-brace—then the system of equations

left-parenthesis bold upper X prime bold upper X right-parenthesis bold-italic beta equals bold upper X prime bold upper Y

does not have a unique solution. A particular solution depends on the choice of the generalized inverse. However, some aspects of the statistical inference are invariant to the choice of the generalized inverse. If bold upper G is a generalized inverse of bold upper X prime bold upper X, then bold upper X bold upper G bold upper X prime is invariant to the choice of bold upper G. This result comes into play, for example, when you are computing predictions in an OLS model with a rank-deficient bold upper X matrix, since it implies that the predicted values

bold upper X ModifyingAbove bold-italic beta With caret equals bold upper X left-parenthesis bold upper X prime bold upper X right-parenthesis Superscript minus Baseline bold upper X prime bold y

are invariant to the choice of left-parenthesis bold upper X prime bold upper X right-parenthesis Superscript minus.

Matrix Differentiation

Taking the derivative of expressions involving matrices is a frequent task in statistical estimation. Objective functions that are to be minimized or maximized are usually written in terms of model matrices and/or vectors whose elements depend on the unknowns of the estimation problem. Suppose that bold upper A and bold upper B are real matrices whose elements depend on the scalar quantities beta and theta—that is, bold upper A equals left-bracket a Subscript i j Baseline left-parenthesis beta comma theta right-parenthesis right-bracket, and similarly for bold upper B.

The following are useful results in finding the derivative of elements of a matrix and of functions involving a matrix. For more in-depth discussion of matrix differentiation and matrix calculus, see, for example, Magnus and Neudecker (1999) and Harville (1997).

The derivative of bold upper A with respect to beta is denoted ModifyingAbove bold upper A With dot Subscript beta and is the matrix of the first derivatives of the elements of bold upper A:

ModifyingAbove bold upper A With dot Subscript beta Baseline equals StartFraction partial-differential Over partial-differential beta EndFraction bold upper A equals left-bracket StartFraction partial-differential a Subscript i j Baseline left-parenthesis beta comma theta right-parenthesis Over partial-differential beta EndFraction right-bracket

Similarly, the second derivative of bold upper A with respect to beta and theta is the matrix of the second derivatives

ModifyingAbove bold upper A With two-dots Subscript beta theta Baseline equals StartFraction partial-differential squared Over partial-differential beta partial-differential theta EndFraction bold upper A equals left-bracket StartFraction partial-differential squared a Subscript i j Baseline left-parenthesis beta comma theta right-parenthesis Over partial-differential beta partial-differential theta EndFraction right-bracket

The following are some basic results involving sums, products, and traces of matrices:

StartLayout 1st Row 1st Column StartFraction partial-differential Over partial-differential beta EndFraction c 1 bold upper A equals 2nd Column c 1 ModifyingAbove bold upper A With dot Subscript beta 2nd Row 1st Column StartFraction partial-differential Over partial-differential beta EndFraction left-parenthesis bold upper A plus bold upper B right-parenthesis equals 2nd Column ModifyingAbove bold upper A With dot Subscript beta Baseline plus ModifyingAbove bold upper B With dot Subscript beta 3rd Row 1st Column StartFraction partial-differential Over partial-differential beta EndFraction left-parenthesis c 1 bold upper A plus c 2 bold upper B right-parenthesis equals 2nd Column c 1 ModifyingAbove bold upper A With dot Subscript beta plus c 2 ModifyingAbove bold upper B With dot Subscript beta 4th Row 1st Column StartFraction partial-differential Over partial-differential beta EndFraction bold upper A bold upper B equals 2nd Column bold upper A ModifyingAbove bold upper B With dot Subscript beta plus ModifyingAbove bold upper A With dot Subscript beta Baseline bold upper B 5th Row 1st Column StartFraction partial-differential Over partial-differential beta EndFraction normal t normal r normal a normal c normal e left-parenthesis bold upper A right-parenthesis equals 2nd Column normal t normal r normal a normal c normal e left-parenthesis ModifyingAbove bold upper A With dot Subscript beta Baseline right-parenthesis 6th Row 1st Column StartFraction partial-differential Over partial-differential beta EndFraction normal t normal r normal a normal c normal e left-parenthesis bold upper A bold upper B right-parenthesis equals 2nd Column normal t normal r normal a normal c normal e left-parenthesis bold upper A ModifyingAbove bold upper B With dot Subscript beta Baseline right-parenthesis plus normal t normal r normal a normal c normal e left-parenthesis ModifyingAbove bold upper A With dot Subscript beta Baseline bold upper B right-parenthesis EndLayout

The next set of results is useful in finding the derivative of elements of bold upper A and of functions of bold upper A, if bold upper A is a nonsingular matrix:

StartLayout 1st Row 1st Column StartFraction partial-differential Over partial-differential beta EndFraction bold x prime bold upper A Superscript negative 1 Baseline bold x equals 2nd Column minus bold x prime bold upper A Superscript negative 1 Baseline ModifyingAbove bold upper A With dot Subscript beta Baseline bold upper A Superscript negative 1 Baseline bold x 2nd Row 1st Column StartFraction partial-differential Over partial-differential beta EndFraction bold upper A Superscript negative 1 Baseline equals 2nd Column minus bold upper A Superscript negative 1 Baseline ModifyingAbove bold upper A With dot Subscript beta Baseline bold upper A Superscript negative 1 3rd Row 1st Column StartFraction partial-differential Over partial-differential beta EndFraction StartAbsoluteValue bold upper A EndAbsoluteValue equals 2nd Column StartAbsoluteValue bold upper A EndAbsoluteValue normal t normal r normal a normal c normal e left-parenthesis bold upper A Superscript negative 1 Baseline ModifyingAbove bold upper A With dot Subscript beta Baseline right-parenthesis 4th Row 1st Column StartFraction partial-differential Over partial-differential beta EndFraction log left-brace StartAbsoluteValue bold upper A EndAbsoluteValue right-brace equals 2nd Column StartFraction 1 Over StartAbsoluteValue bold upper A EndAbsoluteValue EndFraction StartFraction partial-differential Over partial-differential beta EndFraction bold upper A equals normal t normal r normal a normal c normal e left-parenthesis bold upper A Superscript negative 1 Baseline ModifyingAbove bold upper A With dot Subscript beta Baseline right-parenthesis 5th Row 1st Column StartFraction partial-differential squared Over partial-differential beta partial-differential theta EndFraction bold upper A Superscript negative 1 Baseline equals 2nd Column minus bold upper A Superscript negative 1 Baseline ModifyingAbove bold upper A With two-dots Subscript beta theta Baseline bold upper A Superscript negative 1 plus bold upper A Superscript negative 1 Baseline ModifyingAbove bold upper A With dot Subscript beta Baseline bold upper A Superscript negative 1 Baseline ModifyingAbove bold upper A With dot Subscript theta Baseline bold upper A Superscript negative 1 plus bold upper A Superscript negative 1 Baseline ModifyingAbove bold upper A With dot Subscript theta Baseline bold upper A Superscript negative 1 Baseline ModifyingAbove bold upper A With dot Subscript beta Baseline bold upper A Superscript negative 1 6th Row 1st Column StartFraction partial-differential squared Over partial-differential beta partial-differential theta EndFraction log left-brace StartAbsoluteValue bold upper A EndAbsoluteValue right-brace equals 2nd Column normal t normal r normal a normal c normal e left-parenthesis bold upper A Superscript negative 1 Baseline ModifyingAbove bold upper A With two-dots Subscript beta theta Baseline right-parenthesis minus normal t normal r normal a normal c normal e left-parenthesis bold upper A Superscript negative 1 Baseline ModifyingAbove bold upper A With dot Subscript beta Baseline bold upper A Superscript negative 1 Baseline ModifyingAbove bold upper A With dot Subscript theta Baseline right-parenthesis EndLayout

Now suppose that bold a and bold b are column vectors that depend on beta and/or theta and that bold x is a vector of constants. The following results are useful for manipulating derivatives of linear and quadratic forms:

StartLayout 1st Row 1st Column StartFraction partial-differential Over partial-differential bold x EndFraction bold a prime bold x equals 2nd Column bold a 2nd Row 1st Column StartFraction partial-differential Over partial-differential bold x prime EndFraction bold upper B bold x equals 2nd Column bold upper B 3rd Row 1st Column StartFraction partial-differential Over partial-differential bold x EndFraction bold x prime bold upper B bold x equals 2nd Column left-parenthesis bold upper B plus bold upper B prime right-parenthesis bold x 4th Row 1st Column StartFraction partial-differential squared Over partial-differential bold x partial-differential bold x prime EndFraction bold x prime bold upper B bold x equals 2nd Column bold upper B plus bold upper B prime EndLayout

Matrix Decompositions

To decompose a matrix is to express it as a function—typically a product—of other matrices that have particular properties such as orthogonality, diagonality, triangularity. For example, the Cholesky decomposition of a symmetric positive definite matrix bold upper A is bold upper C bold upper C Superscript prime Baseline equals bold upper A, where bold upper C is a lower-triangular matrix. The spectral decomposition of a symmetric matrix is bold upper A equals bold upper P bold upper D bold upper P prime, where bold upper D is a diagonal matrix and bold upper P is an orthogonal matrix.

Matrix decomposition play an important role in statistical theory as well as in statistical computations. Calculations in terms of decompositions can have greater numerical stability. Decompositions are often necessary to extract information about matrices, such as matrix rank, eigenvalues, or eigenvectors. Decompositions are also used to form special transformations of matrices, such as to form a "square-root" matrix. This section briefly mentions several decompositions that are particularly prevalent and important.

LDU, LU, and Cholesky Decomposition

Every square matrix bold upper A, whether it is positive definite or not, can be expressed in the form bold upper A equals bold upper L bold upper D bold upper U, where bold upper L is a unit lower-triangular matrix, bold upper D is a diagonal matrix, and bold upper U is a unit upper-triangular matrix. (The diagonal elements of a unit triangular matrix are 1.) Because of the arrangement of the matrices, the decomposition is called the LDU decomposition. Since you can absorb the diagonal matrix into the triangular matrices, the decomposition

bold upper A equals bold upper L bold upper D Superscript 1 slash 2 Baseline bold upper D Superscript 1 slash 2 Baseline bold upper U equals bold upper L Superscript asterisk Baseline bold upper U Superscript asterisk

is also referred to as the LU decomposition of bold upper A.

If the matrix bold upper A is positive definite, then the diagonal elements of bold upper D are positive and the LDU decomposition is unique. If bold upper A is also symmetric, then the unique decomposition takes the form bold upper A equals bold upper U prime bold upper D bold upper U, where bold upper U is unit upper-triangular and bold upper D is diagonal with positive elements. Absorbing the square root of bold upper D into bold upper U, bold upper C equals bold upper D Superscript 1 slash 2 Baseline bold upper U, the decomposition is known as the Cholesky decomposition of a positive-definite matrix:

bold upper A equals bold upper U prime bold upper D Superscript 1 slash 2 Baseline bold upper D Superscript 1 slash 2 Baseline bold upper U equals bold upper C prime bold upper C

where bold upper C is upper triangular.

If bold upper A is symmetric but only nonnegative definite of rank k, rather than being positive definite of full rank, then it has an extended Cholesky decomposition as follows. Let bold upper C Superscript asterisk denote the lower-triangular matrix such that

bold upper C Superscript asterisk Baseline equals Start 2 By 2 Matrix 1st Row 1st Column bold upper C Subscript k times k Baseline 2nd Column bold 0 2nd Row 1st Column bold 0 2nd Column bold 0 EndMatrix

Then bold upper A equals bold upper C bold upper C prime.

Spectral Decomposition

Suppose that bold upper A is an left-parenthesis n times n right-parenthesis symmetric matrix. Then there exists an orthogonal matrix bold upper Q and a diagonal matrix bold upper D such that bold upper A equals bold upper Q bold upper D bold upper Q prime. Of particular importance is the case where the orthogonal matrix is also orthonormal—that is, its column vectors have unit norm. Denote this orthonormal matrix as bold upper P. Then the corresponding diagonal matrix—bold upper Lamda equals normal d normal i normal a normal g left-parenthesis lamda Subscript i Baseline comma ellipsis comma lamda Subscript n Baseline right-parenthesis, say—contains the eigenvalues of bold upper A. The spectral decomposition of bold upper A can be written as

bold upper A equals bold upper P bold upper Lamda bold upper P Superscript prime Baseline equals sigma-summation Underscript i equals 1 Overscript n Endscripts lamda Subscript i Baseline bold p Subscript i Baseline bold p prime Subscript i

where bold p Subscript i denotes the ith column vector of bold upper P. The right-side expression decomposes bold upper A into a sum of rank-1 matrices, and the weight of each contribution is equal to the eigenvalue associated with the ith eigenvector. The sum furthermore emphasizes that the rank of bold upper A is equal to the number of nonzero eigenvalues.

Harville (1997, p. 538) refers to the spectral decomposition of bold upper A as the decomposition that takes the previous sum one step further and accumulates contributions associated with the distinct eigenvalues. If lamda Subscript i Superscript asterisk Baseline comma ellipsis comma lamda Subscript k Superscript asterisk are the distinct eigenvalues and bold upper E Subscript j Baseline equals sigma-summation bold p Subscript i Baseline bold p prime Subscript i, where the sum is taken over the set of columns for which lamda Subscript i Baseline equals lamda Subscript j Superscript asterisk, then

bold upper A equals sigma-summation Underscript i equals 1 Overscript k Endscripts lamda Subscript j Superscript asterisk Baseline bold upper E Subscript j

You can employ the spectral decomposition of a nonnegative definite symmetric matrix to form a "square-root" matrix of bold upper A. Suppose that bold upper Lamda Superscript 1 slash 2 is the diagonal matrix containing the square roots of the lamda Subscript i. Then bold upper B equals bold upper P bold upper Lamda Superscript 1 slash 2 Baseline bold upper P prime is a square-root matrix of bold upper A in the sense that bold upper B bold upper B equals bold upper A, because

bold upper B bold upper B equals bold upper P bold upper Lamda Superscript 1 slash 2 Baseline bold upper P prime bold upper P bold upper Lamda Superscript 1 slash 2 Baseline bold upper P Superscript prime Baseline equals bold upper P bold upper Lamda Superscript 1 slash 2 Baseline bold upper Lamda Superscript 1 slash 2 Baseline bold upper P Superscript prime Baseline equals bold upper P bold upper Lamda bold upper P prime

Generating the Moore-Penrose inverse of a matrix based on the spectral decomposition is also simple. Denote as bold upper Delta the diagonal matrix with typical element

delta Subscript i Baseline equals StartLayout Enlarged left-brace 1st Row 1st Column 1 slash lamda Subscript i Baseline 2nd Column lamda Subscript i Baseline not-equals 0 2nd Row 1st Column 0 2nd Column lamda Subscript i Baseline equals 0 EndLayout

Then the matrix bold upper P bold upper Delta bold upper P Superscript prime Baseline equals sigma-summation delta Subscript i Baseline bold p Subscript i Baseline bold p prime Subscript i is the Moore-Penrose (g 4-generalized) inverse of bold upper A.

Singular-Value Decomposition

The singular-value decomposition is related to the spectral decomposition of a matrix, but it is more general. The singular-value decomposition can be applied to any matrix. Let bold upper B be an left-parenthesis n times p right-parenthesis matrix of rank k. Then there exist orthogonal matrices P and Q of order left-parenthesis n times n right-parenthesis and left-parenthesis p times p right-parenthesis, respectively, and a diagonal matrix bold upper D such that

bold upper P prime bold upper B bold upper Q equals bold upper D equals Start 2 By 2 Matrix 1st Row 1st Column bold upper D 1 2nd Column bold 0 2nd Row 1st Column bold 0 2nd Column bold 0 EndMatrix

where bold upper D 1 is a diagonal matrix of order k. The diagonal elements of bold upper D 1 are strictly positive. As with the spectral decomposition, this result can be written as a decomposition of bold upper B into a weighted sum of rank-1 matrices

bold upper B equals minus bold upper P bold upper D bold upper Q Superscript prime Baseline equals sigma-summation Underscript i equals 1 Overscript n Endscripts d Subscript i Baseline bold p Subscript i Baseline bold q prime Subscript i

The scalars d 1 comma ellipsis comma d Subscript k Baseline are called the singular values of the matrix bold upper B. They are the positive square roots of the nonzero eigenvalues of the matrix bold upper B prime bold upper B. If the singular-value decomposition is applied to a symmetric, nonnegative definite matrix bold upper A, then the singular values d 1 comma ellipsis comma d Subscript n Baseline are the nonzero eigenvalues of bold upper A and the singular-value decomposition is the same as the spectral decomposition.

As with the spectral decomposition, you can use the results of the singular-value decomposition to generate the Moore-Penrose inverse of a matrix. If bold upper B is left-parenthesis n times p right-parenthesis with singular-value decomposition bold upper P bold upper D bold upper Q prime, and if bold upper Delta is a diagonal matrix with typical element

delta Subscript i Baseline equals StartLayout Enlarged left-brace 1st Row 1st Column 1 slash d Subscript i Baseline 2nd Column StartAbsoluteValue d Subscript i Baseline EndAbsoluteValue not-equals 0 2nd Row 1st Column 0 2nd Column d Subscript i Baseline equals 0 EndLayout

then bold upper Q bold upper Delta bold upper P prime is the g 4-generalized inverse of bold upper B.

Last updated: December 09, 2022