The GENMOD Procedure

Residuals

The GENMOD procedure computes three kinds of residuals. Residuals are available for all generalized linear models except multinomial models for ordinal response data, for which residuals are not available. Raw residuals and Pearson residuals are available for models fit with generalized estimating equations (GEEs).

The raw residual is defined as

r Subscript i Baseline equals y Subscript i Baseline minus mu Subscript i

where y Subscript i is the ith response and mu Subscript i is the corresponding predicted mean. You can request raw residuals in an output data set with the keyword RESRAW in the OUTPUT statement.

The Pearson residual is the square root of the ith contribution to the Pearson’s chi-square:

r Subscript upper P i Baseline equals left-parenthesis y Subscript i Baseline minus mu Subscript i Baseline right-parenthesis StartRoot StartFraction w Subscript i Baseline Over upper V left-parenthesis mu Subscript i Baseline right-parenthesis EndFraction EndRoot

You can request Pearson residuals in an output data set with the keyword RESCHI in the OUTPUT statement.

Finally, the deviance residual is defined as the square root of the contribution of the ith observation to the deviance, with the sign of the raw residual:

r Subscript upper D i Baseline equals StartRoot d Subscript i Baseline EndRoot left-parenthesis normal s normal i normal g normal n left-parenthesis y Subscript i Baseline minus mu Subscript i Baseline right-parenthesis right-parenthesis

You can request deviance residuals in an output data set with the keyword RESDEV in the OUTPUT statement. For more information about the deviance computations, see the section Goodness of Fit.

The adjusted Pearson, deviance, and likelihood residuals are defined by Agresti (2002); Williams (1987); Davison and Snell (1991). These residuals are useful for outlier detection and for assessing the influence of single observations on the fitted model.

For the generalized linear model, the variance of the ith individual observation is given by

v Subscript i Baseline equals StartFraction phi upper V left-parenthesis mu Subscript i Baseline right-parenthesis Over w Subscript i Baseline EndFraction

where phi is the dispersion parameter, w Subscript i is a user-specified prior weight (if not specified, w Subscript i Baseline equals 1), mu Subscript i is the mean, and upper V left-parenthesis mu Subscript i Baseline right-parenthesis is the variance function. Let

w Subscript e i Baseline equals v Subscript i Superscript negative 1 Baseline left-parenthesis g prime left-parenthesis mu Subscript i Baseline right-parenthesis right-parenthesis Superscript negative 2

for the ith observation, where g prime left-parenthesis mu Subscript i Baseline right-parenthesis is the derivative of the link function, evaluated at mu Subscript i. Let bold upper W Subscript e be the diagonal matrix with w Subscript e i denoting the ith diagonal element. The weight matrix bold upper W Subscript e is used in computing the expected information matrix.

Define h Subscript i as the ith diagonal element of the matrix

bold upper W Subscript e Superscript one-half Baseline bold upper X left-parenthesis bold upper X prime bold upper W Subscript e Baseline bold upper X right-parenthesis Superscript negative 1 Baseline bold upper X prime bold upper W Subscript e Superscript one-half

The Pearson residuals, standardized to have unit asymptotic variance, are given by

r Subscript upper P i Baseline equals StartFraction y Subscript i Baseline minus mu Subscript i Baseline Over StartRoot v Subscript i Baseline left-parenthesis 1 minus h Subscript i Baseline right-parenthesis EndRoot EndFraction

You can request standardized Pearson residuals in an output data set with the keyword STDRESCHI in the OUTPUT statement. The deviance residuals, standardized to have unit asymptotic variance, are given by

r Subscript upper D i Baseline equals StartFraction normal s normal i normal g normal n left-parenthesis y Subscript i Baseline minus mu Subscript i Baseline right-parenthesis StartRoot d Subscript i Baseline EndRoot Over StartRoot phi left-parenthesis 1 minus h Subscript i Baseline right-parenthesis EndRoot EndFraction

where d Subscript i is the contribution to the total deviance from observation i, and normal s normal i normal g normal n left-parenthesis y Subscript i Baseline minus mu Subscript i Baseline right-parenthesis is 1 if y Subscript i Baseline minus mu Subscript i is positive and –1 if y Subscript i Baseline minus mu Subscript i is negative. You can request standardized deviance residuals in an output data set with the keyword STDRESDEV in the OUTPUT statement. The likelihood residuals are defined by

r Subscript upper G i Baseline equals normal s normal i normal g normal n left-parenthesis y Subscript i Baseline minus mu Subscript i Baseline right-parenthesis StartRoot left-parenthesis 1 minus h Subscript i Baseline right-parenthesis r Subscript upper D i Superscript 2 Baseline plus h Subscript i Baseline r Subscript upper P i Superscript 2 Baseline EndRoot

You can request likelihood residuals in an output data set with the keyword RESLIK in the OUTPUT statement.

Last updated: December 09, 2022