The PHREG Procedure

Residuals

This section describes the computation of residuals (RESMART=, RESDEV=, RESSCH=, and RESSCO=) in the OUTPUT statement.

First, consider TIES=BRESLOW. Let

StartLayout 1st Row 1st Column upper S Superscript left-parenthesis 0 right-parenthesis Baseline left-parenthesis bold-italic beta comma t right-parenthesis 2nd Column equals 3rd Column sigma-summation Underscript i Endscripts upper Y Subscript i Baseline left-parenthesis t right-parenthesis normal e Superscript bold-italic beta prime bold upper Z Super Subscript i Superscript left-parenthesis t right-parenthesis 2nd Row 1st Column upper S Superscript left-parenthesis 1 right-parenthesis Baseline left-parenthesis bold-italic beta comma t right-parenthesis 2nd Column equals 3rd Column sigma-summation Underscript i Endscripts upper Y Subscript i Baseline left-parenthesis t right-parenthesis normal e Superscript bold-italic beta prime bold upper Z Super Subscript i Superscript left-parenthesis t right-parenthesis Baseline bold upper Z Subscript i Baseline left-parenthesis t right-parenthesis 3rd Row 1st Column ModifyingAbove bold upper Z With bar left-parenthesis bold-italic beta comma t right-parenthesis 2nd Column equals 3rd Column StartFraction upper S Superscript left-parenthesis 1 right-parenthesis Baseline left-parenthesis bold-italic beta comma t right-parenthesis Over upper S Superscript left-parenthesis 0 right-parenthesis Baseline left-parenthesis bold-italic beta comma t right-parenthesis EndFraction 4th Row 1st Column d normal upper Lamda 0 left-parenthesis bold-italic beta comma t right-parenthesis 2nd Column equals 3rd Column sigma-summation Underscript i Endscripts StartFraction d upper N Subscript i Baseline left-parenthesis t right-parenthesis Over upper S Superscript left-parenthesis 0 right-parenthesis Baseline left-parenthesis bold-italic beta comma t right-parenthesis EndFraction 5th Row 1st Column d upper M Subscript i Baseline left-parenthesis bold-italic beta comma t right-parenthesis 2nd Column equals 3rd Column d upper N Subscript i Baseline left-parenthesis t right-parenthesis minus upper Y Subscript i Baseline left-parenthesis t right-parenthesis normal e Superscript bold-italic beta prime bold upper Z Super Subscript i Superscript left-parenthesis t right-parenthesis Baseline d normal upper Lamda 0 left-parenthesis bold-italic beta comma t right-parenthesis EndLayout

The martingale residual at t is defined as

ModifyingAbove upper M With caret Subscript i Baseline left-parenthesis t right-parenthesis equals integral Subscript 0 Superscript t Baseline d upper M Subscript i Baseline left-parenthesis ModifyingAbove bold-italic beta With caret comma s right-parenthesis equals upper N Subscript i Baseline left-parenthesis t right-parenthesis minus integral Subscript 0 Superscript t Baseline upper Y Subscript i Baseline left-parenthesis s right-parenthesis normal e Superscript ModifyingAbove bold-italic beta With caret prime bold upper Z Super Subscript i Superscript left-parenthesis s right-parenthesis Baseline d normal upper Lamda 0 left-parenthesis ModifyingAbove bold-italic beta With caret comma s right-parenthesis

Here ModifyingAbove upper M With caret Subscript i Baseline left-parenthesis t right-parenthesis estimates the difference over left-parenthesis 0 comma t right-bracket between the observed number of events for the ith subject and a conditional expected number of events. The quantity ModifyingAbove upper M With caret Subscript i Baseline identical-to ModifyingAbove upper M With caret Subscript i Baseline left-parenthesis normal infinity right-parenthesis is referred to as the martingale residual for the ith subject. When the counting process MODEL specification is used, the RESMART= variable contains the component (ModifyingAbove upper M With caret Subscript i Baseline left-parenthesis t 2 right-parenthesis minus ModifyingAbove upper M With caret Subscript i Baseline left-parenthesis t 1 right-parenthesis) instead of the martingale residual at t 2. The martingale residual for a subject can be obtained by summing up these component residuals within the subject. For the Cox model with no time-dependent explanatory variables, the martingale residual for the ith subject with observation time t Subscript i and event status normal upper Delta Subscript i is

ModifyingAbove upper M With caret Subscript i Baseline equals normal upper Delta Subscript i Baseline minus normal e Superscript ModifyingAbove bold-italic beta With caret prime bold upper Z Super Subscript i Baseline integral Subscript 0 Superscript t Subscript i Baseline Baseline d normal upper Lamda 0 left-parenthesis ModifyingAbove bold-italic beta With caret comma s right-parenthesis

The deviance residuals upper D Subscript i are a transform of the martingale residuals:

upper D Subscript i Baseline equals normal s normal i normal g normal n left-parenthesis ModifyingAbove upper M With caret Subscript i Baseline right-parenthesis StartRoot 2 left-bracket minus ModifyingAbove upper M With caret Subscript i Baseline minus upper N Subscript i Baseline left-parenthesis normal infinity right-parenthesis log left-parenthesis StartFraction upper N Subscript i Baseline left-parenthesis normal infinity right-parenthesis minus ModifyingAbove upper M With caret Subscript i Baseline Over upper N Subscript i Baseline left-parenthesis normal infinity right-parenthesis EndFraction right-parenthesis right-bracket EndRoot

The square root shrinks large negative martingale residuals, while the logarithmic transformation expands martingale residuals that are close to unity. As such, the deviance residuals are more symmetrically distributed around zero than the martingale residuals. For the Cox model, the deviance residual reduces to the form

upper D Subscript i Baseline equals normal s normal i normal g normal n left-parenthesis ModifyingAbove upper M With caret Subscript i Baseline right-parenthesis StartRoot 2 left-bracket minus ModifyingAbove upper M With caret Subscript i Baseline minus normal upper Delta Subscript i Baseline log left-parenthesis normal upper Delta Subscript i Baseline minus ModifyingAbove upper M With caret Subscript i Baseline right-parenthesis right-bracket EndRoot

When the counting process MODEL specification is used, values of the RESDEV= variable are set to missing because the deviance residuals can be calculated only on a per-subject basis.

The Schoenfeld (1982) residual vector is calculated on a per-event-time basis. At the jth event time t Subscript i Sub Subscript j of the ith subject, the Schoenfeld residual

ModifyingAbove bold upper U With caret Subscript i Baseline left-parenthesis t Subscript i Sub Subscript j Subscript Baseline right-parenthesis equals bold upper Z Subscript i Baseline left-parenthesis t Subscript i Sub Subscript j Subscript Baseline right-parenthesis minus ModifyingAbove bold upper Z With bar left-parenthesis ModifyingAbove bold-italic beta With caret comma t Subscript i Sub Subscript j Subscript Baseline right-parenthesis

is the difference between the ith subject covariate vector at t Subscript i Sub Subscript jand the average of the covariate vectors over the risk set at t Subscript i Sub Subscript j. Under the proportional hazards assumption, the Schoenfeld residuals have the sample path of a random walk; therefore, they are useful in assessing time trend or lack of proportionality. Harrell (1986) proposed a z-transform of the Pearson correlation between these residuals and the rank order of the failure time as a test statistic for nonproportional hazards.

The score process for the ith subject at time t is

bold upper L Subscript i Baseline left-parenthesis bold-italic beta comma t right-parenthesis equals integral Subscript 0 Superscript t Baseline left-bracket bold upper Z Subscript i Baseline left-parenthesis s right-parenthesis minus ModifyingAbove bold upper Z With bar left-parenthesis bold-italic beta comma s right-parenthesis right-bracket d upper M Subscript i Baseline left-parenthesis bold-italic beta comma s right-parenthesis

The vector ModifyingAbove bold upper L With caret Subscript i Baseline identical-to bold upper L Subscript i Baseline left-parenthesis ModifyingAbove bold-italic beta With caret comma normal infinity right-parenthesis is the score residual for the ith subject. When the counting process MODEL specification is used, the RESSCO= variables contain the components of left-parenthesis bold upper L Subscript i Baseline left-parenthesis ModifyingAbove bold-italic beta With caret comma t 2 right-parenthesis minus bold upper L Subscript i Baseline left-parenthesis ModifyingAbove bold-italic beta With caret comma t 1 right-parenthesis right-parenthesis instead of the score process at t 2. The score residual for a subject can be obtained by summing up these component residuals within the subject.

The score residuals are a decomposition of the first partial derivative of the log likelihood. They are useful in assessing the influence of each subject on individual parameter estimates. Therneau, Grambsch, and Fleming (1990) have considered a Kolmogorov-type test based on the cumulative sum of the residuals for detecting nonproportional hazards. These residuals also play an important role in the computation of the robust sandwich variance estimators of Lin and Wei (1989) and Wei, Lin, and Weissfeld (1989).

For TIES=EFRON, the preceding computation is modified to comply with the Efron partial likelihood. For a given time t, let normal upper Delta Subscript i Baseline left-parenthesis t right-parenthesis=1 if the t is an event time of the ith subject and 0 otherwise. Let d left-parenthesis t right-parenthesis equals sigma-summation Underscript i Endscripts normal upper Delta Subscript i Baseline left-parenthesis t right-parenthesis, which is the number of subjects that have an event at t. For 1 less-than-or-equal-to k less-than-or-equal-to d left-parenthesis t right-parenthesis, let

StartLayout 1st Row 1st Column upper S Superscript left-parenthesis 0 right-parenthesis Baseline left-parenthesis bold-italic beta comma k comma t right-parenthesis 2nd Column equals 3rd Column sigma-summation Underscript i Endscripts upper Y Subscript i Baseline left-parenthesis t right-parenthesis StartSet 1 minus StartFraction k minus 1 Over d left-parenthesis t right-parenthesis EndFraction normal upper Delta Subscript i Baseline left-parenthesis t right-parenthesis EndSet normal e Superscript bold-italic beta prime bold upper Z Super Subscript i Superscript left-parenthesis t right-parenthesis 2nd Row 1st Column upper S Superscript left-parenthesis 1 right-parenthesis Baseline left-parenthesis bold-italic beta comma k comma t right-parenthesis 2nd Column equals 3rd Column sigma-summation Underscript i Endscripts upper Y Subscript i Baseline left-parenthesis t right-parenthesis StartSet 1 minus StartFraction k minus 1 Over d left-parenthesis t right-parenthesis EndFraction normal upper Delta Subscript i Baseline left-parenthesis t right-parenthesis EndSet normal e Superscript bold-italic beta prime bold upper Z Super Subscript i Superscript left-parenthesis t right-parenthesis Baseline bold upper Z Subscript i Baseline left-parenthesis t right-parenthesis 3rd Row 1st Column ModifyingAbove bold upper Z With bar left-parenthesis bold-italic beta comma k comma t right-parenthesis 2nd Column equals 3rd Column StartFraction upper S Superscript left-parenthesis 1 right-parenthesis Baseline left-parenthesis bold-italic beta comma k comma t right-parenthesis Over upper S Superscript left-parenthesis 0 right-parenthesis Baseline left-parenthesis bold-italic beta comma k comma t right-parenthesis EndFraction 4th Row 1st Column d normal upper Lamda 0 left-parenthesis bold-italic beta comma k comma t right-parenthesis 2nd Column equals 3rd Column sigma-summation Underscript i Endscripts StartFraction d upper N Subscript i Baseline left-parenthesis t right-parenthesis Over upper S Superscript left-parenthesis 0 right-parenthesis Baseline left-parenthesis bold-italic beta comma k comma t right-parenthesis EndFraction 5th Row 1st Column d upper M Subscript i Baseline left-parenthesis bold-italic beta comma k comma t right-parenthesis 2nd Column equals 3rd Column d upper N Subscript i Baseline left-parenthesis t right-parenthesis minus upper Y Subscript i Baseline left-parenthesis t right-parenthesis left-parenthesis 1 minus normal upper Delta Subscript i Baseline left-parenthesis t right-parenthesis StartFraction k minus 1 Over d left-parenthesis t right-parenthesis EndFraction right-parenthesis normal e Superscript bold-italic beta prime bold upper Z Super Subscript i Superscript left-parenthesis t right-parenthesis Baseline d normal upper Lamda 0 left-parenthesis bold-italic beta comma k comma t right-parenthesis EndLayout

The martingale residual at t for the ith subject is defined as

ModifyingAbove upper M With caret Subscript i Baseline left-parenthesis t right-parenthesis equals integral Subscript 0 Superscript t Baseline StartFraction 1 Over d left-parenthesis s right-parenthesis EndFraction sigma-summation Underscript k equals 1 Overscript d left-parenthesis s right-parenthesis Endscripts d upper M Subscript i Baseline left-parenthesis ModifyingAbove bold-italic beta With caret comma k comma s right-parenthesis equals upper N Subscript i Baseline left-parenthesis t right-parenthesis minus integral Subscript 0 Superscript t Baseline StartFraction 1 Over d left-parenthesis s right-parenthesis EndFraction sigma-summation Underscript k equals 1 Overscript d left-parenthesis s right-parenthesis Endscripts upper Y Subscript i Baseline left-parenthesis s right-parenthesis left-parenthesis 1 minus normal upper Delta Subscript i Baseline left-parenthesis s right-parenthesis StartFraction k minus 1 Over d left-parenthesis s right-parenthesis EndFraction right-parenthesis normal e Superscript ModifyingAbove bold-italic beta With caret prime bold upper Z Super Subscript i Superscript left-parenthesis s right-parenthesis Baseline d normal upper Lamda 0 left-parenthesis ModifyingAbove bold-italic beta With caret comma k comma s right-parenthesis

Deviance residuals are computed by using the same transform on the corresponding martingale residuals as in TIES=BRESLOW.

The Schoenfeld residual vector for the ith subject at event time t Subscript i Sub Subscript j is

ModifyingAbove bold upper U With caret Subscript i Baseline left-parenthesis t Subscript i Sub Subscript j Subscript Baseline right-parenthesis equals bold upper Z Subscript i Baseline left-parenthesis t Subscript i Sub Subscript j Subscript Baseline right-parenthesis minus StartFraction 1 Over d left-parenthesis t Subscript i Sub Subscript j Subscript Baseline right-parenthesis EndFraction sigma-summation Underscript k equals 1 Overscript d left-parenthesis t Subscript i Sub Subscript j Subscript Baseline right-parenthesis Endscripts ModifyingAbove bold upper Z With bar left-parenthesis ModifyingAbove bold-italic beta With caret comma k comma t Subscript i Sub Subscript j Subscript Baseline right-parenthesis

The score process for the ith subject at time t is given by

bold upper L Subscript i Baseline left-parenthesis bold-italic beta comma t right-parenthesis equals integral Subscript 0 Superscript t Baseline StartFraction 1 Over d left-parenthesis s right-parenthesis EndFraction sigma-summation Underscript k equals 1 Overscript d left-parenthesis s right-parenthesis Endscripts left-parenthesis bold upper Z Subscript i Baseline left-parenthesis s right-parenthesis minus ModifyingAbove bold upper Z With bar left-parenthesis bold-italic beta comma k comma s right-parenthesis right-parenthesis d upper M Subscript i Baseline left-parenthesis bold-italic beta comma k comma s right-parenthesis

For TIES=DISCRETE or TIES=EXACT, it is difficult to come up with modifications that are consistent with the corresponding partial likelihood. Residuals for these TIES= methods are computed by using the same formulas as in TIES=BRESLOW.

Last updated: December 09, 2022