The LIFETEST Procedure

Rank Tests for the Association of Survival Time with Covariates

The rank tests for the association of covariates (Kalbfleisch and Prentice 1980, ChapterĀ 6) are more general cases of the rank tests for homogeneity. In this section, the index alpha is used to label all observations, alpha equals 1 comma 2 comma ellipsis comma n, and the indices i comma j range only over the observations that correspond to events, i comma j equals 1 comma 2 comma ellipsis comma k. The ordered event times are denoted as t Subscript left-parenthesis i right-parenthesis, the corresponding vectors of covariates are denoted as bold z Subscript left-parenthesis i right-parenthesis, and the ordered times, both censored and event times, are denoted as t Subscript alpha.

The rank test statistics have the form

bold v equals sigma-summation Underscript alpha equals 1 Overscript n Endscripts c Subscript alpha comma delta Sub Subscript alpha Subscript Baseline bold z Subscript alpha

where n is the total number of observations, c Subscript alpha comma delta Sub Subscript alpha Subscript are rank scores, which can be either log-rank or Wilcoxon rank scores, delta Subscript alpha is 1 if the observation is an event and 0 if the observation is censored, and bold z Subscript alpha is the vector of covariates in the TEST statement for the alphath observation. Notice that the scores, c Subscript alpha comma delta Sub Subscript alpha Subscript, depend on the censoring pattern and that the terms are summed up over all observations.

The log-rank scores are

c Subscript alpha comma delta Sub Subscript alpha Subscript Baseline equals sigma-summation Underscript left-parenthesis j colon t Subscript left-parenthesis j right-parenthesis Baseline less-than-or-equal-to t Subscript alpha Baseline right-parenthesis Endscripts left-parenthesis StartFraction 1 Over n Subscript j Baseline EndFraction minus delta Subscript alpha Baseline right-parenthesis

and the Wilcoxon scores are

c Subscript alpha comma delta Sub Subscript alpha Subscript Baseline equals 1 minus left-parenthesis 1 plus delta Subscript alpha Baseline right-parenthesis product Underscript left-parenthesis j colon t Subscript left-parenthesis j right-parenthesis Baseline less-than-or-equal-to t Subscript alpha Baseline right-parenthesis Endscripts StartFraction n Subscript j Baseline Over n Subscript j Baseline plus 1 EndFraction

where n Subscript j is the number at risk just prior to t Subscript left-parenthesis j right-parenthesis.

The estimates used for the covariance matrix of the log-rank statistics are

bold upper V equals sigma-summation Underscript i equals 1 Overscript k Endscripts StartFraction bold upper V Subscript i Baseline Over n Subscript i Baseline EndFraction

where bold upper V Subscript i is the corrected sum of squares and crossproducts matrix for the risk set at time t Subscript left-parenthesis i right-parenthesis; that is,

bold upper V Subscript i Baseline equals sigma-summation Underscript left-parenthesis alpha colon t Subscript alpha Baseline greater-than-or-equal-to t Subscript left-parenthesis i right-parenthesis Baseline right-parenthesis Endscripts left-parenthesis bold z Subscript alpha Baseline minus bold z overbar Subscript i Baseline right-parenthesis prime left-parenthesis bold z Subscript alpha Baseline minus bold z overbar Subscript i Baseline right-parenthesis

where

bold z overbar Subscript i Baseline equals sigma-summation Underscript left-parenthesis alpha colon t Subscript alpha Baseline greater-than-or-equal-to t Subscript left-parenthesis i right-parenthesis Baseline right-parenthesis Endscripts StartFraction bold z Subscript alpha Baseline Over n Subscript i Baseline EndFraction

The estimate used for the covariance matrix of the Wilcoxon statistics is

bold upper V equals sigma-summation Underscript i equals 1 Overscript k Endscripts left-bracket a Subscript i Baseline left-parenthesis 1 minus a Subscript i Superscript asterisk Baseline right-parenthesis left-parenthesis 2 bold z Subscript left-parenthesis i right-parenthesis Baseline bold z prime Subscript left-parenthesis i right-parenthesis plus bold upper S Subscript i Baseline right-parenthesis minus left-parenthesis a Subscript i Superscript asterisk Baseline minus a Subscript i Baseline right-parenthesis left-parenthesis a Subscript i Baseline bold x Subscript i Baseline bold x prime Subscript i plus sigma-summation Underscript j equals i plus 1 Overscript k Endscripts a Subscript j Baseline left-parenthesis bold x Subscript i Baseline bold x prime Subscript j plus bold x Subscript j Baseline bold x prime Subscript i right-parenthesis right-parenthesis right-bracket

where

StartLayout 1st Row 1st Column a Subscript i 2nd Column equals 3rd Column product Underscript j equals 1 Overscript i Endscripts StartFraction n Subscript j Baseline Over n Subscript j Baseline plus 1 EndFraction 2nd Row 1st Column a Subscript i Superscript asterisk 2nd Column equals 3rd Column product Underscript j equals 1 Overscript i Endscripts StartFraction n Subscript j Baseline plus 1 Over n Subscript j Baseline plus 2 EndFraction 3rd Row 1st Column bold upper S Subscript i 2nd Column equals 3rd Column sigma-summation Underscript left-parenthesis alpha colon t Subscript left-parenthesis i plus 1 right-parenthesis Baseline greater-than t Subscript alpha Baseline greater-than t Subscript left-parenthesis i right-parenthesis Baseline right-parenthesis Endscripts bold z Subscript alpha Baseline bold z prime Subscript alpha 4th Row 1st Column bold x Subscript i 2nd Column equals 3rd Column 2 bold z Subscript left-parenthesis i right-parenthesis plus sigma-summation Underscript left-parenthesis alpha colon t Subscript left-parenthesis i plus 1 right-parenthesis Baseline greater-than t Subscript alpha Baseline greater-than t Subscript left-parenthesis i right-parenthesis Baseline right-parenthesis Endscripts bold z Subscript alpha EndLayout

In the case of tied failure times, the statistics bold v are averaged over the possible orderings of the tied failure times. The covariance matrices are also averaged over the tied failure times. Averaging the covariance matrices over the tied orderings produces functions with appropriate symmetries for the tied observations; however, the actual variances of the bold v statistics would be smaller than the preceding estimates. Unless the proportion of ties is large, it is unlikely that this will be a problem.

The univariate tests for each covariate are formed from each component of bold v and the corresponding diagonal element of bold upper V as v Subscript i Superscript 2 Baseline slash upper V Subscript i i. These statistics are treated as coming from a chi-square distribution for calculation of probability values.

The statistic bold v prime bold upper V Superscript bold minus Baseline bold v is computed by sweeping each pivot of the bold upper V matrix in the order of greatest increase to the statistic. The corresponding sequence of partial statistics is tabulated. Sequential increments for including a given covariate and the corresponding probabilities are also included in the same table. These probabilities are calculated as the tail probabilities of a chi-square distribution with one degree of freedom. Because of the selection process, these probabilities should not be interpreted as p-values.

If desired for data screening purposes, the output data set requested by the OUTTEST= option can be treated as a sum of squares and crossproducts matrix and processed by the REG procedure by using the option METHOD=RSQUARE. Then the sets of variables of a given size can be found that give the largest test statistics. Product-Limit Estimates and Tests of Association illustrates this process.

Last updated: March 08, 2022