The PHREG Procedure

Specifics for Bayesian Analysis

To request a Bayesian analysis, you specify the BAYES statement in addition to the PROC PHREG statement and the MODEL statement. You include a CLASS statement if you have effects that involve categorical variables. The FREQ or WEIGHT statement can be included if you have a frequency or weight variable, respectively, in the input data. You can use the STRATA statement to carry out a stratified analysis for the Cox model, but it is not allowed in the proportional hazards spline model or the piecewise constant baseline hazard model. Programming statements can be used to create time-dependent covariates for the Cox model, but they are not allowed in the proportional hazards spline model or the piecewise constant baseline hazard model. You can use the counting process style of input or the ENTRY= option in the MODEL statement to specify left truncation of failure times. The HAZARDRATIO statement enables you to request a hazard ratio analysis based on the posterior samples. The ASSESS, CONTRAST, ID, OUTPUT, and TEST statements, if specified, are ignored. Also ignored are the COVM and COVS options in the PROC PHREG statement and the following options in the MODEL statement: BEST=, CORRB, COVB, DETAILS, HIERARCHY=, INCLUDE=, MAXSTEP=, NOFIT, PLCONV=, SELECTION=, SEQUENTIAL, SLENTRY=, and SLSTAY=.

Proportional Hazards Spline Model

The proportional hazards spline model (Royston and Parmar 2002) with fixed covariate vector has a cumulative hazard function at time t,

upper H left-parenthesis t semicolon bold x vertical-bar bold-italic gamma right-parenthesis equals normal e Superscript s left-parenthesis log left-parenthesis t right-parenthesis comma bold-italic gamma right-parenthesis plus bold-italic beta prime bold x

where is the vector of regression coefficients and is a cubic spline function as described for the SPLINE option in the BAYES statement. The number of knots is equal to one plus the degrees of freedom (which you specify in the DF= option). The knots are determined by the sequence of the distinct event times of the data. PROC PHREG places the terminal knots at the minimum and maximum of the sequence and selects the m internal knots as the kth percentiles of the sequence, where . For example, if you specify DF=4, the three internal knots are the 25th, 50th, and 75th percentiles of the distinct event times.

Let be the observed data. The likelihood function of the proportional hazards spline model is given by

upper L Subscript upper R upper P Baseline left-parenthesis bold-italic gamma comma beta right-parenthesis equals product Underscript i Endscripts StartSet StartFraction 1 Over t Subscript i Baseline EndFraction StartFraction d s left-parenthesis y Subscript i Baseline comma bold-italic gamma right-parenthesis Over d y Subscript i Baseline EndFraction EndSet Superscript delta Super Subscript i Baseline normal e Superscript eta Super Subscript i Superscript minus normal e Super Superscript eta Super Super Subscript i

where and . Note that

StartFraction d s left-parenthesis y semicolon bold-italic gamma right-parenthesis Over d y EndFraction equals gamma 1 plus sigma-summation Underscript j equals 2 Overscript m Endscripts gamma Subscript j Baseline left-bracket 3 left-parenthesis y minus k Subscript j Baseline right-parenthesis Subscript plus Superscript 2 Baseline minus 3 lamda Subscript j Baseline left-parenthesis x minus k Subscript normal m normal i normal n Baseline right-parenthesis Subscript plus Superscript 2 Baseline minus 3 left-parenthesis 1 minus lamda Subscript j Baseline right-parenthesis left-parenthesis x minus k Subscript normal m normal a normal x Baseline right-parenthesis Subscript plus Superscript 2 Baseline right-bracket

Piecewise Constant Baseline Hazard Model

Let be the observed data. Let be a partition of the time axis.

Hazards in Original Scale

The hazard function for subject i is

h left-parenthesis t vertical-bar bold x Subscript i Baseline semicolon bold-italic theta right-parenthesis equals h 0 left-parenthesis t right-parenthesis exp left-parenthesis bold-italic beta prime bold x Subscript i Baseline right-parenthesis

where

h 0 left-parenthesis t right-parenthesis equals lamda Subscript j Baseline normal i normal f a Subscript j minus 1 Baseline less-than-or-equal-to t less-than a Subscript j Baseline comma j equals 1 comma ellipsis comma upper J

The baseline cumulative hazard function is

upper H 0 left-parenthesis t right-parenthesis equals sigma-summation Underscript j equals 1 Overscript upper J Endscripts lamda Subscript j Baseline normal upper Delta Subscript j Baseline left-parenthesis t right-parenthesis

where

StartLayout 1st Row normal upper Delta Subscript j Baseline left-parenthesis t right-parenthesis equals StartLayout Enlarged left-brace 1st Row 1st Column 0 2nd Column t less-than a Subscript j minus 1 Baseline 2nd Row 1st Column t minus a Subscript j minus 1 Baseline 2nd Column a Subscript j minus 1 Baseline less-than-or-equal-to t less-than a Subscript j Baseline 3rd Row 1st Column a Subscript j Baseline minus a Subscript j minus 1 Baseline 2nd Column t greater-than-or-equal-to a Subscript j EndLayout EndLayout

The log likelihood is given by

where .

Note that for , the full conditional for is log-concave only when , but the full conditionals for the ’s are always log-concave.

For a given , gives

ModifyingAbove lamda With tilde Subscript j Baseline left-parenthesis bold-italic beta right-parenthesis equals StartFraction d Subscript j Baseline Over sigma-summation Underscript i equals 1 Overscript n Endscripts normal upper Delta Subscript j Baseline left-parenthesis t Subscript i Baseline right-parenthesis exp left-parenthesis bold-italic beta prime bold x Subscript i Baseline right-parenthesis EndFraction comma j equals 1 comma ellipsis comma upper J

Substituting these values into gives the profile log likelihood for

l Subscript p Baseline left-parenthesis bold-italic beta right-parenthesis equals sigma-summation Underscript i equals 1 Overscript n Endscripts delta Subscript i Baseline bold-italic beta prime bold x Subscript i Baseline minus sigma-summation Underscript j equals 1 Overscript upper J Endscripts d Subscript j Baseline log left-bracket sigma-summation Underscript l equals 1 Overscript n Endscripts normal upper Delta Subscript j Baseline left-parenthesis t Subscript l Baseline right-parenthesis exp left-parenthesis bold-italic beta prime bold x Subscript l Baseline right-parenthesis right-bracket plus c

where . Since the constant c does not depend on , it can be discarded from in the optimization.

The MLE of is obtained by maximizing

with respect to , and the MLE of is given by

ModifyingAbove bold-italic lamda With caret equals ModifyingAbove bold-italic lamda With tilde left-parenthesis ModifyingAbove bold-italic beta With caret right-parenthesis

For , let

StartLayout 1st Row 1st Column bold upper S Subscript j Superscript left-parenthesis r right-parenthesis Baseline left-parenthesis bold-italic beta right-parenthesis 2nd Column equals 3rd Column sigma-summation Underscript l equals 1 Overscript n Endscripts normal upper Delta Subscript j Baseline left-parenthesis t Subscript l Baseline right-parenthesis normal e Superscript bold-italic beta prime bold x Super Subscript l Superscript Baseline bold x Subscript l Superscript circled-times r Baseline comma r equals 0 comma 1 comma 2 2nd Row 1st Column bold upper E Subscript j Baseline left-parenthesis bold-italic beta right-parenthesis 2nd Column equals 3rd Column StartFraction bold upper S Subscript j Superscript left-parenthesis 1 right-parenthesis Baseline left-parenthesis bold-italic beta right-parenthesis Over upper S Subscript j Superscript left-parenthesis 0 right-parenthesis Baseline left-parenthesis bold-italic beta right-parenthesis EndFraction EndLayout

The partial derivatives of are

StartLayout 1st Row 1st Column StartFraction partial-differential l Subscript p Baseline left-parenthesis bold-italic beta right-parenthesis Over partial-differential bold-italic beta EndFraction 2nd Column equals 3rd Column sigma-summation Underscript i equals 1 Overscript n Endscripts delta Subscript i Baseline bold x Subscript i minus sigma-summation Underscript j equals 1 Overscript upper J Endscripts d Subscript j Baseline bold upper E Subscript j Baseline left-parenthesis bold-italic beta right-parenthesis 2nd Row 1st Column minus StartFraction partial-differential squared l Subscript p Baseline left-parenthesis bold-italic beta right-parenthesis Over partial-differential bold-italic beta squared EndFraction 2nd Column equals 3rd Column sigma-summation Underscript j equals 1 Overscript upper J Endscripts d Subscript j Baseline StartSet StartFraction bold upper S Subscript j Superscript left-parenthesis 2 right-parenthesis Baseline left-parenthesis bold-italic beta right-parenthesis Over upper S Subscript j Superscript left-parenthesis 0 right-parenthesis Baseline left-parenthesis bold-italic beta right-parenthesis EndFraction minus left-bracket bold upper E Subscript j Baseline left-parenthesis bold-italic beta right-parenthesis right-bracket left-bracket bold upper E Subscript j Baseline left-parenthesis bold-italic beta right-parenthesis right-bracket prime EndSet EndLayout

The asymptotic covariance matrix for is obtained as the inverse of the information matrix given by

StartLayout 1st Row 1st Column minus StartFraction partial-differential squared l left-parenthesis ModifyingAbove bold-italic lamda With caret comma ModifyingAbove bold-italic beta With caret right-parenthesis Over partial-differential bold-italic lamda squared EndFraction 2nd Column equals 3rd Column script upper D left-parenthesis StartFraction d 1 Over ModifyingAbove lamda With caret Subscript 1 Superscript 2 Baseline EndFraction comma ellipsis comma StartFraction d Subscript upper J Baseline Over ModifyingAbove lamda With caret Subscript upper J Superscript 2 Baseline EndFraction right-parenthesis 2nd Row 1st Column minus StartFraction partial-differential squared l left-parenthesis ModifyingAbove bold-italic lamda With caret comma ModifyingAbove bold-italic beta With caret right-parenthesis Over partial-differential bold-italic beta squared EndFraction 2nd Column equals 3rd Column sigma-summation Underscript j equals 1 Overscript upper J Endscripts ModifyingAbove lamda With caret Subscript j Baseline bold upper S Subscript j Superscript left-parenthesis 2 right-parenthesis Baseline left-parenthesis ModifyingAbove bold-italic beta With caret right-parenthesis 3rd Row 1st Column minus StartFraction partial-differential squared l left-parenthesis ModifyingAbove bold-italic lamda With caret comma ModifyingAbove bold-italic beta With caret right-parenthesis Over partial-differential bold-italic lamda partial-differential bold-italic beta EndFraction 2nd Column equals 3rd Column left-parenthesis bold upper S 1 Superscript left-parenthesis 1 right-parenthesis Baseline left-parenthesis ModifyingAbove bold-italic beta With caret right-parenthesis comma ellipsis comma bold upper S Subscript upper J Superscript left-parenthesis 1 right-parenthesis Baseline left-parenthesis ModifyingAbove bold-italic beta With caret right-parenthesis right-parenthesis EndLayout

See Example 6.5.1 in Lawless (2003) for details.

Hazards in Log Scale

By letting

alpha Subscript j Baseline equals log left-parenthesis lamda Subscript j Baseline right-parenthesis comma j equals 1 comma ellipsis comma upper J

you can build a prior correlation among the ’s by using a correlated prior , where .

The log likelihood is given by

l left-parenthesis bold-italic alpha comma bold-italic beta right-parenthesis equals sigma-summation Underscript j equals 1 Overscript upper J Endscripts d Subscript j Baseline alpha Subscript j Baseline plus sigma-summation Underscript i equals 1 Overscript n Endscripts delta Subscript i Baseline bold-italic beta prime bold x Subscript i Baseline minus sigma-summation Underscript j equals 1 Overscript upper J Endscripts normal e Superscript alpha Super Subscript j Baseline upper S Subscript j Superscript left-parenthesis 0 right-parenthesis Baseline left-parenthesis bold-italic beta right-parenthesis

Then the MLE of is given by

normal e Superscript ModifyingAbove alpha With caret Super Subscript j Baseline equals ModifyingAbove lamda With caret Subscript j Baseline equals StartFraction d Subscript j Baseline Over upper S Subscript j Superscript 0 Baseline left-parenthesis ModifyingAbove bold-italic beta With caret right-parenthesis EndFraction

Note that the full conditionals for ’s and ’s are always log-concave.

The asymptotic covariance matrix for is obtained as the inverse of the information matrix formed by

StartLayout 1st Row 1st Column minus StartFraction partial-differential squared l left-parenthesis ModifyingAbove bold-italic alpha With caret comma ModifyingAbove bold-italic beta With caret right-parenthesis Over partial-differential bold-italic alpha squared EndFraction 2nd Column equals 3rd Column script upper D left-parenthesis normal e Superscript ModifyingAbove alpha With caret Super Subscript j Superscript Baseline upper S Subscript j Superscript 0 Baseline left-parenthesis ModifyingAbove bold-italic beta With caret right-parenthesis comma ellipsis comma normal e Superscript ModifyingAbove alpha With caret Super Subscript upper J Superscript Baseline upper S Subscript upper J Superscript 0 Baseline left-parenthesis ModifyingAbove bold-italic beta With caret right-parenthesis right-parenthesis right-parenthesis 2nd Row 1st Column minus StartFraction partial-differential squared l left-parenthesis ModifyingAbove bold-italic alpha With caret comma ModifyingAbove bold-italic beta With caret right-parenthesis Over partial-differential bold-italic beta squared EndFraction 2nd Column equals 3rd Column sigma-summation Underscript j equals 1 Overscript upper J Endscripts normal e Superscript ModifyingAbove alpha With caret Super Subscript j Baseline bold upper S Subscript j Superscript left-parenthesis 2 right-parenthesis Baseline left-parenthesis ModifyingAbove bold-italic beta With caret right-parenthesis 3rd Row 1st Column minus StartFraction partial-differential squared l left-parenthesis ModifyingAbove bold-italic alpha With caret comma ModifyingAbove bold-italic beta With caret right-parenthesis Over partial-differential bold-italic alpha partial-differential bold-italic beta EndFraction 2nd Column equals 3rd Column left-parenthesis normal e Superscript ModifyingAbove alpha With caret Super Subscript j Superscript Baseline bold upper S 1 Superscript left-parenthesis 1 right-parenthesis Baseline left-parenthesis ModifyingAbove bold-italic beta With caret right-parenthesis comma ellipsis comma normal e Superscript ModifyingAbove alpha With caret Super Subscript j Superscript Baseline bold upper S Subscript upper J Superscript left-parenthesis 1 right-parenthesis Baseline left-parenthesis ModifyingAbove bold-italic beta With caret right-parenthesis right-parenthesis EndLayout

Priors for Model Parameters

For a Cox model, the model parameters are the regression coefficients. For a piecewise exponential model, the model parameters consist of the regression coefficients and the hazards or log-hazards. The priors for the hazards and the priors for the regression coefficients are assumed to be independent, while you can have a joint multivariate normal prior for the log-hazards and the regression coefficients. For a proportional hazards spline model, the model parameters consist of the regression coefficients and the cubic spline parameters. You can have a joint multivariate normal prior for the cubic spline parameters and the regression coefficients. Otherwise, the prior for the cubic spline parameters and the prior for the regression coefficients are assumed to be independent.

Cubic Spline Parameters

Uniform Prior

The joint prior density is given by

p left-parenthesis gamma 1 comma ellipsis comma gamma Subscript upper J Baseline right-parenthesis proportional-to 1 comma for-all minus normal infinity less-than gamma Subscript i Baseline less-than normal infinity

Normal Prior

Assume has a multivariate normal prior with mean vector and covariance matrix . The joint prior density is given by

p left-parenthesis gamma right-parenthesis proportional-to normal e Superscript minus one-half left-parenthesis bold-italic gamma minus bold-italic gamma 0 right-parenthesis prime bold upper Psi 0 Super Superscript negative 1 Superscript left-parenthesis bold-italic gamma minus bold-italic gamma 0 right-parenthesis

Hazard Parameters

Let be the constant baseline hazards.

Improper Prior

The joint prior density is given by

p left-parenthesis lamda 1 comma ellipsis comma lamda Subscript upper J Baseline right-parenthesis equals product Underscript j equals 1 Overscript upper J Endscripts StartFraction 1 Over lamda Subscript j Baseline EndFraction for all lamda Subscript j Baseline greater-than 0

This prior is improper (nonintegrable), but the posterior distribution is proper as long as there is at least one event time in each of the constant hazard intervals.

Uniform Prior

The joint prior density is given by

p left-parenthesis lamda 1 comma ellipsis comma lamda Subscript upper J Baseline right-parenthesis proportional-to 1 for all lamda Subscript j Baseline greater-than 0

This prior is improper (nonintegrable), but the posteriors are proper as long as there is at least one event time in each of the constant hazard intervals.

Gamma Prior

The gamma distribution has a PDF

f Subscript a comma b Baseline left-parenthesis t right-parenthesis equals StartFraction b left-parenthesis b t right-parenthesis Superscript a minus 1 Baseline normal e Superscript minus b t Baseline Over normal upper Gamma left-parenthesis a right-parenthesis EndFraction comma t greater-than 0

where a is the shape parameter and is the scale parameter. The mean is and the variance is .

Independent Gamma Prior

Suppose for , has an independent prior. The joint prior density is given by

p left-parenthesis lamda 1 comma ellipsis comma lamda Subscript upper J Baseline right-parenthesis proportional-to product Underscript j equals 1 Overscript upper J Endscripts StartSet lamda Subscript j Superscript a Super Subscript j Superscript minus 1 Baseline normal e Superscript minus b Super Subscript j Superscript lamda Super Subscript j Superscript Baseline EndSet comma for-all lamda Subscript j Baseline greater-than 0

AR1 Prior

are correlated as follows:

The joint prior density is given by

p left-parenthesis lamda 1 comma ellipsis comma lamda Subscript upper J Baseline right-parenthesis proportional-to lamda 1 Superscript a 1 minus 1 Baseline normal e Superscript minus b 1 lamda 1 Baseline product Underscript j equals 2 Overscript upper J Endscripts left-parenthesis StartFraction b Subscript j Baseline Over lamda Subscript j minus 1 Baseline EndFraction right-parenthesis Superscript a Super Subscript j Baseline lamda Subscript j Superscript a Super Subscript j Superscript minus 1 Baseline normal e Superscript minus StartFraction b Super Subscript j Superscript Over lamda Super Subscript j minus 1 Superscript EndFraction lamda Super Subscript j

Log-Hazard Parameters

Write .

Uniform Prior

The joint prior density is given by

p left-parenthesis alpha 1 comma ellipsis comma alpha Subscript upper J Baseline right-parenthesis proportional-to 1 comma for-all minus normal infinity less-than alpha Subscript i Baseline less-than normal infinity

Note that the uniform prior for the log-hazards is the same as the improper prior for the hazards.

Normal Prior

Assume has a multivariate normal prior with mean vector and covariance matrix . The joint prior density is given by

p left-parenthesis bold-italic alpha right-parenthesis proportional-to normal e Superscript minus one-half left-parenthesis bold-italic alpha minus bold-italic alpha 0 right-parenthesis prime bold upper Psi 0 Super Superscript negative 1 Superscript left-parenthesis bold-italic alpha minus bold-italic alpha 0 right-parenthesis

Regression Coefficients

Let be the vector of regression coefficients.

Uniform Prior

The joint prior density is given by

p left-parenthesis beta 1 comma ellipsis comma beta Subscript k Baseline right-parenthesis proportional-to 1 comma for-all minus normal infinity less-than beta Subscript i Baseline less-than normal infinity

This prior is improper, but the posterior distributions for are proper.

Normal Prior

Assume has a multivariate normal prior with mean vector and covariance matrix . The joint prior density is given by

p left-parenthesis bold-italic beta right-parenthesis proportional-to normal e Superscript minus one-half left-parenthesis bold-italic beta minus bold-italic beta 0 right-parenthesis prime bold upper Sigma 0 Super Superscript negative 1 Superscript left-parenthesis bold-italic beta minus bold-italic beta 0 right-parenthesis

Joint Multivariate Normal Prior for Log-Hazards and Regression Coefficients

Assume has a multivariate normal prior with mean vector and covariance matrix . The joint prior density is given by

p left-parenthesis bold-italic alpha comma bold-italic beta right-parenthesis proportional-to normal e Superscript minus one-half left-bracket left-parenthesis bold-italic alpha minus bold-italic alpha 0 right-parenthesis prime comma left-parenthesis bold-italic beta minus bold-italic beta 0 right-parenthesis Super Superscript prime Superscript right-bracket bold upper Phi 0 Super Superscript negative 1 Superscript left-bracket left-parenthesis bold-italic alpha minus bold-italic alpha 0 right-parenthesis prime comma left-parenthesis bold-italic beta minus bold-italic beta 0 right-parenthesis Super Superscript prime Superscript right-bracket prime

Joint Multivariate Normal Prior for Cubic Spline Parameters and Regression Coefficients

Assume has a multivariate normal prior with mean vector and covariance matrix . The joint prior density is given by

p left-parenthesis bold-italic gamma comma bold-italic beta right-parenthesis proportional-to normal e Superscript minus one-half left-bracket left-parenthesis bold-italic gamma minus bold-italic alpha 0 right-parenthesis prime comma left-parenthesis bold-italic beta minus bold-italic beta 0 right-parenthesis Super Superscript prime Superscript right-bracket bold upper Phi 0 Super Superscript negative 1 Superscript left-bracket left-parenthesis bold-italic gamma minus bold-italic alpha 0 right-parenthesis prime comma left-parenthesis bold-italic beta minus bold-italic beta 0 right-parenthesis Super Superscript prime Superscript right-bracket prime

Zellner’s g-Prior

Assume has a multivariate normal prior with mean vector and covariance matrix , where is the design matrix and g is either a constant or it follows a gamma prior with density where a and b are the SHAPE= and ISCALE= parameters. Let k be the rank of . The joint prior density with g being a constant c is given by

p left-parenthesis bold-italic beta right-parenthesis proportional-to c Superscript StartFraction k Over 2 EndFraction Baseline normal e Superscript minus one-half bold-italic beta prime left-parenthesis c bold upper X prime bold upper X right-parenthesis Super Superscript negative 1 Superscript bold-italic beta

The joint prior density with g having a gamma prior is given by

p left-parenthesis bold-italic beta comma tau right-parenthesis proportional-to tau Superscript StartFraction k Over 2 EndFraction Baseline normal e Superscript minus one-half bold-italic beta prime left-parenthesis tau bold upper X prime bold upper X right-parenthesis Super Superscript negative 1 Superscript bold-italic beta Baseline StartFraction b left-parenthesis b tau right-parenthesis Superscript a minus 1 Baseline normal e Superscript minus b tau Baseline Over normal upper Gamma left-parenthesis a right-parenthesis EndFraction

Dispersion Parameter for Frailty Model

Improper Prior

The density is

p left-parenthesis theta right-parenthesis equals StartFraction 1 Over theta EndFraction

Inverse Gamma Prior

The inverse gamma distribution has a density

p left-parenthesis theta vertical-bar a comma b right-parenthesis equals StartFraction b Superscript a Baseline theta Superscript minus left-parenthesis a plus 1 right-parenthesis Baseline normal e Superscript minus StartFraction b Over theta EndFraction Baseline Over normal upper Gamma left-parenthesis a right-parenthesis EndFraction

where a and b are the SHAPE= and SCALE= parameters, respectively.

Gamma Prior

The gamma distribution has a density

p left-parenthesis theta vertical-bar a comma b right-parenthesis equals StartFraction b Superscript a Baseline theta Superscript a minus 1 Baseline normal e Superscript minus b theta Baseline Over normal upper Gamma left-parenthesis a right-parenthesis EndFraction

where a and b are the SHAPE= and ISCALE= parameters, respectively.

Posterior Distribution

Denote the observed data as D.

Cox Model

pi left-parenthesis bold-italic beta vertical-bar upper D right-parenthesis proportional-to ModifyingBelow upper L Subscript left-parenthesis Baseline upper D vertical-bar bold-italic beta right-parenthesis With bottom-brace Underscript normal p normal a normal r normal t normal i normal a normal l normal l normal i normal k normal e normal l normal i normal h normal o normal o normal d Endscripts ModifyingAbove p left-parenthesis bold-italic beta right-parenthesis With top-brace Overscript normal p normal r normal i normal o normal r Endscripts

Proportional Hazards Spline Model

StartLayout 1st Row pi left-parenthesis bold-italic gamma comma bold-italic beta vertical-bar upper D right-parenthesis proportional-to StartLayout Enlarged left-brace 1st Row 1st Column upper L Subscript normal upper R normal upper P Baseline left-parenthesis upper D vertical-bar bold-italic gamma comma bold-italic beta right-parenthesis p left-parenthesis bold-italic gamma comma bold-italic beta right-parenthesis 2nd Column if left-parenthesis gamma prime comma bold-italic beta Superscript prime Baseline right-parenthesis prime tilde MVN 2nd Row 1st Column upper L Subscript normal upper R normal upper P Baseline left-parenthesis upper D vertical-bar bold-italic gamma comma bold-italic beta right-parenthesis p left-parenthesis bold-italic gamma right-parenthesis p left-parenthesis bold-italic beta right-parenthesis 2nd Column otherwise EndLayout EndLayout

where is the likelihood function with cubic spline parameters and regression coefficients as parameters.

Frailty Model

Based on the framework of Sargent (1998),

pi left-parenthesis bold-italic beta comma bold-italic gamma comma theta vertical-bar upper D right-parenthesis proportional-to ModifyingBelow upper L left-parenthesis upper D vertical-bar bold-italic beta comma bold-italic gamma right-parenthesis With bottom-brace Underscript normal p normal a normal r normal t normal i normal a normal l normal l normal i normal k normal e normal l normal i normal h normal o normal o normal d Endscripts ModifyingAbove g left-parenthesis bold-italic gamma vertical-bar theta right-parenthesis With top-brace Overscript normal r normal a normal n normal d normal o normal m normal e normal f normal f normal e normal c normal t normal s Endscripts ModifyingBelow p left-parenthesis bold-italic beta right-parenthesis p left-parenthesis theta right-parenthesis With bottom-brace Underscript normal p normal r normal i normal o normal r normal s Endscripts

where the joint density of the random effects is given by

g left-parenthesis bold-italic gamma vertical-bar theta right-parenthesis proportional-to StartLayout Enlarged left-brace 1st Row 1st Column product Underscript i Endscripts exp left-parenthesis StartFraction gamma Subscript i Baseline Over theta EndFraction right-parenthesis exp left-parenthesis minus exp left-parenthesis StartFraction gamma Subscript i Baseline Over theta EndFraction right-parenthesis right-parenthesis 2nd Column gamma frailty 2nd Row 1st Column product Underscript i Endscripts exp left-parenthesis minus StartFraction gamma Subscript i Superscript 2 Baseline Over 2 theta EndFraction right-parenthesis 2nd Column lognormal frailty EndLayout

Piecewise Exponential Model

Hazard Parameters

pi left-parenthesis bold-italic lamda comma bold-italic beta vertical-bar upper D right-parenthesis proportional-to upper L Subscript upper H Baseline left-parenthesis upper D vertical-bar bold-italic lamda comma bold-italic beta right-parenthesis p left-parenthesis bold-italic lamda right-parenthesis p left-parenthesis bold-italic beta right-parenthesis

where is the likelihood function with hazards and regression coefficients as parameters.

Log-Hazard Parameters

StartLayout 1st Row pi left-parenthesis bold-italic alpha comma bold-italic beta vertical-bar upper D right-parenthesis proportional-to StartLayout Enlarged left-brace 1st Row 1st Column upper L Subscript normal upper L normal upper H Baseline left-parenthesis upper D vertical-bar bold-italic alpha comma bold-italic beta right-parenthesis p left-parenthesis bold-italic alpha comma bold-italic beta right-parenthesis 2nd Column if left-parenthesis bold-italic alpha prime comma bold-italic beta Superscript prime Baseline right-parenthesis prime tilde MVN 2nd Row 1st Column upper L Subscript normal upper L normal upper H Baseline left-parenthesis upper D vertical-bar bold-italic alpha comma bold-italic beta right-parenthesis p left-parenthesis bold-italic alpha right-parenthesis p left-parenthesis bold-italic beta right-parenthesis 2nd Column otherwise EndLayout EndLayout

where is the likelihood function with log-hazards and regression coefficients as parameters.

Sampling from the Posterior Distribution

For the Gibbs sampler, PROC PHREG uses the ARMS (adaptive rejection Metropolis sampling) algorithm of Gilks, Best, and Tan (1995) to sample from the full conditionals. This is the default sampling scheme. Alternatively, you can requests the random walk Metropolis (RWM) algorithm to sample an entire parameter vector from the posterior distribution. For a general discussion of these algorithms, see section Markov Chain Monte Carlo Method in Chapter 8, Introduction to Bayesian Analysis Procedures.

You can output these posterior samples into a SAS data set by using the OUTPOST= option in the BAYES statement, or you can use the following SAS statement to output the posterior samples into the SAS data set Post:

 ods output PosteriorSample=Post;

The output data set also includes the variables LogLike and LogPost, which represent the log of the likelihood and the log of the posterior log density, respectively.

Let be the parameter vector. For the Cox model, the ’s are the regression coefficients ’s, and for the piecewise constant baseline hazard model, the ’s consist of the baseline hazards ’s (or log baseline hazards ’s) and the regression coefficients ’s. Let be the likelihood function, where D is the observed data. Note that for the Cox model, the likelihood contains the infinite-dimensional baseline hazard function, and the gamma process is perhaps the most commonly used prior process (Ibrahim, Chen, and Sinha 2001). However, Sinha, Ibrahim, and Chen (2003) justify using the partial likelihood as the likelihood function for the Bayesian analysis. Let be the prior distribution. The posterior is proportional to the joint distribution .

Gibbs Sampler

The full conditional distribution of is proportional to the joint distribution; that is,

pi left-parenthesis theta Subscript i Baseline vertical-bar theta Subscript j Baseline comma i not-equals j comma upper D right-parenthesis proportional-to upper L left-parenthesis upper D vertical-bar bold-italic theta right-parenthesis p left-parenthesis bold-italic theta right-parenthesis

For example, the one-dimensional conditional distribution of , given , is computed as

pi left-parenthesis theta 1 vertical-bar theta Subscript j Baseline equals theta Subscript j Superscript asterisk Baseline comma 2 less-than-or-equal-to j less-than-or-equal-to k comma upper D right-parenthesis equals upper L left-parenthesis upper D vertical-bar bold-italic theta equals left-parenthesis theta 1 comma theta 2 Superscript asterisk Baseline comma ellipsis comma theta Subscript k Superscript asterisk Baseline right-parenthesis Superscript prime Baseline right-parenthesis p left-parenthesis bold-italic theta equals left-parenthesis theta 1 comma theta 2 Superscript asterisk Baseline comma ellipsis comma theta Subscript k Superscript asterisk Baseline right-parenthesis prime right-parenthesis

Suppose you have a set of arbitrary starting values . Using the ARMS algorithm, an iteration of the Gibbs sampler consists of the following:

draw from
draw from
draw from

After one iteration, you have . After n iterations, you have . Cumulatively, a chain of n samples is obtained.

Random Walk Metropolis Algorithm

PROC PHREG uses a multivariate normal proposal distribution centered at . With an initial parameter vector , a new sample is obtained as follows:

sample from
calculate the quantity
sample u from the uniform distribution
set if ; otherwise set

With taking the role of , the previous steps are repeated to generate the next sample . After n iterations, a chain of n samples is obtained.

Starting Values of the Markov Chains

When the BAYES statement is specified, PROC PHREG generates one Markov chain that contains the approximate posterior samples of the model parameters. Additional chains are produced when the Gelman-Rubin diagnostics are requested. Starting values (initial values) can be specified in the INITIAL= data set in the BAYES statement. If the INITIAL= option is not specified, PROC PHREG picks its own initial values for the chains based on the maximum likelihood estimates of and the prior information of .

Denote as the integral value of x.

Constant Baseline Hazard Parameters ’s

For the first chain that the summary statistics and diagnostics are based on, the initial values are

lamda Subscript i Superscript left-parenthesis 0 right-parenthesis Baseline equals ModifyingAbove lamda With caret Subscript i

For subsequent chains, the starting values are picked in two different ways according to the total number of chains specified. If the total number of chains specified is less than or equal to 10, initial values of the rth chain () are given by

lamda Subscript i Superscript left-parenthesis 0 right-parenthesis Baseline equals ModifyingAbove lamda With caret Subscript i Baseline normal e Superscript plus-or-minus left-parenthesis left-bracket StartFraction r Over 2 EndFraction right-bracket plus 2 right-parenthesis ModifyingAbove s With caret left-parenthesis ModifyingAbove lamda With caret Super Subscript i Superscript right-parenthesis

with the plus sign for odd r and minus sign for even r. If the total number of chains is greater than 10, initial values are picked at random over a wide range of values. Let be a uniform random number between 0 and 1; the initial value for is given by

Regression Coefficients and Log-Hazard Parameters ’s

The ’s are the regression coefficients ’s, and in the piecewise exponential model, include the log-hazard parameters ’s. For the first chain that the summary statistics and regression diagnostics are based on, the initial values are

theta Subscript i Superscript left-parenthesis 0 right-parenthesis Baseline equals ModifyingAbove theta With caret Subscript i

If the number of chains requested is less than or equal to 10, initial values for the rth chain () are given by

theta Subscript i Superscript left-parenthesis 0 right-parenthesis Baseline equals ModifyingAbove theta With caret Subscript i Baseline plus-or-minus left-parenthesis 2 plus left-bracket StartFraction r Over 2 EndFraction right-bracket right-parenthesis ModifyingAbove s With caret left-parenthesis ModifyingAbove theta With caret Subscript i Baseline right-parenthesis

with the plus sign for odd r and minus sign for even r. When there are more than 10 chains, the initial value for the is picked at random over the range ; that is,

where is a uniform random number between 0 and 1.

Fit Statistics

Denote the observed data by D. Let be the vector of parameters of length k. Let be the likelihood. The deviance information criterion (DIC) proposed in Spiegelhalter et al. (2002) is a Bayesian model assessment tool. Let Dev. Let and be the corresponding posterior means of and , respectively. The deviance information criterion is computed as

normal upper D normal upper I normal upper C equals 2 ModifyingAbove normal upper D normal e normal v left-parenthesis bold-italic theta right-parenthesis With bar minus normal upper D normal e normal v left-parenthesis bold-italic theta overbar right-parenthesis

Also computed is

p upper D equals ModifyingAbove normal upper D normal e normal v left-parenthesis bold-italic theta right-parenthesis With bar minus normal upper D normal e normal v left-parenthesis bold-italic theta overbar right-parenthesis

where pD is interpreted as the effective number of parameters.

Note that defined here does not have the standardizing term as in the section Deviance Information Criterion (DIC) in Chapter 8, Introduction to Bayesian Analysis Procedures. Nevertheless, the DIC calculated here is still useful for variable selection.

Posterior Distribution for Quantities of Interest

Let be the parameter vector. For the Cox model, the ’s are the regression coefficients ’s; for the proportional hazards spline model, the ’s consist of the cubic spline parameters ’s and the regression coefficients ’s; for the piecewise constant baseline hazard model, the ’s consist of the baseline hazards ’s (or log baseline hazards ’s) and the regression coefficients ’s.

Let be the chain that represents the posterior distribution for .

Consider a quantity of interest that can be expressed as a function of the parameter vector . You can construct the posterior distribution of by evaluating the function for each in . The posterior chain for is Summary statistics such as mean, standard deviation, percentiles, and credible intervals are used to describe the posterior distribution of .

Hazard Ratio

As shown in the section Hazard Ratios, a log-hazard ratio is a linear combination of the regression coefficients. Let be the vector of linear coefficients. The posterior sample for this hazard ratio is the set .

Survival Distribution

Let be a covariate vector of interest.

Cox Model

Let be the observed data. Define

StartLayout 1st Row upper Y Subscript i Baseline left-parenthesis t right-parenthesis equals StartLayout Enlarged left-brace 1st Row 1st Column 1 2nd Column t less-than t Subscript i Baseline 2nd Row 1st Column 0 2nd Column normal o normal t normal h normal e normal r normal w normal i normal s normal e EndLayout EndLayout

Consider the rth draw of . The baseline cumulative hazard function at time t is given by

upper H 0 left-parenthesis t vertical-bar bold-italic beta Superscript left-parenthesis r right-parenthesis Baseline right-parenthesis equals sigma-summation Underscript i colon t Subscript i Baseline less-than-or-equal-to t Endscripts StartFraction delta Subscript i Baseline Over sigma-summation Underscript l equals 1 Overscript n Endscripts upper Y Subscript l Baseline left-parenthesis t Subscript i Baseline right-parenthesis normal e normal x normal p left-parenthesis bold z prime Subscript l Baseline bold-italic beta Superscript left-parenthesis r right-parenthesis Baseline right-parenthesis EndFraction

For the given covariate vector , the cumulative hazard function at time t is

upper H left-parenthesis t semicolon bold x vertical-bar bold-italic beta Superscript left-parenthesis r right-parenthesis Baseline right-parenthesis equals upper H 0 left-parenthesis t vertical-bar bold-italic beta Superscript left-parenthesis r right-parenthesis Baseline right-parenthesis exp left-parenthesis bold x prime bold-italic beta Superscript left-parenthesis r right-parenthesis Baseline right-parenthesis

and the survival function at time t is

upper S left-parenthesis t semicolon bold x vertical-bar bold-italic beta Superscript left-parenthesis r right-parenthesis Baseline right-parenthesis equals exp left-bracket minus upper H left-parenthesis t semicolon bold x vertical-bar bold-italic beta Superscript left-parenthesis r right-parenthesis Baseline right-parenthesis right-bracket

Proportional Hazards Spline Model

Consider the rth draw in , where consists of and . The baseline cumulative hazard function at time t is

upper H 0 left-parenthesis t vertical-bar bold-italic gamma Superscript left-parenthesis r right-parenthesis Baseline right-parenthesis equals exp left-bracket s left-parenthesis log left-parenthesis t right-parenthesis comma bold-italic gamma Superscript left-parenthesis r right-parenthesis Baseline right-parenthesis right-bracket

where is a cubic spline function as described for the SPLINE option in the BAYES statement. For the given covariate vector , the cumulative hazard function at time t is

upper H left-parenthesis t semicolon bold x vertical-bar bold-italic gamma Superscript left-parenthesis r right-parenthesis Baseline comma bold-italic beta Superscript left-parenthesis r right-parenthesis Baseline right-parenthesis equals upper H 0 left-parenthesis t vertical-bar bold-italic gamma Superscript left-parenthesis r right-parenthesis Baseline right-parenthesis exp left-parenthesis bold x prime bold-italic beta Superscript left-parenthesis r right-parenthesis Baseline right-parenthesis

and the survival function at time t is

upper S left-parenthesis t semicolon bold x vertical-bar bold-italic gamma Superscript left-parenthesis r right-parenthesis Baseline comma bold-italic beta Superscript left-parenthesis r right-parenthesis Baseline right-parenthesis equals exp left-bracket minus upper H left-parenthesis t semicolon bold x vertical-bar bold-italic gamma Superscript left-parenthesis r right-parenthesis Baseline comma bold-italic beta Superscript left-parenthesis r right-parenthesis Baseline right-parenthesis right-bracket

Piecewise Exponential Model

Let be a partition of the time axis. Consider the rth draw in , where consists of and . The baseline cumulative hazard function at time t is

upper H 0 left-parenthesis t vertical-bar bold-italic lamda Superscript left-parenthesis r right-parenthesis Baseline right-parenthesis equals sigma-summation Underscript j equals 1 Overscript upper J Endscripts lamda Subscript j Superscript left-parenthesis r right-parenthesis Baseline normal upper Delta Subscript j Baseline left-parenthesis t right-parenthesis

where

For the given covariate vector , the cumulative hazard function at time t is

upper H left-parenthesis t semicolon bold x vertical-bar bold-italic lamda Superscript left-parenthesis r right-parenthesis Baseline comma bold-italic beta Superscript left-parenthesis r right-parenthesis Baseline right-parenthesis equals upper H 0 left-parenthesis t vertical-bar bold-italic lamda Superscript left-parenthesis r right-parenthesis Baseline right-parenthesis exp left-parenthesis bold x prime bold-italic beta Superscript left-parenthesis r right-parenthesis Baseline right-parenthesis

and the survival function at time t is

upper S left-parenthesis t semicolon bold x vertical-bar bold-italic lamda Superscript left-parenthesis r right-parenthesis Baseline comma bold-italic beta Superscript left-parenthesis r right-parenthesis Baseline right-parenthesis equals exp left-bracket minus upper H left-parenthesis t semicolon bold x vertical-bar bold-italic lamda Superscript left-parenthesis r right-parenthesis Baseline comma bold-italic beta Superscript left-parenthesis r right-parenthesis Baseline right-parenthesis right-bracket

Last updated: March 08, 2022