The ICPHREG Procedure

Computational Details

Design Matrix

The linear predictor part of a proportional hazards model is

bold-italic mu equals bold upper Z prime bold-italic beta

where bold-italic beta is a vector of unknown regression coefficients and bold upper Z is a known design matrix. The ordering of these parameters is displayed in the "CLASS Level Information" table and in tables that display the parameter estimates of the fitted model.

When you use the PARAM=GLM option in the CLASS statement to specify an overparameterized model, some columns of bold upper Z can be linearly dependent on other columns. For example, when you specify a model that consists of a classification variable, the column that corresponds to any one of the levels of the classification variable is linearly dependent on the other columns of bold upper Z. The columns of bold upper Z prime bold upper Z are checked in the order in which the model is specified for dependence on preceding columns. If a dependency is found, the parameter that corresponds to the dependent column and its standard error are set to 0 to indicate that it is not estimated. The test for linear dependence is controlled by the SINGULAR= option in the MODEL statement. You can use the ORDER= option in the CLASS statement to specify the order in which the levels of a classification variable are checked for dependencies. For full-rank parameterizations, the columns of the bold upper Z matrix are designed to be linearly independent.

Initial Values

The initial values of the regression coefficients bold-italic beta are all set to 0.

For the piecewise constant model, the initial values of the hazard parameters are set equal to the exponential rate that is estimated from an imputed data set. The data set is obtained by imputing a middle point for the interval-censored and left-censored observations while retaining the right-censored and exact observations. For the cubic spline model, the first spline coefficient, gamma 0, is set to be the log of the exponential rate estimated with the previous imputed data, and the second spline coefficient, gamma 1, is set to 1. The remaining spline coefficients, if there are any, are set to 0.

Maximum Likelihood Estimation

By default, the ICPHREG procedure uses a Newton-Raphson algorithm to maximize the log-likelihood function with respect to the parameters.

Denote the set of parameters that need to be estimated as bold-italic omega equals StartSet omega Subscript j Baseline EndSet, which consists of the parameters that determine baseline hazard function normal upper Lamda 0 left-parenthesis t right-parenthesis and the regression coefficients bold-italic beta. On the rth iteration, the algorithm updates the parameter vector bold-italic omega Subscript r with

bold-italic omega Subscript r plus 1 Baseline equals bold-italic omega Subscript r Baseline minus bold upper H Superscript negative 1 Baseline bold g

where bold upper H is the Hessian (second derivative) matrix, and bold g is the gradient (first derivative) vector of the log-likelihood function, both evaluated at the current value of the parameter vector. That is,

bold g equals left-bracket g Subscript j Baseline right-bracket equals left-bracket StartFraction partial-differential l Over partial-differential omega Subscript j Baseline EndFraction right-bracket

and

bold upper H equals left-bracket h Subscript i j Baseline right-bracket equals left-bracket StartFraction partial-differential squared l Over partial-differential omega Subscript i Baseline partial-differential omega Subscript j Baseline EndFraction right-bracket

The ICPHREG procedure also supports other optimization methods, such as quasi-Newton and Newton-Raphson with ridging. These methods are described in the section Choosing an Optimization Algorithm in Chapter 20, Shared Concepts and Topics.

Covariance and Correlation Matrix

The estimated covariance matrix of the parameter estimator is

bold upper Sigma equals minus bold upper H Superscript negative 1

where bold upper H is the Hessian matrix that is evaluated using the parameter estimates on the last iteration. If some parameters in the baseline function are held fixed, they are not incorporated in bold upper H. Rows and columns that correspond to aliased parameters are not included in bold upper Sigma.

The correlation matrix is the normalized covariance matrix. That is, if sigma Subscript i j is an element of bold upper Sigma, then the corresponding element of the correlation matrix is sigma Subscript i j Baseline slash sigma Subscript i Baseline sigma Subscript j, where sigma Subscript i Baseline equals StartRoot sigma Subscript i i Baseline EndRoot.

Choosing Break Points

There are no obvious ways to choose break points for parameterizing the baseline function in terms of a piecewise constant function or a cubic spline curve. For right-censored data, PROC PHREG chooses a set of points such that the resulting time intervals contain approximately equal numbers of event times. This is difficult for interval-censored data because event times are not fully observed. Friedman (1982) recommends choosing the points so that the expected number of events is comparable among the time intervals. For an interval-censored spline model, Cai and Betensky (2003) propose an ad hoc approach that uses the quantile values of the unique time points among StartSet upper L Subscript i Baseline comma upper R Subscript i Baseline comma left-parenthesis upper L Subscript i Baseline plus upper R Subscript i Baseline right-parenthesis slash 2 comma i equals 1 comma ellipsis comma n EndSet for choosing the knot values.

Ibrahim, Chen, and Sinha (2001) propose the equally spaced quantile partition (ESQP) method for selecting break points in the right-censored data to fit the piecewise constant model. Suppose there are Q break points to be determined. The ICPHREG procedure modifies this method to handle interval-censored data. First, it imputes a middle point for each observation that is not right-censored. Then, it merges these values with the observed boundary values in the input data set, except for the right-censored observations. Next, it sorts these values in increasing order.

Suppose the unique values of the sorted sequence are u 1 less-than u 2 less-than midline-horizontal-ellipsis less-than u Subscript upper M. First, PROC ICPHREG computes the targeted quantile for each break point as q Subscript j Baseline equals j slash left-parenthesis upper Q plus 1 right-parenthesis left-parenthesis j equals 1 comma midline-horizontal-ellipsis comma upper Q right-parenthesis. Then, it chooses the point u Subscript m plus 1, where m equals the integer part of the product q Subscript j Baseline upper M. If q Subscript j Baseline upper M is already an integer, then the chosen break point is set to be left-parenthesis u Subscript m Baseline plus u Subscript m plus 1 Baseline right-parenthesis slash 2. When there are no ties in the sorted sequence for right-censored data, this method is identical to the original ESQP method.

Fit Statistics

Suppose that the model contains q estimated parameters and that n observations are used in model fitting. The fit criteria displayed by the ICPHREG procedure are calculated as follows:

  • –2 log likelihood:

    minus 2 normal l normal o normal g left-parenthesis normal upper L right-parenthesis

    where normal upper L is the maximized likelihood for the model.

  • Akaike’s information criterion:

    normal upper A normal upper I normal upper C equals minus 2 normal l normal o normal g left-parenthesis normal upper L right-parenthesis plus 2 q
  • corrected Akaike’s information criterion:

    normal upper A normal upper I normal upper C normal upper C equals normal upper A normal upper I normal upper C plus StartFraction 2 q left-parenthesis q plus 1 right-parenthesis Over n minus q minus 1 EndFraction
  • Bayesian information criterion:

    normal upper B normal upper I normal upper C equals minus 2 normal l normal o normal g left-parenthesis normal upper L right-parenthesis plus q log left-parenthesis n right-parenthesis

For more information about AIC and BIC, see Akaike (1981, 1979). For a discussion of using AIC, AICC, and BIC in statistical modeling, see Simonoff (2003).

Last updated: March 08, 2022