The ICPHREG Procedure

BASELINE Statement

  • BASELINE <OUT=SAS-data-set> <COVARIATES=SAS-data-set> <TIMELIST=list> <keyword=name …keyword=name> </ options>;

The BASELINE statement creates a SAS data set (named by the OUT= option) that contains the predicted values at specified times that partition the time axis for every set of covariates in the COVARIATES= data set. If the COVARIATES= data set is not specified, PROC ICPHREG uses a reference set of covariates that consists of the reference levels for the CLASS variables and the average values for the continuous variables.

Table 3 summarizes the options that you can specify in the BASELINE statement.

Table 3: BASELINE Statement Options

Option Description
Data Set and Time List Options
OUT= Specifies the output BASELINE data set
COVARIATES= Specifies the SAS data set that contains the explanatory variables
TIMELIST= Specifies a list of time points for computing the predicted values
Keyword Options for Variables
CUMHAZ= Specifies the cumulative hazard function estimate
HAZARD= Specifies the hazard function estimate
LOGLOGS= Specifies the log of the negative log of the survival function
LOGSURV= Specifies the log of the survival function
LOWERCUMHAZ= Specifies the lower pointwise confidence limit for the cumulative hazard function
LOWER= Specifies the lower pointwise confidence limit for the survival function
STDCUMHAZ= Specifies the estimated standard error of the cumulative hazard function
STDERR= Specifies the standard error of the survival function
STDXBETA= Specifies the estimated standard error of the linear predictor estimator
SURVIVAL= Specifies the survival function estimate
UPPERCUMHAZ= Specifies the upper pointwise confidence limit for the cumulative hazard function
UPPER= Specifies the upper pointwise confidence limit for the survival function
XBETA= Specifies the estimate of the linear predictor bold z prime bold-italic beta
Other Options
ALPHA= Specifies the level of the confidence interval for the survival function
CLTYPE= Specifies the transformation that is used to compute confidence limits for the survival function
GROUP= Names a variable whose values identify or group predicted survival or cumulative hazard functions in plots
ROWID= Names the variable in the COVARIATES= data set for identifying the predicted survival or cumulative hazard functions in plots


You can specify the following options in the BASELINE statement.

OUT=SAS-data-set

names the output data set. If you omit the OUT= option, the data set is created and given a default name by using the DATAn convention. For more information, see the section OUT= Output Data Set in the BASELINE Statement.

COVARIATES=SAS-data-set

names the SAS data set that contains the sets of explanatory variable values for which the functions of interest are estimated. All variables in the COVARIATES= data set are copied to the OUT= data set. Thus, any variable in the COVARIATES= data set can be used to identify the covariate sets in the OUT= data set.

TIMELIST=list

specifies a list of time points at which the predicted values are computed. The following specifications are equivalent:

   timelist=5,20 to 50 by 10
   timelist=5 20 30 40 50

If you do not specify this option, predicted values are computed at all the times that partition the time axis.

keyword=name

specifies the statistics to be included in the OUT= data set and assigns names to the variables that contain these statistics. Specify a keyword for each desired statistic, an equal sign, and the name of the variable for the statistic. You can specify the following keywords:

CUMHAZ=name

specifies the cumulative hazard function estimate. Specifying CUMHAZ=_ALL_ is equivalent to specifying CUMHAZ=CumHaz, STDCUMHAZ=StdErrCumHaz, LOWERCUMHAZ=LowerCumHaz, and UPPERCUMHAZ=UpperCumHaz.

HAZARD=name

specifies the hazard function estimate.

LOGLOGS=name
CLOGLOGS=name

specifies the log of the negative log of the estimated survival function.

LOGSURV=name

specifies the log of the estimated survival function.

LOWER=name
L=name
LOWERSDF=name

specifies the lower pointwise confidence limit for the survival function. The confidence level is determined by the ALPHA= option.

LOWERCUMHAZ=name

specifies the lower pointwise confidence limit for the cumulative hazard function. The confidence level is determined by the ALPHA= option.

STDERR=name
STDSDF=name

specifies the standard error of the survival function estimator.

STDCUMHAZ=name

specifies the estimated standard error of the cumulative hazard function estimator.

STDXBETA=name

specifies the estimated standard error of the linear predictor estimator.

SURVIVAL=name
SDF=name

specifies the estimated survival function (upper S left-parenthesis t right-parenthesis equals left-bracket upper S 0 left-parenthesis t right-parenthesis right-bracket Superscript exp left-parenthesis bold z prime bold-italic beta right-parenthesis). Specifying SURVIVAL=_ALL_ is equivalent to specifying SURVIVAL=Survival, STDERR=StdErrSurvival, LOWER=LowerSurvival, and UPPER=UpperSurvival.

UPPER=name
U=name
UPPERSDF=name

specifies the upper pointwise confidence limit for the survival function. The confidence level is determined by the ALPHA= option.

UPPERCUMHAZ=name

specifies the upper pointwise confidence limit for the cumulative hazard function. The confidence level is determined by the ALPHA= option.

XBETA=name

specifies the estimate of the linear predictor bold z prime bold-italic beta. If there is an offset, it is added to the predictor.

You can specify the following options after a slash (/).

ALPHA=value

specifies the level of the confidence interval for the survival function. The value must be between 0 and 1. The default is the value of the ALPHA= option in the PROC ICPHREG statement, or 0.05 if that option is not specified.

CLTYPE=method
CITYPE=method
TYPE=method
CLTRANSFORM=method
TRANSFORM=method

specifies the transformation that is used to compute the confidence limits for upper S left-parenthesis t comma bold z right-parenthesis, which is the survival function for a subject that has a fixed covariate vector bold z at event time t. You can specify the following methods:

LOG

uses normal theory approximation to compute the confidence limits for log left-parenthesis upper S left-parenthesis t comma bold z right-parenthesis right-parenthesis. The confidence limits for upper S left-parenthesis t comma bold z right-parenthesis are obtained by back-transforming the confidence limits for log left-parenthesis upper S left-parenthesis t comma bold z right-parenthesis right-parenthesis.

LOGLOG

uses normal theory approximation to compute the confidence limits for the log left-parenthesis minus log left-parenthesis upper S left-parenthesis t comma bold z right-parenthesis right-parenthesis right-parenthesis. The confidence limits for upper S left-parenthesis t comma bold z right-parenthesis are obtained by back-transforming the confidence limits for log left-parenthesis minus log left-parenthesis upper S left-parenthesis t comma bold z right-parenthesis right-parenthesis right-parenthesis.

NORMAL
IDENTITY
LINEAR
PLAIN
DIRECT

uses normal theory approximation to compute the confidence limits for upper S left-parenthesis t comma bold z right-parenthesis.

By default, CLTYPE=LOG.

GROUP=variable

names a variable whose values identify or group the predicted curves. The variable must be a numeric variable in the COVARIATES= data set. Survival curves for observations that have the same value of the variable are overlaid in the same plot.

ROWID=variable
ID=variable
ROW=variable

names a variable in the COVARIATES= data set for identifying plotted survival functions and cumulative hazard functions. This option has no effect if the PLOTS= option in the PROC ICPHREG statement is not specified. Values of this variable are used to label the plotted functions for the corresponding rows in the COVARIATES= data set. You can specify ROWID=_OBS_ to use the observation numbers in the COVARIATES= data set for identification.

Last updated: March 08, 2022