The MODEL statement identifies the variables to be used as the failure time variables, the optional censoring variable, and the explanatory effects, including covariates, main effects, and interactions. For more information about explanatory effects, see the section Specification of Effects in Chapter 53, The GLM Procedure. A note of caution: specifying the effect T*A in the MODEL statement, where T is the time variable and A is a CLASS variable, does not make the effect time-dependent.
You must specify exactly one MODEL statement. Specify either of two forms of MODEL syntax: the first form allows one time variable, and the second form allows two time variables for the counting process style of input. For more information on the counting process style of input, see the section Counting Process Style of Input.
For the first form of the MODEL statement, the name of the failure time variable (response) precedes the equal sign. This variable can optionally be followed by an asterisk, the name of the censoring variable, and a list of censoring values (separated by blanks or commas if there is more than one) enclosed in parentheses. If the censoring variable takes on one of these values, the corresponding failure time is considered to be censored. The variables following the equal sign (effects) are the explanatory variables (sometimes called independent variables or covariates) for the model.
Instead of using a single failure time variable, the second form of the MODEL statement identifies a pair of failure time variables. Their names are enclosed in parentheses, and they signify the endpoints of a semiclosed interval
during which the subject is at risk. If the censoring variable takes on one of the censoring values, the time t2 is considered to be censored.
The censoring variable must be numeric. The failure time variable must contain nonnegative values. Any observation that has a negative failure time is excluded from the analysis, as is any observation that has a missing value for any of the variables listed in the MODEL statement. For more information, see the section Missing Values. Failure time variables that have a SAS date format are not recommended because the dates might be translated into negative numbers and consequently the corresponding observation would be discarded.
Table 8 summarizes the options available in the MODEL statement, which can be specified after a slash (/).
Table 8: MODEL Statement Options
| Option |
Description |
|
ALPHA= |
Specifies for the confidence limits |
|
CLPARM |
Computes confidence limits for regression parameters |
|
COVB |
Displays covariance matrix |
|
DF= |
Specifies the denominator degrees of freedom |
|
ENTRYTIME= |
Specifies the delayed entry time variable |
|
FIRTH |
Specifies Firth’s penalized likelihood method |
|
HESS |
Displays the Hessian matrix |
|
INVHESS |
Displays the inverse of the Hessian matrix |
|
RISKLIMITS |
Computes confidence limits for the exponentials of the regression parameters |
|
SERATIO= |
Computes the ratio of two standard errors for the regression coefficients |
|
SINGULAR= |
Specifies tolerance for testing singularity |
|
TIES= |
Specifies the method of handling ties in failure times |
|
VADJUST= |
Specifies a variance adjustment factor |
|
VARRATIO= |
Computes the ratio of two variances for the regression coefficients |
-
ALPHA=
-
sets the level of the confidence limits for the estimated regression parameters and the hazard ratios. The value of alpha must be between 0 and 1, and the default is 0.05. A confidence level of
produces
% confidence limits. The default of ALPHA=0.05 produces 95% confidence limits.
The ALPHA= option has no effect unless you also specify the CLPARM or RISKLIMITS option.
-
CLPARM
produces confidence limits for regression parameters of Cox proportional hazards models. You can specify the confidence coefficient by using the ALPHA= option. Classification main effects that use parameterizations other than REF, EFFECT, or GLM are ignored. For more information, see the section Confidence Intervals.
-
COVB
displays the estimated covariance matrix of the parameter estimates.
-
DF=value | keyword <(value)>
-
specifies the denominator degrees of freedom for hypothesis tests, specifies the degrees of freedom for confidence limits, and requests adjustments to the Wald test statistics. If you specify a value, it must be a nonnegative number.
In the description that follows, d denotes the usual degrees of freedom computed from the survey data by using the number of strata, clusters, or replicate weights. For more information, see the section Degrees of Freedom.
By default, DF=PARMADJ when you use the Taylor series linearized variance estimator, and DF=DESIGN when you use the replication variance estimator. Alternatively, you can specify a nonnegative value for the degrees of freedom, or you can specify one of the following keywords:
-
ALLREPS
computes the denominator degrees of freedom for replication methods by using the total number of replicate samples. By default, PROC SURVEYPHREG computes the denominator degrees of freedom based on the number of replicate samples that are used. Some replicate samples might not be usable, in the sense that they cannot be used for variance estimation because of factors such as inestimability or nonconvergence. These replicate samples are not accounted for in the denominator degrees of freedom unless you specify DF=ALLREPS. For more information, see the section Degrees of Freedom.
-
DESIGN
computes the denominator degrees of freedom as d. When you specify DF=DESIGN, the corresponding Wald F statistics do not account for the number of parameters in the model. This option is useful if you do not want to apply the adjustment described in Korn and Graubard (1999, p. 93). For more information, see the section Testing the Global Null Hypothesis.
-
DESIGN (value)
computes the denominator degrees of freedom as value. When you specify DF=DESIGN (value), the corresponding Wald F statistics do not account for the number of parameters in the model. This option is useful if you do not want to apply the adjustment described in Korn and Graubard (1999, p. 93) and you want to specify the denominator degrees of freedom. You might want to specify a denominator degrees of freedom other than d for reasons such as missing values or domain estimation for relatively small domains. For more information, see the section Testing the Global Null Hypothesis.
-
DESIGNADJ
computes the denominator degrees of freedom as d. When you specify DF=DESIGNADJ, the corresponding Wald F statistics account for the number of parameters in the model. This option is useful if you are fitting a model that has many parameters relative to d but you want to use d as the denominator degrees of freedom. For more information, see the section Testing the Global Null Hypothesis.
-
NONE
specifies the denominator degrees of freedom to be infinite. This option is useful if you want to compute chi-square tests and normal confidence intervals. For more information, see the section Testing the Global Null Hypothesis.
-
PARMADJ
computes the denominator degrees of freedom as d minus the number of nonsingular parameters plus 1. When you specify DF=PARMADJ, the corresponding Wald F statistics account for the number of parameters in the model. This option is useful if you are fitting a model that has many parameters relative to d. For more information, see the section Testing the Global Null Hypothesis.
-
PARMADJ (value)
computes the denominator degrees of freedom as value. When you specify DF=PARMADJ (value), the corresponding Wald F statistics account for the number of parameters in the model. This option is useful if you are fitting a model that has many parameters relative to d and you want to specify the denominator degrees of freedom. You might want to specify the denominator degrees of freedom for reasons such as missing values or domain estimation for relatively small domains. For more information, see the section Testing the Global Null Hypothesis.
-
ENTRYTIME=variable
ENTRY=variable
specifies the name of the variable that represents the left-truncation time. This option has no effect when the counting process style of input is specified. For more information, see the section Left-Truncation of Failure Times.
-
FIRTH
performs Firth’s penalized maximum likelihood estimation (Mukhopadhyay 2020; Heinze and Schemper 2001; Firth 1993). This method is useful when the likelihood is monotone—that is, the likelihood converges to a finite value, but at least one estimate diverges to infinity. This option is available only for the Breslow likelihood. When you specify this option, the likelihood ratio statistics are computed using the unadjusted likelihoods, and only the Wald test for the overall null hypothesis is available. For more information, see the section Firth’s Modification for Maximum Likelihood Estimation.
-
HESS
displays the last evaluation of the Hessian matrix.
-
INVHESS
displays the inverse of the Hessian matrix that is evaluated at the estimated regression parameters.
-
RISKLIMITS
RL
produces confidence limits for hazard ratios and related quantities. For more information, see the section Hazard Ratios. You can specify the confidence coefficient by using the ALPHA= option. You must take great care with any interpretation of the estimates and their confidence limits if interaction effects are involved in the model or if parameterizations other than REF, EFFECT, or GLM are used.
-
SERATIO=ALL | MODEL | IND | SRSWOR | SRSWR
-
computes the ratio of two standard errors for the regression parameters. The standard error in the numerator uses the complete design information that you specify. You can specify the following options to compute different standard errors for the denominator:
-
ALL
requests IND, MODEL, and either SRSWR or SRSWOR standard error ratios. If you specify the RATE= or the TOTAL= option in the PROC SURVEYPHREG statement, then SRSWOR standard error ratios are computed; otherwise, SRSWR standard error ratios are computed.
-
IND
computes the standard errors in the denominator by ignoring stratification and clustering. For more information, see the section Variance Ratios and Standard Error Ratios.
-
MODEL
computes the standard errors in the denominator as the square root of the diagonals of the inverse Hessian matrix evaluated at the estimated regression parameters. For more information, see the section Variance Ratios and Standard Error Ratios.
-
SRSWOR
computes the standard errors in the denominator as the square root of the diagonals of a scaled inverse Hessian matrix evaluated at the estimated regression parameters. If you specify the RATE= or the TOTAL= option in the PROC SURVEYPHREG statement, then the scaling factor also includes the sampling fractions. For more information, see the section Variance Ratios and Standard Error Ratios.
-
SRSWR
computes the standard errors in the denominator as the square root of the diagonals of a scaled inverse Hessian matrix evaluated at the estimated regression parameters. For more information, see the section Variance Ratios and Standard Error Ratios.
-
SINGULAR=value
specifies the singularity criterion for determining linear dependencies in the set of explanatory variables. The default value is
.
-
TIES=method
-
specifies how to handle ties in the failure time. You can specify the following methods:
-
BRESLOW
uses the approximate partial likelihood of Breslow (1974).
-
EFRON
uses the approximate partial likelihood of Efron (1977).
If there are no ties, both methods result in the same likelihood and yield identical estimates. By default, TIES=BRESLOW, which is the most efficient method when there are no ties.
-
VADJUST=DF | PARMADJ | NONE | AVGREPSS
-
specifies variance adjustment factors. You can specify the following keywords:
-
VARRATIO=ALL | MODEL | IND | SRSWOR | SRSWR
-
computes the ratio of two variances for the regression parameters. The variance in the numerator uses the complete design information. You can specify the following options to compute different variances for the denominator:
-
ALL
requests IND, MODEL, and either SRSWR or SRSWOR variance ratios. If you specify the RATE= or the TOTAL= option in the PROC SURVEYPHREG statement, then SRSWOR variance ratios are computed; otherwise, SRSWR variance ratios are computed.
-
IND
computes the variances in the denominator by ignoring stratification and clustering. For more information, see the section Variance Ratios and Standard Error Ratios.
-
MODEL
computes the variances in the denominator as the diagonals of the inverse Hessian matrix evaluated at the estimated regression parameters. For more information, see the section Variance Ratios and Standard Error Ratios.
-
SRSWOR
computes the variances in the denominator as the diagonals of a scaled inverse Hessian matrix evaluated at the estimated regression parameters. If you specify the RATE= or the TOTAL= option in the PROC SURVEYPHREG statement, then the scaling factor also includes the sampling fractions. For more information, see the section Variance Ratios and Standard Error Ratios.
-
SRSWR
computes the variances in the denominator as the diagonals of a scaled inverse Hessian matrix evaluated at the estimated regression parameters. For more information, see the section Variance Ratios and Standard Error Ratios.