The PHREG Procedure

Effect Selection Methods

Five effect selection methods are available. The simplest method (and the default) is SELECTION=NONE, for which PROC PHREG fits the complete model as specified in the MODEL statement. The other four methods are FORWARD for forward selection, BACKWARD for backward elimination, STEPWISE for stepwise selection, and SCORE for best subsets selection. These methods are specified with the SELECTION= option in the MODEL statement and are based on the score test or Wald test as described in the section Type 3 Tests and Joint Tests.

SELECTION=FORWARD begins with the effects that are forced into the model. These are the first n effects in the MODEL statement, where n is the number specified by the START= or INCLUDE= option in the MODEL statement (n is zero by default). The forward selection process sequentially adds the effect that most improves the fit. The statistic that is used to determine whether to add an effect is the significant level of the score test (Kalbfleisch and Prentice 1980, Section 3.4.1) that reflects the effect’s contribution to the model if it is included. At each step, the effect that is most significant is added. The process is repeated until none of the remaining effects meet the specified level for entry or until the STOP= value is reached.

SELECTION=BACKWARD starts with the full model, which includes all the explanatory effects unless the START= option is specified. In that case, only the first n effects in the MODEL statement are included in the initial model, where n is the number specified by the START= option. The backward elimination process sequentially drops the effects that have the least contribution to the fit. The statistic that is used to determine whether to drop an effect is the significance level of the Wald test (Kalbfleisch and Prentice 1980, Section 3.4.2). At any step, the least significant effect is dropped and the process continues until all effects that remain in the model are significant at the SLSTAY= level or until the STOP= value is reached.

SELECTION=STEPWISE modifies the forward selection technique by allowing effects already in the model to be removed. See Example 92.1 for an illustration of the stepwise selection process. The same entry and removal significance levels for the forward selection and backward elimination methods are used to assess contributions of effects as they are added to or removed from a model. If, at a step of the stepwise method, any effect in the model is not significant at the SLSTAY= level, then the least significant of these effects is removed from the model and the algorithm proceeds to the next step. This ensures that no effect can be added to a model while some effect currently in the model is not deemed significant. Another effect can be added to the model only after all necessary deletions have been accomplished. In this case the effect whose addition is the most significant is added to the model and the algorithm proceeds to the next step. The stepwise process ends when none of the effects outside the model is significant at the SLENTRY= level and every effect in the model is significant at the SLSTAY= level. In some cases, neither of these two conditions for stopping is met and the sequence of models cycles. In this case, the stepwise method terminates at the end of the cycle.

SELECTION=SCORE uses the branch-and-bound algorithm of Furnival and Wilson (1974) to find a specified number of models with the highest score (chi-square) statistic for all possible model sizes, from 1, 2, or 3 variables, and so on, up to the single model that contains all of the explanatory variables. The number of models displayed for each model size is controlled by the BEST= option. You can use the START= option to impose a minimum model size, and you can use the STOP= option to impose a maximum model size. For instance, with BEST=3, START=2, and STOP=5, the SCORE selection method displays the best three models (that is, the three models with the highest score chi-squares) that contain 2, 3, 4, and 5 variables. One of the limitations of the branch-and-bound algorithm is that it works only when each explanatory effect contains exactly one parameter—the SELECTION=SCORE option is not allowed when an explanatory effect in the MODEL statement contains a CLASS variable.

The SEQUENTIAL and STOPRES options can alter the default criteria for adding variables to or removing variables from the model when they are used with the FORWARD, BACKWARD, or STEPWISE selection method.

Last updated: March 08, 2022