The LOGISTIC Procedure

Conditional Logistic Regression

The method of maximum likelihood described in the preceding sections relies on large-sample asymptotic normality for the validity of estimates and especially of their standard errors. When you do not have a large sample size compared to the number of parameters, this approach might be inappropriate and might result in biased inferences. This situation typically arises when your data are stratified and you fit intercepts to each stratum so that the number of parameters is of the same order as the sample size. For example, in a 1:1 matched pairs study with n pairs and p covariates, you would estimate n minus 1 intercept parameters and p slope parameters. Taking the stratification into account by "conditioning out" (and not estimating) the stratum-specific intercepts gives consistent and asymptotically normal MLEs for the slope coefficients. See Breslow and Day (1980) and Stokes, Davis, and Koch (2012) for more information. If your nuisance parameters are not just stratum-specific intercepts, you can perform an exact conditional logistic regression.

Computational Details

For each stratum h, h equals 1 comma ellipsis comma upper H, number the observations as i equals 1 comma ellipsis comma n Subscript h Baseline so that hi indexes the ith observation in stratum h. Denote the p covariates for the hith observation as bold x Subscript h i and its binary response as y Subscript h i, and let bold y equals left-parenthesis y 11 comma ellipsis comma y Subscript 1 n 1 Baseline comma ellipsis comma y Subscript upper H Baseline 1 Baseline comma ellipsis comma y Subscript upper H n Sub Subscript upper H Subscript Baseline right-parenthesis prime, bold upper X Subscript h Baseline equals left-parenthesis bold x Subscript h Baseline 1 Baseline ellipsis bold x Subscript h n Sub Subscript h Subscript Baseline right-parenthesis prime, and bold upper X equals left-parenthesis bold upper X prime 1 ellipsis bold upper X prime Subscript upper H right-parenthesis prime. Let the dummy variables z Subscript h Baseline comma h equals 1 comma ellipsis comma upper H, be indicator functions for the strata (z Subscript h Baseline equals 1 if the observation is in stratum h), and denote bold z Subscript h i Baseline equals left-parenthesis z 1 comma ellipsis comma z Subscript upper H Baseline right-parenthesis for the hith observation, bold upper Z Subscript h Baseline equals left-parenthesis bold z Subscript h Baseline 1 Baseline ellipsis bold z Subscript h n Sub Subscript h Subscript Baseline right-parenthesis prime, and bold upper Z equals left-parenthesis bold upper Z prime 1 ellipsis bold upper Z prime Subscript upper H right-parenthesis prime. Denote bold upper X Superscript Super Superscript asterisk Superscript Baseline equals left-parenthesis bold upper Z vertical-bar bold upper X) and bold x Subscript h i Superscript asterisk Baseline equals left-parenthesis bold z prime Subscript h i Baseline vertical-bar bold x prime Subscript h i right-parenthesis prime. Arrange the observations in each stratum h so that y Subscript h i Baseline equals 1 for i equals 1 comma ellipsis comma m Subscript h Baseline, and y Subscript h i Baseline equals 0 for i equals m Subscript h plus 1 Baseline comma ellipsis comma n Subscript h Baseline. Suppose all observations have unit frequency.

Consider the binary logistic regression model written as

logit left-parenthesis bold-italic pi right-parenthesis equals bold upper X Superscript Super Superscript asterisk Superscript Baseline bold-italic theta

where the parameter vector bold-italic theta equals left-parenthesis bold-italic alpha prime comma bold-italic beta Superscript prime Baseline right-parenthesis prime consists of bold-italic alpha equals left-parenthesis alpha 1 comma ellipsis comma alpha Subscript upper H Baseline right-parenthesis prime, alpha Subscript h is the intercept for stratum h comma h equals 1 comma ellipsis comma upper H, and bold-italic beta is the parameter vector for the p covariates.

From the section Determining Observations for Likelihood Contributions, you can write the likelihood contribution of observation h i comma i equals 1 comma ellipsis comma n Subscript h Baseline comma h equals 1 comma ellipsis comma upper H comma as

upper L Subscript h i Baseline left-parenthesis bold-italic theta right-parenthesis equals StartFraction e Superscript y Super Subscript h i Superscript bold x Super Subscript h i Super Superscript asterisk Superscript prime bold-italic theta Baseline Over 1 plus e Superscript bold x Super Subscript h i Super Superscript asterisk Superscript prime bold-italic theta Baseline EndFraction

where y Subscript h i Baseline equals 1 when the response takes Ordered Value 1, and y Subscript h i Baseline equals 0 otherwise.

The full likelihood is

upper L left-parenthesis bold-italic theta right-parenthesis equals product Underscript h equals 1 Overscript upper H Endscripts product Underscript i equals 1 Overscript n Subscript h Baseline Endscripts upper L Subscript h i Baseline left-parenthesis bold-italic theta right-parenthesis equals StartFraction e Superscript bold y prime bold upper X Super Superscript Super Super Superscript asterisk Super Superscript Superscript bold-italic theta Baseline Over product Underscript h equals 1 Overscript upper H Endscripts product Underscript i equals 1 Overscript n Subscript h Baseline Endscripts left-parenthesis 1 plus e Superscript bold x Super Subscript h i Super Superscript asterisk Superscript prime bold-italic theta Baseline right-parenthesis EndFraction

Unconditional likelihood inference is based on maximizing this likelihood function.

When your nuisance parameters are the stratum-specific intercepts left-parenthesis alpha 1 comma ellipsis comma alpha Subscript upper H Baseline right-parenthesis prime, and the slopes bold-italic beta are your parameters of interest, "conditioning out" the nuisance parameters produces the conditional likelihood (Lachin 2000)

upper L left-parenthesis bold-italic beta right-parenthesis equals product Underscript h equals 1 Overscript upper H Endscripts upper L Subscript h Baseline left-parenthesis bold-italic beta right-parenthesis equals product Underscript h equals 1 Overscript upper H Endscripts StartFraction product Underscript i equals 1 Overscript m Subscript h Baseline Endscripts exp left-parenthesis bold x prime Subscript h i Baseline bold-italic beta right-parenthesis Over sigma-summation product Underscript j equals j 1 Overscript j Subscript m Sub Subscript h Subscript Baseline Endscripts exp left-parenthesis bold x prime Subscript h j Baseline bold-italic beta right-parenthesis EndFraction

where the summation is over all StartBinomialOrMatrix n Subscript h Choose m Subscript h EndBinomialOrMatrix subsets StartSet j 1 comma ellipsis comma j Subscript m Sub Subscript h Subscript Baseline EndSet of m Subscript h observations chosen from the n Subscript h observations in stratum h. Note that the nuisance parameters have been factored out of this equation.

For conditional asymptotic inference, maximum likelihood estimates ModifyingAbove bold-italic beta With caret of the regression parameters are obtained by maximizing the conditional likelihood, and asymptotic results are applied to the conditional likelihood function and the maximum likelihood estimators. A relatively fast method of computing this conditional likelihood and its derivatives is given by Gail, Lubin, and Rubinstein (1981) and Howard (1972). The optimization techniques can be controlled by specifying the NLOPTIONS statement.

Sometimes the log likelihood converges but the estimates diverge. This condition is flagged by having inordinately large standard errors for some of your parameter estimates, and can be monitored by specifying the ITPRINT option. Unfortunately, broad existence criteria such as those discussed in the section Existence of Maximum Likelihood Estimates do not exist for this model. It might be possible to circumvent such a problem by standardizing your independent variables before fitting the model.

Regression Diagnostic Details

Diagnostics are used to indicate observations that might have undue influence on the model fit or that might be outliers. Further investigation should be performed before removing such an observation from the data set.

The derivations in this section use an augmentation method described by Storer and Crowley (1985), which provides an estimate of the "one-step" DFBETAS estimates advocated by Pregibon (1984). The method also provides estimates of conditional stratum-specific predicted values, residuals, and leverage for each observation. The augmentation method can take a lot of time and memory.

Following Storer and Crowley (1985), the log-likelihood contribution can be written as

StartLayout 1st Row 1st Column l Subscript h 2nd Column equals 3rd Column log left-parenthesis upper L Subscript h Baseline right-parenthesis equals bold y prime Subscript h Baseline bold-italic gamma Subscript h Baseline minus a left-parenthesis bold-italic gamma Subscript h Baseline right-parenthesis where 2nd Row 1st Column a left-parenthesis bold-italic gamma Subscript h Baseline right-parenthesis 2nd Column equals 3rd Column log left-bracket sigma-summation product Underscript j equals j 1 Overscript j Subscript m Sub Subscript h Subscript Baseline Endscripts exp left-parenthesis bold-italic gamma Subscript h j Baseline right-parenthesis right-bracket EndLayout

and the h subscript on matrices indicates the submatrix for the stratum, bold-italic gamma Subscript h Baseline equals left-parenthesis gamma Subscript h Baseline 1 Baseline comma ellipsis comma gamma Subscript h n Sub Subscript h Subscript Baseline right-parenthesis prime, and gamma Subscript h i Baseline equals bold x prime Subscript h i Baseline bold-italic beta. Then the gradient and information matrix are

StartLayout 1st Row 1st Column bold g left-parenthesis bold-italic beta right-parenthesis 2nd Column equals 3rd Column StartSet StartFraction partial-differential l Subscript h Baseline Over partial-differential bold-italic beta EndFraction EndSet Subscript h equals 1 Superscript upper H Baseline equals bold upper X prime left-parenthesis bold y minus bold-italic pi right-parenthesis 2nd Row 1st Column bold upper Lamda left-parenthesis bold-italic beta right-parenthesis 2nd Column equals 3rd Column StartSet StartFraction partial-differential squared l Subscript h Baseline Over partial-differential bold-italic beta squared EndFraction EndSet Subscript h equals 1 Superscript upper H Baseline equals bold upper X Superscript prime Baseline diag left-parenthesis bold upper U 1 comma ellipsis comma bold upper U Subscript upper H Baseline right-parenthesis bold upper X EndLayout

where

StartLayout 1st Row 1st Column pi Subscript h i 2nd Column equals 3rd Column StartFraction partial-differential a left-parenthesis bold-italic gamma Subscript h Baseline right-parenthesis Over partial-differential gamma Subscript h i Baseline EndFraction equals StartFraction sigma-summation Underscript j left-parenthesis i right-parenthesis Endscripts product Underscript j equals j 1 Overscript j Subscript m Sub Subscript h Subscript Baseline Endscripts exp left-parenthesis gamma Subscript h j Baseline right-parenthesis Over sigma-summation product Underscript j equals j 1 Overscript j Subscript m Sub Subscript h Subscript Baseline Endscripts exp left-parenthesis gamma Subscript h j Baseline right-parenthesis EndFraction 2nd Row 1st Column bold-italic pi Subscript h 2nd Column equals 3rd Column left-parenthesis pi Subscript h Baseline 1 Baseline comma ellipsis comma pi Subscript h n Sub Subscript h Subscript Baseline right-parenthesis 3rd Row 1st Column bold upper U Subscript h 2nd Column equals 3rd Column StartFraction partial-differential squared a left-parenthesis bold-italic gamma Subscript h Baseline right-parenthesis Over partial-differential bold-italic gamma Subscript h Superscript 2 Baseline EndFraction equals StartSet StartFraction partial-differential squared bold a left-parenthesis bold-italic gamma Subscript h Baseline right-parenthesis Over partial-differential gamma Subscript h i Baseline partial-differential gamma Subscript h j Baseline EndFraction EndSet equals StartSet a Subscript i j Baseline EndSet 4th Row 1st Column a Subscript i j 2nd Column equals 3rd Column StartFraction sigma-summation Underscript k left-parenthesis i comma j right-parenthesis Endscripts product Underscript k equals k 1 Overscript k Subscript m Sub Subscript h Subscript Baseline Endscripts exp left-parenthesis gamma Subscript h k Baseline right-parenthesis Over sigma-summation product Underscript k equals k 1 Overscript k Subscript m Sub Subscript h Subscript Baseline Endscripts exp left-parenthesis gamma Subscript h k Baseline right-parenthesis EndFraction minus StartFraction partial-differential a left-parenthesis bold-italic gamma Subscript h Baseline right-parenthesis Over partial-differential gamma Subscript h i Baseline EndFraction StartFraction partial-differential a left-parenthesis bold-italic gamma Subscript h Baseline right-parenthesis Over partial-differential gamma Subscript h j Baseline EndFraction equals pi Subscript h i j Baseline minus pi Subscript h i Baseline pi Subscript h j EndLayout

and where pi Subscript h i is the conditional stratum-specific probability that subject i in stratum h is a case, the summation on j left-parenthesis i right-parenthesis is over all subsets from StartSet 1 comma ellipsis comma n Subscript h Baseline EndSet of size m Subscript h that contain the index i, and the summation on k left-parenthesis i comma j right-parenthesis is over all subsets from StartSet 1 comma ellipsis comma n Subscript h Baseline EndSet of size m Subscript h that contain the indices i and j.

To produce the true one-step estimate bold-italic beta Subscript h i Superscript 1, start at the MLE ModifyingAbove bold-italic beta With caret, delete the hith observation, and use this reduced data set to compute the next Newton-Raphson step. Note that if there is only one event or one nonevent in a stratum, deletion of that single observation is equivalent to deletion of the entire stratum. The augmentation method does not take this into account.

The augmented model is

logit left-parenthesis probability left-parenthesis y Subscript h i Baseline equals 1 vertical-bar bold x Subscript h i Baseline right-parenthesis right-parenthesis equals bold x prime Subscript h i Baseline bold-italic beta plus bold z prime Subscript h i Baseline gamma

where bold z Subscript h i Baseline equals left-parenthesis 0 comma ellipsis comma 0 comma 1 comma 0 comma ellipsis comma 0 right-parenthesis prime has a 1 in the hith coordinate, and use bold-italic beta Superscript 0 Baseline equals left-parenthesis ModifyingAbove bold-italic beta With caret prime comma 0 right-parenthesis prime as the initial estimate for left-parenthesis bold-italic beta prime comma gamma right-parenthesis prime. The gradient and information matrix before the step are

StartLayout 1st Row 1st Column bold g left-parenthesis bold-italic beta Superscript 0 Baseline right-parenthesis 2nd Column equals 3rd Column StartBinomialOrMatrix bold upper X Superscript prime Baseline Choose bold z Subscript h i Superscript prime Baseline EndBinomialOrMatrix left-parenthesis bold y minus bold-italic pi right-parenthesis equals StartBinomialOrMatrix bold 0 Choose y Subscript h i Baseline minus pi Subscript h i Baseline EndBinomialOrMatrix 2nd Row 1st Column bold upper Lamda left-parenthesis bold-italic beta Superscript 0 Baseline right-parenthesis 2nd Column equals 3rd Column StartBinomialOrMatrix bold upper X Superscript prime Baseline Choose bold z Subscript h i Superscript prime Baseline EndBinomialOrMatrix bold upper U left-bracket bold upper X bold z Subscript h i Baseline right-bracket equals Start 2 By 2 Matrix 1st Row 1st Column bold upper Lamda left-parenthesis bold-italic beta right-parenthesis 2nd Column bold upper X prime bold upper U bold z Subscript h i Baseline 2nd Row 1st Column bold z prime Subscript h i Baseline bold upper U bold upper X 2nd Column bold z prime Subscript h i Baseline bold upper U bold z Subscript h i EndMatrix EndLayout

Inserting the bold-italic beta Superscript 0 and left-parenthesis bold upper X prime comma bold z prime Subscript h i right-parenthesis prime into the Gail, Lubin, and Rubinstein (1981) algorithm provides the appropriate estimates of bold g left-parenthesis bold-italic beta Superscript 0 Baseline right-parenthesis and bold upper Lamda left-parenthesis bold-italic beta Superscript 0 Baseline right-parenthesis. Indicate these estimates with ModifyingAbove bold-italic pi With caret equals bold-italic pi left-parenthesis ModifyingAbove bold-italic beta With caret right-parenthesis, ModifyingAbove bold upper U With caret equals bold upper U left-parenthesis ModifyingAbove bold-italic beta With caret right-parenthesis, ModifyingAbove bold g With caret, and ModifyingAbove bold upper Lamda With caret.

DFBETA is computed from the information matrix as

StartLayout 1st Row 1st Column normal upper Delta Subscript h i Baseline bold-italic beta 2nd Column equals 3rd Column bold-italic beta Superscript 0 Baseline minus bold-italic beta Subscript h i Superscript 1 2nd Row 1st Column Blank 2nd Column equals 3rd Column minus ModifyingAbove bold upper Lamda With caret Superscript negative 1 Baseline left-parenthesis bold-italic beta Superscript 0 Baseline right-parenthesis ModifyingAbove bold g With caret left-parenthesis bold-italic beta Superscript 0 Baseline right-parenthesis 3rd Row 1st Column Blank 2nd Column equals 3rd Column minus ModifyingAbove bold upper Lamda With caret Superscript negative 1 Baseline left-parenthesis ModifyingAbove bold-italic beta With caret right-parenthesis left-parenthesis bold upper X prime ModifyingAbove bold upper U With caret bold z Subscript h i Baseline right-parenthesis bold upper M Superscript negative 1 Baseline bold z prime Subscript h i Baseline left-parenthesis bold y minus ModifyingAbove bold-italic pi With caret right-parenthesis EndLayout

where

StartLayout 1st Row 1st Column bold upper M 2nd Column equals 3rd Column left-parenthesis bold z prime Subscript h i Baseline ModifyingAbove bold upper U With caret bold z Subscript h i Baseline right-parenthesis minus left-parenthesis bold z prime Subscript h i Baseline ModifyingAbove bold upper U With caret bold upper X right-parenthesis ModifyingAbove bold upper Lamda With caret Superscript negative 1 Baseline left-parenthesis ModifyingAbove bold-italic beta With caret right-parenthesis left-parenthesis bold upper X prime ModifyingAbove bold upper U With caret bold z Subscript h i Baseline right-parenthesis EndLayout

For each observation in the data set, a DFBETA statistic is computed for each parameter beta Subscript j, 1 less-than-or-equal-to j less-than-or-equal-to p, and standardized by the standard error of beta Subscript j from the full data set to produce the estimate of DFBETAS.

The estimated leverage is defined as

h Subscript h i Baseline equals StartFraction trace StartSet left-parenthesis bold z prime Subscript h i Baseline ModifyingAbove bold upper U With caret bold upper X right-parenthesis ModifyingAbove bold upper Lamda With caret Superscript negative 1 Baseline left-parenthesis ModifyingAbove bold-italic beta With caret right-parenthesis left-parenthesis bold upper X prime ModifyingAbove bold upper U With caret bold z Subscript h i Baseline right-parenthesis EndSet Over trace StartSet bold z prime Subscript h i Baseline ModifyingAbove bold upper U With caret bold z Subscript h i Baseline EndSet EndFraction

This definition of leverage produces different values from those defined by Pregibon (1984); Moolgavkar, Lustbader, and Venzon (1985); Hosmer and Lemeshow (2000); however, it has the advantage that no extra computations beyond those for the DFBETAS are required.

The estimated residuals e Subscript h i Baseline equals y Subscript h i Baseline minus ModifyingAbove pi With caret Subscript h i are obtained from ModifyingAbove bold g With caret left-parenthesis bold-italic beta Superscript 0 Baseline right-parenthesis, and the weights, or predicted probabilities, are then ModifyingAbove pi With caret Subscript h i Baseline equals y Subscript h i Baseline minus e Subscript h i. The residuals are standardized and reported as (estimated) Pearson residuals:

StartFraction r Subscript h i Baseline minus n Subscript h i Baseline ModifyingAbove pi With caret Subscript h i Baseline Over StartRoot n Subscript h i Baseline ModifyingAbove pi With caret Subscript h i Baseline left-parenthesis 1 minus ModifyingAbove pi With caret Subscript h i Baseline right-parenthesis EndRoot EndFraction

where r Subscript h i is the number of events in the observation and n Subscript h i is the number of trials.

The STDRES option in the INFLUENCE option computes the standardized Pearson residual:

e Subscript s comma h i Baseline equals StartFraction e Subscript h i Baseline Over StartRoot 1 minus h Subscript h i Baseline EndRoot EndFraction

For events/trials MODEL statement syntax, treat each observation as two observations (the first for the nonevents and the second for the events) with frequencies f Subscript h comma 2 i minus 1 Baseline equals n Subscript h i Baseline minus r Subscript h i and f Subscript h comma 2 i Baseline equals r Subscript h i, and augment the model with a matrix bold upper Z Subscript h i Baseline equals left-bracket bold z Subscript h comma 2 i minus 1 Baseline bold z Subscript h comma 2 i Baseline right-bracket instead of a single bold z Subscript h i vector. Writing gamma Subscript h i Baseline equals bold x prime Subscript h i Baseline bold-italic beta f Subscript h i in the preceding section results in the following gradient and information matrix:

StartLayout 1st Row 1st Column bold g left-parenthesis bold-italic beta Superscript 0 Baseline right-parenthesis 2nd Column equals 3rd Column Start 3 By 1 Matrix 1st Row  bold 0 2nd Row  f Subscript h comma 2 i minus 1 Baseline left-parenthesis y Subscript h comma 2 i minus 1 Baseline minus pi Subscript h comma 2 i minus 1 Baseline right-parenthesis 3rd Row  f Subscript h comma 2 i Baseline left-parenthesis y Subscript h comma 2 i Baseline minus pi Subscript h comma 2 i Baseline right-parenthesis EndMatrix 2nd Row 1st Column bold upper Lamda left-parenthesis bold-italic beta Superscript 0 Baseline right-parenthesis 2nd Column equals 3rd Column Start 2 By 2 Matrix 1st Row 1st Column bold upper Lamda left-parenthesis bold-italic beta right-parenthesis 2nd Column bold upper X prime diag left-parenthesis bold f right-parenthesis bold upper U diag left-parenthesis bold f right-parenthesis bold upper Z Subscript h i Baseline 2nd Row 1st Column bold upper Z prime Subscript h i Baseline diag left-parenthesis bold f right-parenthesis bold upper U diag left-parenthesis bold f right-parenthesis bold upper X 2nd Column bold upper Z prime Subscript h i Baseline diag left-parenthesis bold f right-parenthesis bold upper U diag left-parenthesis bold f right-parenthesis bold upper Z Subscript h i Baseline EndMatrix EndLayout

The predicted probabilities are then ModifyingAbove pi With caret Subscript h i Baseline equals y Subscript h comma 2 i Baseline minus e Subscript h comma 2 i Baseline slash r Subscript h comma 2 i, while the leverage and the DFBETAS are produced from bold upper Lamda left-parenthesis bold-italic beta Superscript 0 Baseline right-parenthesis in a fashion similar to that for the preceding single-trial equations.

Last updated: December 09, 2022