The LOGISTIC Procedure

Odds Ratio Estimation

Consider a dichotomous response variable with outcomes event and nonevent. Consider a dichotomous risk factor variable X that takes the value 1 if the risk factor is present and 0 if the risk factor is absent. According to the logistic model, the log odds function, logit left-parenthesis upper X right-parenthesis, is given by

logit left-parenthesis upper X right-parenthesis identical-to log left-parenthesis StartFraction probability left-parenthesis e v e n t vertical-bar upper X right-parenthesis Over probability left-parenthesis n o n e v e n t vertical-bar upper X right-parenthesis EndFraction right-parenthesis equals alpha plus upper X beta

The odds ratio psi is defined as the ratio of the odds for those with the risk factor (X = 1) to the odds for those without the risk factor (X = 0). The log of the odds ratio is given by

log left-parenthesis psi right-parenthesis identical-to log left-parenthesis psi left-parenthesis upper X equals 1 comma upper X equals 0 right-parenthesis right-parenthesis equals logit left-parenthesis upper X equals 1 right-parenthesis minus logit left-parenthesis upper X equals 0 right-parenthesis equals left-parenthesis alpha plus 1 times beta right-parenthesis minus left-parenthesis alpha plus 0 times beta right-parenthesis equals beta

In general, the odds ratio can be computed by exponentiating the difference of the logits between any two population profiles. This is the approach taken by the ODDSRATIO statement, so the computations are available regardless of parameterization, interactions, and nestings. However, as shown in the preceding equation for log left-parenthesis psi right-parenthesis, odds ratios of main effects can be computed as functions of the parameter estimates, and the remainder of this section is concerned with this methodology.

The parameter, beta, associated with X represents the change in the log odds from upper X equals 0 to upper X equals 1. So the odds ratio is obtained by simply exponentiating the value of the parameter associated with the risk factor. The odds ratio indicates how the odds of the event change as you change X from 0 to 1. For instance, psi equals 2 means that the odds of an event when X = 1 are twice the odds of an event when X = 0. You can also express this as follows: the percent change in the odds of an event from X = 0 to X = 1 is left-parenthesis psi minus 1 right-parenthesis 100 percent-sign equals 100 percent-sign.

Suppose the values of the dichotomous risk factor are coded as constants a and b instead of 0 and 1. The odds when upper X equals a become exp left-parenthesis alpha plus a beta right-parenthesis, and the odds when upper X equals b become exp left-parenthesis alpha plus b beta right-parenthesis. The odds ratio corresponding to an increase in X from a to b is

psi equals exp left-bracket left-parenthesis b minus a right-parenthesis beta right-bracket equals left-bracket exp left-parenthesis beta right-parenthesis right-bracket Superscript b minus a Baseline identical-to left-bracket exp left-parenthesis beta right-parenthesis right-bracket Superscript c

Note that for any a and b such that c equals b minus a equals 1 comma psi equals exp left-parenthesis beta right-parenthesis. So the odds ratio can be interpreted as the change in the odds for any increase of one unit in the corresponding risk factor. However, the change in odds for some amount other than one unit is often of greater interest. For example, a change of one pound in body weight might be too small to be considered important, while a change of 10 pounds might be more meaningful. The odds ratio for a change in X from a to b is estimated by raising the odds ratio estimate for a unit change in X to the power of c equals b minus a as shown previously.

For a polytomous risk factor, the computation of odds ratios depends on how the risk factor is parameterized. For illustration, suppose that Race is a risk factor with four categories: White, Black, Hispanic, and Other.

For the effect parameterization scheme (PARAM=EFFECT) with White as the reference group (REF=’White’), the design variables for Race are as follows:

Design Variables
Race upper X 1 upper X 2 upper X 3
Black 1 0 0
Hispanic 0 1 0
Other 0 0 1
White –1    –1 –1

The log odds for Black is

StartLayout 1st Row 1st Column logit left-parenthesis Black right-parenthesis 2nd Column equals 3rd Column alpha plus left-parenthesis upper X 1 equals 1 right-parenthesis beta 1 plus left-parenthesis upper X 2 equals 0 right-parenthesis beta 2 plus left-parenthesis upper X 3 equals 0 right-parenthesis beta 3 2nd Row 1st Column Blank 2nd Column equals 3rd Column alpha plus beta 1 EndLayout

The log odds for White is

StartLayout 1st Row 1st Column logit left-parenthesis White right-parenthesis 2nd Column equals 3rd Column alpha plus left-parenthesis upper X 1 equals negative 1 right-parenthesis beta 1 plus left-parenthesis upper X 2 equals negative 1 right-parenthesis beta 2 plus left-parenthesis upper X 3 equals negative 1 right-parenthesis beta 3 2nd Row 1st Column Blank 2nd Column equals 3rd Column alpha minus beta 1 minus beta 2 minus beta 3 EndLayout

Therefore, the log odds ratio of Black versus White becomes

StartLayout 1st Row 1st Column log left-parenthesis psi left-parenthesis Black comma White right-parenthesis right-parenthesis 2nd Column equals 3rd Column logit left-parenthesis Black right-parenthesis minus logit left-parenthesis White right-parenthesis 2nd Row 1st Column Blank 2nd Column equals 3rd Column 2 beta 1 plus beta 2 plus beta 3 EndLayout

For the reference cell parameterization scheme (PARAM=REF) with White as the reference cell, the design variables for race are as follows:

Design Variables
Race upper X 1 upper X 2 upper X 3
Black 1 0 0
Hispanic 0 1 0
Other 0 0 1
White 0       0 0

The log odds ratio of Black versus White is given by

StartLayout 1st Row 1st Column log left-parenthesis psi left-parenthesis Black comma White right-parenthesis right-parenthesis 2nd Column equals 3rd Column logit left-parenthesis Black right-parenthesis minus logit left-parenthesis White right-parenthesis 2nd Row 1st Column Blank 2nd Column equals 3rd Column left-parenthesis alpha plus left-parenthesis upper X 1 equals 1 right-parenthesis beta 1 plus left-parenthesis upper X 2 equals 0 right-parenthesis beta 2 plus left-parenthesis upper X 3 equals 0 right-parenthesis beta 3 right-parenthesis minus 3rd Row 1st Column Blank 2nd Column Blank 3rd Column left-parenthesis alpha plus left-parenthesis upper X 1 equals 0 right-parenthesis beta 1 plus left-parenthesis upper X 2 equals 0 right-parenthesis beta 2 plus left-parenthesis upper X 3 equals 0 right-parenthesis beta 3 right-parenthesis 4th Row 1st Column Blank 2nd Column equals 3rd Column beta 1 EndLayout

For the GLM parameterization scheme (PARAM=GLM), the design variables are as follows:

Design Variables
Race upper X 1 upper X 2 upper X 3 upper X 4
Black 1 0 0 0
Hispanic 0 1 0 0
Other 0 0 1 0
White 0 0 0 1

The log odds ratio of Black versus White is

StartLayout 1st Row 1st Column log left-parenthesis psi left-parenthesis Black comma White right-parenthesis right-parenthesis 2nd Column equals 3rd Column logit left-parenthesis Black right-parenthesis minus logit left-parenthesis White right-parenthesis 2nd Row 1st Column Blank 2nd Column equals 3rd Column left-parenthesis alpha plus left-parenthesis upper X 1 equals 1 right-parenthesis beta 1 plus left-parenthesis upper X 2 equals 0 right-parenthesis beta 2 plus left-parenthesis upper X 3 equals 0 right-parenthesis beta 3 plus left-parenthesis upper X 4 equals 0 right-parenthesis beta 4 right-parenthesis minus 3rd Row 1st Column Blank 2nd Column Blank 3rd Column left-parenthesis alpha plus left-parenthesis upper X 1 equals 0 right-parenthesis beta 1 plus left-parenthesis upper X 2 equals 0 right-parenthesis beta 2 plus left-parenthesis upper X 3 equals 0 right-parenthesis beta 3 plus left-parenthesis upper X 4 equals 1 right-parenthesis beta 4 right-parenthesis 4th Row 1st Column Blank 2nd Column equals 3rd Column beta 1 minus beta 4 EndLayout

Consider the hypothetical example of heart disease among race in Hosmer and Lemeshow (2000, p. 56). The entries in the following contingency table represent counts:

Race
Disease Status White Black Hispanic Other
Present 5 20 15 10
Absent 20 10 10 10

The computation of odds ratio of Black versus White for various parameterization schemes is tabulated in Table 10.

Table 10: Odds Ratio of Heart Disease Comparing Black to White

Parameter Estimates
PARAM= ModifyingAbove beta With caret Subscript 1 ModifyingAbove beta With caret Subscript 2 ModifyingAbove beta With caret Subscript 3 ModifyingAbove beta With caret Subscript 4 Odds Ratio Estimates
EFFECT 0.7651 0.4774 0.0719 exp left-parenthesis 2 times 0.7651 plus 0.4774 plus 0.0719 right-parenthesis equals 8
REF 2.0794 1.7917 1.3863 exp left-parenthesis 2.0794 right-parenthesis equals 8
GLM 2.0794 1.7917 1.3863 0.0000 exp left-parenthesis 2.0794 right-parenthesis equals 8


Because the log odds ratio (log left-parenthesis psi right-parenthesis) is a linear function of the parameters, the Wald confidence interval for log left-parenthesis psi right-parenthesis can be derived from the parameter estimates and the estimated covariance matrix. Confidence intervals for the odds ratios are obtained by exponentiating the corresponding confidence limits for the log odd ratios. In the displayed output of PROC LOGISTIC, the "Odds Ratio Estimates" table contains the odds ratio estimates and the corresponding 95% Wald confidence intervals. For continuous explanatory variables, these odds ratios correspond to a unit increase in the risk factors.

To customize odds ratios for specific units of change for a continuous risk factor, you can use the UNITS statement to specify a list of relevant units for each explanatory variable in the model. Estimates of these customized odds ratios are given in a separate table. Let left-parenthesis upper L Subscript j Baseline comma upper U Subscript j Baseline right-parenthesis be a confidence interval for log left-parenthesis psi right-parenthesis. The corresponding lower and upper confidence limits for the customized odds ratio exp left-parenthesis c beta Subscript j Baseline right-parenthesis are exp left-parenthesis c upper L Subscript j Baseline right-parenthesis and exp left-parenthesis c upper U Subscript j Baseline right-parenthesis, respectively (for c greater-than 0), or exp left-parenthesis c upper U Subscript j Baseline right-parenthesis and exp left-parenthesis c upper L Subscript j Baseline right-parenthesis, respectively (for c less-than 0). You use the CLODDS= option or ODDSRATIO statement to request the confidence intervals for the odds ratios.

For a generalized logit model, odds ratios are computed similarly, except k odds ratios are computed for each effect, corresponding to the k logits in the model.

Last updated: March 08, 2022