(View the complete code for this example.)
This example demonstrates causal mediation analysis with treatment, outcome, and mediator variables that are all binary. The data contain information about infant mortality in 2003 and were obtained from the US National Center for Health Statistics. A random sample of 100,000 observations is used in this example. The analysis and its interpretation are purely illustrative; definitive conclusions should not be drawn from this example.
The following statements print the first 10 observations of the data set, which are shown in Output 38.3.1:
proc print data=sashelp.birthwgt(obs=10);
run;
Output 38.3.1: First 10 Observations of birthwgt Data Set
| Obs | LowBirthWgt | Married | AgeGroup | Race | Drinking | Death | Smoking | SomeCollege |
|---|---|---|---|---|---|---|---|---|
| 1 | No | No | 3 | Asian | No | No | No | Yes |
| 2 | No | No | 2 | White | No | No | No | No |
| 3 | Yes | Yes | 2 | Native | No | Yes | No | No |
| 4 | No | No | 2 | White | No | No | No | No |
| 5 | No | No | 2 | White | No | No | No | Yes |
| 6 | No | No | 2 | White | No | No | No | |
| 7 | No | No | 2 | Asian | No | No | No | Yes |
| 8 | No | No | 3 | White | No | No | No | Yes |
| 9 | No | Yes | 1 | Black | No | No | No | No |
| 10 | No | No | 2 | Native | No | No | No | Yes |
The main variables in the analysis are as follows:
The treatment variable is Smoking. It is an indicator of maternal smoking behavior, with values Yes and No.
The outcome variable is Death. It is an indicator of infant death within one year of birth, with values Yes and No.
The mediator variable is LowBirthWgt. It is an indicator of low birth weight (less than 2,500 grams), with values Yes and No.
The analysis also includes five confounding covariates:
AgeGroup represents maternal ages of less than 20, between 20 and 35, and greater than 35, with values 1, 2, and 3, respectively.
Drinking is an indicator of maternal drinking during pregnancy, with values Yes and No.
Married is an indicator of marital status, with values Yes and No.
Race is an indicator of race, with values Asian, Black, Hispanic, Native (native American), and White.
SomeCollege is an indicator of whether the mother has 12 or more years of education, with values Yes and No.
The following statements specify a causal mediation model:
proc causalmed data=sashelp.birthwgt decomp;
class LowBirthWgt Smoking Death AgeGroup Married Race
Drinking SomeCollege /descending;
mediator LowBirthWgt = Smoking;
model Death = LowBirthWgt | Smoking;
covar AgeGroup Married Race Drinking SomeCollege;
evaluate 'Low Birth-Weight' LowBirthWgt='Yes' / nodecomp;
evaluate 'Normal Birth-Weight' LowBirthWgt='No' / nodecomp;
run;
The DECOMP option requests various total effect decompositions. The MEDIATOR statement specifies the mediator model for the response LowBirthWgt. The MODEL statement specifies the outcome model for the response Death and assumes an interaction between LowBirthWgt and Smoking. The CLASS statement names the categorical variables in the analysis, and the DESCENDING option models the probability of the last level of both responses (Death=Yes and LowBirthWgt=Yes). The COVAR statement specifies the five covariates. Finally, the two EVALUATE statements specify the mediator levels for comparing their patterns of causal mediation effects.
Output 38.3.2 displays the model information, which includes the outcome, treatment, and mediator variables, the distributions, and the link functions of the response variables. Because observations that have missing values are not included, only 93,292 observations are used for analysis.
Output 38.3.2: Model Information
| Model Information | |
|---|---|
| Data Set | SASHELP.BIRTHWGT |
| Outcome Variable | Death |
| Treatment Variable | Smoking |
| Mediator Variable | LowBirthWgt |
| Outcome Modeling | Generalized Linear Model |
| Outcome Distribution | Binomial |
| Outcome Link Function | Logit |
| Mediator Modeling | Generalized Linear Model |
| Mediator Distribution | Binomial |
| Mediator Link Function | Logit |
| Number of Observations Read | 100000 |
|---|---|
| Number of Observations Used | 93292 |
Output 38.3.3 displays the levels of the categorical variables, including binary response variables.
Output 38.3.3: Class Levels
| Class Level Information | ||
|---|---|---|
| Class | Levels | Values |
| LowBirthWgt | 2 | Yes No |
| Smoking | 2 | Yes No |
| Death | 2 | Yes No |
| AgeGroup | 3 | 3 2 1 |
| Married | 2 | Yes No |
| Race | 5 | White Native Hispanic Black Asian |
| Drinking | 2 | Yes No |
| SomeCollege | 2 | Yes No |
Output 38.3.4 displays frequency counts of the binary outcome, mediator, and treatment variables. It also shows which levels of the response variables are being modeled.
Output 38.3.4: Profiles of Binary Outcome, Mediator, and Treatment Variables
| Response Profile | ||
|---|---|---|
| Ordered Value |
Death | Total Frequency |
| 1 | Yes | 527 |
| 2 | No | 92765 |
| Outcome probability modeled is Death='Yes'. |
| Mediator Profile | ||
|---|---|---|
| Ordered Value |
LowBirthWgt | Total Frequency |
| 1 | Yes | 7562 |
| 2 | No | 85730 |
| Mediator probability modeled is LowBirthWgt='Yes'. |
| Treatment Profile | ||
|---|---|---|
| Ordered Value |
Smoking | Total Frequency |
| 1 | Yes | 20984 |
| 2 | No | 72308 |
Output 38.3.5 displays the major decompositions of effects on infant mortality on both the odds ratio (OR) scale and the excess relative risk scale. Percentages of the total effect are displayed only on the excess relative risk scale.
Output 38.3.5: Summary of Effects on Infant Mortality
| Summary of Effects | ||||||||||
|---|---|---|---|---|---|---|---|---|---|---|
| Estimate | Standard Error |
Nontransformed Scale | Back-Transformed from Log Scale | |||||||
| Wald 95% Confidence Limits |
Z | Pr > |Z| | 95% Confidence Limits | Z | Pr > |Z| | |||||
| Odds Ratio Total Effect | 1.7071 | 0.2215 | 1.2729 | 2.1412 | 3.19 | 0.0014 | 1.3237 | 2.2014 | 4.12 | <.0001 |
| Odds Ratio Controlled Direct Effect (CDE) | 1.8940 | 0.3540 | 1.2002 | 2.5879 | 2.53 | 0.0116 | 1.3131 | 2.7320 | 3.42 | 0.0006 |
| Odds Ratio Natural Direct Effect (NDE) | 1.3626 | 0.1768 | 1.0160 | 1.7092 | 2.05 | 0.0403 | 1.0566 | 1.7572 | 2.38 | 0.0171 |
| Odds Ratio Natural Indirect Effect (NIE) | 1.2528 | 0.03432 | 1.1855 | 1.3201 | 7.37 | <.0001 | 1.1873 | 1.3219 | 8.23 | <.0001 |
| Total Excess Relative Risk | 0.7071 | 0.2215 | 0.2729 | 1.1412 | 3.19 | 0.0014 | ||||
| Excess Relative Risk Due to CDE | 0.3246 | 0.1207 | 0.08810 | 0.5611 | 2.69 | 0.0071 | ||||
| Excess Relative Risk Due to NDE | 0.3626 | 0.1768 | 0.01604 | 0.7092 | 2.05 | 0.0403 | ||||
| Excess Relative Risk Due to NIE | 0.3445 | 0.06119 | 0.2245 | 0.4644 | 5.63 | <.0001 | ||||
| Percentage Mediated | 48.7165 | 9.8917 | 29.3291 | 68.1040 | 4.92 | <.0001 | ||||
| Percentage Due to Interaction | 8.1202 | 19.8380 | -30.7615 | 47.0020 | 0.41 | 0.6823 | ||||
| Percentage Eliminated | 54.0930 | 11.6647 | 31.2306 | 76.9554 | 4.64 | <.0001 | ||||
The first four rows of the table in Output 38.3.5 summarize the effects on the odds ratio scale. The controlled direct effect (CDE) on this scale is 1.894. This is the CDE when the mediator variable LowBirthWgt is controlled at the level No. In other words, this is the CDE odds ratio for the group that has normal birth weights. The natural direct effect (NDE) and natural indirect effect (NIE) on the odds ratio scale are 1.363 and 1.253, respectively. Their product, rather than their sum, is the same as the total effect on the odds ratio scale, which is 1.707.
There are two sets of confidence limits, z-values, and p-values for the odds ratio effects in Output 38.3.5. One set is based on the normality of the original (nontransformed) scale, and the other is based on the normality of log scale with back-transformations. The log scale inferences are sometimes preferred because they are range-respecting. That is, unlike the nontransformed Wald confidence intervals that could include negative values for odds ratios, the confidence limits constructed from back-transformation of the log scale are always nonnegative. However, for the current example, the two sets of confidence limits, z-values, and p-values result in very much the same statistical inferences about the four causal mediation effects.
The next seven rows of the table in Output 38.3.5 summarize effects on the excess relative risk (ERR) scale. The natural direct effect (0.363) and natural indirect effect (0.345) have an additive property on this scale; they sum to the total excess relative risk, which is 0.707. Additivity makes it easier to use these values to deduce the 'Percentage Mediated', which is 48.72% (= 0.3445/0.7071100%). Therefore, about 50% of the smoking effect on infant mortality is mediated through the lowering of babies’ birth weights. However, the 95% confidence interval for the 'Percentage Mediated' is (29.3%, 68.1%), which is fairly wide. More data would yield a more precise interval estimate.
The percentage of total effect due to the interaction between smoking and low birth weights is about 8%, which is relatively small. Again, the corresponding 95% confidence interval, (–30.8%, 47.0%), is quite wide.
The DECOMP option requests various total effect decompositions, which are shown in Output 38.3.6. Following VanderWeele (2014), all these decompositions are computed on the excess relative risk scale.
Output 38.3.6: Decompositions of Smoking Effects on Infant Mortality
| Decompositions of Total Excess Relative Risk | |||||||
|---|---|---|---|---|---|---|---|
| Decomposition | Excess Relative Risk | Estimate | Standard Error |
Wald 95% Confidence Limits |
Z | Pr > |Z| | |
| NDE+NIE | Natural Direct | 0.3626 | 0.1768 | 0.01604 | 0.7092 | 2.05 | 0.0403 |
| Natural Indirect | 0.3445 | 0.06119 | 0.2245 | 0.4644 | 5.63 | <.0001 | |
| CDE+PE | Controlled Direct | 0.3246 | 0.1207 | 0.08810 | 0.5611 | 2.69 | 0.0071 |
| Portion Eliminated | 0.3825 | 0.1556 | 0.07752 | 0.6874 | 2.46 | 0.0140 | |
| TDE+PIE | Total Direct | 0.3820 | 0.2188 | -0.04688 | 0.8109 | 1.75 | 0.0809 |
| Pure Indirect | 0.3251 | 0.03534 | 0.2558 | 0.3943 | 9.20 | <.0001 | |
| NDE+PIE+IMD | Natural Direct | 0.3626 | 0.1768 | 0.01604 | 0.7092 | 2.05 | 0.0403 |
| Pure Indirect | 0.3251 | 0.03534 | 0.2558 | 0.3943 | 9.20 | <.0001 | |
| Mediated Interaction | 0.01940 | 0.05229 | -0.08309 | 0.1219 | 0.37 | 0.7106 | |
| CDE+PIE+PAI | Controlled Direct | 0.3246 | 0.1207 | 0.08810 | 0.5611 | 2.69 | 0.0071 |
| Pure Indirect | 0.3251 | 0.03534 | 0.2558 | 0.3943 | 9.20 | <.0001 | |
| Portion Due to Interaction | 0.05742 | 0.1547 | -0.2457 | 0.3606 | 0.37 | 0.7105 | |
| Four-Way | Controlled Direct | 0.3246 | 0.1207 | 0.08810 | 0.5611 | 2.69 | 0.0071 |
| Reference Interaction | 0.03801 | 0.1024 | -0.1627 | 0.2387 | 0.37 | 0.7105 | |
| Mediated Interaction | 0.01940 | 0.05229 | -0.08309 | 0.1219 | 0.37 | 0.7106 | |
| Pure Indirect | 0.3251 | 0.03534 | 0.2558 | 0.3943 | 9.20 | <.0001 | |
| Total | Excess Relative Risk | 0.7071 | 0.2215 | 0.2729 | 1.1412 | 3.19 | 0.0014 |
| Note: NDE=CDE+IRF, NIE=PIE+IMD, PAI=IRF+IMD, PE=PAI+PIE, TDE=CDE+PAI. | |||||||
As shown in Output 38.3.7, PROC CAUSALMED also displays the corresponding decompositions by their percentage contribution to the total effect on the excess relative risk scale.
Output 38.3.7: Percentage Decomposition of Smoking Effects on Infant Mortality
| Percentage Decompositions of Total Excess Relative Risk | |||||||
|---|---|---|---|---|---|---|---|
| Decomposition | Excess Relative Risk | Percent | Standard Error |
Wald 95% Confidence Limits |
Z | Pr > |Z| | |
| NDE+NIE | Natural Direct | 51.28 | 9.89 | 31.90 | 70.67 | 5.18 | <.0001 |
| Natural Indirect | 48.72 | 9.89 | 29.33 | 68.10 | 4.92 | <.0001 | |
| CDE+PE | Controlled Direct | 45.91 | 11.66 | 23.04 | 68.77 | 3.94 | <.0001 |
| Portion Eliminated | 54.09 | 11.66 | 31.23 | 76.96 | 4.64 | <.0001 | |
| TDE+PIE | Total Direct | 54.03 | 14.49 | 25.62 | 82.43 | 3.73 | 0.0002 |
| Pure Indirect | 45.97 | 14.49 | 17.57 | 74.38 | 3.17 | 0.0015 | |
| NDE+PIE+IMD | Natural Direct | 51.28 | 9.89 | 31.90 | 70.67 | 5.18 | <.0001 |
| Pure Indirect | 45.97 | 14.49 | 17.57 | 74.38 | 3.17 | 0.0015 | |
| Mediated Interaction | 2.74 | 6.70 | -10.40 | 15.88 | 0.41 | 0.6823 | |
| CDE+PIE+PAI | Controlled Direct | 45.91 | 11.66 | 23.04 | 68.77 | 3.94 | <.0001 |
| Pure Indirect | 45.97 | 14.49 | 17.57 | 74.38 | 3.17 | 0.0015 | |
| Portion Due to Interaction | 8.12 | 19.84 | -30.76 | 47.00 | 0.41 | 0.6823 | |
| Four-Way | Controlled Direct | 45.91 | 11.66 | 23.04 | 68.77 | 3.94 | <.0001 |
| Reference Interaction | 5.38 | 13.14 | -20.37 | 31.13 | 0.41 | 0.6824 | |
| Mediated Interaction | 2.74 | 6.70 | -10.40 | 15.88 | 0.41 | 0.6823 | |
| Pure Indirect | 45.97 | 14.49 | 17.57 | 74.38 | 3.17 | 0.0015 | |
| Note: NDE=CDE+IRF, NIE=PIE+IMD, PAI=IRF+IMD, PE=PAI+PIE, TDE=CDE+PAI. | |||||||
The entries for the four-way decomposition in Output 38.3.7 show that 46% of the total effect is attributed to neither interaction nor mediation ('Controlled Direct'), 5% is attributed to interaction but not mediation ('Reference Interaction'), 3% is attributed to both mediation and interaction ('Mediated Interaction'), and 46% is attributed to mediation but not interaction ('Pure Indirect').
In the three-way decomposition labeled 'CDE+PIE+PAI,' the percentage of total effect that is attributed to the interaction (PAI or 'Portion Due to Interaction' in the table) is about 8%, which is not large but is also not ignorable.
Note that some of the confidence intervals in this table span from negative to positive values. This indicates that the corresponding point estimates might not be very accurate.
As requested by the first EVALUATE statement, the table in Output 38.3.8 displays the major effects and percentages when the mediator LowBirthWgt is set at the level Yes.
Output 38.3.8: Summary of Smoking Effects for the Low Birth-Weight Group
| Summary of Effects: Low Birth-Weight | ||||||||||
|---|---|---|---|---|---|---|---|---|---|---|
| Estimate | Standard Error |
Nontransformed Scale | Back-Transformed from Log Scale | |||||||
| Wald 95% Confidence Limits |
Z | Pr > |Z| | 95% Confidence Limits | Z | Pr > |Z| | |||||
| Odds Ratio Total Effect | 1.7071 | 0.2215 | 1.2729 | 2.1412 | 3.19 | 0.0014 | 1.3237 | 2.2014 | 4.12 | <.0001 |
| Odds Ratio Controlled Direct Effect (CDE) | 1.0917 | 0.1591 | 0.7799 | 1.4036 | 0.58 | 0.5643 | 0.8205 | 1.4527 | 0.60 | 0.5470 |
| Odds Ratio Natural Direct Effect (NDE) | 1.3626 | 0.1768 | 1.0160 | 1.7092 | 2.05 | 0.0403 | 1.0566 | 1.7572 | 2.38 | 0.0171 |
| Odds Ratio Natural Indirect Effect (NIE) | 1.2528 | 0.03432 | 1.1855 | 1.3201 | 7.37 | <.0001 | 1.1873 | 1.3219 | 8.23 | <.0001 |
| Total Excess Relative Risk | 0.7071 | 0.2215 | 0.2729 | 1.1412 | 3.19 | 0.0014 | ||||
| Excess Relative Risk Due to CDE | 0.8669 | 1.4959 | -2.0649 | 3.7988 | 0.58 | 0.5622 | ||||
| Excess Relative Risk Due to NDE | 0.3626 | 0.1768 | 0.01604 | 0.7092 | 2.05 | 0.0403 | ||||
| Excess Relative Risk Due to NIE | 0.3445 | 0.06119 | 0.2245 | 0.4644 | 5.63 | <.0001 | ||||
| Percentage Mediated | 48.7165 | 9.8917 | 29.3291 | 68.1040 | 4.92 | <.0001 | ||||
| Percentage Due to Interaction | -68.5853 | 167.58 | -397.04 | 259.87 | -0.41 | 0.6823 | ||||
| Percentage Eliminated | -22.6126 | 179.59 | -374.60 | 329.38 | -0.13 | 0.8998 | ||||
The odds ratio CDE (which is evaluated for the low birth-weight group) is 1.09, with a corresponding 95% confidence interval of (0.82, 1.45) when the log scale is used for inferences.
As requested by the second EVALUATE statement, the table in Output 38.3.9 displays the major effects and percentages when the mediator LowBirthWgt is set at the level No.
Output 38.3.9: Summary of Smoking Effects for the Normal Birth-Weight Group
| Summary of Effects: Normal Birth-Weight | ||||||||||
|---|---|---|---|---|---|---|---|---|---|---|
| Estimate | Standard Error |
Nontransformed Scale | Back-Transformed from Log Scale | |||||||
| Wald 95% Confidence Limits |
Z | Pr > |Z| | 95% Confidence Limits | Z | Pr > |Z| | |||||
| Odds Ratio Total Effect | 1.7071 | 0.2215 | 1.2729 | 2.1412 | 3.19 | 0.0014 | 1.3237 | 2.2014 | 4.12 | <.0001 |
| Odds Ratio Controlled Direct Effect (CDE) | 1.8940 | 0.3540 | 1.2002 | 2.5879 | 2.53 | 0.0116 | 1.3131 | 2.7320 | 3.42 | 0.0006 |
| Odds Ratio Natural Direct Effect (NDE) | 1.3626 | 0.1768 | 1.0160 | 1.7092 | 2.05 | 0.0403 | 1.0566 | 1.7572 | 2.38 | 0.0171 |
| Odds Ratio Natural Indirect Effect (NIE) | 1.2528 | 0.03432 | 1.1855 | 1.3201 | 7.37 | <.0001 | 1.1873 | 1.3219 | 8.23 | <.0001 |
| Total Excess Relative Risk | 0.7071 | 0.2215 | 0.2729 | 1.1412 | 3.19 | 0.0014 | ||||
| Excess Relative Risk Due to CDE | 0.3246 | 0.1207 | 0.08810 | 0.5611 | 2.69 | 0.0071 | ||||
| Excess Relative Risk Due to NDE | 0.3626 | 0.1768 | 0.01604 | 0.7092 | 2.05 | 0.0403 | ||||
| Excess Relative Risk Due to NIE | 0.3445 | 0.06119 | 0.2245 | 0.4644 | 5.63 | <.0001 | ||||
| Percentage Mediated | 48.7165 | 9.8917 | 29.3291 | 68.1040 | 4.92 | <.0001 | ||||
| Percentage Due to Interaction | 8.1202 | 19.8380 | -30.7615 | 47.0020 | 0.41 | 0.6823 | ||||
| Percentage Eliminated | 54.0930 | 11.6647 | 31.2306 | 76.9554 | 4.64 | <.0001 | ||||
The odds ratio CDE (which is evaluated for the normal birth-weight group) is now 1.89, with a corresponding 95% confidence interval of (1.31, 2.73) when the log scale is used for inferences.
Note that the controlled level of the mediator requested by the second EVALUATE statement coincides the default setting that uses the last level of mediator as the controlled level. Hence, the results in Output 38.3.9 and Output 38.3.5 are identical.