The CAUSALMED Procedure

Example 38.3 Smoking Effect on Infant Mortality

(View the complete code for this example.)

This example demonstrates causal mediation analysis with treatment, outcome, and mediator variables that are all binary. The data contain information about infant mortality in 2003 and were obtained from the US National Center for Health Statistics. A random sample of 100,000 observations is used in this example. The analysis and its interpretation are purely illustrative; definitive conclusions should not be drawn from this example.

The following statements print the first 10 observations of the data set, which are shown in Output 38.3.1:

proc print data=sashelp.birthwgt(obs=10);
run;

Output 38.3.1: First 10 Observations of birthwgt Data Set

Obs LowBirthWgt Married AgeGroup Race Drinking Death Smoking SomeCollege
1 No No 3 Asian No No No Yes
2 No No 2 White No No No No
3 Yes Yes 2 Native No Yes No No
4 No No 2 White No No No No
5 No No 2 White No No No Yes
6 No No 2 White No No No  
7 No No 2 Asian No No No Yes
8 No No 3 White No No No Yes
9 No Yes 1 Black No No No No
10 No No 2 Native No No No Yes


The main variables in the analysis are as follows:

  • The treatment variable is Smoking. It is an indicator of maternal smoking behavior, with values Yes and No.

  • The outcome variable is Death. It is an indicator of infant death within one year of birth, with values Yes and No.

  • The mediator variable is LowBirthWgt. It is an indicator of low birth weight (less than 2,500 grams), with values Yes and No.

The analysis also includes five confounding covariates:

  • AgeGroup represents maternal ages of less than 20, between 20 and 35, and greater than 35, with values 1, 2, and 3, respectively.

  • Drinking is an indicator of maternal drinking during pregnancy, with values Yes and No.

  • Married is an indicator of marital status, with values Yes and No.

  • Race is an indicator of race, with values Asian, Black, Hispanic, Native (native American), and White.

  • SomeCollege is an indicator of whether the mother has 12 or more years of education, with values Yes and No.

The following statements specify a causal mediation model:

proc causalmed data=sashelp.birthwgt decomp;
   class LowBirthWgt Smoking Death AgeGroup Married Race
         Drinking SomeCollege /descending;
   mediator LowBirthWgt = Smoking;
   model Death = LowBirthWgt | Smoking;
   covar AgeGroup Married Race Drinking SomeCollege;
   evaluate 'Low Birth-Weight' LowBirthWgt='Yes' / nodecomp;
   evaluate 'Normal Birth-Weight' LowBirthWgt='No' / nodecomp;
run;

The DECOMP option requests various total effect decompositions. The MEDIATOR statement specifies the mediator model for the response LowBirthWgt. The MODEL statement specifies the outcome model for the response Death and assumes an interaction between LowBirthWgt and Smoking. The CLASS statement names the categorical variables in the analysis, and the DESCENDING option models the probability of the last level of both responses (Death=Yes and LowBirthWgt=Yes). The COVAR statement specifies the five covariates. Finally, the two EVALUATE statements specify the mediator levels for comparing their patterns of causal mediation effects.

Output 38.3.2 displays the model information, which includes the outcome, treatment, and mediator variables, the distributions, and the link functions of the response variables. Because observations that have missing values are not included, only 93,292 observations are used for analysis.

Output 38.3.2: Model Information

Model Information
Data Set SASHELP.BIRTHWGT
Outcome Variable Death
Treatment Variable Smoking
Mediator Variable LowBirthWgt
Outcome Modeling Generalized Linear Model
Outcome Distribution Binomial
Outcome Link Function Logit
Mediator Modeling Generalized Linear Model
Mediator Distribution Binomial
Mediator Link Function Logit

Number of Observations Read 100000
Number of Observations Used 93292


Output 38.3.3 displays the levels of the categorical variables, including binary response variables.

Output 38.3.3: Class Levels

Class Level Information
Class Levels Values
LowBirthWgt 2 Yes No
Smoking 2 Yes No
Death 2 Yes No
AgeGroup 3 3 2 1
Married 2 Yes No
Race 5 White Native Hispanic Black Asian
Drinking 2 Yes No
SomeCollege 2 Yes No


Output 38.3.4 displays frequency counts of the binary outcome, mediator, and treatment variables. It also shows which levels of the response variables are being modeled.

Output 38.3.4: Profiles of Binary Outcome, Mediator, and Treatment Variables

Response Profile
Ordered
Value
Death Total
Frequency
1 Yes 527
2 No 92765


Outcome probability modeled is Death='Yes'.

Mediator Profile
Ordered
Value
LowBirthWgt Total
Frequency
1 Yes 7562
2 No 85730


Mediator probability modeled is LowBirthWgt='Yes'.

Treatment Profile
Ordered
Value
Smoking Total
Frequency
1 Yes 20984
2 No 72308


Output 38.3.5 displays the major decompositions of effects on infant mortality on both the odds ratio (OR) scale and the excess relative risk scale. Percentages of the total effect are displayed only on the excess relative risk scale.

Output 38.3.5: Summary of Effects on Infant Mortality

Summary of Effects
  Estimate Standard
Error
Nontransformed Scale Back-Transformed from Log Scale
  Wald 95%
Confidence Limits
Z Pr > |Z| 95% Confidence Limits Z Pr > |Z|
Odds Ratio Total Effect 1.7071 0.2215 1.2729 2.1412 3.19 0.0014 1.3237 2.2014 4.12 <.0001
Odds Ratio Controlled Direct Effect (CDE) 1.8940 0.3540 1.2002 2.5879 2.53 0.0116 1.3131 2.7320 3.42 0.0006
Odds Ratio Natural Direct Effect (NDE) 1.3626 0.1768 1.0160 1.7092 2.05 0.0403 1.0566 1.7572 2.38 0.0171
Odds Ratio Natural Indirect Effect (NIE) 1.2528 0.03432 1.1855 1.3201 7.37 <.0001 1.1873 1.3219 8.23 <.0001
Total Excess Relative Risk 0.7071 0.2215 0.2729 1.1412 3.19 0.0014        
Excess Relative Risk Due to CDE 0.3246 0.1207 0.08810 0.5611 2.69 0.0071        
Excess Relative Risk Due to NDE 0.3626 0.1768 0.01604 0.7092 2.05 0.0403        
Excess Relative Risk Due to NIE 0.3445 0.06119 0.2245 0.4644 5.63 <.0001        
Percentage Mediated 48.7165 9.8917 29.3291 68.1040 4.92 <.0001        
Percentage Due to Interaction 8.1202 19.8380 -30.7615 47.0020 0.41 0.6823        
Percentage Eliminated 54.0930 11.6647 31.2306 76.9554 4.64 <.0001        


The first four rows of the table in Output 38.3.5 summarize the effects on the odds ratio scale. The controlled direct effect (CDE) on this scale is 1.894. This is the CDE when the mediator variable LowBirthWgt is controlled at the level No. In other words, this is the CDE odds ratio for the group that has normal birth weights. The natural direct effect (NDE) and natural indirect effect (NIE) on the odds ratio scale are 1.363 and 1.253, respectively. Their product, rather than their sum, is the same as the total effect on the odds ratio scale, which is 1.707.

There are two sets of confidence limits, z-values, and p-values for the odds ratio effects in Output 38.3.5. One set is based on the normality of the original (nontransformed) scale, and the other is based on the normality of log scale with back-transformations. The log scale inferences are sometimes preferred because they are range-respecting. That is, unlike the nontransformed Wald confidence intervals that could include negative values for odds ratios, the confidence limits constructed from back-transformation of the log scale are always nonnegative. However, for the current example, the two sets of confidence limits, z-values, and p-values result in very much the same statistical inferences about the four causal mediation effects.

The next seven rows of the table in Output 38.3.5 summarize effects on the excess relative risk (ERR) scale. The natural direct effect (0.363) and natural indirect effect (0.345) have an additive property on this scale; they sum to the total excess relative risk, which is 0.707. Additivity makes it easier to use these values to deduce the 'Percentage Mediated', which is 48.72% (= 0.3445/0.7071times100%). Therefore, about 50% of the smoking effect on infant mortality is mediated through the lowering of babies’ birth weights. However, the 95% confidence interval for the 'Percentage Mediated' is (29.3%, 68.1%), which is fairly wide. More data would yield a more precise interval estimate.

The percentage of total effect due to the interaction between smoking and low birth weights is about 8%, which is relatively small. Again, the corresponding 95% confidence interval, (–30.8%, 47.0%), is quite wide.

The DECOMP option requests various total effect decompositions, which are shown in Output 38.3.6. Following VanderWeele (2014), all these decompositions are computed on the excess relative risk scale.

Output 38.3.6: Decompositions of Smoking Effects on Infant Mortality

Decompositions of Total Excess Relative Risk
Decomposition Excess Relative Risk Estimate Standard
Error
Wald 95%
Confidence Limits
Z Pr > |Z|
NDE+NIE Natural Direct 0.3626 0.1768 0.01604 0.7092 2.05 0.0403
  Natural Indirect 0.3445 0.06119 0.2245 0.4644 5.63 <.0001
CDE+PE Controlled Direct 0.3246 0.1207 0.08810 0.5611 2.69 0.0071
  Portion Eliminated 0.3825 0.1556 0.07752 0.6874 2.46 0.0140
TDE+PIE Total Direct 0.3820 0.2188 -0.04688 0.8109 1.75 0.0809
  Pure Indirect 0.3251 0.03534 0.2558 0.3943 9.20 <.0001
NDE+PIE+IMD Natural Direct 0.3626 0.1768 0.01604 0.7092 2.05 0.0403
  Pure Indirect 0.3251 0.03534 0.2558 0.3943 9.20 <.0001
  Mediated Interaction 0.01940 0.05229 -0.08309 0.1219 0.37 0.7106
CDE+PIE+PAI Controlled Direct 0.3246 0.1207 0.08810 0.5611 2.69 0.0071
  Pure Indirect 0.3251 0.03534 0.2558 0.3943 9.20 <.0001
  Portion Due to Interaction 0.05742 0.1547 -0.2457 0.3606 0.37 0.7105
Four-Way Controlled Direct 0.3246 0.1207 0.08810 0.5611 2.69 0.0071
  Reference Interaction 0.03801 0.1024 -0.1627 0.2387 0.37 0.7105
  Mediated Interaction 0.01940 0.05229 -0.08309 0.1219 0.37 0.7106
  Pure Indirect 0.3251 0.03534 0.2558 0.3943 9.20 <.0001
Total Excess Relative Risk 0.7071 0.2215 0.2729 1.1412 3.19 0.0014
Note: NDE=CDE+IRF, NIE=PIE+IMD, PAI=IRF+IMD, PE=PAI+PIE, TDE=CDE+PAI.


As shown in Output 38.3.7, PROC CAUSALMED also displays the corresponding decompositions by their percentage contribution to the total effect on the excess relative risk scale.

Output 38.3.7: Percentage Decomposition of Smoking Effects on Infant Mortality

Percentage Decompositions of Total Excess Relative Risk
Decomposition Excess Relative Risk Percent Standard
Error
Wald 95%
Confidence Limits
Z Pr > |Z|
NDE+NIE Natural Direct 51.28 9.89 31.90 70.67 5.18 <.0001
  Natural Indirect 48.72 9.89 29.33 68.10 4.92 <.0001
CDE+PE Controlled Direct 45.91 11.66 23.04 68.77 3.94 <.0001
  Portion Eliminated 54.09 11.66 31.23 76.96 4.64 <.0001
TDE+PIE Total Direct 54.03 14.49 25.62 82.43 3.73 0.0002
  Pure Indirect 45.97 14.49 17.57 74.38 3.17 0.0015
NDE+PIE+IMD Natural Direct 51.28 9.89 31.90 70.67 5.18 <.0001
  Pure Indirect 45.97 14.49 17.57 74.38 3.17 0.0015
  Mediated Interaction 2.74 6.70 -10.40 15.88 0.41 0.6823
CDE+PIE+PAI Controlled Direct 45.91 11.66 23.04 68.77 3.94 <.0001
  Pure Indirect 45.97 14.49 17.57 74.38 3.17 0.0015
  Portion Due to Interaction 8.12 19.84 -30.76 47.00 0.41 0.6823
Four-Way Controlled Direct 45.91 11.66 23.04 68.77 3.94 <.0001
  Reference Interaction 5.38 13.14 -20.37 31.13 0.41 0.6824
  Mediated Interaction 2.74 6.70 -10.40 15.88 0.41 0.6823
  Pure Indirect 45.97 14.49 17.57 74.38 3.17 0.0015
Note: NDE=CDE+IRF, NIE=PIE+IMD, PAI=IRF+IMD, PE=PAI+PIE, TDE=CDE+PAI.


The entries for the four-way decomposition in Output 38.3.7 show that 46% of the total effect is attributed to neither interaction nor mediation ('Controlled Direct'), 5% is attributed to interaction but not mediation ('Reference Interaction'), 3% is attributed to both mediation and interaction ('Mediated Interaction'), and 46% is attributed to mediation but not interaction ('Pure Indirect').

In the three-way decomposition labeled 'CDE+PIE+PAI,' the percentage of total effect that is attributed to the interaction (PAI or 'Portion Due to Interaction' in the table) is about 8%, which is not large but is also not ignorable.

Note that some of the confidence intervals in this table span from negative to positive values. This indicates that the corresponding point estimates might not be very accurate.

As requested by the first EVALUATE statement, the table in Output 38.3.8 displays the major effects and percentages when the mediator LowBirthWgt is set at the level Yes.

Output 38.3.8: Summary of Smoking Effects for the Low Birth-Weight Group

Summary of Effects: Low Birth-Weight
  Estimate Standard
Error
Nontransformed Scale Back-Transformed from Log Scale
  Wald 95%
Confidence Limits
Z Pr > |Z| 95% Confidence Limits Z Pr > |Z|
Odds Ratio Total Effect 1.7071 0.2215 1.2729 2.1412 3.19 0.0014 1.3237 2.2014 4.12 <.0001
Odds Ratio Controlled Direct Effect (CDE) 1.0917 0.1591 0.7799 1.4036 0.58 0.5643 0.8205 1.4527 0.60 0.5470
Odds Ratio Natural Direct Effect (NDE) 1.3626 0.1768 1.0160 1.7092 2.05 0.0403 1.0566 1.7572 2.38 0.0171
Odds Ratio Natural Indirect Effect (NIE) 1.2528 0.03432 1.1855 1.3201 7.37 <.0001 1.1873 1.3219 8.23 <.0001
Total Excess Relative Risk 0.7071 0.2215 0.2729 1.1412 3.19 0.0014        
Excess Relative Risk Due to CDE 0.8669 1.4959 -2.0649 3.7988 0.58 0.5622        
Excess Relative Risk Due to NDE 0.3626 0.1768 0.01604 0.7092 2.05 0.0403        
Excess Relative Risk Due to NIE 0.3445 0.06119 0.2245 0.4644 5.63 <.0001        
Percentage Mediated 48.7165 9.8917 29.3291 68.1040 4.92 <.0001        
Percentage Due to Interaction -68.5853 167.58 -397.04 259.87 -0.41 0.6823        
Percentage Eliminated -22.6126 179.59 -374.60 329.38 -0.13 0.8998        


The odds ratio CDE (which is evaluated for the low birth-weight group) is 1.09, with a corresponding 95% confidence interval of (0.82, 1.45) when the log scale is used for inferences.

As requested by the second EVALUATE statement, the table in Output 38.3.9 displays the major effects and percentages when the mediator LowBirthWgt is set at the level No.

Output 38.3.9: Summary of Smoking Effects for the Normal Birth-Weight Group

Summary of Effects: Normal Birth-Weight
  Estimate Standard
Error
Nontransformed Scale Back-Transformed from Log Scale
  Wald 95%
Confidence Limits
Z Pr > |Z| 95% Confidence Limits Z Pr > |Z|
Odds Ratio Total Effect 1.7071 0.2215 1.2729 2.1412 3.19 0.0014 1.3237 2.2014 4.12 <.0001
Odds Ratio Controlled Direct Effect (CDE) 1.8940 0.3540 1.2002 2.5879 2.53 0.0116 1.3131 2.7320 3.42 0.0006
Odds Ratio Natural Direct Effect (NDE) 1.3626 0.1768 1.0160 1.7092 2.05 0.0403 1.0566 1.7572 2.38 0.0171
Odds Ratio Natural Indirect Effect (NIE) 1.2528 0.03432 1.1855 1.3201 7.37 <.0001 1.1873 1.3219 8.23 <.0001
Total Excess Relative Risk 0.7071 0.2215 0.2729 1.1412 3.19 0.0014        
Excess Relative Risk Due to CDE 0.3246 0.1207 0.08810 0.5611 2.69 0.0071        
Excess Relative Risk Due to NDE 0.3626 0.1768 0.01604 0.7092 2.05 0.0403        
Excess Relative Risk Due to NIE 0.3445 0.06119 0.2245 0.4644 5.63 <.0001        
Percentage Mediated 48.7165 9.8917 29.3291 68.1040 4.92 <.0001        
Percentage Due to Interaction 8.1202 19.8380 -30.7615 47.0020 0.41 0.6823        
Percentage Eliminated 54.0930 11.6647 31.2306 76.9554 4.64 <.0001        


The odds ratio CDE (which is evaluated for the normal birth-weight group) is now 1.89, with a corresponding 95% confidence interval of (1.31, 2.73) when the log scale is used for inferences.

Note that the controlled level of the mediator requested by the second EVALUATE statement coincides the default setting that uses the last level of mediator as the controlled level. Hence, the results in Output 38.3.9 and Output 38.3.5 are identical.

Last updated: December 09, 2022