(View the complete code for this example.)
This section illustrates basic features of the CAUSALMED procedure for estimating total, direct, and indirect effects and their corresponding percentages.
The example presented in this section is patterned after the theoretical educational models that are discussed by Marjoribanks (1974). However, the data in this example are simulated, and neither the analysis nor the interpretation of procedure output mirrors that of Marjoribanks (1974).
A study is conducted to understand whether an encouraging environment provided by parents has an effect on the cognitive development of children. A key question is whether the effect of parental encouragement is due in part to its enhancement of children’s motivation to learn. Two pathways of parental encouragement effect are possible:
In these pathways, the variable Encourage represents parental encouragement, the variable Motivation represents the learning motivation of the children, and the variable CogPerform represents the cognitive performance of the children. In the terminology of mediation analysis, Encourage is a treatment or an exposure, Motivation is a mediator, and CogPerform is an outcome.
A simulated sample of 300 observations is saved in a data set named Cognitive. Each observation has six variable values, as shown in Figure 2.
Figure 2: First 10 Observations of the Input Data Set
| Obs | SubjectID | FamSize | SocStatus | Encourage | Motivation | CogPerform |
|---|---|---|---|---|---|---|
| 1 | 1 | 7 | 31 | 36 | 40 | 103 |
| 2 | 2 | 3 | 27 | 36 | 40 | 103 |
| 3 | 3 | 0 | 25 | 35 | 40 | 99 |
| 4 | 4 | 6 | 29 | 36 | 40 | 103 |
| 5 | 5 | 4 | 22 | 33 | 37 | 79 |
| 6 | 6 | 2 | 23 | 34 | 38 | 87 |
| 7 | 7 | 0 | 29 | 37 | 41 | 112 |
| 8 | 8 | 4 | 23 | 34 | 38 | 87 |
| 9 | 9 | 3 | 20 | 32 | 36 | 71 |
| 10 | 10 | 3 | 28 | 36 | 40 | 103 |
The variables are defined as follows:
CogPerform: the child’s score on a cognitive test (outcome)
Encourage: the sum score of the ratings of three items about parents’ encouraging behavior in a questionnaire (treatment)
FamSize: the size of the child’s family
Motivation: the sum score of the child’s levels of motivation as evaluated by the child, the teacher, and the primary caretaker (mediator)
SocStatus: the child’s social status, which is an aggregate measure of household income, parents’ occupations, and parents’ educational levels
StudentID: the child’s identifier
Variables FamSize and SocStatus are background or pretreatment characteristics that you would like to control for when observing various causal effects—either total, direct, or mediated.
First, consider an analysis in which the pretreatment characteristics are omitted. The following statements invoke PROC CAUSALMED to estimate various effects without controlling for background confounding variables:
proc causalmed data=Cognitive all;
model CogPerform = Encourage Motivation;
mediator Motivation = Encourage;
run;
The ALL option in the PROC CAUSALMED statement displays all available output. The MODEL statement specifies the outcome model for CogPerform, which is affected by Encourage and Motivation. The MEDIATOR statement specifies the mediator model for Motivation, which is affected only by Encourage.
The output produced by PROC CAUSALMED is displayed in Figure 3 through Figure 6.
Figure 3 echoes the modeling information and displays the number of observations read and used in the analysis; it also identifies the outcome, treatment, and mediator variables. By default, PROC CAUSALMED assumes normal distributions and identity links for the response variables in the outcome and mediator models because they are continuous.
Figure 3: Model Information
| Model Information | |
|---|---|
| Data Set | WORK.COGNITIVE |
| Outcome Variable | CogPerform |
| Treatment Variable | Encourage |
| Mediator Variable | Motivation |
| Outcome Modeling | Generalized Linear Model |
| Outcome Distribution | Normal |
| Outcome Link Function | Identity |
| Mediator Modeling | Generalized Linear Model |
| Mediator Distribution | Normal |
| Mediator Link Function | Identity |
| Number of Observations Read | 300 |
|---|---|
| Number of Observations Used | 300 |
Figure 4 presents the estimated effects. All effect estimates and percentage estimates are significant. The total effect estimate is 8.04, which is decomposed into the natural direct effect (NDE=4.28) and natural indirect effect (NIE=3.76). The estimated controlled direct effect (CDE) is 4.28, which is evaluated at the mean value of the mediator variable Motivation by default. In the current model, CDE is the same as NDE. The 'Percentage Mediated' is 46.74%. This means that slightly less than half of the parental encouragement effect on children’s cognitive development can be attributed to the enhancement of children’s learning motivation.
Figure 4: Summary of Total, Direct, and Mediated Effects
| Summary of Effects | ||||||
|---|---|---|---|---|---|---|
| Estimate | Standard Error |
Wald 95% Confidence Limits |
Z | Pr > |Z| | ||
| Total Effect | 8.0423 | 0.03200 | 7.9796 | 8.1050 | 251.30 | <.0001 |
| Controlled Direct Effect (CDE) | 4.2835 | 0.1062 | 4.0754 | 4.4917 | 40.33 | <.0001 |
| Natural Direct Effect (NDE) | 4.2835 | 0.1062 | 4.0754 | 4.4917 | 40.33 | <.0001 |
| Natural Indirect Effect (NIE) | 3.7588 | 0.1091 | 3.5449 | 3.9727 | 34.44 | <.0001 |
| Percentage Mediated | 46.7377 | 1.3254 | 44.1400 | 49.3353 | 35.26 | <.0001 |
| Percentage Due to Interaction | 0 | . | . | . | . | . |
| Percentage Eliminated | 46.7377 | 1.3254 | 44.1400 | 49.3353 | 35.26 | <.0001 |
The tables in Figure 5 and Figure 6 are useful for confirming the direction of the effects. Figure 5 shows the estimates of the outcome model for CogPerform.
Figure 5: Estimates of the Outcome Model
| Outcome Model Estimates | ||||||
|---|---|---|---|---|---|---|
| Parameter | Estimate | Standard Error |
Wald 95% Confidence Limits |
Wald Chi-Square |
Pr > ChiSq | |
| Intercept | -201.21 | 0.6426 | -202.47 | -199.95 | 98053.6157 | <.0001 |
| Encourage | 4.2835 | 0.1062 | 4.0754 | 4.4917 | 1626.7935 | <.0001 |
| Motivation | 3.7576 | 0.1052 | 3.5514 | 3.9639 | 1274.6903 | <.0001 |
| Scale | 0.4605 | 0.01880 | 0.4251 | 0.4989 | ||
Figure 6 shows the estimates of the mediator model for Motivation. The estimates of the direct effects from Encourage and Motivation are both positive and significant, thus confirming the positive effect of parental encouragement on children’s learning motivation.
Figure 6: Estimates of the Mediator Model
| Mediator Model Estimates | ||||||
|---|---|---|---|---|---|---|
| Parameter | Estimate | Standard Error |
Wald 95% Confidence Limits |
Wald Chi-Square |
Pr > ChiSq | |
| Intercept | 4.0428 | 0.2641 | 3.5251 | 4.5605 | 234.2732 | <.0001 |
| Encourage | 1.0003 | 0.007663 | 0.9853 | 1.0153 | 17040.9178 | <.0001 |
| Scale | 0.2526 | 0.01031 | 0.2332 | 0.2737 | ||
Although the preceding analysis is interpretable, it does not take full advantage of the causal analytic techniques that are available in the CAUSALMED procedure. In order to draw valid causal interpretations from observational data, you must statistically control for all important confounding background characteristics.
Assume that FamSize and SocStatus are the only important confounding background characteristics that need to be controlled for. You can specify these variables as covariates in the COVAR statement and use PROC CAUSALMED as follows to fit an appropriate causal mediation model:
proc causalmed data=Cognitive;
model CogPerform = Encourage Motivation;
mediator Motivation = Encourage;
covar FamSize SocStatus;
run;
When the confounding covariates FamSize and SocStatus are included, the procedure adjusts the estimates of the causal effects, leading to a new set of results which are summarized in Figure 7.
Figure 7: Summary of Causal Effects
| Summary of Effects | ||||||
|---|---|---|---|---|---|---|
| Estimate | Standard Error |
Wald 95% Confidence Limits |
Z | Pr > |Z| | ||
| Total Effect | 6.8435 | 0.1525 | 6.5446 | 7.1424 | 44.88 | <.0001 |
| Controlled Direct Effect (CDE) | 4.2962 | 0.1098 | 4.0811 | 4.5114 | 39.14 | <.0001 |
| Natural Direct Effect (NDE) | 4.2962 | 0.1098 | 4.0811 | 4.5114 | 39.14 | <.0001 |
| Natural Indirect Effect (NIE) | 2.5473 | 0.1563 | 2.2410 | 2.8536 | 16.30 | <.0001 |
| Percentage Mediated | 37.2219 | 1.7523 | 33.7874 | 40.6564 | 21.24 | <.0001 |
| Percentage Due to Interaction | 0 | . | . | . | . | . |
| Percentage Eliminated | 37.2219 | 1.7523 | 33.7874 | 40.6564 | 21.24 | <.0001 |
The total effect of Encourage on CogPerform is now 6.84, which is about 1 point lower than the total effect that is obtained without including the confounding covariates in the analysis (see Figure 4). This discrepancy suggests that parts of the observed association between Encourage and CogPerform are indeed due to their associations with the confounding background covariates. Failure to adjust for the confounding covariates led to inflated estimates of the total causal effect in Figure 4.
The natural direct effect (NDE) in the current analysis is 4.30, which is not much different from that of the preceding analysis. However, the natural indirect effect (NIE) is now 2.55, which is more than 1 point lower than the NIE in Figure 4. Finally, the 'Percentage Mediated' is now only 37%, which is almost 10% lower than the 'Percentage Mediated' (47%) that Figure 4 shows.
These results demonstrate that you must carefully consider the set of confounding covariates when conducting a causal mediation analysis. First and foremost, the no unmeasured confounding assumption must be reasonably satisfied. That is, to enable causal interpretations of the effect estimates, the baseline covariates for which adjustment is made must suffice to control for treatment-outcome, mediator-outcome, and treatment-mediator confounding. Second, causal analysis from observational data might involve many other assumptions that require serious attention. For instance, in the current example you could consider the following questions:
Why should the analysis assume that the variables Encourage and Motivation do not have an interaction effect on CogPerform? Is there any justification for this assumption?
If there is an interaction effect between the treatment and the mediator, what is the amount of this effect?
What justifies treating Encourage as the cause and CogPerform as the effect?
Is the causal sequence among the variables Encourage, Motivation, and CogPerform properly captured in the data?
You can address Questions 1 and 2 by fitting a more general model that includes the interaction term to determine whether the interaction effect is ignorable. PROC CAUSALMED supports outcome models that have interaction effects, as illustrated in Example 38.1, which presents a continuation of the current analysis.
Question 3 does not have a definite statistical answer. Substantive knowledge or existing evidence of the relationships is required to support these justifications.
An answer to Question 4 must also be justified by using substantive knowledge about the system of interest. In many systems there are temporal conditions that the data must satisfy so that the effects of the treatment on the outcome, the treatment on the mediator, and the mediator on the outcome can be observed.
Some researchers use longitudinal studies to establish the causal sequence. For instance, you can collect data in stages to ensure a proper temporal ordering of the causal, mediation, and outcome events. In this example, you could collect data for CogPerform several months after collecting data for Motivation, which were collected several years after you obtained the information about Encourage and any pretreatment confounders.
If you collect all the data at the same time point, you would need to justify that the parental encouragement pattern has long been established and that its effect on children’s learning motivation has been stabilized well before the children took the cognitive performance test. In addition, you would also need to justify that the background or pretreatment characteristics had been stabilized well before the measurements of the treatment, mediator, and the outcome. Substantive knowledge is required to support these justifications.
The role of CAUSALMED procedure is to estimate causal mediation effects given that all related assumptions are satisfied. The procedure can only serve as a tool to refute the presence of causal effects (when estimates are close to zero) given the model. The procedure cannot be used to establish causal interpretations of effects if the necessary methodological and statistical assumptions are not satisfied. For more information about assumptions of causal mediation analysis, see the sections Causal Mediation Effects: Theory, Definitions, and Effect Decompositions and Causal Mediation Effects: Assumptions, Identification, and Estimation.