The CAUSALMED Procedure

Getting Started: CAUSALMED Procedure

(View the complete code for this example.)

This section illustrates basic features of the CAUSALMED procedure for estimating total, direct, and indirect effects and their corresponding percentages.

The example presented in this section is patterned after the theoretical educational models that are discussed by Marjoribanks (1974). However, the data in this example are simulated, and neither the analysis nor the interpretation of procedure output mirrors that of Marjoribanks (1974).

A study is conducted to understand whether an encouraging environment provided by parents has an effect on the cognitive development of children. A key question is whether the effect of parental encouragement is due in part to its enhancement of children’s motivation to learn. Two pathways of parental encouragement effect are possible:

  • a direct pathway, which can be denoted as Encourage right-arrow CogPerform

  • a mediated or indirect pathway, which can be denoted as Encourage right-arrow Motivation right-arrow CogPerform

In these pathways, the variable Encourage represents parental encouragement, the variable Motivation represents the learning motivation of the children, and the variable CogPerform represents the cognitive performance of the children. In the terminology of mediation analysis, Encourage is a treatment or an exposure, Motivation is a mediator, and CogPerform is an outcome.

A simulated sample of 300 observations is saved in a data set named Cognitive. Each observation has six variable values, as shown in Figure 2.

Figure 2: First 10 Observations of the Input Data Set

Obs SubjectID FamSize SocStatus Encourage Motivation CogPerform
1 1 7 31 36 40 103
2 2 3 27 36 40 103
3 3 0 25 35 40 99
4 4 6 29 36 40 103
5 5 4 22 33 37 79
6 6 2 23 34 38 87
7 7 0 29 37 41 112
8 8 4 23 34 38 87
9 9 3 20 32 36 71
10 10 3 28 36 40 103


The variables are defined as follows:

  • CogPerform: the child’s score on a cognitive test (outcome)

  • Encourage: the sum score of the ratings of three items about parents’ encouraging behavior in a questionnaire (treatment)

  • FamSize: the size of the child’s family

  • Motivation: the sum score of the child’s levels of motivation as evaluated by the child, the teacher, and the primary caretaker (mediator)

  • SocStatus: the child’s social status, which is an aggregate measure of household income, parents’ occupations, and parents’ educational levels

  • StudentID: the child’s identifier

Variables FamSize and SocStatus are background or pretreatment characteristics that you would like to control for when observing various causal effects—either total, direct, or mediated.

First, consider an analysis in which the pretreatment characteristics are omitted. The following statements invoke PROC CAUSALMED to estimate various effects without controlling for background confounding variables:

proc causalmed data=Cognitive all;
   model    CogPerform  = Encourage Motivation;
   mediator Motivation  = Encourage;
run;

The ALL option in the PROC CAUSALMED statement displays all available output. The MODEL statement specifies the outcome model for CogPerform, which is affected by Encourage and Motivation. The MEDIATOR statement specifies the mediator model for Motivation, which is affected only by Encourage.

The output produced by PROC CAUSALMED is displayed in Figure 3 through Figure 6.

Figure 3 echoes the modeling information and displays the number of observations read and used in the analysis; it also identifies the outcome, treatment, and mediator variables. By default, PROC CAUSALMED assumes normal distributions and identity links for the response variables in the outcome and mediator models because they are continuous.

Figure 3: Model Information

Model Information
Data Set WORK.COGNITIVE
Outcome Variable CogPerform
Treatment Variable Encourage
Mediator Variable Motivation
Outcome Modeling Generalized Linear Model
Outcome Distribution Normal
Outcome Link Function Identity
Mediator Modeling Generalized Linear Model
Mediator Distribution Normal
Mediator Link Function Identity

Number of Observations Read 300
Number of Observations Used 300


Figure 4 presents the estimated effects. All effect estimates and percentage estimates are significant. The total effect estimate is 8.04, which is decomposed into the natural direct effect (NDE=4.28) and natural indirect effect (NIE=3.76). The estimated controlled direct effect (CDE) is 4.28, which is evaluated at the mean value of the mediator variable Motivation by default. In the current model, CDE is the same as NDE. The 'Percentage Mediated' is 46.74%. This means that slightly less than half of the parental encouragement effect on children’s cognitive development can be attributed to the enhancement of children’s learning motivation.

Figure 4: Summary of Total, Direct, and Mediated Effects

Summary of Effects
  Estimate Standard
Error
Wald 95%
Confidence Limits
Z Pr > |Z|
Total Effect 8.0423 0.03200 7.9796 8.1050 251.30 <.0001
Controlled Direct Effect (CDE) 4.2835 0.1062 4.0754 4.4917 40.33 <.0001
Natural Direct Effect (NDE) 4.2835 0.1062 4.0754 4.4917 40.33 <.0001
Natural Indirect Effect (NIE) 3.7588 0.1091 3.5449 3.9727 34.44 <.0001
Percentage Mediated 46.7377 1.3254 44.1400 49.3353 35.26 <.0001
Percentage Due to Interaction 0 . . . . .
Percentage Eliminated 46.7377 1.3254 44.1400 49.3353 35.26 <.0001


The tables in Figure 5 and Figure 6 are useful for confirming the direction of the effects. Figure 5 shows the estimates of the outcome model for CogPerform.

Figure 5: Estimates of the Outcome Model

Outcome Model Estimates
Parameter Estimate Standard
Error
Wald 95%
Confidence Limits
Wald
Chi-Square
Pr > ChiSq
Intercept -201.21 0.6426 -202.47 -199.95 98053.6157 <.0001
Encourage 4.2835 0.1062 4.0754 4.4917 1626.7935 <.0001
Motivation 3.7576 0.1052 3.5514 3.9639 1274.6903 <.0001
Scale 0.4605 0.01880 0.4251 0.4989    


Figure 6 shows the estimates of the mediator model for Motivation. The estimates of the direct effects from Encourage and Motivation are both positive and significant, thus confirming the positive effect of parental encouragement on children’s learning motivation.

Figure 6: Estimates of the Mediator Model

Mediator Model Estimates
Parameter Estimate Standard
Error
Wald 95%
Confidence Limits
Wald
Chi-Square
Pr > ChiSq
Intercept 4.0428 0.2641 3.5251 4.5605 234.2732 <.0001
Encourage 1.0003 0.007663 0.9853 1.0153 17040.9178 <.0001
Scale 0.2526 0.01031 0.2332 0.2737    


Although the preceding analysis is interpretable, it does not take full advantage of the causal analytic techniques that are available in the CAUSALMED procedure. In order to draw valid causal interpretations from observational data, you must statistically control for all important confounding background characteristics.

Assume that FamSize and SocStatus are the only important confounding background characteristics that need to be controlled for. You can specify these variables as covariates in the COVAR statement and use PROC CAUSALMED as follows to fit an appropriate causal mediation model:

proc causalmed data=Cognitive;
   model    CogPerform  = Encourage Motivation;
   mediator Motivation  = Encourage;
   covar FamSize SocStatus;
run;

When the confounding covariates FamSize and SocStatus are included, the procedure adjusts the estimates of the causal effects, leading to a new set of results which are summarized in Figure 7.

Figure 7: Summary of Causal Effects

Summary of Effects
  Estimate Standard
Error
Wald 95%
Confidence Limits
Z Pr > |Z|
Total Effect 6.8435 0.1525 6.5446 7.1424 44.88 <.0001
Controlled Direct Effect (CDE) 4.2962 0.1098 4.0811 4.5114 39.14 <.0001
Natural Direct Effect (NDE) 4.2962 0.1098 4.0811 4.5114 39.14 <.0001
Natural Indirect Effect (NIE) 2.5473 0.1563 2.2410 2.8536 16.30 <.0001
Percentage Mediated 37.2219 1.7523 33.7874 40.6564 21.24 <.0001
Percentage Due to Interaction 0 . . . . .
Percentage Eliminated 37.2219 1.7523 33.7874 40.6564 21.24 <.0001


The total effect of Encourage on CogPerform is now 6.84, which is about 1 point lower than the total effect that is obtained without including the confounding covariates in the analysis (see Figure 4). This discrepancy suggests that parts of the observed association between Encourage and CogPerform are indeed due to their associations with the confounding background covariates. Failure to adjust for the confounding covariates led to inflated estimates of the total causal effect in Figure 4.

The natural direct effect (NDE) in the current analysis is 4.30, which is not much different from that of the preceding analysis. However, the natural indirect effect (NIE) is now 2.55, which is more than 1 point lower than the NIE in Figure 4. Finally, the 'Percentage Mediated' is now only 37%, which is almost 10% lower than the 'Percentage Mediated' (47%) that Figure 4 shows.

These results demonstrate that you must carefully consider the set of confounding covariates when conducting a causal mediation analysis. First and foremost, the no unmeasured confounding assumption must be reasonably satisfied. That is, to enable causal interpretations of the effect estimates, the baseline covariates for which adjustment is made must suffice to control for treatment-outcome, mediator-outcome, and treatment-mediator confounding. Second, causal analysis from observational data might involve many other assumptions that require serious attention. For instance, in the current example you could consider the following questions:

  1. Why should the analysis assume that the variables Encourage and Motivation do not have an interaction effect on CogPerform? Is there any justification for this assumption?

  2. If there is an interaction effect between the treatment and the mediator, what is the amount of this effect?

  3. What justifies treating Encourage as the cause and CogPerform as the effect?

  4. Is the causal sequence among the variables Encourage, Motivation, and CogPerform properly captured in the data?

You can address Questions 1 and 2 by fitting a more general model that includes the interaction term to determine whether the interaction effect is ignorable. PROC CAUSALMED supports outcome models that have interaction effects, as illustrated in Example 38.1, which presents a continuation of the current analysis.

Question 3 does not have a definite statistical answer. Substantive knowledge or existing evidence of the relationships is required to support these justifications.

An answer to Question 4 must also be justified by using substantive knowledge about the system of interest. In many systems there are temporal conditions that the data must satisfy so that the effects of the treatment on the outcome, the treatment on the mediator, and the mediator on the outcome can be observed.

Some researchers use longitudinal studies to establish the causal sequence. For instance, you can collect data in stages to ensure a proper temporal ordering of the causal, mediation, and outcome events. In this example, you could collect data for CogPerform several months after collecting data for Motivation, which were collected several years after you obtained the information about Encourage and any pretreatment confounders.

If you collect all the data at the same time point, you would need to justify that the parental encouragement pattern has long been established and that its effect on children’s learning motivation has been stabilized well before the children took the cognitive performance test. In addition, you would also need to justify that the background or pretreatment characteristics had been stabilized well before the measurements of the treatment, mediator, and the outcome. Substantive knowledge is required to support these justifications.

The role of CAUSALMED procedure is to estimate causal mediation effects given that all related assumptions are satisfied. The procedure can only serve as a tool to refute the presence of causal effects (when estimates are close to zero) given the model. The procedure cannot be used to establish causal interpretations of effects if the necessary methodological and statistical assumptions are not satisfied. For more information about assumptions of causal mediation analysis, see the sections Causal Mediation Effects: Theory, Definitions, and Effect Decompositions and Causal Mediation Effects: Assumptions, Identification, and Estimation.

Last updated: December 09, 2022