The CAUSALGRAPH Procedure

Limitations of the CAUSALGRAPH Procedure

Causal Models with Directed Cycles

PROC CAUSALGRAPH analyzes DAGs that represent a causal model. These DAGs cannot contain a directed cycle. This can lead to difficulties in situations where two variables seem to cause (directly or indirectly) each other. In such situations, a common approach is to introduce additional variables so as to describe the data generating process at a more refined temporal scale (Greenland, Pearl, and Robins 1999; Elwert 2013).

Identifying Joint Treatment Effects

The CAUSALGRAPH procedure enables you to specify multiple treatment variables and multiple outcome variables in an identification analysis.

When you specify multiple treatment variables, the causal effect is interpreted as a joint causal effect. That is, the causal effect is interpreted as the hypothetical result of imposing specific values on all treatment variables simultaneously (Elwert 2013). In the language of the do-operator, the joint causal effect is treated as a conjunction so that is interpreted as and .

You can also interpret multiple treatment variables as sequential treatment actions, provided that the treatment sequence is predetermined (Elwert 2013). However, you cannot use PROC CAUSALGRAPH to assess the identifiability of a dynamic treatment regime.

When you specify multiple outcome variables, each outcome is interpreted separately as a unique causal effect. Although the interpretation is separate, PROC CAUSALGRAPH constructs only those adjustment sets that are valid for every outcome variable. In some situations, there might not be any such adjustment sets, even though it is possible to identify the causal effect on each outcome separately. For example, if the causal effect of X on can be identified only with an adjustment set and the causal effect of X on can be identified only with an adjustment set for disjoint sets and , then there is no adjustment set that is valid for both outcome variables simultaneously.

Causal Effect Identification Is a Population Concept

A causal effect that you estimate from observational data cannot have a valid causal interpretation unless those data are supplemented by a set of causal assumptions in the form of a causal model (Pearl 2009b). However, the causal model represents assumed relationships between variables at the population level and not at the level of an individual subject. Therefore, the theory that describes causal effect identification by using DAGs does not consider sampling variability. The conditions for identification are valid in the asymptotic limit (as the number of observations increases) (Elwert 2013). For this reason, a successful identification strategy (using either an adjustment set or a conditional instrumental variable) is a necessary first step to estimate a causal effect by using data from a nonrandomized experiment (Elwert and Winship 2014). You should carefully consider the role of sampling variability when estimating a causal effect and when examining the testable implications of a model.

Causal Effect Identification Is a Nonparametric Concept

The identifiability of a causal effect is a fully nonparametric concept in the sense that it does not depend on distributional or functional forms for the variables and edges in a causal model. However, an identification strategy as well as any estimate that is computed by that strategy should be understood to be conditional on the validity of the assumed causal model (Elwert 2013). In addition, when a causal effect is shown to be identified (for example, using an adjustment set), this does not mean that you can freely choose a parametric estimator in order to quantify the effect. The suitability of a parametric estimator is contingent on parametric assumptions. These assumptions are separate from the assumptions of the causal model and must be justified for each specific situation (Elwert 2013).

Dealing with Nonidentified Causal Effects

When a causal effect cannot be identified in a particular causal model, there are a couple of actions that you can take. First, you can revise the assumptions of the causal model to see whether the data generating process might be equally well described by an alternative model. In some cases, this might involve testing the observable implications of a model against existing data. For more information about the testable implications of a model, see the section Statistical Properties of Causal Models. Second, you can consider observing additional variables. This might take the form of adding observations for a previously unmeasured variable or adding new variables and edges to an existing model (Pearl 2009b). However, adding edges to an existing set of variables never helps, and might harm, identification (Pearl 2009b; Elwert and Winship 2014).

Last updated: December 09, 2022