The CAUSALTRT procedure can estimate the average treatment effect (ATE) by using inverse probability weighting, regression adjustment, and doubly robust methods (doubly robust methods are combinations of the first two methods). The following sections describe each of these estimation methods.
The CAUSALTRT procedure estimates the potential outcome means and ATE by using the three inverse probability weighting methods that are described in Lunceford and Davidian (2004). All three methods compute weights using estimates for the conditional probability of receiving treatment,
where is a vector of the observed covariates. The conditional probability
is called the propensity score, and the model that is used to estimate
is called the propensity score model. The inverse probability weight for an observation is equal to
In addition to the stable unit treatment value assumption (SUTVA), two more assumptions on the propensity scores are required for the use of inverse probability weighting methods. The positivity assumption, that , ensures that each observation has a nonzero probability of receiving each treatment condition.
The assumption of no unmeasured confounders, also known as strong ignorability, requires that the potential outcomes
and
be independent of the treatment variable T conditional on the covariates
. In practice, this assumption means that the set of covariates
should have included all the important confounders in the propensity score model. If these conditions are satisfied and the propensity score model is correctly specified, then it follows that
The CAUSALTRT procedure uses the effects that are specified in the PSMODEL statement to fit a logistic regression to the propensity score model. The inverse probability weights are then computed by taking the inverse of the estimated propensity scores.
Suppose you have observations , for
. The parameter estimates for the propensity score model is given by
, and the predicted values for the propensity score are given by
The three weighting methods that PROC CAUSALTRT implements obtain unbiased estimates of the potential outcome means by solving the equations
These estimation methods differ in the values of . The choices of
and the corresponding potential outcome mean estimates for each method are as follows:
If the values of that motivate the IPWS method were known constants, they would produce estimates for the potential outcome means that have the smallest asymptotic variance among the class of estimators that is defined by
. The IPWS method estimates
by solving the estimating equations
Although you do not need to specify an outcome model when you use the inverse probability weighting methods, you still need to specify the outcome variable in the MODEL statement. The IPW, IPWR, or IPWS estimation method can be requested using the METHOD= option in the PROC CAUSALTRT statement. For more information about which model specifications are required for different estimation methods and how default estimation methods are determined, see the section Outline of Estimation Method Requirements.
The robust sandwich covariance estimate for is computed by stacking the equations
and the score function for the logistic regression model that is used to obtain
. By stacking the equations, PROC CAUSALTRT adjusts the covariance matrix for
to account for the prediction of
and the propensity scores
. For more information about using the robust sandwich covariance estimate to compute standard errors, see the section Asymptotic Formulas.
The finite sample properties of and its covariance matrix can be affected by observations that have large weights. By default, a note is added to the SAS log if any observations have weights greater than 50. You can specify the value that is used to flag large weights in the WGTFLAG= option in the PSMODEL statement. You can also inspect the values of the predicted weights by requesting graphics in the PLOTS= option in the PROC CAUSALTRT or PSMODEL statements.
As described in Rosenbaum and Rubin (1983), the propensity score is also a balancing score. To assess the balance that is produced by a propensity score model, you can request the display of weighted and unweighted standardized mean differences and variance ratios for propensity score model effects in the COVDIFFPS option in the PROC CAUSALTRT statement. You can also request weighted and unweighted densities for propensity score model effects by specifying PSCOVDEN in the PLOTS= option in the PROC CAUSALTRT or PSMODEL statements. To inspect the balance of a propensity score model without showing the causal effects, specify the NOEFFECT option in the PROC CAUSALTRT statement.
Based on the balancing score property, you can use the propensity scores to create matched samples or strata for the data. For more information about applying the propensity score in matching and stratification see ChapterĀ 101, The PSMATCH Procedure.
Regression adjustment estimates the ATE by using predicted values for the potential outcomes. The CAUSALTRT procedure predicts the potential outcomes from generalized linear models that are fit to the data by maximum likelihood estimation. For more information about generalized linear models, see the section Generalized Linear Models Theory in ChapterĀ 51, The GENMOD Procedure.
The effects that you specify in the MODEL statement are used to fit models for the outcome within each treatment condition. The predicted potential outcomes are then given by
where g is the link function being used and and
are the parameter estimates for the control and treatment outcome models, respectively. The potential outcome mean
is then estimated by averaging the predicted
for all observations
, regardless of the observed treatment assignment. Therefore, the regression adjustment estimates for the potential outcome means solve the estimating equations
The regression adjustment estimates for the potential outcome means are equal to
If the outcome model is correctly specified and the assumption of no unmeasured confounders is satisfied, then regression adjustment produces unbiased estimates for the potential outcome means. The outcome model effects should also have similar distributions between treatment conditions to avoid extrapolation when the potential outcomes are predicted.
Although you do not need to specify a propensity score model in order to estimate the potential outcome means by the regression adjustment method, you still need to specify the treatment variable in the PSMODEL statement. You can specify METHOD=REGADJ in the PROC CAUSALTRT statement to request the regression adjustment method. For more information about which model specifications are required for different estimation methods and how default estimation methods are determined, see the section Outline of Estimation Method Requirements.
The robust sandwich covariance estimate for is estimated by stacking the equations
and the score function for the outcome models that are used to estimate
and
. By stacking the equations, PROC CAUSALTRT adjusts the covariance matrix for
to account for the prediction of the potential outcomes
. For more information about using the robust sandwich covariance estimate to compute standard errors, see the section Asymptotic Formulas.
Doubly robust estimation methods fit models for both the outcome and treatment variables. They combine inverse probability weighting and regression adjustment to estimate the potential outcome means. The methods are said to be doubly robust because they provide unbiased estimates for even if one of the models is misspecified (Bang and Robins 2005). In this sense, you have two opportunities to obtain unbiased estimation of causal effects by specifying either a correct treatment or a correct outcome model. The CAUSALTRT procedure implements two doubly robust estimation methods: the augmented inverse probability weighting (AIPW) method described in Lunceford and Davidian (2004) and the inverse probability weighted regression adjustment (IPWREG) method described in Wooldridge (2010).
The AIPW estimation method estimates propensity scores and the associated parameter estimates
by fitting a logistic regression model for treatment assignment, as described in section Inverse Probability Weighting. It also estimates the potential outcomes
by using the maximum likelihood fitting of the generalized linear models, as described in section Regression Adjustment. Note that the treatment assignment and outcome models do not need to be fit using the same set of model effects.
The AIPW estimates for potential outcome means solve the estimating equations
The AIPW estimates for the potential outcome means are given by
If one of the models is misspecified, remains an unbiased estimate of
. However, the efficiency of
and the estimation of its covariance matrix are affected.
The IPWREG estimation method uses the same logistic model for predicting , but it fits weighted models for the outcome variable separately for each treatment condition. To fit the outcome models, each observation is weighted by the inverse probability weights as follows:
Let and
denote the parameter estimates from the weighted regression outcome models for the control and treatment conditions, respectively. The predicted potential outcomes are then given by
where g is the link function being used. For the IPWREG method, PROC CAUSALTRT enforces the use of the canonical link for the distribution of the outcome variable. You cannot override the canonical link by the LINK= option. For information about the canonical link for each distribution, see Table 5.
The IPWREG estimates for the potential outcome means are given by
These estimates solve the following estimating equations: