PROC CAUSALTRT computes standard errors for the potential outcome means and treatment effect by using the robust sandwich covariance formula that is based on asymptotic theory. In general, let denote the vector of parameters of interest. PROC CAUSALTRT considers all its estimation as an M-estimation problem that solves the following estimating equations:
When the function is evaluated at
, it satisfies
. By the theory of M-estimation (Stefanski and Boos 2002), the robust sandwich covariance matrix,
, for the estimates
is given by
where n is the sample size, denotes
, and
PROC CAUSALTRT computes standard errors by taking the square roots of the diagonal elements of .
The form and dimension of , and hence those of the robust sandwich covariance matrix, depend on which estimation method is used. Whenever an outcome or treatment (propensity score) model is used in a particular estimation method, the estimating equations must also include the corresponding score vectors for that model.
Let be a generic vector of parameters of a model, and let
be the vector of potential outcome means. Further, let
,
, and
denote the score functions from the maximum likelihood estimation of the propensity score, outcome, and inverse probability weighted outcome models, respectively. Then the composition of
and
for each estimation method is as follows:
The Wald confidence interval for the potential outcome mean
is given by
where is the
th percentile of the standard normal distribution and
is the standard error estimate for the potential outcome estimate
. Because the ATE is the difference in potential outcome means, its standard error is computed by applying standard variance rules to the estimates from the covariance matrix for the potential outcomes. The corresponding Wald confidence interval for ATE is then constructed by the standard formula as shown in the previous expression. For the estimates of conditional potential outcome means, the standard errors and confidence limits are computed using the same approach but with
and
replaced by
and
.
To provide better theoretical and computational properties, PROC CAUSALTRT uses modified expressions for components of and
when the covariance matrix
is computed. For descriptions of the modifications, their motivation, and theoretical justification, see Pierce (1982); Robins, Rotnitzky, and Zhao (1995); Lunceford and Davidian (2004); Wooldridge (2010).
If you specify the BOOTSTRAP statement, PROC CAUSALTRT uses bootstrap resampling to compute standard errors and confidence intervals for the potential outcome means and treatment effect. The procedure samples as many bootstrap sample data sets (replicates) as you specify in the NBOOT= option and then estimates the potential outcome means and treatment effect for each replication. The bootstrap samples are taken from within the treatment conditions. Each bootstrap replicate contains the same numbers of usable observations in the control and treatment conditions as the number of usable observations that are included in the input data set.
Bootstrap confidence intervals are computed only for the potential outcome means and treatment effect. You can specify one or more the following types of bootstrap confidence intervals in the BOOTCI option in the BOOTSTRAP statement:
The BOOTCI(NORMAL) option uses the normal approximation method to compute the bootstrap confidence interval. That is, the normal bootstrap confidence interval is given by
where is the estimate of
from the original sample,
is the standard deviation of the bootstrap parameter estimates, and
is the
th percentile of the standard normal distribution. PROC CAUSALTRT requires at least 50 bootstrap samples for normal bootstrap confidence intervals and does not compute them if 40 or fewer of the samples produce usable estimates.
The BOOTCI(PERC) option requests the percentile method, which uses the th and
th percentiles of the bootstrap parameter estimates as the confidence limits. These percentiles are computed as follows. Let
…
represent the ordered values of the bootstrap estimates for the potential outcome mean
. Let the kth weighted average percentile be q, set
, and let
where l is the integer part of and g is the fractional part of
. Then the kth percentile, q, is computed as follows, which corresponds to the default percentile definition of the UNIVARIATE procedure:
The BOOTCI(BC) option requests bias-corrected bootstrap confidence intervals, which use the cumulative distribution function (CDF), , of the bootstrap parameter estimates to determine the upper and lower endpoints of the confidence interval. The bias-corrected bootstrap confidence interval is given by
where is the standard normal CDF,
, and
is a bias correction,
where is the original sample estimate of
from the input data set;
is the number of bootstrap estimates,
, that are less than or equal to
; and B is the number of bootstrap replicates for which an estimate for the treatment effect is obtained.
The bias-corrected bootstrap confidence intervals are the default intervals used if you do not override them by specifying the NORMAL or PERC suboptions in the BOOTCI option.
PROC CAUSALTRT requires at least 1,000 bootstrap samples for the percentile and bias-corrected bootstrap confidence intervals and does not compute them if fewer than 900 of the samples produce usable estimates. If the number of samples n specified in the NBOOT=n option is less than 1,000 and the percentile or bias-corrected bootstrap confidence intervals are requested, the value of n is ignored. Bootstrap confidence intervals and the bootstrap standard deviations are computed in the same manner for the estimates of the ATE and ATT.