The CAUSALTRT Procedure

Standard Errors and Confidence Intervals

Asymptotic Formulas

PROC CAUSALTRT computes standard errors for the potential outcome means and treatment effect by using the robust sandwich covariance formula that is based on asymptotic theory. In general, let bold-italic theta 0 denote the vector of parameters of interest. PROC CAUSALTRT considers all its estimation as an M-estimation problem that solves the following estimating equations:

bold upper S left-parenthesis bold-italic theta right-parenthesis equals sigma-summation Underscript i equals 1 Overscript n Endscripts bold upper S Subscript i Baseline equals bold 0

When the function upper S left-parenthesis dot right-parenthesis is evaluated at theta 0, it satisfies upper E left-bracket upper S right-bracket equals 0. By the theory of M-estimation (Stefanski and Boos 2002), the robust sandwich covariance matrix, ModifyingAbove upper V With caret left-parenthesis ModifyingAbove bold-italic theta With caret right-parenthesis, for the estimates ModifyingAbove bold-italic theta With caret is given by

n Superscript negative 1 Baseline bold upper A Subscript n Baseline left-parenthesis ModifyingAbove bold-italic theta With caret right-parenthesis Superscript negative 1 Baseline bold upper B Subscript n Baseline left-parenthesis ModifyingAbove bold-italic theta With caret right-parenthesis bold upper A Subscript n Baseline left-parenthesis ModifyingAbove bold-italic theta With caret right-parenthesis Superscript negative upper T

where n is the sample size, bold upper A Superscript negative upper T denotes left-parenthesis bold upper A Superscript upper T Baseline right-parenthesis Superscript negative 1, and

bold upper A Subscript n Baseline left-parenthesis ModifyingAbove bold-italic theta With caret right-parenthesis equals n Superscript negative 1 Baseline sigma-summation Underscript i equals 1 Overscript n Endscripts StartFraction minus partial-differential upper S Subscript i Baseline Over partial-differential bold-italic theta Superscript upper T Baseline EndFraction
bold upper B Subscript n Baseline left-parenthesis ModifyingAbove bold-italic theta With caret right-parenthesis equals n Superscript negative 1 Baseline sigma-summation Underscript i equals 1 Overscript n Endscripts upper S Subscript i Baseline upper S Subscript i Superscript upper T

PROC CAUSALTRT computes standard errors by taking the square roots of the diagonal elements of ModifyingAbove upper V With caret left-parenthesis ModifyingAbove bold-italic theta With caret right-parenthesis.

The form and dimension of upper S left-parenthesis bold-italic theta right-parenthesis, and hence those of the robust sandwich covariance matrix, depend on which estimation method is used. Whenever an outcome or treatment (propensity score) model is used in a particular estimation method, the estimating equations must also include the corresponding score vectors for that model.

Let bold-italic beta be a generic vector of parameters of a model, and let bold-italic mu equals left-parenthesis mu 0 comma mu 1 right-parenthesis be the vector of potential outcome means. Further, let bold upper S Subscript normal p normal s, bold upper S Subscript normal o normal u normal t, and bold upper S Subscript normal o normal u normal t Superscript w denote the score functions from the maximum likelihood estimation of the propensity score, outcome, and inverse probability weighted outcome models, respectively. Then the composition of bold-italic theta 0 and bold upper S Subscript i for each estimation method is as follows:

  • AIPW

    bold-italic theta 0 equals left-parenthesis bold-italic beta Subscript normal p normal s Baseline comma bold-italic beta Subscript normal c Baseline comma bold-italic beta Subscript normal t Baseline comma bold-italic mu right-parenthesis
    bold upper S Subscript i Baseline equals left-parenthesis bold upper S Subscript normal p normal s comma i Baseline comma bold upper S Subscript normal o normal u normal t comma i Baseline comma bold upper S Subscript normal i normal p normal w comma i Baseline right-parenthesis
  • IPW and IPWR

    bold-italic theta 0 equals left-parenthesis bold-italic beta Subscript normal p normal s Baseline comma bold-italic mu right-parenthesis
    bold upper S Subscript i Baseline equals left-parenthesis bold upper S Subscript normal p normal s comma i Baseline comma bold upper S Subscript normal i normal p normal w comma i Baseline right-parenthesis
  • IPWS

    bold-italic theta 0 equals left-parenthesis bold-italic beta Subscript normal p normal s Baseline comma bold-italic eta comma bold-italic mu right-parenthesis
    bold upper S Subscript i Baseline equals left-parenthesis bold upper S Subscript normal p normal s comma i Baseline comma bold upper S Subscript bold-italic eta comma i Baseline comma bold upper S Subscript normal i normal p normal w comma i Baseline right-parenthesis
  • IPWREG

    bold-italic theta 0 equals left-parenthesis bold-italic beta Subscript normal p normal s Baseline comma bold-italic beta Subscript normal c Superscript w Baseline comma bold-italic beta Subscript normal t Superscript w Baseline comma bold-italic mu right-parenthesis
    bold upper S Subscript i Baseline equals left-parenthesis bold upper S Subscript normal p normal s comma i Baseline comma bold upper S Subscript normal o normal u normal t comma i Superscript w Baseline comma bold upper S Subscript normal r normal e normal g comma i Superscript w Baseline right-parenthesis
  • REGADJ

    bold-italic theta 0 equals left-parenthesis bold-italic beta Subscript normal c Baseline comma bold-italic beta Subscript normal t Baseline comma bold-italic mu right-parenthesis
    bold upper S Subscript i Baseline equals left-parenthesis bold upper S Subscript normal o normal u normal t comma i Baseline comma bold upper S Subscript normal r normal e normal g comma i Baseline right-parenthesis

The left-parenthesis 1 minus alpha right-parenthesis 100 percent-sign Wald confidence interval for the potential outcome mean mu Subscript i is given by

ModifyingAbove mu With caret Subscript i Baseline plus-or-minus normal z Subscript 1 minus alpha slash 2 Baseline ModifyingAbove sigma With caret Subscript i

where normal z Subscript p is the 100 pth percentile of the standard normal distribution and ModifyingAbove sigma With caret Subscript i is the standard error estimate for the potential outcome estimate ModifyingAbove mu With caret Subscript i. Because the ATE is the difference in potential outcome means, its standard error is computed by applying standard variance rules to the estimates from the covariance matrix for the potential outcomes. The corresponding Wald confidence interval for ATE is then constructed by the standard formula as shown in the previous expression. For the estimates of conditional potential outcome means, the standard errors and confidence limits are computed using the same approach but with bold upper S Subscript normal i normal p normal w comma i and bold upper S Subscript normal r normal e normal g comma i replaced by bold upper S Subscript normal i normal p normal w comma i vertical-bar upper T equals 1 and bold upper S Subscript normal r normal e normal g comma i vertical-bar upper T equals 1.

To provide better theoretical and computational properties, PROC CAUSALTRT uses modified expressions for components of upper A Subscript n and upper B Subscript n when the covariance matrix ModifyingAbove upper V With caret left-parenthesis ModifyingAbove bold-italic theta With caret right-parenthesis is computed. For descriptions of the modifications, their motivation, and theoretical justification, see Pierce (1982); Robins, Rotnitzky, and Zhao (1995); Lunceford and Davidian (2004); Wooldridge (2010).

Bootstrap Methods

If you specify the BOOTSTRAP statement, PROC CAUSALTRT uses bootstrap resampling to compute standard errors and confidence intervals for the potential outcome means and treatment effect. The procedure samples as many bootstrap sample data sets (replicates) as you specify in the NBOOT= option and then estimates the potential outcome means and treatment effect for each replication. The bootstrap samples are taken from within the treatment conditions. Each bootstrap replicate contains the same numbers of usable observations in the control and treatment conditions as the number of usable observations that are included in the input data set.

Bootstrap confidence intervals are computed only for the potential outcome means and treatment effect. You can specify one or more the following types of bootstrap confidence intervals in the BOOTCI option in the BOOTSTRAP statement:

  • The BOOTCI(NORMAL) option uses the normal approximation method to compute the bootstrap confidence interval. That is, the left-parenthesis 1 minus alpha right-parenthesis 100 percent-sign normal bootstrap confidence interval is given by

    ModifyingAbove mu With caret Subscript j Baseline plus-or-minus sigma Subscript mu Sub Subscript j Sub Superscript asterisk times normal z Subscript left-parenthesis 1 minus alpha slash 2 right-parenthesis

    where ModifyingAbove mu With caret Subscript jis the estimate of mu Subscript j from the original sample, sigma Subscript mu Sub Subscript j Sub Superscript asterisk is the standard deviation of the bootstrap parameter estimates, and normal z Subscript left-parenthesis 1 minus alpha slash 2 right-parenthesis is the 100 left-parenthesis 1 minus alpha slash 2 right-parenthesisth percentile of the standard normal distribution. PROC CAUSALTRT requires at least 50 bootstrap samples for normal bootstrap confidence intervals and does not compute them if 40 or fewer of the samples produce usable estimates.

  • The BOOTCI(PERC) option requests the percentile method, which uses the 100 left-parenthesis alpha slash 2 right-parenthesisth and 100 left-parenthesis 1 minus alpha slash 2 right-parenthesisth percentiles of the bootstrap parameter estimates as the confidence limits. These percentiles are computed as follows. Let mu Subscript j comma 1 Superscript asterisk Baseline comma mu Subscript j comma 2 Superscript asterisk Baseline commacomma mu Subscript j comma upper B Superscript asterisk represent the ordered values of the bootstrap estimates for the potential outcome mean mu Subscript j. Let the kth weighted average percentile be q, set p equals StartFraction k Over 100 EndFraction, and let

    n p equals l plus g

    where l is the integer part of n p and g is the fractional part of n p. Then the kth percentile, q, is computed as follows, which corresponds to the default percentile definition of the UNIVARIATE procedure:

    q equals StartLayout Enlarged left-brace 1st Row 1st Column one-half left-parenthesis mu Subscript j comma l Superscript asterisk Baseline plus mu Subscript j comma l plus 1 Superscript asterisk Baseline right-parenthesis 2nd Column if g equals 0 2nd Row 1st Column mu Subscript j comma l plus 1 Superscript asterisk Baseline 2nd Column if g greater-than 0 EndLayout

  • The BOOTCI(BC) option requests bias-corrected bootstrap confidence intervals, which use the cumulative distribution function (CDF), upper G left-parenthesis mu Superscript asterisk Baseline right-parenthesis, of the bootstrap parameter estimates to determine the upper and lower endpoints of the confidence interval. The bias-corrected bootstrap confidence interval is given by

    upper G Superscript negative 1 Baseline left-parenthesis normal upper Phi left-parenthesis 2 z 0 plus-or-minus z Subscript alpha slash 2 Baseline right-parenthesis right-parenthesis

    where normal upper Phi is the standard normal CDF, z Subscript alpha slash 2 Baseline equals normal upper Phi Superscript negative 1 Baseline left-parenthesis alpha slash 2 right-parenthesis, and z 0 is a bias correction,

    z 0 equals normal upper Phi Superscript negative 1 Baseline left-parenthesis StartFraction normal upper N left-parenthesis mu Subscript j Superscript asterisk Baseline less-than-or-equal-to ModifyingAbove mu With caret Subscript j Baseline right-parenthesis Over normal upper B EndFraction right-parenthesis

    where ModifyingAbove mu With caret Subscript j is the original sample estimate of mu Subscript j from the input data set; normal upper N left-parenthesis mu Subscript j Superscript asterisk Baseline less-than-or-equal-to ModifyingAbove mu With caret Subscript j Baseline right-parenthesis is the number of bootstrap estimates, mu Subscript j Superscript asterisk, that are less than or equal to ModifyingAbove mu With caret Subscript j; and B is the number of bootstrap replicates for which an estimate for the treatment effect is obtained.

The bias-corrected bootstrap confidence intervals are the default intervals used if you do not override them by specifying the NORMAL or PERC suboptions in the BOOTCI option.

PROC CAUSALTRT requires at least 1,000 bootstrap samples for the percentile and bias-corrected bootstrap confidence intervals and does not compute them if fewer than 900 of the samples produce usable estimates. If the number of samples n specified in the NBOOT=n option is less than 1,000 and the percentile or bias-corrected bootstrap confidence intervals are requested, the value of n is ignored. Bootstrap confidence intervals and the bootstrap standard deviations are computed in the same manner for the estimates of the ATE and ATT.

Last updated: December 09, 2022