Propensity score analysis assumes that the true propensity scores are known. When the propensity scores are estimated—as is usually the case in practice—you need to assess how well the distributions of the propensity scores (or their logits) and the adjusted variables are balanced between the treatment group and the control group. PROC PSMATCH provides numerical and graphical tools that you can use to assess the balance of the propensity scores (or their logits) and the adjusted variables that are either continuous or binary.
The ASSESS statement in the PSMATCH procedure provides a variety of statistical measures and graphical displays for comparing these distributions. You can make these assessments for all the observations in the data set, the observations in the support region, or the matched observations (if you specify a MATCH statement).
Two statistical measures for balance assessment are the standardized mean difference between the treatment and control groups and the treated-to-control variance ratio. For good variable balance, the absolute standardized mean difference should be less than or equal to 0.25, and the variance ratio should be between 0.5 and 2 (Rubin 2001, p. 174; Stuart 2010, p. 11). Some authors have applied a smaller threshold of 0.1 to the absolute standardized mean difference (Normand et al. 2001; Mamdani et al. 2005; Austin 2009).
The standardized mean difference is computed by dividing the difference in the means of the variable in the two groups by an estimate of the standard deviation. Two estimates of the standard deviation are available in the PSMATCH procedure:
For binary classification variables, the mean is taken to be the proportion p of units having the first classification level, and the variance is computed as (Austin, Grootendorst, and Anderson 2007, p. 737).
If you specify a STRATA statement, then stratum-specific standardized mean differences are computed for observations in the support region.
The PSMATCH procedure displays the standardized mean differences in plots. You can also request box plots and cloud plots for continuous variables, and bar charts for binary classification variables. These plots are also produced for each stratum if you specify a STRATA statement.
The next three subsections describe how standardized mean differences and treated-to-control variance ratios are computed for all observations, observations in the support region, and matched observations.
For all observations in the data set, let be the mean of the observations in the treatment group and let
be the mean of the observations in the control group, with corresponding sample variances
and
. Then the standardized mean difference is
where the standard deviation is given by
The treated-to-control variance ratio is
For observations in the support region, let be the mean of observations in the treatment group and
be the mean of observations in the control group, with corresponding sample variances
and
. Then the standardized mean difference is
where the standard deviation is given by
That is, with ALLOBS=YES, the standard deviation that is derived from all observations in the data set is used to compute the standardized mean difference. With ALLOBS=NO, the standard deviation that is derived from observations in the support region is used to compute the standardized mean difference.
The treated-to-control variance ratio is
The percentage reduction in the standardized mean difference is computed as
Let be the weighted stratum mean of treated observations, and let
be the weighted stratum mean of control observations, with corresponding variances
and
. For information about these statistics, see the section Weighting after Stratification.
The standardized mean difference is
where the standard deviation is given by
The treated-to-control variance ratio is
The percentage reduction for the standardized mean difference is computed as
Let be the mean of matched observations in the treatment group, and let
be the mean of matched observations in the control group, with corresponding sample variances
and
. Then the standardized mean difference is
where the standard deviation is given by
The treated-to-control variance ratio is
The percentage reduction for the standardized mean difference is computed as