The PSMATCH Procedure

Variable Balance Assessment

Propensity score analysis assumes that the true propensity scores are known. When the propensity scores are estimated—as is usually the case in practice—you need to assess how well the distributions of the propensity scores (or their logits) and the adjusted variables are balanced between the treatment group and the control group. PROC PSMATCH provides numerical and graphical tools that you can use to assess the balance of the propensity scores (or their logits) and the adjusted variables that are either continuous or binary.

The ASSESS statement in the PSMATCH procedure provides a variety of statistical measures and graphical displays for comparing these distributions. You can make these assessments for all the observations in the data set, the observations in the support region, or the matched observations (if you specify a MATCH statement).

Two statistical measures for balance assessment are the standardized mean difference between the treatment and control groups and the treated-to-control variance ratio. For good variable balance, the absolute standardized mean difference should be less than or equal to 0.25, and the variance ratio should be between 0.5 and 2 (Rubin 2001, p. 174; Stuart 2010, p. 11). Some authors have applied a smaller threshold of 0.1 to the absolute standardized mean difference (Normand et al. 2001; Mamdani et al. 2005; Austin 2009).

The standardized mean difference is computed by dividing the difference in the means of the variable in the two groups by an estimate of the standard deviation. Two estimates of the standard deviation are available in the PSMATCH procedure:

the square root of the average of the variances in the treatment and control groups (Rosenbaum and Rubin 1985, p. 37),
the standard deviation of observations in the treatment group only (Stuart 2010, p. 11)

For binary classification variables, the mean is taken to be the proportion p of units having the first classification level, and the variance is computed as (Austin, Grootendorst, and Anderson 2007, p. 737).

If you specify a STRATA statement, then stratum-specific standardized mean differences are computed for observations in the support region.

The PSMATCH procedure displays the standardized mean differences in plots. You can also request box plots and cloud plots for continuous variables, and bar charts for binary classification variables. These plots are also produced for each stratum if you specify a STRATA statement.

The next three subsections describe how standardized mean differences and treated-to-control variance ratios are computed for all observations, observations in the support region, and matched observations.

Standardized Mean Differences for All Observations

For all observations in the data set, let be the mean of the observations in the treatment group and let be the mean of the observations in the control group, with corresponding sample variances and . Then the standardized mean difference is

d Subscript left-parenthesis normal upper A right-parenthesis Baseline equals StartFraction x overbar Subscript normal t left-parenthesis normal upper A right-parenthesis Baseline minus x overbar Subscript normal c left-parenthesis normal upper A right-parenthesis Baseline Over s Subscript left-parenthesis normal upper A right-parenthesis Baseline EndFraction

where the standard deviation is given by

s Subscript left-parenthesis normal upper A right-parenthesis Baseline equals StartLayout Enlarged left-brace 1st Row 1st Column StartRoot StartFraction upper V left-parenthesis x Subscript normal t left-parenthesis normal upper A right-parenthesis Baseline right-parenthesis plus upper V left-parenthesis x Subscript normal c left-parenthesis normal upper A right-parenthesis Baseline right-parenthesis Over 2 EndFraction EndRoot 2nd Column if STDDEV equals POOLED 2nd Row 1st Column StartRoot upper V left-parenthesis x Subscript normal t left-parenthesis normal upper A right-parenthesis Baseline right-parenthesis EndRoot 2nd Column if STDDEV equals TREATED EndLayout

The treated-to-control variance ratio is

StartFraction upper V left-parenthesis x Subscript normal t left-parenthesis normal upper A right-parenthesis Baseline right-parenthesis Over upper V left-parenthesis x Subscript normal c left-parenthesis normal upper A right-parenthesis Baseline right-parenthesis EndFraction

Standardized Mean Differences for Observations in the Support Region

For observations in the support region, let be the mean of observations in the treatment group and be the mean of observations in the control group, with corresponding sample variances and . Then the standardized mean difference is

where the standard deviation is given by

s Subscript left-parenthesis normal upper R right-parenthesis Baseline equals StartLayout Enlarged left-brace 1st Row 1st Column StartRoot StartFraction upper V left-parenthesis x Subscript normal t left-parenthesis normal upper R right-parenthesis Baseline right-parenthesis plus upper V left-parenthesis x Subscript normal c left-parenthesis normal upper R right-parenthesis Baseline right-parenthesis Over 2 EndFraction EndRoot 2nd Column if STDDEV equals POOLED 2nd Row 1st Column StartRoot upper V left-parenthesis x Subscript normal t left-parenthesis normal upper R right-parenthesis Baseline right-parenthesis EndRoot 2nd Column if STDDEV equals TREATED EndLayout

That is, with ALLOBS=YES, the standard deviation that is derived from all observations in the data set is used to compute the standardized mean difference. With ALLOBS=NO, the standard deviation that is derived from observations in the support region is used to compute the standardized mean difference.

The treated-to-control variance ratio is

StartFraction upper V left-parenthesis x Subscript normal t left-parenthesis normal upper R right-parenthesis Baseline right-parenthesis Over upper V left-parenthesis x Subscript normal c left-parenthesis normal upper R right-parenthesis Baseline right-parenthesis EndFraction

The percentage reduction in the standardized mean difference is computed as

100 times StartFraction max left-parenthesis StartAbsoluteValue d Subscript left-parenthesis normal upper A right-parenthesis Baseline EndAbsoluteValue minus StartAbsoluteValue d Subscript left-parenthesis normal upper R right-parenthesis Baseline EndAbsoluteValue comma 0 right-parenthesis Over StartAbsoluteValue d Subscript left-parenthesis normal upper A right-parenthesis Baseline EndAbsoluteValue EndFraction

Pooled Standardized Mean Differences across the Strata

Let be the weighted stratum mean of treated observations, and let be the weighted stratum mean of control observations, with corresponding variances and . For information about these statistics, see the section Weighting after Stratification.

The standardized mean difference is

where the standard deviation is given by

s Subscript left-parenthesis normal upper S right-parenthesis Baseline equals StartLayout Enlarged left-brace 1st Row 1st Column StartRoot StartFraction upper V left-parenthesis x Subscript normal t left-parenthesis normal upper S right-parenthesis Baseline right-parenthesis plus upper V left-parenthesis x Subscript normal c left-parenthesis normal upper S right-parenthesis Baseline right-parenthesis Over 2 EndFraction EndRoot 2nd Column if STDDEV equals POOLED 2nd Row 1st Column StartRoot upper V left-parenthesis x Subscript normal t left-parenthesis normal upper S right-parenthesis Baseline right-parenthesis EndRoot 2nd Column if STDDEV equals TREATED EndLayout

The treated-to-control variance ratio is

StartFraction upper V left-parenthesis x Subscript normal t left-parenthesis normal upper S right-parenthesis Baseline right-parenthesis Over upper V left-parenthesis x Subscript normal c left-parenthesis normal upper S right-parenthesis Baseline right-parenthesis EndFraction

The percentage reduction for the standardized mean difference is computed as

Standardized Mean Differences for Matched Observations

Let be the mean of matched observations in the treatment group, and let be the mean of matched observations in the control group, with corresponding sample variances and . Then the standardized mean difference is

where the standard deviation is given by

s Subscript left-parenthesis normal upper M right-parenthesis Baseline equals StartLayout Enlarged left-brace 1st Row 1st Column StartRoot StartFraction upper V left-parenthesis x Subscript normal t left-parenthesis normal upper M right-parenthesis Baseline right-parenthesis plus upper V left-parenthesis x Subscript normal c left-parenthesis normal upper M right-parenthesis Baseline right-parenthesis Over 2 EndFraction EndRoot 2nd Column if STDDEV equals POOLED 2nd Row 1st Column StartRoot upper V left-parenthesis x Subscript normal t left-parenthesis normal upper M right-parenthesis Baseline right-parenthesis EndRoot 2nd Column if STDDEV equals TREATED EndLayout

The treated-to-control variance ratio is

StartFraction upper V left-parenthesis x Subscript normal t left-parenthesis normal upper M right-parenthesis Baseline right-parenthesis Over upper V left-parenthesis x Subscript normal c left-parenthesis normal upper M right-parenthesis Baseline right-parenthesis EndFraction

The percentage reduction for the standardized mean difference is computed as

Last updated: December 09, 2022