Introduction to Causal Analysis Procedures

PROC PSMATCH

The PSMATCH procedure provides a variety of tools for propensity score analysis (Rosenbaum and Rubin 1983). It outputs data sets that contain pertinent propensity score information that you can use in subsequent outcome analyses to estimate causal effects.

PROC PSMATCH conducts propensity score analysis of a binary treatment variable T (say, T=1 indicates the treatment level), which is posited to have a causal effect on an outcome variable Y. You fit a propensity score model to the data so as to predict the propensity scores from a set of pretreatment (or baseline) characteristics. A commonly used form of the propensity score model is logistic regression, which is also assumed by PROC PSMATCH. For an introduction to propensity score analysis, see Guo and Fraser (2015).

For example, the following statements specify a propensity score analysis of the treatment variable Music (which indicates musical training in subjects):

proc psmatch data=School;
   class Music Gender;
   psmodel Music = Gender Absence;
   match method=optimal;
   output out(obs=match)=OutMatch;
run;

In the PSMODEL statement, the propensity score model specifies that the probability of receiving musical training is determined by the pretreatment characteristics Gender and Absence. In the MATCH statement, you request an optimal one-to-one matching of subjects between the treated and control groups. PROC PSMATCH then selects subsets of observations from the original data so that the selected observations in the treated and control groups match their propensity score distributions as closely as possible.

In the OUTPUT statement, you request that PROC PSMATCH output a data set called OutMatch. This output data set contains the original data as well as information about the matched subjects. For the current example, PROC PSMATCH creates a weighting variable named _MATCHWGT_ to indicate the matched subjects. You can then use this data set to estimate the causal treatment effect of Music on the outcome variable of interest.

For example, you can use PROC TTEST to estimate the causal treatment effect of Music on GPA (representing academic performance) and to test the statistical significance of the estimated causal effect by specifying the following statements:

proc ttest data=OutMatch;
   class Music;
   var GPA;
   weight _MATCHWGT_;
run;

PROC PSMATCH implements various propensity score methods, which are summarized as follows:

propensity score matching method: PROC PSMATCH outputs matched observation weights for all subjects; subjects that are not matched are indicated by zero weights. Each set of matched subjects in the treated and control groups is also indicated by distinct identification numbers.
propensity score weighting method: PROC PSMATCH outputs weights that are computed on the basis of the predicted propensity scores for all subjects.
propensity score stratification method: PROC PSMATCH outputs stratum identification numbers for subjects. It also provides stratum weights that you can use to combine causal effect estimates from separate outcome analyses for the strata that PROC PSMATCH creates.

In all these propensity score methods, PROC PSMATCH outputs weights that you need to use in subsequent outcome analyses for estimating causal treatment effects. In addition, PROC PSMATCH computes different types of weights that are appropriate for computing the average treatment effect (ATE) and the average treatment effect for the treated (ATT), respectively.

A very important step in propensity score analysis is to evaluate the balance in covariates after you fit a propensity score model. PROC PSMATCH provides many numerical and graphical tools that you can use to assess the balance, including these:

standardized mean differences between treated and control groups in the covariates
percentage reductions of absolute mean differences after matching or weighting
comparisons of distributions of covariates and propensity scores before and after matching or weighting

When you apply a propensity score–based adjustment method that is derived from a good propensity model, you expect good balance between the treated and control groups for all pretreatment covariates. If you are not satisfied with the covariate balance, refitting the propensity score model by other statistical strategies would be needed.

Other main features of the PSMATCH include the following:

input of propensity scores that are estimated by methods outside PROC PSMATCH (for more information about inputting propensity scores, see the PSDATA statement of Chapter 101, The PSMATCH Procedure)
various matching methods: greedy nearest-neighbor matching, optimal matching, and matching with replacement

For more information about PROC PSMATCH, see Chapter 101, The PSMATCH Procedure, and Yuan, Yung, and Stokes (2017).

Last updated: December 09, 2022