The PSMATCH Procedure

PROC PSMATCH Statement

The PROC PSMATCH statement invokes the PSMATCH procedure. Table 1 summarizes the options available in the PROC PSMATCH statement.

Table 1: Summary of PROC PSMATCH Options

Option	Description
DATA=	Specifies the input data set
REGION=	Specifies the support region of observations for stratification and matching

DATA=SAS-data-set

names the input SAS data set. If the propensity scores are to be derived from this data set, you must also include a PSMODEL statement to specify the binary logistic model. Otherwise, a PSDATA statement is required to identify the variable that contains either the propensity scores or the logits of the propensity scores. If you do not specify this option, the procedure uses the most recently created SAS data set.

REGION=region <(region-options)>

specifies an interval region of propensity scores (or equivalently, logits of propensity scores) that determines which observations are used in stratification and matching. Only those observations whose propensity scores lie in the region are used in stratification and matching. This option also determines which observations are included in the output data set if you specify the OUT(OBS=REGION) option in the OUTPUT statement (even when you omit the STRATA and MATCH statements). By default, REGION=TREATED if you specify a MATCH statement, and REGION=ALLOBS otherwise.

When you perform entropy balancing, all observations are used and the REGION= option is ignored.

You can specify the following regions along with their region-options:

REGION=ALLOBS <(region-options)>

selects all available observations. You can specify the following region-options to select observations whose propensity scores lie in a specified range:

PSMIN=pmin: specifies the minimum propensity score in the support region, where pmin 0. Observations whose propensity scores are less than pmin are excluded from the support region. By default, PSMIN=0, so that observations that have small propensity scores are not excluded.
PSMAX=pmax: specifies the maximum propensity score in the support region, where pmax 1. Observations whose propensity scores are greater than pmax are excluded from the support region. By default, PSMAX=1, so that observations that have large propensity scores are not excluded.

You can also use the PSMIN= and PSMAX= options to exclude observations that have extreme propensity scores from the output data set.

REGION=CS <(ext-option)>

selects observations whose propensity scores (or equivalently, logits of propensity scores) lie in the region of common support for the treated and control groups. This region is the largest interval that contains propensity scores (or logits of propensity scores) for subjects in both groups. The lower endpoint of the region is the larger of the minimum propensity scores (or logits of propensity scores) for the two groups. The upper endpoint is the smaller of the maximum propensity scores (or logits of propensity scores) for the two groups.

You can specify the following ext-option:

EXTEND <(type-options)> = p <(LOWER=p UPPER=p)>

extends the lower and upper ends of the common support region for the support region by p, where p 0. By default, EXTEND=0.25.

You can use the following type-options to prescribe the extension requirement:

DISTANCE=LPS | PS

specifies the type of the distance that is used to extend the support region.

LPS: extends the region by using the logit of the propensity score.
PS: extends the region by using the propensity score.

By default, DISTANCE=LPS.

MULT=ONE | STDDEV

specifies the multiplier for the extension p to extend the support region.

ONE: extends the region by p.
STDDEV: extends the region by p times the pooled estimate of the standard deviation of either LPS (DISTANCE=LPS) or PS (DISTANCE=PS), where this estimate is computed as the square root of the average of the variances in the treated and control groups.

By default, MULT=STDDEV.

The DISTANCE= and MULT= type-options prescribe the extension requirement as follows:

EXTEND(DISTANCE=PS MULT=ONE)=p extends the specified support region by p in propensity score. That is, if denotes the propensity score interval region that is computed from the specified region, then the range of the extended support region is given by .
EXTEND(DISTANCE=PS MULT=STDDEV)=p extends the specified support region by p , the square root of the average variance of the propensity score in the treated and control groups. That is, if denotes the propensity score interval region that is computed from the specified region, then the range of the extended support region is given by .
EXTEND(DISTANCE=LPS MULT=ONE)=p extends the specified support region by p in the logit of the propensity score.
EXTEND(DISTANCE=LPS MULT=STDDEV)=p extends the specified support region by p , the square root of the average variance of the logit of the propensity score in the treated and control groups.

You can specify one of the following two options to use an extension other than p:

LOWER=p: extends the lower end of the specified region by p, where p 0.
UPPER=p: extends the upper end of the specified region by p, where p 0.

REGION=TREATED <(ext-option)>

selects observations whose propensity scores lie in the region of propensity scores for observations in the treated group.

You can specify the following ext-option:

EXTEND <(type-options)>= p <(LOWER=p UPPER=p)>

extends the lower and upper ends of the range of treated observations for the support region by p, where p 0. By default, EXTEND=0.25.

You can use the type-options to prescribe the extension requirement, and these are identical to the type-options in the REGION=CS option. You can also specify the LOWER=p or UPPER=p suboption to use an extension other than p.

Last updated: December 09, 2022