In addition to performing propensity score analyses, the PSMATCH procedure also performs entropy balancing (Hainmueller 2012). Entropy balancing is a method of reweighting observations that does not require propensity scores and instead directly targets balancing the moments of covariates between the treated and control conditions. The procedure obtains the entropy balancing weights by solving a constrained optimization problem. The optimization problem is defined by a set of reference weights, constraints on the weighted moments for a set of covariates, and a measure of divergence.
To describe the entropy balancing weights and how they are computed, the following definitions are used:
As described in Hainmueller (2012), the entropy balance optimization problem imposes the constraints ,
, on the weighted moments of the covariates. In particular, the PSMATCH procedure supports constraints on the first and possibly second moments of continuous variables. For categorical variables, the procedure uses the GLM parameterization and imposes balance constraints on the level proportions. To define the constraints for a continuous variable
, let
be an estimate of the variable mean, let
denote the target value of the weighted central moment, and assume that the weights
are nonnegative and sum to 1. A constraint on the first moment of
is then of the form
where is equal to 0. A constraint on the second moment of
is of the form
For a categorical predictor, let denote a column that is created for a level of the variable by using the GLM parameterization, and let
denote the target proportion for that level. A constraint on the level proportion is of the form
The objective function that is used for the entropy balance optimization problem is
where ,
, are nonnegative weights that are additionally constrained to sum to 1. This objective function corresponds to the Kullback-Leibler divergence, or relative entropy, between probability the distributions that are defined by the reference weights
and the weights
. The entropy balance weights for the treatment group in which
are determined by the optimal solution
to the optimization problem
As described in Hainmueller (2012), by introducing Lagrange multipliers for the constraints, you can obtain the entropy balance weights by solving a more tractable dual formulation of the problem. In particular, let denote a vector of dimension
, and consider the unconstrained dual problem of
Note that the maximum absolute gradient element of the dual problem corresponds to the maximum absolute violation of the balance constraints for the primal problem.
Given the solution to the dual problem, you can obtain the entropy balance weights by evaluating the expression
You can rescale the weights to sum to any positive target value.
The PSMATCH procedure formulates and solves the entropy balance optimization problem as follows. When WEIGHT=CONTROL, one optimization problem is solved to obtain the entropy balance weights for the control units, and the values of and the target moments
are determined by the central moments of the treated units. When WEIGHT=CONTROL, the units in the treatment are either assigned reference weight values that are determined by the REFWEIGHT= option or set to 1 if no reference weights are specified. When WEIGHT=ALLOBS, two optimization problems are solved—one for the control units and a second one for the treated units—and the values of
and the target moments
are determined by the central moments of all observations.
By default, the reference weights are from a uniform distribution where
. Alternatively, you can use the REFWEIGHT= option to specify a variable in the input data set that contains the reference weights. Note that the PSMATCH procedure internally rescales the input reference weights so that they sum to 1. When WEIGHT=CONTROL, the unscaled values are used as weights for the treated units in the balance diagnostics and OUT= data set. When you specify reference weights, you can use the WEIGHTMOMENTS= suboption to specify whether or not the target moments for the entropy balance constraints should be weighted by the reference weights. By default, WEIGHTMOMENTS=YES.
Balance constraints are imposed on the variables that you specify in the BALANCEVARS= option. By default, balance constraints are added for both the first and second moments of continuous variables. You can use the MOMENTS=1 suboption to add balance constraints only for the first moments of continuous variables. If PROC PSMATCH detects linearly dependent balance constraints, a warning message is printed to the log. The procedure attempts to solve the entropy balance problem for the linearly independent constraints, and it returns an error if the resulting weights do not also satisfy the linearly dependent constraints.
PROC PSMATCH attempts to solve the dual formulation of the entropy balancing problem by using a Newton-Raphson optimization that combines a line-search algorithm with ridging. The procedure evaluates an absolute gradient convergence criterion for the dual problem that you can specify by using the ABSGTOL= option. In addition to the absolute gradient convergence criterion, you can specify a second, weaker value for evaluating the balance constraints by using the BALTOL= option. If the entropy balance optimization problem fails to converge, the weights from the final iteration are accepted if the maximum absolute violation of the balance constraints is less than or equal to the balance constraint tolerance value. By default, the values of the ABSGTOL= and BALTOL= options are the same.
When a set of entropy balance weights are obtained for a treatment condition, the weights are rescaled to sum to the value of the TOTALWEIGHT= option. If you omit the TOTALWEIGHT= option value and if WEIGHT=CONTROL, the weights for the control units are rescaled to equal the total frequency of the treated units. When WEIGHT=CONTROL, the weights for the treated units that are used in the balance assessments and OUT= data set are not rescaled to equal the TOTALWEIGHT= value. If you omit the TOTALWEIGHT= value and if WEIGHT=ALLOBS, the weights for the control units and the treated units are separately rescaled so that the total weight for each treatment condition equals the total frequency of the input observations.