The CAUSALTRT Procedure

MODEL Statement

MODEL outcome <(variable-options)><= effects> </ model-options>;

For all the estimation methods that PROC CAUSALTRT implements, you must specify the outcome variable in the MODEL statement. For more information about the estimation methods implemented in PROC CAUSALTRT and the model specifications that are required, see the section Outline of Estimation Method Requirements.

In addition to specifying the outcome variable, you can also specify the following:

effects

specify one or more explanatory variables or a combination of variables that are used to fit regression models for the outcome variable within each treatment condition. Explanatory variables that represent nominal (classification) data must be declared in the CLASS statement. For more information about specifying effects, see the section Specification of Effects in Chapter 53, The GLM Procedure.

variable-options

specify one or more of the following options within parentheses for a binary outcome variable immediately after the variable:

DESCENDING DESC

reverses the order of the outcome categories. If both the DESCENDING and ORDER= options are specified, PROC CAUSALTRT orders the outcome categories according to the ORDER= option and then reverses that order.

EVENT='category' | FIRST | LAST

specifies the event category for the binary outcome model. PROC CAUSALTRT models the probability of the event category. You can specify one of the following:

'category': specifies the value (formatted if a format is applied) of the event category in quotation marks.
FIRST: designates the first-ordered category as the event.
LAST: designates the last-ordered category as the event.

By default, EVENT=FIRST.

One of the most common sets of outcome levels is {0,1}, where 1 represents the event for which the probability is to be modeled. Consider the following example, where Y takes the values 1 and 0 for event and nonevent, respectively, and X is the explanatory variable. To specify the value 1 as the event category, use the following MODEL statement:

model Y(event='1') = X;

ORDER=DATA | FORMATTED | FREQ | INTERNAL

specifies the sort order for the levels of the outcome variable. The following table displays the available ORDER= options.

ORDER=	Levels Sorted By
DATA	Order of appearance in the input data set.
FORMATTED	External formatted value, except for numeric variables that have no explicit format, which are sorted by their unformatted (internal) value. The sort order is machine-dependent.
FREQ	Descending frequency count. Levels that have the most observations come first in the order.
INTERNAL	Unformatted value. The sort order is machine-dependent.

By default, ORDER=FORMATTED.

For more information about sort order, see the chapter on the SORT procedure in the Base SAS Procedures Guide and the discussion of BY-group processing in SAS Programmers Guide: Essentials.

REFERENCE='category' | FIRST | LAST REF='category' | FIRST | LAST

specifies the reference category for the binary outcome model. Specifying one outcome category as the reference is the same as specifying the other outcome category as the event category. You can specify one of the following:

'category': specifies the value (formatted if a format is applied) of the reference category in quotation marks.
FIRST: designates the first-ordered category as the reference.
LAST: designates the last-ordered category as the reference.

By default, REF=LAST.

model-options

specify additional options for the outcome model after a slash (/). These model-options are summarized in Table 4 and described in detail after the table.

Table 4: model-options in the MODEL Statement

model-option	Description
ALPHA=	Specifies the level for confidence limits
DIST=	Specifies the probability distribution
LINK=	Specifies the link function
NOSCALE	Holds the scale parameter fixed
SCALE=	Specifies the value used for the scale

ALPHA=number

specifies a number to use as the level for % confidence limits that the MODEL statement computes. The value of number must be between 0 and 1. The default value of number is the set by the ALPHA= option in the PROC CAUSALTRT statement.

DIST=keyword DISTRIBUTION=keyword

specifies the built-in probability distribution to use in the model. If you specify the DIST= option and you omit the LINK= option, a default link function is chosen as displayed in Table 5. If you specify neither the DIST= option nor the LINK= option, then the CAUSALTRT procedure defaults to the binomial distribution with logit link if the outcome variable is listed in the CLASS statement. If the outcome variable is not listed in the CLASS statement, then the CAUSALTRT procedure defaults to the normal distribution with the identity link function.

Table 5: Distributions and Default Link Functions

DIST=	Distribution	Default Link Function
BIN \| B	Binary	Logit
GAMMA \| GAM \| G	Gamma	Reciprocal
IGAUSSIAN \| IG	Inverse Gaussian	Reciprocal square
NORMAL \| NOR \| N	Normal	Identity
POISSON \| POI \| P	Poisson	Log

Responses for the inverse Gaussian and gamma distributions must be strictly positive. For the Poisson distribution, responses must be nonnegative, but they can take noninteger values. Observations whose response values are outside of the distribution’s support are not used to estimate the causal effect, even if the estimation method does not fit an outcome model.

LINK=keyword

specifies the link function in the model. You can specify the keywords shown in Table 6.

Table 6: Built-In Link Functions of the CAUSALTRT Procedure

	Link
LINK=	Function
CLOGLOG \| CLL	Complementary log-log
IDENTITY \| ID	Identity
INVERSE \| RECIPROCAL	Reciprocal
LOG	Log
LOGIT	Logit
POWERMINUS2	Power with exponent –2
PROBIT	Probit

For the probit link, denotes the quantile function of the standard normal distribution. By default, the link function is chosen as shown in Table 5.

NOSCALE

holds the scale parameter fixed. If you omit the SCALE= option, the scale parameter is fixed at the value 1. If you do not specify the NOSCALE option, the scale parameter is estimated by maximum likelihood for the normal, inverse Gaussian, and gamma distributions.

SCALE=number

specifies the value used for the scale parameter when the NOSCALE option is specified. For the binomial and Poisson distributions, which have no free scale parameter, you can use this option to specify an overdispersed model. If the NOSCALE option is not specified, then number is used as an initial estimate of the scale parameter.

Last updated: December 09, 2022