The CATMOD Procedure

RESPONSE Statement

  • RESPONSE <function> </ options>;

The RESPONSE statement specifies functions of the response probabilities. The procedure models these response functions as linear combinations of the parameters.

By default, PROC CATMOD uses the standard response functions (generalized logits, which are explained in detail in the section Understanding the Standard Response Functions). With these standard response functions, the default estimation method is maximum likelihood, but you can use the WLS option in the MODEL statement to request weighted least squares estimation. With other response functions (specified in the RESPONSE statement), the default (and only) estimation method is weighted least squares.

You can specify more than one RESPONSE statement, in which case each RESPONSE statement produces a separate analysis. If the computed response functions for any population are linearly dependent (yielding a singular covariance matrix), then PROC CATMOD displays an error message and stops processing. See the section Cautions for methods of dealing with this.

The function specification can be any of the items in the following list. For an example of response functions generated and formulas for q (the number of response functions), see the section More on Response Functions.

Table 6 summarizes the options available in the RESPONSE statement.

Table 6: RESPONSE Statement Options

Option Description
ALOGIT Specifies response functions as adjacent-category logits
CLOGIT Specifies that the response functions are cumulative logits
JOINT Specifies that the response functions are the joint response probabilities
LOGIT Specifies that the response functions are generalized logits
MARGINAL Specifies that the response functions are marginal probabilities
MEAN Specifies that the response functions are the means dependent variables
READ Directly reads the response functions and their covariance matrix from the input data set
OUT= Produces a SAS data set that contains predicted values, standard errors and residuals
OUTEST= Produces a SAS data set that contains the estimated parameter vector and its estimated covariance matrix
TITLE= Displays the title


ALOGIT
ALOGITS

specifies response functions as adjacent-category logits of the marginal probabilities for each of the dependent variables. For each dependent variable, the response functions are a set of linearly independent adjacent-category logits, obtained by taking the logarithms of the ratios of two probabilities. The denominator of the kth ratio is the marginal probability corresponding to the kth level of the variable, and the numerator is the marginal probability corresponding to the (k + 1) level. If a dependent variable has two levels, then the adjacent-category logit is the negative of the generalized logit.

CLOGIT
CLOGITS

specifies that the response functions are cumulative logits of the marginal probabilities for each of the dependent variables. For each dependent variable, the response functions are a set of linearly independent cumulative logits, obtained by taking the logarithms of the ratios of two probabilities. The denominator of the kth ratio is the cumulative probability, c Subscript k, corresponding to the kth level of the variable, and the numerator is 1 minus c Subscript k (Agresti 1984, 113–114). If a dependent variable has two levels, then PROC CATMOD computes its cumulative logit as the negative of its generalized logit. You should use cumulative logits only when the dependent variables are ordinally scaled.

JOINT

specifies that the response functions are the joint response probabilities. A linearly independent set is created by deleting the last response probability. For the case of one dependent variable, the JOINT and MARGINALS specifications are equivalent.

LOGIT
LOGITS

specifies that the response functions are generalized logits of the marginal probabilities for each of the dependent variables. For each dependent variable, the response functions are a set of linearly independent generalized logits, obtained by taking the logarithms of the ratios of two probabilities. The denominator of each ratio is the marginal probability corresponding to the last observed level of the variable, and the numerators are the marginal probabilities corresponding to each of the other levels. If there is one dependent variable, then specifying LOGIT is equivalent to using the standard response functions.

MARGINAL
MARGINALS

specifies that the response functions are marginal probabilities for each of the dependent variables in the MODEL statement. For each dependent variable, the response functions are a set of linearly independent marginals, obtained by deleting the marginal probability corresponding to the last level.

MEAN
MEANS

specifies that the response functions are the means of the dependent variables in the MODEL statement. This specification requires that all of the dependent variables be numeric.

READ variables

specifies that the response functions and their covariance matrix are to be read directly from the input data set with one response function for each variable named. See the section Inputting Response Functions and Covariances Directly for more information.

transformation

specifies response functions that can be expressed by using successive applications of the four operations: bold upper L bold upper O bold upper G, bold upper E bold upper X bold upper P, bold asterisk matrix literal, or bold plus matrix literal. The operations are described in detail in the section Using a Transformation to Specify Response Functions.

You can specify the following options in the RESPONSE statement after a slash.

OUT=SAS-data-set

produces a SAS data set that contains, for each population, the observed and predicted values of the response functions, their standard errors, and the residuals. Moreover, if you use the standard response functions, the data set also includes observed and predicted values of the cell frequencies or the cell probabilities. For further information, see the section Output Data Sets.

OUTEST=SAS-data-set

produces a SAS data set that contains the estimated parameter vector and its estimated covariance matrix. For further information, see the section Output Data Sets.

TITLE=’title’

displays the title at the top of certain pages of output that correspond to this RESPONSE statement.

More on Response Functions

Suppose the dependent variable A has three levels and is the only response-effect in the MODEL statement. The following table shows the proportions upon which the response functions are defined:

Value of A: 1 2 3
Proportions: p 1 p 2 p 3

Note that sigma-summation Underscript j Endscripts p Subscript j Baseline equals 1. The following table shows the response functions generated for each population:

Function Value
Specification of q Response Function
noneSuperscript asterisk 2 ln left-parenthesis StartFraction p 1 Over p 3 EndFraction right-parenthesis comma ln left-parenthesis StartFraction p 2 Over p 3 EndFraction right-parenthesis
ALOGITS 2 ln left-parenthesis StartFraction p 2 Over p 1 EndFraction right-parenthesis comma ln left-parenthesis StartFraction p 3 Over p 2 EndFraction right-parenthesis
CLOGITS 2 ln left-parenthesis StartFraction 1 minus p 1 Over p 1 EndFraction right-parenthesis comma ln left-parenthesis StartFraction 1 minus left-parenthesis p 1 plus p 2 right-parenthesis Over p 1 plus p 2 EndFraction right-parenthesis
JOINT 2 p 1 comma p 2
LOGITS 2 ln left-parenthesis StartFraction p 1 Over p 3 EndFraction right-parenthesis comma ln left-parenthesis StartFraction p 2 Over p 3 EndFraction right-parenthesis
MARGINAL 2 p 1 comma p 2
MEAN 1 1 p 1 plus 2 p 2 plus 3 p 3
Superscript asteriskWithout a function specification, the default response functions are generalized logits.

Now, suppose the dependent variables A and B each have three levels (valued 1, 2, and 3 each) and the response-effect in the MODEL statement is A*B. The following table shows the proportions upon which the response functions are defined:

Value of A: 1 1 1 2 2 2 3 3 3
Value of B: 1 2 3 1 2 3 1 2 3
Proportions: p 1 p 2 p 3 p 4 p 5 p 6 p 7 p 8 p 9

The marginal totals for the preceding table are defined as follows:

StartLayout 1st Row 1st Column p Subscript 1 dot 2nd Column equals 3rd Column p 1 plus p 2 plus p 3 4th Column Blank 5th Column p Subscript dot 1 6th Column equals 7th Column p 1 plus p 4 plus p 7 2nd Row 1st Column p Subscript 2 dot 2nd Column equals 3rd Column p 4 plus p 5 plus p 6 4th Column Blank 5th Column p Subscript dot 2 6th Column equals 7th Column p 2 plus p 5 plus p 8 3rd Row 1st Column p Subscript 3 dot 2nd Column equals 3rd Column p 7 plus p 8 plus p 9 4th Column Blank 5th Column p Subscript dot 3 6th Column equals 7th Column p 3 plus p 6 plus p 9 EndLayout

where sigma-summation Underscript j Endscripts p Subscript j Baseline equals 1. The following table shows the response functions generated for each population:

Function Value
Specification of q Response Function
noneSuperscript asterisk 8 ln left-parenthesis StartFraction p 1 Over p 9 EndFraction right-parenthesis comma ln left-parenthesis StartFraction p 2 Over p 9 EndFraction right-parenthesis comma ln left-parenthesis StartFraction p 3 Over p 9 EndFraction right-parenthesis comma ellipsis comma ln left-parenthesis StartFraction p 8 Over p 9 EndFraction right-parenthesis
ALOGITS 4 ln left-parenthesis StartFraction p Subscript 2 dot Baseline Over p Subscript 1 dot Baseline EndFraction right-parenthesis comma ln left-parenthesis StartFraction p Subscript 3 dot Baseline Over p Subscript 2 dot Baseline EndFraction right-parenthesis comma ln left-parenthesis StartFraction p Subscript dot 2 Baseline Over p Subscript dot 1 Baseline EndFraction right-parenthesis comma ln left-parenthesis StartFraction p Subscript dot 3 Baseline Over p Subscript dot 2 Baseline EndFraction right-parenthesis
CLOGITS 4 ln left-parenthesis StartFraction 1 minus p Subscript 1 dot Baseline Over p Subscript 1 dot Baseline EndFraction right-parenthesis comma ln left-parenthesis StartFraction 1 minus left-parenthesis p Subscript 1 dot Baseline plus p Subscript 2 dot Baseline right-parenthesis Over p Subscript 1 dot Baseline plus p Subscript 2 dot Baseline EndFraction right-parenthesis comma ln left-parenthesis StartFraction 1 minus p Subscript dot 1 Baseline Over p Subscript dot 1 Baseline EndFraction right-parenthesis comma ln left-parenthesis StartFraction 1 minus left-parenthesis p Subscript dot 1 Baseline plus p Subscript dot 2 Baseline right-parenthesis Over p Subscript dot 1 Baseline plus p Subscript dot 2 Baseline EndFraction right-parenthesis
JOINT 8 p 1 comma p 2 comma p 3 comma p 4 comma p 5 comma p 6 comma p 7 comma p 8
LOGITS 4 ln left-parenthesis StartFraction p Subscript 1 dot Baseline Over p Subscript 3 dot Baseline EndFraction right-parenthesis comma ln left-parenthesis StartFraction p Subscript 2 dot Baseline Over p Subscript 3 dot Baseline EndFraction right-parenthesis comma ln left-parenthesis StartFraction p Subscript dot 1 Baseline Over p Subscript dot 3 Baseline EndFraction right-parenthesis comma ln left-parenthesis StartFraction p Subscript dot 2 Baseline Over p Subscript dot 3 Baseline EndFraction right-parenthesis
MARGINAL 4 p Subscript 1 dot Baseline comma p Subscript 2 dot Baseline comma p Subscript dot 1 Baseline comma p Subscript dot 2 Baseline
MEAN 2 1 p Subscript 1 dot Baseline plus 2 p Subscript 2 dot Baseline plus 3 p Subscript 3 dot Baseline comma 1 p Subscript dot 1 Baseline plus 2 p Subscript dot 2 Baseline plus 3 p Subscript dot 3 Baseline
Superscript asterisk Without a function specification, the default response functions are generalized logits.

The READ and transformation function specifications are not shown in the preceding table. For these two situations, there is not a general response function; the response functions that are generated depend on what you specify.

Another important aspect of the function specification is the number of response functions generated per population, q. Let m Subscript i represent the number of levels for the ith dependent variable in the MODEL statement, and let d represent the number of dependent variables in the MODEL statement. Then, if the function specification is ALOGITS, CLOGITS, LOGITS, or MARGINALS, the number of response functions is

q equals sigma-summation Underscript i equals 1 Overscript d Endscripts left-parenthesis m Subscript i Baseline minus 1 right-parenthesis

If the function specification is JOINT or the default (generalized logits), the number of response functions per population is

q equals r minus 1

where r is the number of response profiles. If every possible cross-classification of the dependent variables is observed in the samples, then

r equals product Underscript i equals 1 Overscript d Endscripts m Subscript i

Otherwise, r is the number of cross-classifications actually observed.

If the function specification is MEANS, the number of response functions per population is q equals d.

Response Statement Examples

Some example response statements are shown in the following table:

Example Result
response marginals; Marginals for each dependent variable
response means; The mean of each dependent variable
response logits; Generalized logits of the marginal probabilities
response clogits; Cumulative logits of the marginal probabilities
response alogits; Adjacent-category logits of the marginal probabilities
response joint; The joint probabilities
response 1 -1 log; The logit
response; Generalized logits
response 1 2 3; The mean score, with scores of 1, 2, and 3 corresponding to the three response levels
response read b1-b4; Four response functions and their covariance matrix, read directly from the input data set

Using a Transformation to Specify Response Functions

If you specify a transformation, it is applied to the vector that contains the sample proportions in each population. The transformation can be any combination of the following four operations:

Operation Specification
Linear combination bold asterisk Matrix literal
Linear combination Matrix literal
Logarithm bold upper L bold upper O bold upper G
Exponential bold upper E bold upper X bold upper P
Adding constant bold plus Matrix literal

If more than one operation is specified, then PROC CATMOD applies the operations consecutively from right to left.

A matrix literal is a matrix of numbers with each row of the matrix separated from the next by a comma. If you specify a linear combination, in most cases the bold asterisk is not needed. The following statement defines the response function p 1 plus 1. The bold asterisk is needed to separate the two matrix literals '1' and '1 0'.

response + 1 * 1 0;

The bold upper L bold upper O bold upper G of a vector transforms each element of the vector into its natural logarithm; the bold upper E bold upper X bold upper P of a vector transforms each element into its exponential function (antilogarithm).

In order to specify a linear response function for data that have r = 3 response categories, you can specify either of the following RESPONSE statements:

response  * 1 0 0 , 0 1 0;
response    1 0 0 , 0 1 0;

The matrix literal in the preceding statements specifies a 2 times 3 matrix, which is applied to each population as follows:

StartBinomialOrMatrix upper F 1 Choose upper F 2 EndBinomialOrMatrix equals Start 2 By 3 Matrix 1st Row 1st Column 1 2nd Column 0 3rd Column 0 2nd Row 1st Column 0 2nd Column 1 3rd Column 0 EndMatrix asterisk Start 3 By 1 Matrix 1st Row  p 1 2nd Row  p 2 3rd Row  p 3 EndMatrix

where p 1, p 2, and p 3 are sample proportions for the three response categories in a population, and upper F 1 and upper F 2 are the two response functions computed for that population. Therefore, this response function sets upper F Baseline 1 equals p 1 and upper F Baseline 2 equals p 2 in each population.

As another example of the linear response function, suppose you have two dependent variables corresponding to two observers who evaluate the same subjects. If the observers grade on the same three-point scale and if all nine possible responses are observed, then the following RESPONSE statement would compute the probability that the observers agree on their assessments:

response 1 0 0 0 1 0 0 0 1;

This response function is then computed as

upper F equals p 11 plus p 22 plus p 33 equals Start 1 By 9 Matrix 1st Row 1st Column 1 2nd Column 0 3rd Column 0 4th Column 0 5th Column 1 6th Column 0 7th Column 0 8th Column 0 9th Column 1 EndMatrix asterisk Start 9 By 1 Matrix 1st Row  p 11 2nd Row  p 12 3rd Row  p 13 4th Row  p 21 5th Row  p 22 6th Row  p 23 7th Row  p 31 8th Row  p 32 9th Row  p 33 EndMatrix

where p Subscript i j denotes the probability that a subject gets a grade of i from the first observer and j from the second observer.

If the function is a compound function, requiring more than one operation to specify it, then the operations should be listed so that the first operation to be applied is on the right and the last operation to be applied is on the left. For example, if there are two response levels, you can have the following response function:

response 1 -1 log;

This is equivalent to the matrix expression

upper F equals Start 1 By 2 Matrix 1st Row 1st Column 1 2nd Column negative 1 EndMatrix asterisk StartBinomialOrMatrix log left-parenthesis p 1 right-parenthesis Choose log left-parenthesis p 2 right-parenthesis EndBinomialOrMatrix equals log left-parenthesis p 1 right-parenthesis minus log left-parenthesis p 2 right-parenthesis equals log left-parenthesis StartFraction p 1 Over p 2 EndFraction right-parenthesis

which is the logit response function since p 2 equals 1 minus p 1 when there are only two response levels.

The following statement specifies another example of a compound response function:

response exp 1 -1 * 1 0 0 1, 0 1 1 0 log;

This is equivalent to the matrix expression

upper F equals bold upper E bold upper X bold upper P left-parenthesis bold upper A asterisk bold upper B asterisk bold upper L bold upper O bold upper G left-parenthesis bold upper P right-parenthesis right-parenthesis

where bold upper P is the vector of sample proportions for some population,

bold upper A equals Start 1 By 2 Matrix 1st Row 1st Column 1 2nd Column negative 1 EndMatrix and bold upper B equals Start 2 By 4 Matrix 1st Row 1st Column 1 2nd Column 0 3rd Column 0 4th Column 1 2nd Row 1st Column 0 2nd Column 1 3rd Column 1 4th Column 0 EndMatrix

If the four responses are based on two dependent variables, each with two levels, then the function can also be written as

upper F equals StartFraction p 11 p 22 Over p 12 p 21 EndFraction

which is the odds (crossproduct) ratio for a 2 times 2 table.

Understanding the Standard Response Functions

If no RESPONSE statement is specified, PROC CATMOD computes the standard response functions, which contrast the log of each response probability with the log of the probability for the last response category. If there are r response categories, then there are r minus 1 standard response functions. For example, if there are four response categories, using no RESPONSE statement is equivalent to specifying the following:

response  1 0 0 -1,
          0 1 0 -1,
          0 0 1 -1  log;

This results in three response functions:

upper F equals Start 3 By 1 Matrix 1st Row  upper F 1 2nd Row  upper F 2 3rd Row  upper F 3 EndMatrix equals Start 3 By 1 Matrix 1st Row  log left-parenthesis p 1 slash p 4 right-parenthesis 2nd Row  log left-parenthesis p 2 slash p 4 right-parenthesis 3rd Row  log left-parenthesis p 3 slash p 4 right-parenthesis EndMatrix

If there are only two response levels, the resulting response function would be a logit, which is why the standard response functions are called generalized logits. They are useful in dealing with the log-linear model:

bold-italic pi equals bold upper E bold upper X bold upper P left-parenthesis bold upper X bold-italic beta right-parenthesis

If bold upper C denotes the matrix in the preceding RESPONSE statement, then because of the restriction that the probabilities sum to 1, it follows that an equivalent model is

bold upper C asterisk bold upper L bold upper O bold upper G left-parenthesis bold-italic pi right-parenthesis equals left-parenthesis bold upper C bold upper X right-parenthesis bold-italic beta

But bold upper C asterisk bold upper L bold upper O bold upper G left-parenthesis bold upper P right-parenthesis is simply the vector of standard response functions. Thus, fitting a log-linear model on the cell probabilities is equivalent to fitting a linear model on the generalized logits.

Last updated: December 09, 2022