The LOGISTIC Procedure

Score Statistics and Tests

To understand the general form of the score statistics, let bold g left-parenthesis bold-italic beta right-parenthesis be the vector of first partial derivatives of the log likelihood with respect to the parameter vector bold-italic beta, and let bold upper H left-parenthesis bold-italic beta right-parenthesis be the matrix of second partial derivatives of the log likelihood with respect to bold-italic beta. That is, bold g left-parenthesis bold-italic beta right-parenthesis is the gradient vector, and bold upper H left-parenthesis bold-italic beta right-parenthesis is the Hessian matrix. Let bold upper I left-parenthesis bold-italic beta right-parenthesis be either minus bold upper H left-parenthesis bold-italic beta right-parenthesis or the expected value of minus bold upper H left-parenthesis bold-italic beta right-parenthesis. Consider a null hypothesis upper H 0. Let ModifyingAbove bold-italic beta With caret Subscript upper H 0 be the MLE of bold-italic beta under upper H 0. The chi-square score statistic for testing upper H 0 is defined by

bold g prime left-parenthesis ModifyingAbove bold-italic beta With caret Subscript upper H 0 Baseline right-parenthesis bold upper I Superscript negative 1 Baseline left-parenthesis ModifyingAbove bold-italic beta With caret Subscript upper H 0 Baseline right-parenthesis bold g left-parenthesis ModifyingAbove bold-italic beta With caret Subscript upper H 0 Baseline right-parenthesis

and it has an asymptotic chi squared distribution with r degrees of freedom under upper H 0, where r is the number of restrictions imposed on bold-italic beta by upper H 0.

Score statistics are used when performing forward, stepwise and score selection; for more information see the section Effect-Selection Methods.

Residual Chi-Square

When you use SELECTION=FORWARD, BACKWARD, or STEPWISE, the procedure calculates a residual chi-square score statistic and reports the statistic, its degrees of freedom, and the p-value. This section describes how the statistic is calculated.

Suppose there are s explanatory effects of interest. The full cumulative response model has a parameter vector

bold-italic beta equals left-parenthesis alpha 1 comma ellipsis comma alpha Subscript k Baseline comma beta 1 comma ellipsis comma beta Subscript s Baseline right-parenthesis prime

where alpha 1 comma ellipsis comma alpha Subscript k Baseline are intercept parameters, and beta 1 comma ellipsis comma beta Subscript s Baseline are the common slope parameters for the s explanatory effects. The full generalized logit model has a parameter vector

StartLayout 1st Row 1st Column bold-italic beta 2nd Column equals 3rd Column left-parenthesis alpha 1 comma ellipsis comma alpha Subscript k Baseline comma bold-italic beta prime 1 comma ellipsis comma bold-italic beta prime Subscript k right-parenthesis prime with 2nd Row 1st Column bold-italic beta prime Subscript i 2nd Column equals 3rd Column left-parenthesis beta Subscript i Baseline 1 Baseline comma ellipsis comma beta Subscript i s Baseline right-parenthesis comma i equals 1 comma ellipsis comma k EndLayout

where beta Subscript i j is the slope parameter for the jth effect in the ith logit.

Consider the null hypothesis upper H 0 colon beta Subscript t plus 1 Baseline equals midline-horizontal-ellipsis equals beta Subscript s Baseline equals 0, where t less-than s for the cumulative response model, and upper H 0 colon beta Subscript i comma t plus 1 Baseline equals midline-horizontal-ellipsis equals beta Subscript i s Baseline equals 0 comma t less-than s comma i equals 1 comma ellipsis comma k, for the generalized logit model. For the reduced model with t explanatory effects, let ModifyingAbove alpha With caret Subscript 1 Baseline comma ellipsis comma ModifyingAbove alpha With caret Subscript k Baseline be the MLEs of the unknown intercept parameters, let ModifyingAbove beta With caret Subscript 1 Baseline comma ellipsis comma ModifyingAbove beta With caret Subscript t Baseline be the MLEs of the unknown slope parameters, and let ModifyingAbove bold-italic beta With caret prime Subscript i left-parenthesis t right-parenthesis Baseline equals left-parenthesis ModifyingAbove beta With caret Subscript i Baseline 1 Baseline comma ellipsis comma ModifyingAbove beta With caret Subscript i t Baseline right-parenthesis comma i equals 1 comma ellipsis comma k, be those for the generalized logit model. The residual chi-square is the chi-square score statistic testing the null hypothesis upper H 0; that is, the residual chi-square is

bold g prime left-parenthesis ModifyingAbove bold-italic beta With caret Subscript upper H 0 Baseline right-parenthesis bold upper I Superscript negative 1 Baseline left-parenthesis ModifyingAbove bold-italic beta With caret Subscript upper H 0 Baseline right-parenthesis bold g left-parenthesis ModifyingAbove bold-italic beta With caret Subscript upper H 0 Baseline right-parenthesis

where for the cumulative response model ModifyingAbove bold-italic beta With caret Subscript upper H 0 Baseline equals left-parenthesis ModifyingAbove alpha With caret Subscript 1 Baseline comma ellipsis comma ModifyingAbove alpha With caret Subscript k Baseline comma ModifyingAbove beta With caret Subscript 1 Baseline comma ellipsis comma ModifyingAbove beta With caret Subscript t Baseline comma 0 comma ellipsis comma 0 right-parenthesis prime, and for the generalized logit model ModifyingAbove bold-italic beta With caret Subscript upper H 0 Baseline equals left-parenthesis ModifyingAbove alpha With caret Subscript 1 Baseline comma ellipsis comma ModifyingAbove alpha With caret Subscript k Baseline comma ModifyingAbove bold-italic beta With caret prime Subscript 1 left-parenthesis t right-parenthesis Baseline comma bold 0 prime Subscript left-parenthesis s minus t right-parenthesis Baseline comma ellipsis ModifyingAbove bold-italic beta With caret prime Subscript k left-parenthesis t right-parenthesis comma bold 0 prime Subscript left-parenthesis s minus t right-parenthesis right-parenthesis prime, where bold 0 Subscript left-parenthesis s minus t right-parenthesis denotes a vector of s minus t zeros.

The residual chi-square has an asymptotic chi-square distribution with s minus t degrees of freedom (k left-parenthesis s minus t right-parenthesis for the generalized logit model). A special case is the global score chi-square, where the reduced model consists of the k intercepts and no explanatory effects. The global score statistic is displayed in the "Testing Global Null Hypothesis: BETA=0" table. The table is not produced when the NOFIT option is used, but the global score statistic is displayed.

Testing Individual Effects Not in the Model

These tests are performed when you specify SELECTION=FORWARD or STEPWISE, and are displayed when the DETAILS option is specified. In the displayed output, the tests are labeled "Score Chi-Square" in the "Analysis of Effects Eligible for Entry" table and in the "Summary of Stepwise (Forward) Selection" table. This section describes how the tests are calculated.

Suppose that k intercepts and t explanatory variables (say v 1 comma ellipsis comma v Subscript t Baseline) have been fit to a model and that v Subscript t plus 1 is another explanatory variable of interest. Consider a full model with the k intercepts and t plus 1 explanatory variables (v 1 comma ellipsis comma v Subscript t Baseline comma v Subscript t plus 1 Baseline) and a reduced model with v Subscript t plus 1 excluded. The significance of v Subscript t plus 1 adjusted for v 1 comma ellipsis comma v Subscript t Baseline can be determined by comparing the corresponding residual chi-square with a chi-square distribution with one degree of freedom (k degrees of freedom for the generalized logit model).

Testing the Parallel Lines Assumption

For an ordinal response, PROC LOGISTIC performs a test of the parallel lines assumption. In the displayed output, this test is labeled "Score Test for the Equal Slopes Assumption" when the LINK= option is NORMIT or CLOGLOG. When LINK=LOGIT, the test is labeled as "Score Test for the Proportional Odds Assumption" in the output. For small sample sizes, this test might be too liberal (Stokes, Davis, and Koch 2000, p. 249). This section describes the methods used to calculate the test.

For this test the number of response levels, k plus 1, is assumed to be strictly greater than 2. Let Y be the response variable taking values 1 comma ellipsis comma k comma k plus 1. Suppose there are s explanatory variables. Consider the general cumulative model without making the parallel lines assumption

g left-parenthesis probability left-parenthesis upper Y less-than-or-equal-to i vertical-bar bold x right-parenthesis right-parenthesis equals left-parenthesis 1 comma bold x Superscript prime Baseline right-parenthesis bold-italic beta Subscript i Baseline comma 1 less-than-or-equal-to i less-than-or-equal-to k

where g left-parenthesis dot right-parenthesis is the link function, and bold-italic beta Subscript i Baseline equals left-parenthesis alpha Subscript i Baseline comma beta Subscript i Baseline 1 Baseline comma ellipsis comma beta Subscript i s Baseline right-parenthesis prime is a vector of unknown parameters consisting of an intercept alpha Subscript i and s slope parameters beta Subscript i Baseline 1 Baseline comma ellipsis comma beta Subscript i s Baseline. The parameter vector for this general cumulative model is

bold-italic beta equals left-parenthesis bold-italic beta prime 1 comma ellipsis comma bold-italic beta prime Subscript k right-parenthesis prime

Under the null hypothesis of parallelism upper H 0 colon beta Subscript 1 m Baseline equals beta Subscript 2 m Baseline equals midline-horizontal-ellipsis equals beta Subscript k m Baseline comma 1 less-than-or-equal-to m less-than-or-equal-to s, there is a single common slope parameter for each of the s explanatory variables. Let beta 1 comma ellipsis comma beta Subscript s Baseline be the common slope parameters. Let ModifyingAbove alpha With caret Subscript 1 Baseline comma ellipsis comma ModifyingAbove alpha With caret Subscript k Baseline and ModifyingAbove beta With caret Subscript 1 Baseline comma ellipsis comma ModifyingAbove beta With caret Subscript s Baseline be the MLEs of the intercept parameters and the common slope parameters. Then, under upper H 0, the MLE of bold-italic beta is

ModifyingAbove bold-italic beta With caret Subscript upper H 0 Baseline equals left-parenthesis ModifyingAbove bold-italic beta With caret prime Subscript 1 Baseline comma ellipsis comma ModifyingAbove bold-italic beta With caret prime Subscript k right-parenthesis Superscript prime Baseline with ModifyingAbove bold-italic beta With caret Subscript i Baseline equals left-parenthesis ModifyingAbove alpha With caret Subscript i Baseline comma ModifyingAbove beta With caret Subscript 1 Baseline comma ellipsis comma ModifyingAbove beta With caret Subscript s Baseline right-parenthesis prime 1 less-than-or-equal-to i less-than-or-equal-to k

and the chi-square score statistic bold g prime left-parenthesis ModifyingAbove bold-italic beta With caret Subscript upper H 0 Baseline right-parenthesis bold upper I Superscript negative 1 Baseline left-parenthesis ModifyingAbove bold-italic beta With caret Subscript upper H 0 Baseline right-parenthesis bold g left-parenthesis ModifyingAbove bold-italic beta With caret Subscript upper H 0 Baseline right-parenthesis has an asymptotic chi-square distribution with s left-parenthesis k minus 1 right-parenthesis degrees of freedom. This tests the parallel lines assumption by testing the equality of separate slope parameters simultaneously for all explanatory variables.

Last updated: December 09, 2022