The statistics that this section discusses are useful for comparing competing models that are not necessarily nested. They are different measures of how well your model is fitted to your data.
Let N be the number of observations in your data. For the jth observation, let be the number of events and
be the number of trials when events/trials syntax is specified or
for single-trial syntax. Let
and
be the weight and frequency values, respectively, and denote
and
. The total sample size is
.
Let p denote the number of parameters in the model, including the intercept parameters. Let s be the number of explanatory effects—that is, the number of slope parameters. Let k be the total number of response functions; this is the same as the number of intercepts in the model unless you specify the NOINT option in the MODEL statement. For this section, assume that the NOINT option is not specified. For binary and cumulative response models, . For the generalized logit model,
.
For the jth observation, let be the estimated probability of the observed response. The criteria that the LOGISTIC procedure displays are calculated as follows:
–2 log likelihood:
where is the dispersion parameter, which equals 1 unless the SCALE= option is specified. For binary response models that use events/trials MODEL statement syntax, this is
where denotes the estimated event probability. This statistic is reported both with and without the constant term.
Akaike’s information criterion:
Schwarz (Bayesian information) criterion:
Akaike’s corrected information criterion:
The AIC and SC statistics give two different ways of adjusting the –2 log-likelihood statistic for the number of terms in the model and the number of observations used. You can use these statistics to compare different models for the same data (for example, when you use the SELECTION=STEPWISE option in the MODEL statement). The models that you are comparing do not have to be nested; lower values of the statistics indicate a more desirable model.
The AICC is a small-sample, bias-corrected version of Akaike’s information criterion, as promoted in Hurvich and Tsai (1989) and Burnham and Anderson (1998), for example. The AICC is displayed in the "Model Fit Statistics" table for the selected model when you specify the GOF option in the MODEL statement, and in the "Fit Statistics for SCORE Data" table when you specify the FITSTAT option in the SCORE statement.
The difference in the –2 log-likelihood statistics between the intercepts-only model and the specified model has a degree-of-freedom chi-square distribution under the null hypothesis that all the explanatory effects in the model are zero. The likelihood ratio test in the "Testing Global Null Hypothesis: BETA=0" table displays this difference and the associated p-value for this statistic. The score and Wald tests in that table test the same hypothesis and are asymptotically equivalent; for more information, see the sections Residual Chi-Square and Testing Linear Hypotheses about the Regression Coefficients.
Like the AIC and SC statistics that are described in the section Information Criteria, R-square statistics are most useful for comparing competing models that are not necessarily nested—larger values indicate better models. The statistics that are discussed in this section are based on the likelihood of the fitted model. Specifying the NORMALIZE option in the WEIGHT statement makes these coefficients invariant to the scale of the weights.
Let L denote the likelihood of the intercept-only model, and let L denote the likelihood of the specified model.
Maddala (1983) and Cox and Snell (1989, pp. 208–209) propose the following generalization of the coefficient of determination to a more general linear model:
The quantity achieves a maximum of less than one for discrete models, where the maximum is given by
Cragg and Uhler (1970), Maddala (1983), and Nagelkerke (1991) propose the following adjusted coefficient, which can achieve a maximum value of 1:
The RSQUARE option in the MODEL statement displays , labeled as "R-Square," and
, labeled as "Max-rescaled R-Square," in the "RSquare" table. The GOF option in the MODEL statement displays these two statistics in the "Model Fit Statistics" table. The FITSTAT option in the SCORE statement displays them in the "Fit Statistics for SCORE Data" table.
You produce the remaining statistics in this section by specifying the GOF option when you have a binary logistic regression model. These statistics are displayed in the "Model Fit Statistics" table.
McFadden (1974) suggests a measure analogous to the R-square in the linear regression model, labeled as "McFadden’s R-Square," also called the likelihood ratio index or the deviance R-square:
Estrella (1998) devised an R-square measure, labeled "Estrella’s R-Square," which satisfies a requirement that its derivative equal 1 when its value is 0:
Estrella also adjusts , in a measure labeled "Estrella’s Adjusted R-Square," by imposing a penalty for the number of parameters in the model in the same fashion as the AIC:
Aldrich and Nelson (1984) based another measure on the model chi-square statistic that compares the full model to the intercept-only model:
Veall and Zimmermann (1996) adjust to obtain an upper limit of 1:
Discussions of these and other pseudo-R measures can also be found in Allison (2014), Menard (2000), Smith and McKenna (2013), and Windmeijer (1995).
Measures that are discussed in this section use residuals and predicted probabilities to evaluate the strength of the fit.
For binary response models, let denote the estimated event probability for observation j. For polytomous response models, let
denote the predicted probability of response level i,
. Let
be the residual, which for binomial models is
, for binary response models is
, and for polytomous response models is
.
The average square error is computed as
If you have specified a WEIGHT statement, then the weighted average square error is
If you specify the FITSTAT option in the SCORE statement, then these statistics are displayed in the "Fit Statistics for SCORE Data" table, and they are labeled as "Brier Score" or "Brier Reliability," as discussed in the section Fit Statistics for Scored Data Sets. If you specify the GOF option and are fitting a binary logistic regression model, then these statistics are displayed in the "Model Fit Statistics" table.
The misclassification rate, or error rate, is the proportion of observations that are incorrectly classified. By default, observations for which are classified as events; otherwise they are classified as nonevents.
If you specify the FITSTAT option in the SCORE statement, then the error rate is displayed in the "Fit Statistics for SCORE Data" table. You can change the cutpoint from 0.5 by specifying the PEVENT= option in the MODEL statement; the first value that you specify in that option is used.
If you specify the GOF option in the MODEL statement and are fitting a binary logistic regression model, then the misclassification rate is displayed in the "Model Fit Statistics" table. You can change the cutpoint from 0.5 by specifying the GOF(CUTPOINT=) option.
Lave (1970) and Efron and Hinkley (1978) propose a statistic that is analogous to the general linear model ,
where is the total frequency*weight of the events. If you specify the GOF option in the MODEL statement and are fitting a binary logistic regression model, then this statistic is displayed in the "Model Fit Statistics" table.
For a binary response model, write the mean of the model-predicted probabilities of event (Y=1) observations as
and of the nonevent (Y=2) observations as
where is the total frequency of events,
is the total frequency of nonevents, and
is the indicator function. Tjur (2009) defines the statistic
and relates it to other R-square measures. If you specify the GOF option in the MODEL statement and are fitting a binary logistic regression model, then this statistic is displayed in the "Model Fit Statistics" table and labeled "Tjur’s R-Square." Tjur calls it the coefficient of discrimination because it is a measure of the model’s ability to distinguish between the event and nonevent distributions; it is called the "Mean Difference" and the "Difference of Means" in other SAS procedures. This statistic is the same as the or
statistic (with unit standard error) that is discussed in the signal detection literature (McNicol 2005).
The predicted mean score of an observation is the sum of the Ordered Values (shown in the "Response Profile" table) minus one, weighted by the corresponding predicted probabilities for that observation; that is, the predicted means score , where
is the number of response levels and
is the predicted probability of the ith (ordered) response.
A pair of observations with different observed responses is said to be concordant if the observation with the lower ordered response value has a lower predicted mean score than the observation with the higher ordered response value. If the observation with the lower ordered response value has a higher predicted mean score than the observation with the higher ordered response value, then the pair is discordant. If the pair is neither concordant nor discordant, it is a tie. If you have more than two response levels, enumeration of the total numbers of concordant and discordant pairs is carried out by categorizing the predicted mean score into intervals of length and accumulating the corresponding frequencies of observations. You can change the length of these intervals by specifying the BINWIDTH= option in the MODEL statement.
Let N be the sum of observation frequencies in the data. Suppose there are a total of t pairs with different responses: of them are concordant,
of them are discordant, and
of them are tied. PROC LOGISTIC computes the following four indices of rank correlation for assessing the predictive ability of a model:
If there are no ties, then Somers’ D (Gini’s coefficient) . Note that the concordance index, c, also gives an estimate of the area under the receiver operating characteristic (ROC) curve when the response is binary (Hanley and McNeil 1982). See the section ROC Computations for more information about this area.
For binary responses, the predicted mean score is equal to the predicted probability for Ordered Value 2.
These statistics are not available when the STRATA statement is specified.