The LOGISTIC Procedure

Classification Table

For binary response data, the response is either an event or a nonevent. In PROC LOGISTIC, the response with Ordered Value 1 is regarded as the event, and the response with Ordered Value 2 is the nonevent. PROC LOGISTIC models the probability of the event. From the fitted model, a predicted event probability can be computed for each observation. A method to compute a reduced-bias estimate of the predicted probability is given in the section Predicted Probability of an Event for Classification. If the (reduced-bias) predicted event probability exceeds or equals some cutpoint value , the observation is predicted to be an event observation; otherwise, it is predicted to be a nonevent observation.

Suppose that of n individuals experience an event, such as a disease, and the remaining individuals do not experience that event (that is, they have a nonevent response). The frequency (classification, confusion, decision, error) table in Table 12 is obtained by cross-classifying the observed and predicted responses, where is the total number of observations that are observed to have Y = i and are classified into j. In this table, let Y = 1 denote an observed event and Y = 2 denote a nonevent, and let the decision rule, D, classify an observation as an event when ; D = 1 indicates that the observation is classified as an event, and D = 2 indicates that the observation is classified as a nonevent.

Table 12: Classification Matrix

	()	()	Total
(event)
(nonevent)

The CTABLE option produces this table, and the PPROB= option selects one or more cutpoints z. Each cutpoint generates a classification table. If the PEVENT= option is also specified, a classification table is produced for each combination of PEVENT= and PPROB= values.

The cells of the classification matrix in Table 12 have the following interpretations:

	the number of true positives, which is the number of event observations that are correctly classified as events
	the number of false positives, which is the number of nonevent observations that are incorrectly classified as events
	the number of false negatives, which is the number of event observations that are incorrectly classified as nonevents
	the number of true negatives, which is the number of nonevent observations that are correctly classified as nonevents
	the total number of actual events
	the total number of actual nonevents

The statistics in Table 13 are computed from the classification table in Table 12.

Table 13: Statistics from the Classification Matrix with Cutpoint z

Statistic	Equation	Value
Sensitivity
1–specificity
Correct classification rate
Misclassification rate
Positive predictive value
Negative predictive value

The accuracy of the classification is measured by its ability to predict events and nonevents correctly. Sensitivity (true positive fraction, TPF, or recall) is the proportion of event responses that are predicted to be events. Specificity (true negative fraction, 1–FPF) is the proportion of nonevent responses that are predicted to be nonevents.

You can also measure accuracy by how well the classification predicts the response. The positive predictive value (precision, PPV) is the proportion of observations classified as events that are correctly classified. The negative predictive value (NPV) is the proportion of observations classified as nonevents that are correctly classified. The correct classification rate (accuracy, PC) is the proportion of observations that are correctly classified, and the misclassification rate (error rate) is the proportion of observations that are incorrectly classified. Given prior probabilities (prevalence, ) that are specified by the PEVENT= option, you can compute these conditional probabilities as posterior probabilities by using Bayes’ theorem, as shown in the following section.

Note: Current literature defines the false positive fraction as FPF and the false negative fraction as FNF, and 1–PPV is called the false discovery rate and 1–NPV is called the false omission rate.

Positive Predictive Values, Negative Predictive Values, and Correct Classification Rates Using Bayes’ Theorem

If the prevalence of the disease in the population is provided by the value of the PEVENT= option, then PROC LOGISTIC uses Bayes’ theorem to modify the PPV, NPV, and PC as follows (Fleiss, Levin, and Paik 2003):

If you do not specify the PEVENT= option, then PROC LOGISTIC uses the sample proportion of diseased individuals; that is, . In such a case, the preceding values reduce to those in Table 13. Note that for a stratified sampling or case-control situation in which and are chosen a priori, is not a desirable estimate of , so you should specify the PEVENT= option.

Predicted Probability of an Event for Classification

When you classify a set of binary data, if the same observations used to fit the model are also used to estimate the classification error, the resulting error-count estimate is biased. One way of reducing the bias is to remove the binary observation to be classified from the data, reestimate the parameters of the model, and then classify the observation based on the new parameter estimates. However, it would be costly to fit the model by leaving out each observation one at a time. The LOGISTIC procedure provides a less expensive one-step approximation to the preceding parameter estimates. Let be the MLE of the parameter vector based on all observations. Let denote the MLE computed without the jth observation. The one-step estimate of is given by

ModifyingAbove bold-italic beta With caret Subscript left-parenthesis j right-parenthesis Superscript 1 Baseline equals ModifyingAbove bold-italic beta With caret minus StartFraction w Subscript j Baseline left-parenthesis y Subscript j Baseline minus ModifyingAbove pi With caret Subscript j Baseline right-parenthesis Over 1 minus h Subscript j Baseline EndFraction ModifyingAbove bold upper V With caret left-parenthesis ModifyingAbove bold-italic beta With caret right-parenthesis StartBinomialOrMatrix 1 Choose bold x Subscript j EndBinomialOrMatrix

where

: is 1 for an observed event response and 0 otherwise
: is the weight of the observation
: is the predicted event probability based on
: is the hat diagonal element with and
: is the estimated covariance matrix of

Last updated: December 09, 2022