The HPLOGISTIC Procedure

The Hosmer-Lemeshow Goodness-of-Fit Test

To evaluate the fit of the model, Hosmer and Lemeshow (2000) proposed a statistic that they show, through simulation, is distributed as chi-square when there is no replication in any of the subpopulations. This goodness-of-fit test is available only for binary response models.

The unit interval is partitioned into 2,000 equal-sized bins, and each observation i is placed into the bin that contains its estimated event probability. This effectively sorts the observations in increasing order of their estimated event probability.

The observations (and frequencies) are further combined into G groups. By default G = 10, but you can specify upper G greater-than-or-equal-to 5 with the NGROUPS= suboption of the LACKFIT option in the MODEL statement. Let F be the total frequency. The target frequency for each group is upper T equals left floor upper F slash upper G plus 0.5 right floor, which is the integer part of upper F slash upper G plus 0.5. Load the first group (g Subscript j Baseline comma j equals 1) with the first of the 2,000 bins that has nonzero frequency f 1, and let the next nonzero bin have a frequency of f. PROC HPLOGISTIC performs the following steps for each nonzero bin to create the groups:

  1. If j equals upper G, then add this bin to group g Subscript j.

  2. Otherwise, if f Subscript j Baseline less-than upper T and f Subscript j Baseline plus left floor f slash 2 right floor less-than-or-equal-to upper T, then add this bin to group g Subscript j.

  3. Otherwise, start loading the next group (g Subscript j plus 1) with f Subscript j plus 1 Baseline equals f, and set j equals j plus 1.

If the final group g Subscript j has frequency f Subscript j Baseline less-than upper T slash 2, then add these observations to the preceding group. The total number of groups actually created, g, can be less than G.

The Hosmer-Lemeshow goodness-of-fit statistic is obtained by calculating the Pearson chi-square statistic from the 2 times g table of observed and expected frequencies. The statistic is written

chi Subscript upper H upper L Superscript 2 Baseline equals sigma-summation Underscript j equals 1 Overscript g Endscripts StartFraction left-parenthesis upper O Subscript j Baseline minus upper F Subscript j Baseline pi overbar Subscript j Baseline right-parenthesis squared Over upper F Subscript j Baseline pi overbar Subscript j Baseline left-parenthesis 1 minus pi overbar Subscript j Baseline right-parenthesis EndFraction

where, for the jth group g Subscript j, upper F Subscript j Baseline equals sigma-summation Underscript i element-of g Subscript j Baseline Endscripts f Subscript i is the total frequency of subjects, upper O Subscript j is the total frequency of event outcomes, and pi overbar Subscript j Baseline equals sigma-summation Underscript i element-of g Subscript j Baseline Endscripts f Subscript i Baseline ModifyingAbove p Subscript i Baseline With caret slash upper F Subscript j is the average estimated predicted probability of an event outcome. Let epsilon be the square root of the machine epsilon divided by 4,000, which is about 2.5E–12. Any pi overbar Subscript j Baseline less-than epsilon is set to epsilon; similarly, any pi overbar Subscript j Baseline greater-than 1 minus epsilon is set to 1 minus epsilon.

The Hosmer-Lemeshow statistic is compared to a chi-square distribution with g minus r degrees of freedom. You can specify r with the DFREDUCE= suboption of the LACKFIT option in the MODEL statement. By default, r equals 2, and to compute the Hosmer-Lemeshow statistic you must have g minus r greater-than-or-equal-to 1. Large values of chi Subscript upper H upper L Superscript 2 (and small p-values) indicate a lack of fit of the model.

Last updated: December 09, 2022