The IRT Procedure

Model and Item Fit

The IRT procedure includes five model fit statistics: log likelihood, Akaike’s information criterion (AIC), Bayesian information criterion (BIC), likelihood ratio chi-square upper G squared, and Pearson’s chi-square.

The following two equations compute the likelihood ratio chi-square upper G squared and Pearson’s chi-square,

upper G squared equals 2 left-bracket sigma-summation Underscript l equals 1 Overscript upper L Endscripts r Subscript l Baseline log StartFraction r Subscript l Baseline Over upper N upper P Subscript l Baseline EndFraction right-bracket
chi squared equals sigma-summation Underscript l equals 1 Overscript upper L Endscripts StartFraction left-parenthesis r Subscript l Baseline minus upper N upper P Subscript l Baseline right-parenthesis squared Over upper N upper P Subscript l Baseline EndFraction

where N is the number of subjects, L is number of possible response patterns, upper P Subscript l is the estimated probability of observing response pattern l, and r Subscript l is the number of subjects who have response pattern l. If the model is true, these two statistics asymptotically follow central chi-square distribution with degrees of freedom upper L minus m minus 1, where m is the number of free parameters in the model. When L (the number of possible response patterns) is much greater than N, the frequency table is sparse. This invalidates the use of chi-square distribution as the asymptotic distribution for these two statistics, and as a result the likelihood ratio chi-square and Pearson’s chi-square statistics should not be used to evaluate overall model fit.

For item fit, PROC IRT computes the likelihood ratio upper G squared and Pearson’s chi-square. Pearson’s chi-square statistic, proposed by Yen (1981), has the form

upper Q Subscript 1 j Baseline equals sigma-summation Underscript k equals 1 Overscript 10 Endscripts upper N Subscript k Baseline StartFraction left-parenthesis upper O Subscript j k Baseline minus upper E Subscript j k Baseline right-parenthesis squared Over upper E Subscript j k Baseline left-parenthesis 1 minus upper E Subscript j k Baseline right-parenthesis EndFraction

The likelihood ratio upper G squared, proposed by McKinley and Mills (1985), uses the following equation:

upper G squared equals 2 sigma-summation Underscript k equals 1 Overscript 10 Endscripts upper N Subscript k Baseline left-bracket upper O Subscript j k Baseline log StartFraction upper O Subscript j k Baseline Over upper E Subscript i k Baseline EndFraction plus left-parenthesis 1 minus upper O Subscript j k Baseline right-parenthesis log StartFraction 1 minus upper O Subscript j k Baseline Over 1 minus upper E Subscript i k Baseline EndFraction right-bracket

These two statistics approximately follow a central chi-square distribution with 10 minus m Subscript j degrees of freedom, where m Subscript j is the number of free parameters for item j.

To calculate these two statistics, first order all the subjects according to their estimated factor scores, and then partition them into 10 intervals such that the number of subjects in each interval is approximately equal. upper O Subscript j k and upper E Subscript j k are the observed proportion and expected proportion, respectively, of subjects in interval k who have a correct response on item j. The expected proportions upper E Subscript j k are computed as the mean predicted probability of a correct response in interval k.

Last updated: December 09, 2022