The REG Procedure

Testing for Lack of Fit

The test for lack of fit compares the variation around the model with "pure" variation within replicated observations. This measures the adequacy of the specified model. In particular, if there are replicated observations of the response all at the same values of the regressors, then you can predict the true response at either by using the predicted value based on the model or by using the mean of the replicated values. The test for lack of fit decomposes the residual error into a component due to the variation of the replications around their mean value (the "pure" error) and a component due to the variation of the mean values around the model prediction (the "bias" error):

StartLayout 1st Row 1st Column sigma-summation Underscript i Endscripts sigma-summation Underscript j equals 1 Overscript n Subscript i Endscripts left-parenthesis upper Y Subscript i j Baseline minus ModifyingAbove upper Y With caret Subscript i Baseline right-parenthesis squared 2nd Column equals 3rd Column sigma-summation Underscript i Endscripts sigma-summation Underscript j equals 1 Overscript n Subscript i Endscripts left-parenthesis upper Y Subscript i j Baseline minus upper Y overbar Subscript i Baseline right-parenthesis squared plus sigma-summation Underscript i Endscripts n Subscript i Baseline left-parenthesis upper Y overbar Subscript i Baseline minus ModifyingAbove upper Y With caret Subscript i Baseline right-parenthesis squared EndLayout

If the model is adequate, then both components estimate the nominal level of error; however, if the bias component of error is much larger than the pure error, then this constitutes evidence that there is significant lack of fit.

If some observations in your design are replicated, you can test for lack of fit by specifying the LACKFIT option in the MODEL statement (see Example 105.6). Note that, since all other tests use total error rather than pure error, you might want to hand-calculate the tests with respect to pure error if the lack of fit is significant. On the other hand, significant lack of fit indicates that the specified model is inadequate, so if this is a problem you can also try to refine the model.

Last updated: December 09, 2022