The LOESS Procedure

Statistical Inference and Lookup Degrees of Freedom

If you denote the ith measurement of the response by y Subscript i and the corresponding measurement of predictors by x Subscript i, then

y Subscript i Baseline equals g left-parenthesis x Subscript i Baseline right-parenthesis plus epsilon Subscript i

where g is the regression function and epsilon Subscript i are independent random errors with mean zero. If the errors are normally distributed with constant variance, then you can obtain confidence intervals for the predictions from PROC LOESS. You can also obtain confidence limits in the case where epsilon Subscript i is heteroscedastic but a Subscript i Baseline epsilon Subscript i has constant variance and a Subscript i are a priori weights that are specified using the WEIGHT statement of PROC LOESS. You can do inference in the case in which the error distribution is symmetric by using iterative reweighting. Formulas for doing statistical inference under the preceding conditions can be found in Cleveland and Grosse (1991) and Cleveland, Grosse, and Shyu (1992). Cleveland and Grosse (1991) show that standardized residuals for a loess model follow a t distribution with rho degrees of freedom where

StartLayout 1st Row 1st Column delta 1 2nd Column identical-to 3rd Column Trace left-parenthesis bold upper I minus bold upper L right-parenthesis prime left-parenthesis bold upper I minus bold upper L right-parenthesis 2nd Row 1st Column delta 2 2nd Column identical-to 3rd Column Trace left-parenthesis left-parenthesis bold upper I minus bold upper L right-parenthesis prime left-parenthesis bold upper I minus bold upper L right-parenthesis right-parenthesis squared 3rd Row 1st Column rho 2nd Column identical-to 3rd Column Lookup Degrees of Freedom 4th Row 1st Column Blank 2nd Column identical-to 3rd Column delta 1 squared slash delta 2 EndLayout

The residual standard error that you find in the "Fit Summary" table is defined by

Residual Standard Error identical-to StartRoot Residual SS slash delta 1 EndRoot

The determination of rho is computationally expensive and is not done by default. It is computed if you specify the DFMETHOD=EXACT or DFMETHOD=APPROX option in the MODEL statement. It is also computed if you specify any of the options CLM, STD, and T in the MODEL statement. Note that the values of delta 1, delta 2, and rho are reported in the "Fit Summary" table.

If you specify the CLM option in the MODEL statement, confidence limits are added to the OutputStatistics table. By default, 95% limits are computed, but you can change this by using the ALPHA= option in the MODEL statement.

Last updated: December 09, 2022