The ROBUSTREG Procedure

Implementation of the WEIGHT Statement

You can use the WEIGHT statement to specify a weight variable in the input data set. (For more information, see the section WEIGHT Statement.) This section describes how PROC ROBUSTREG implements the WEIGHT statement for each of the estimation methods and for leverage detection.

M Estimation

If you use M estimation with a known scale, then instead of minimizing upper Q left-parenthesis bold-italic theta right-parenthesis equals sigma-summation Underscript i equals 1 Overscript n Endscripts rho left-parenthesis StartFraction r Subscript i Baseline Over sigma EndFraction right-parenthesis, the weighted M estimation minimizes the weighted Huber-type objective function

upper Q left-parenthesis bold-italic theta right-parenthesis equals sigma-summation Underscript i equals 1 Overscript n Endscripts v Subscript i Baseline rho left-parenthesis StartFraction r Subscript i Baseline Over sigma EndFraction right-parenthesis

where v Subscript i is the weight variable that is specified by the WEIGHT statement. If you use M estimation with an unknown scale, the weight variable is used in the location steps but not in the scale steps. (For more information, see the section M Estimation and the SCALE= option.) For estimating the covariance of the weighted M estimation, psi left-parenthesis r Subscript i Baseline right-parenthesis and psi prime left-parenthesis r Subscript i Baseline right-parenthesis are obtained from the final iteration of the weighted M estimation, and bold upper X prime bold upper X and bold upper W are replaced, respectively, by bold upper X prime bold upper V bold upper X and upper W Subscript j k Baseline equals sigma-summation v Subscript i Baseline psi prime left-parenthesis r Subscript i Baseline right-parenthesis x Subscript i j Baseline x Subscript i k, where bold upper V is a diagonal matrix whose diagonal elements are v Subscript i. (For more information, see the section Asymptotic Covariance and Confidence Intervals.) The weight variable does not affect the model degrees of freedom p and the error degrees of freedom n minus p.

LTS Estimation

LTS estimation ignores the weight variable.

S Estimation

S estimation applies the weight variable only in its M-refinement step. Except for the initial estimates, the M-refinement step of S estimation is the same as the weighted M estimation with unknown scale. If you use the NOREFINE suboption, S estimation ignores the weight variable along with the M-refinement step.

MM Estimation

By default, the initial step of MM estimation is the initial LTS estimation. Unlike the regular LTS estimation, the initial LTS estimation is applied to the weighted data left-parenthesis y Subscript i Baseline Superscript asterisk Baseline comma bold x Subscript i Baseline Superscript asterisk Baseline right-parenthesis’s, where y Subscript i Baseline Superscript asterisk Baseline equals StartRoot v Subscript i Baseline EndRoot y Subscript i and bold x Subscript i Baseline Superscript asterisk Baseline equals StartRoot v Subscript i Baseline EndRoot bold x Subscript i. After the initial LTS estimation, the weight variable is ignored for the subsequent scale adjustment.

You can use INITEST=S to specify the initial S estimation as the initial step of the MM estimation. As with the regular S estimation, the weight variable is used only in the M-refinement step of the initial S estimation. There is no subsequent scale adjustment step if the initial S estimation is applied.

Except for the initial estimates, the final M estimation of the MM estimation is the same as the weighted M estimation with known scale.

Final Weighted Least Squares Estimation

Final weighted least squares estimation is always applied to the weighted data left-parenthesis y Subscript i Baseline Superscript asterisk Baseline comma bold x Subscript i Baseline Superscript asterisk Baseline right-parenthesis, no matter how the weight variable is applied in the preceding estimation. For example, if the option METHOD=LTS is specified along with the FWLS option, although the outliers that are identified by LTS estimation do not depend on the weight variable, final weighted least squares estimation applies the weight variable to all the points that are not outliers.

Robust Distances and Leverage Detection

Robust distance computation ignores the weight variable. Because leverage detection depends on robust distance, it also ignores the weight variable.

Last updated: December 09, 2022