The HPPRINCOMP Procedure
The OUTPUT statement creates a data set that contains observationwise statistics, which are computed after PROC HPPRINCOMP fits the model. If you do not specify a keyword, then only the principal component scores are included.
The OUTPUT statement causes the OUT= option in the PROC HPPRINCOMP statement to be ignored.
The variables in the input data set are not included in the output data set, in order to avoid data duplication for large data sets; however, variables that you specify in the ID statement are included.
You can specify the following syntax elements:
-
OUT=SAS-data-set
DATA=SAS-data-set
specifies the name of the output data set. If you omit this option, the procedure uses the DATAn convention to name the output data set.
-
keyword <=prefix>
-
specifies a statistic to include in the output data set and optionally a prefix for naming the output variables. If you do not provide a prefix, the HPPRINCOMP procedure assigns a default prefix based on the type of statistic requested. For example, for the VAR variables x1 and x2, RESIDUAL produces two residual value variables, R_x1 and R_x2.
You can specify the following keywords to add statistics to the OUTPUT data set:
-
H
requests the approximate leverage. The default prefix is H.
-
STD
requests standardized (centered and scaled) VAR variable values for each VAR variable. The default prefix is Std.
-
STDSSE
requests the sum of squares of residuals for standardized VAR variables. The default prefix is StdSSE.
-
TSQUARE
T2
requests scaled sum of squares of score values. The default prefix is TSquare.
-
RESIDUAL
RESID
R
requests residuals for each VAR variable. The default prefix is R.
-
SCORE
requests principal component scores for each principal component. The default prefix is Score.
If you specify METHOD=EIG, the only valid keywords are RESIDUAL (if you also specify the PARTIAL statement) and SCORE. Other keywords are ignored.
The output variables that contain the requested statistic are named as follows, according to the keyword that you specify:
The keywords RESIDUAL and STD define an output variable for each VAR variable, so the variables that correspond to each VAR variable are named by appending the name of the VAR variable to the prefix. For example, if the model has the VAR variables x1 and x2, then RESIDUAL=R produces the variables R_x1 and R_x2.
The keyword SCORE defines an output variable for each principal component, so the variables that correspond to each successive component are named by appending the component number to the prefix. For example, if the model has three principal components, then SCORE=T produces the variables T1, T2, and T3.
The keywords H, STDSSE, and TSQUARE each define a single output variable, so the variable name matches the prefix.
Last updated: December 09, 2022