The SURVEYIMPUTE Procedure
The OUTPUT statement creates a SAS data set that contains the imputed data. You must use the OUTPUT statement to store the imputed data in a SAS data set. If you use multiple OUTPUT statements, then PROC SURVEYIMPUTE uses only the first OUTPUT statement and ignores the rest. The OUTPUT OUT= data set contains all the variables from the DATA= input data set, imputed values for missing values for the variables in the VAR statement, and some observation-level quantities. These quantities can include the fractionally adjusted weights, replicate weights, recipient numbers, and donor identifications.
You can specify the following in the OUTPUT statement:
-
OUT=SAS-data-set
names the output data set. If you use the OUTPUT statement but omit the OUT= option, then the output data set is named by using the DATAn convention. For more information, see the section OUT= Output Data Set.
-
OUTJKCOEFS=SAS-data-set
names a SAS data set that contains the jackknife coefficients for VARMETHOD=JACKKNIFE.
-
keyword <=name>
-
specifies the quantities to include in the output data set and optionally names the new variables that contain the quantities. Specify a keyword for each desired quantity (see the following list of keywords), optionally followed by an equal sign and a variable name to contain the quantity. If you specify a keyword without a variable name, then PROC SURVEYIMPUTE uses default names. You can specify the following keywords:
-
DONORID<=name>
requests a name for the identification variable for the donor units. If you do not specify this keyword, the donor IDs are not saved in the output data set. If you specify this keyword but do not specify a name, then the donor IDs are stored in a new variable named DonorID. This keyword is available only when METHOD=HOTDECK.
-
FRACTIONALWEIGHTS=name
includes the fractional weights of donor cells in the output data set and specifies the corresponding variable name. If you do not specify this keyword, the fractional weights are not saved in the output data set. This keyword is available when METHOD=FEFI or METHOD=FHDI.
-
IMPADJWEIGHTS=name
includes the imputation-adjusted weights in the output data set and specifies the corresponding variable name. The imputation-adjusted weights are computed by multiplying the base weights by the fractional weights. If you do not specify this keyword, then the imputation-adjusted weights are stored in a new variable named ImpWt. This keyword is available when METHOD=FEFI or METHOD=FHDI.
-
IMPSTATUS=name
-
includes an imputation status index with the values shown in Table 4.
Table 4: Imputation Status Index
| Index |
Imputation Status |
| 0 |
All items are observed |
| 1 |
All missing items are imputed |
| 2 |
No missing items are imputed |
| 3 |
Some missing items are imputed but some missing items are not imputed |
| 4 |
Invalid observation; observation is not used in the imputation |
-
OBSID=name
includes an index variable to contain the unique numeric identification of every unit from the input data set in the output data and specifies the corresponding variable name. If you do not specify this keyword, then the default unit ID is stored in a new variable named UnitID. The OBSID= option is not applicable when the ID statement is specified.
-
IMPINDEX=name
includes the imputation index in the output data set and specifies the corresponding variable name. The imputation index can be 0 (which indicates a nonmissing unit) or any positive integer (which represents multiple donor units for a recipient unit). If you do not specify this keyword, then the imputation index is stored in a new variable named ImpIndex.
-
OUTCONTLEVELS <=YES | NO>
-
specifies whether to include in the OUTPUT OUT= data set the imputed values for the variables that are specified in the CLEVVAR= option in the VAR statement or the imputed levels for the imputation bins (when the CLEVELS= option is specified in the VAR statement). This option does not apply when METHOD=HOTDECK.
By default, the imputed values for the variables that are specified in the CLEVVAR= option or the imputed levels for the imputation bins are included in the OUTPUT OUT= data. Optionally, you can specify the following keywords:
- YES
includes the imputed values for CLEVVAR= variables or the imputed levels for the imputation bins (when the CLEVELS= option is used) in the OUTPUT OUT= data set.
- NO
does not include the imputed values for CLEVVAR= variables or the imputed levels for the imputation bins in the OUTPUT OUT= data set.
Last updated: December 09, 2022