The two-stage fully efficient fractional imputation method uses multiple donor units for a recipient unit. Each donor donates a fraction of the original weight of the recipient unit such that the sum of the fractional weights from all the donors is equal to the original weight of the recipient. The fraction of the recipient weight that a donor unit contributes to the recipient unit is known as the fractional weight. The method is called fully efficient because it does not introduce additional variability that is caused by the selection of donor units (Kim and Fuller 2004). Two-stage FEFI has two hierarchical imputation stages. One disadvantage of the two-stage FEFI method is that it can greatly increase the size of the imputed data set.
Two-stage FEFI is useful when you want to impute variables that have many unique observed values. FEFI creates many imputed rows if the variable that you are imputing has many unique observed values. Two-stage FEFI imputes these variables conditional on the imputed levels from the first-stage FEFI. Thus, two-stage FEFI often uses fewer imputed rows compared to a similar first-stage FEFI.
Variables that have many observed levels are grouped into imputation bins. The first-stage imputation is performed for all categorical variables by using the FEFI method. The categorical variables include the character variables, the CLASS variables that you also specify in the VAR statement, and the variables that contain the imputation bins of the continuous variables.
The second-stage imputation is performed for the continuous variables within each first-stage donor cell. Observations that contain missing values in any of the continuous items are the recipients, and observations that contain observed values for these missing items are the donors. The second-stage donor cells are defined by the unique combinations of the observed values for the continuous variables within the first-stage donor cells.
Imputation-adjusted replicate weights are computed by repeating both the first-stage and second-stage imputation in every replicate sample independently.
The method is similar to Im, Kim, and Fuller (2015).
Suppose you want to impute P items jointly. Let be the response for
items in unit i, and let
be the response for
items in unit i, where
. Let
be categorical with J levels for item j, and let
be continuous. Further assume that
contains the discretized levels (imputation bins) for Y, where
has J levels. Define
. Then
is categorical and has J levels for item j. Denote
as the observed part and
as the missing part of
.
Let be the population proportion that falls in category
. Assume that it is possible to estimate the population categories from the observed sample. For example, the conditional probability,
, is the same for the observed data as it is for the data where
are missing. The initial conditional probabilities are estimated by
where
is the estimated joint probability, is an indicator function, and
is the observation weight for unit i.
Let be all the observed combinations of
in the sample. Let
be the lth realization of
in the sample. You must assume that at least one realization is available; otherwise, the observation is not imputed.
The two-stage FEFI method first computes the fully efficient fractional weights by using an EM-by-weighting algorithm like that of Kim and Fuller (2013) to impute the missing values in . The missing values in
are imputed in the second-stage imputation. Two-stage FEFI weights for imputing
are computed independently in every imputed level of
, where
is the number of first-stage donor cells.
The following steps describe the two-stage FEFI technique. If you do not use the CELL statement to specify imputation cells, PROC SURVEYIMPUTE uses the entire data set as one imputation cell. If you specify imputation cells, then all the probabilities are computed by using observations from the same imputation cell as the recipient unit. To simplify notation, subscripts are not used for imputation cells in the following description. Imputation cells are defined for the first-stage imputation. Steps 1 to 4 describe the first-stage FEFI for the categorical variables, which also include the imputation bins for the continuous variables. Step 5 describes the second-stage FEFI.
Initialization: For each observation that has missing items, determine the number of first-stage donor cells. The first-stage donor cells are determined by using the number of unique combinations of observed levels in for imputing the missing items in
. Only the responding units in the imputation cell are used to determine the number of first-stage donor cells. Compute the initial fractional weight from donor cell l to unit i,
, by
where is the number of first-stage donor cells and
The sum of the fractional weights over all the donor cells is 1 for every observation unit; that is, for all i. The lth imputed row for unit i is created by keeping the observed items unchanged, replacing the missing items with the observed levels from the lth donor cell, and computing the fractional weight by
. Only the complete observations (observations that have no missing items) are used to compute the fractional weights in this step. If unit i has no missing items, then
. The initial FEFI data set contains all the observed units, the imputed rows for observations that have missing items, and the corresponding fractional weights.
M-step: The tth maximization step (M-step) computes the joint probabilities by using the fractional weights from the (t–1)th expectation-step,
for all i, all l, and . Note that for
,
uses all observation units, including observations that have missing items that are imputed in the initialization step.
E-step: The tth expectation (E-step) computes the fractional weights by using the joint probabilities from the tth M-step. The tth fractional weight for unit i and donor cell l is given by
Repetition: The expectation maximization steps (EM-steps, steps 2 and 3) are repeated for until the changes in fractional weights over all observation units between two successive EM-steps are negligible or the maximum number of EM repetitions is reached.
The maximum absolute difference convergence criterion, , at step t is defined as
The maximum absolute relative difference convergence criterion, , at step t is defined as
Second-stage imputation: The second-stage imputation replaces the missing values in the continuous variables by using the observed values within each selected first-stage donor cell. This step is similar to step 1 but is applied to impute the continuous variables.
For a particular observation unit i, let be the
th donor cell from the first-stage imputation, where
ranges from 1 to
. For each observation unit, i, the possible number of second-stage donor cells is equal to the number of unique combinations of the observed levels for the missing items in
from the responding units in the first-stage donor cell
.
Let be the population proportion that falls in category
. Assume that it is possible to estimate the population categories from the observed sample. For example, the conditional probability,
, is the same for the observed data as it is for the data in which
are missing. The conditional probabilities are estimated by
where
is the estimated joint probability, is an indicator function, and
is the observation weight for unit i.
Let be all the observed combinations of
in the sample. Let
be the lth realization of
in the sample. You must assume that at least one realization is available; otherwise, missing values in the continuous items for the observation are not imputed.
Compute the second-stage fractional weight from the second-stage donor cell conditional on the first-stage donor cell
for unit i,
:
where is the number of second-stage donor cells and
The sum of the second-stage fractional weights over all second-stage donor cells is 1 for every observation unit; that is, for all
and i. The
th second-stage imputed row in the
th first-stage imputed row for unit i is created by keeping the observed items unchanged, replacing the missing items in
with the observed values from the
th second-stage donor cell, and computing the two-stage fractional weight by
, where
is the first-stage fractional weight for the first-stage donor cell
. The maximum number of donor cells for unit i is
. Only the complete observations are used to compute the second-stage fractional weights.
The imputation-adjusted replicate weights are created by using the following:
The small data set shown in Figure 11 is used to illustrate the imputation technique. The data set contains 18 observation units, and each unit has four items (X, CX, Y, and CY). The variable Unit contains the observation identification. Variables CX and CY contains the imputation bins for variables X and Y, respectively. In this example, X and CX are missing for units 14 and 18, and Y and CY are missing for units 5 and 18.
Figure 11: Sample Data with Missing Items
| Unit | X | CX | Y | CY |
|---|---|---|---|---|
| 1 | 0.3 | 0 | -0.54 | 0 |
| 2 | 0.2 | 0 | -0.77 | 0 |
| 3 | 1.7 | 0 | -0.59 | 0 |
| 4 | 1.7 | 0 | -0.59 | 0 |
| 5 | 1.0 | 0 | . | . |
| 6 | 1.8 | 0 | -0.03 | 1 |
| 7 | 2.0 | 0 | 0.95 | 1 |
| 8 | 1.9 | 0 | 0.78 | 1 |
| 9 | 6.7 | 1 | -0.15 | 0 |
| 10 | 6.0 | 1 | -1.01 | 0 |
| 11 | 3.3 | 1 | -1.86 | 0 |
| 12 | 7.3 | 1 | -0.21 | 0 |
| 13 | 6.7 | 1 | 0.80 | 1 |
| 14 | . | . | 1.23 | 1 |
| 15 | 2.9 | 1 | 0.65 | 1 |
| 16 | 9.6 | 1 | 0.95 | 1 |
| 17 | 10.0 | 1 | 0.13 | 1 |
| 18 | . | . | . | . |
The following statements request joint imputation of X and Y by using the two-stage FEFI method. Two CLEVVAR= options specify variables CX and CY, which contain the imputation bins for variables X and Y, respectively. The following statements also request imputation-adjusted replicate weights for the jackknife replication method. The OUTPUT statement stores the imputed values in the Imputed data set and stores the jackknife coefficients in the OJKC data set. The FRACTIONALWEIGHTS= option in the OUTPUT statement saves the fractional weights in the Imputed data set.
proc surveyimpute data=Example method=fefi;
var X (clevvar=CX) Y (clevvar=CY);
output out=Imputed fractionalweights=FracWt outjkcoefs=OJKC;
run;
The first-stage FEFI imputes the imputation bin variables CX and CY by using the FEFI method. The imputed data set after the first-stage imputation is displayed in Figure 12. Variables X and Y are not imputed in the first-stage imputation.
Figure 12: First-Stage Fractional Imputation
| Unit | ImpIndex | ImpWt | FracWt | X | CX | Y | CY |
|---|---|---|---|---|---|---|---|
| 1 | 0 | 1.0000 | 1.0000 | 0.3 | 0 | -0.54 | 0 |
| 2 | 0 | 1.0000 | 1.0000 | 0.2 | 0 | -0.77 | 0 |
| 3 | 0 | 1.0000 | 1.0000 | 1.7 | 0 | -0.59 | 0 |
| 4 | 0 | 1.0000 | 1.0000 | 1.7 | 0 | -0.59 | 0 |
| 5 | 1 | 0.5360 | 0.5360 | 1.0 | 0 | . | 0 |
| 5 | 2 | 0.4640 | 0.4640 | 1.0 | 0 | . | 1 |
| 6 | 0 | 1.0000 | 1.0000 | 1.8 | 0 | -0.03 | 1 |
| 7 | 0 | 1.0000 | 1.0000 | 2.0 | 0 | 0.95 | 1 |
| 8 | 0 | 1.0000 | 1.0000 | 1.9 | 0 | 0.78 | 1 |
| 9 | 0 | 1.0000 | 1.0000 | 6.7 | 1 | -0.15 | 0 |
| 10 | 0 | 1.0000 | 1.0000 | 6.0 | 1 | -1.01 | 0 |
| 11 | 0 | 1.0000 | 1.0000 | 3.3 | 1 | -1.86 | 0 |
| 12 | 0 | 1.0000 | 1.0000 | 7.3 | 1 | -0.21 | 0 |
| 13 | 0 | 1.0000 | 1.0000 | 6.7 | 1 | 0.80 | 1 |
| 14 | 1 | 0.4640 | 0.4640 | . | 0 | 1.23 | 1 |
| 14 | 2 | 0.5360 | 0.5360 | . | 1 | 1.23 | 1 |
| 15 | 0 | 1.0000 | 1.0000 | 2.9 | 1 | 0.65 | 1 |
| 16 | 0 | 1.0000 | 1.0000 | 9.6 | 1 | 0.95 | 1 |
| 17 | 0 | 1.0000 | 1.0000 | 10.0 | 1 | 0.13 | 1 |
| 18 | 1 | 0.2668 | 0.2668 | . | 0 | . | 0 |
| 18 | 2 | 0.2310 | 0.2310 | . | 0 | . | 1 |
| 18 | 3 | 0.2353 | 0.2353 | . | 1 | . | 0 |
| 18 | 4 | 0.2668 | 0.2668 | . | 1 | . | 1 |
The first-stage FEFI is described as follows:
Observation unit 1 has no missing value. Therefore, the ImpIndex value is 0; the FracWt value is 1; and the values of X, CX, Y, and CY are the same as the observed values for observation unit 1 in Figure 12. Because all observation units have a weight of 1, the fractional weights (FracWt) and the imputation-adjusted weights (ImpWt) are the same for all rows.
Observation unit 5 has missing values in Y and CY. In the first-stage, only CY is imputed conditional on the observed level of CX. The observed level for CX for observation unit 5 is 0. For CX=0, two levels for CY are observed: CY=0, and CY=1. Therefore, observation unit 5 receives two donor cells (ImpIndex=1 and ImpIndex=2). The fractional weights for these two donor cells are computed by applying FEFI on variables CX and CY. For more information about FEFI, see the section Example of FEFI. The fractional weights after the first-stage imputation are 0.536 and 0.464. Because CX is observed, CX values in both rows for observation unit 5 are the same as the observed value. However, the first recipient row for observation unit 5 has an imputed CY value of 0, the second recipient row for observation unit 5 has an imputed CY value of 1, and each of these rows has a corresponding fractional weight. Because no imputation is performed for Y in the first-stage, both rows for observation unit 5 contain missing values for Y.
Observation unit 14 has missing values in X and CX. In the first-stage, only CX is imputed conditional on the observed level of CY. The observed level of CY for unit 14 is 1. For CY=1, two levels for CX are observed: CX=0, and CX=1. Therefore, observation unit 14 receives two donor cells (ImpIndex=1 and ImpIndex=2). The fractional weights for these two donor cells are computed by applying FEFI on variables CX and CY. For more information about FEFI, see the section Example of FEFI. The fractional weights after the first-stage imputation are 0.464 and 0.536. Because CY is observed, CY values in both rows for unit 14 are the same as the observed value. However, the first recipient row for unit 14 has an imputed CX value of 0, the second recipient row for unit 14 has an imputed CX value of 1, and each of these rows has a corresponding fractional weight. Because no imputation is performed for X in the first-stage, both rows for unit 14 contain missing values for X.
Observation unit 18 has missing values in all variables X, CX, Y, and CY. Only variables CX and CY are imputed in the first-stage. From the observed data, CX and CY can take the following values (CX=0, CY=0), (CX=0, CY=1), (CX=1, CY=0), and (CX=1, CY=1). The four imputed rows (ImpIndex 1, ImpIndex 2, ImpIndex 3, and ImpIndex 4) for observation unit 18 represent the four observed combinations for CX and CY along with their fractional weights. The fractional weights for these four donor cells are computed by applying FEFI on variables CX and CY. For more information about FEFI, see the section Example of FEFI.
The second-stage FEFI imputes the missing values in X and Y conditional on the imputed levels for imputation bin variables CX and CY from the first-stage imputation. The imputed data set after the second-stage imputation is displayed in Figure 13. Variables X and Y are imputed in the second stage.
Figure 13: Two-Stage Fractional Imputation
| Unit | ImpIndex | ImpWt | FracWt | X | CX | Y | CY |
|---|---|---|---|---|---|---|---|
| 1 | 0 | 1.0000 | 1.0000 | 0.3 | 0 | -0.54 | 0 |
| 2 | 0 | 1.0000 | 1.0000 | 0.2 | 0 | -0.77 | 0 |
| 3 | 0 | 1.0000 | 1.0000 | 1.7 | 0 | -0.59 | 0 |
| 4 | 0 | 1.0000 | 1.0000 | 1.7 | 0 | -0.59 | 0 |
| 5 | 1 | 0.1340 | 0.1340 | 1.0 | 0 | -0.77 | 0 |
| 5 | 2 | 0.1340 | 0.1340 | 1.0 | 0 | -0.54 | 0 |
| 5 | 3 | 0.2680 | 0.2680 | 1.0 | 0 | -0.59 | 0 |
| 5 | 4 | 0.1547 | 0.1547 | 1.0 | 0 | -0.03 | 1 |
| 5 | 5 | 0.1547 | 0.1547 | 1.0 | 0 | 0.78 | 1 |
| 5 | 6 | 0.1547 | 0.1547 | 1.0 | 0 | 0.95 | 1 |
| 6 | 0 | 1.0000 | 1.0000 | 1.8 | 0 | -0.03 | 1 |
| 7 | 0 | 1.0000 | 1.0000 | 2.0 | 0 | 0.95 | 1 |
| 8 | 0 | 1.0000 | 1.0000 | 1.9 | 0 | 0.78 | 1 |
| 9 | 0 | 1.0000 | 1.0000 | 6.7 | 1 | -0.15 | 0 |
| 10 | 0 | 1.0000 | 1.0000 | 6.0 | 1 | -1.01 | 0 |
| 11 | 0 | 1.0000 | 1.0000 | 3.3 | 1 | -1.86 | 0 |
| 12 | 0 | 1.0000 | 1.0000 | 7.3 | 1 | -0.21 | 0 |
| 13 | 0 | 1.0000 | 1.0000 | 6.7 | 1 | 0.80 | 1 |
| 14 | 1 | 0.1547 | 0.1547 | 1.8 | 0 | 1.23 | 1 |
| 14 | 2 | 0.1547 | 0.1547 | 1.9 | 0 | 1.23 | 1 |
| 14 | 3 | 0.1547 | 0.1547 | 2.0 | 0 | 1.23 | 1 |
| 14 | 4 | 0.1340 | 0.1340 | 2.9 | 1 | 1.23 | 1 |
| 14 | 5 | 0.1340 | 0.1340 | 6.7 | 1 | 1.23 | 1 |
| 14 | 6 | 0.1340 | 0.1340 | 9.6 | 1 | 1.23 | 1 |
| 14 | 7 | 0.1340 | 0.1340 | 10.0 | 1 | 1.23 | 1 |
| 15 | 0 | 1.0000 | 1.0000 | 2.9 | 1 | 0.65 | 1 |
| 16 | 0 | 1.0000 | 1.0000 | 9.6 | 1 | 0.95 | 1 |
| 17 | 0 | 1.0000 | 1.0000 | 10.0 | 1 | 0.13 | 1 |
| 18 | 1 | 0.0667 | 0.0667 | 0.2 | 0 | -0.77 | 0 |
| 18 | 2 | 0.0667 | 0.0667 | 0.3 | 0 | -0.54 | 0 |
| 18 | 3 | 0.1334 | 0.1334 | 1.7 | 0 | -0.59 | 0 |
| 18 | 4 | 0.0770 | 0.0770 | 1.8 | 0 | -0.03 | 1 |
| 18 | 5 | 0.0770 | 0.0770 | 1.9 | 0 | 0.78 | 1 |
| 18 | 6 | 0.0770 | 0.0770 | 2.0 | 0 | 0.95 | 1 |
| 18 | 7 | 0.0588 | 0.0588 | 3.3 | 1 | -1.86 | 0 |
| 18 | 8 | 0.0588 | 0.0588 | 6.0 | 1 | -1.01 | 0 |
| 18 | 9 | 0.0588 | 0.0588 | 6.7 | 1 | -0.15 | 0 |
| 18 | 10 | 0.0588 | 0.0588 | 7.3 | 1 | -0.21 | 0 |
| 18 | 11 | 0.0667 | 0.0667 | 2.9 | 1 | 0.65 | 1 |
| 18 | 12 | 0.0667 | 0.0667 | 6.7 | 1 | 0.80 | 1 |
| 18 | 13 | 0.0667 | 0.0667 | 9.6 | 1 | 0.95 | 1 |
| 18 | 14 | 0.0667 | 0.0667 | 10.0 | 1 | 0.13 | 1 |
The second-stage FEFI is described as follows:
Observation unit 1 has no missing value. Therefore, the ImpIndex value is 0; the FracWt value is 1; and the values of X, CX, Y, and CY are the same as the observed values for observation unit 1 in Figure 13. Because all observation units have a weight of 1, the fractional weights (FracWt) and the imputation-adjusted weights (ImpWt) are the same for all rows.
Observation unit 5 has missing values in Y and CY. The variable CY has two imputed levels (0 and 1) from the first-stage imputation. The observed level of CX for observation unit 5 is 0.
The row that contains Unit 5 and ImpIndex=1 in Figure 12 has CX=0 and CY=0. Units 1, 2, 3, and 4 in the complete data have CX=0 and CY=0. These units are possible donors for a missing Y when CX=0 and CY=0. The four donor units have three unique values for Y: –0.54, –0.59, and –0.77. These three unique values define three donor cells to impute Y when CX=0 and CY=0. The missing value in Y for CX=0 and CY=0 is replaced by all three possible observed values. Because the weight for the donor cell that is defined by Y=–0.59 is double the weight for the other two donor cells, the imputed row that contains Y = –0.59 is assigned a second-stage fractional weight of 1/2 and the other two rows are each assigned a second-stage fractional weight of 1/4. The second-stage fractional weight is then multiplied by the first-stage FEFI weight (0.54) for CX=0, CY=0, and ImpIndex=1 to obtain the two-stage FEFI weight.
The row that contains Unit 5 and ImpIndex=2 in Figure 12 has CX=0 and CY=1. Units 6, 7, and 8 in the complete data have CX=0 and CY=1. These units are donors for missing Y when CX=0 and CY=1. The three donor units have three unique values for Y: –0.03, 0.95, and 0.78. These three unique values define three donor cells for imputing Y when CX=0 and CY=1. The missing value in Y for CX=0 and CY=1 is replaced by all three possible observed values. Because all three donor cells have equal weights, all three imputed rows are assigned a second-stage fractional weight of 1/3. The second-stage fractional weight is then multiplied by the first-stage FEFI weight (0.46) for CX=0, CY=1, and ImpIndex=2 to obtain the two-stage FEFI weight.
Therefore, after two-stage FEFI, missing values in Y are replaced by six imputed values that have fractional weights proportional to the observed weighted frequencies of the second-stage donor cells conditional on the first-stage FEFI.
Observation unit 14 has missing values in X and CX. The variable CX has two imputed levels (0 and 1) from the first-stage imputation. The observed level for CY for unit 14 is 1.
The row that contains Unit 14 and ImpIndex=1 in Figure 12 has CX=0 and CY=1. Units 6, 7, and 8 in the complete data have CX=0 and CY=1. These units are possible donors for missing X when CX=0 and CY=1. The three donor units have three unique values for X: 1.8, 1.9, and 2.0. These three unique values define three donor cells for imputing X when CX=0 and CY=1. The missing value in X for CX=0 and CY=1 is replaced by all three possible observed values. Because all three donor cells have equal weights, all three imputed rows are assigned a second-stage fractional weight of 1/3. The second-stage fractional weight is then multiplied by the first-stage FEFI weight (0.46) for CX=0, CY=1, and ImpIndex=1 to obtain the two-stage FEFI weight.
The row that contains Unit 14 and ImpIndex=2 in Figure 12 has CX=1 and CY=1. Units 13, 15, 16 and 17 in the complete data have CX=1 and CY=1. These units are donors for missing X when CX=1 and CY=1. The four donor units have four unique values for X: 2.9, 6.7, 9.6, and 10.0. These four unique values define four donor cells for imputing X when CX=1 and CY=1. The missing value in X for CX=1 and CY=1 is replaced by all four possible observed values. Because all four donor cells have equal weights, all four imputed rows are assigned a second-stage fractional weight of 1/4. The second-stage fractional weight is then multiplied by the first-stage FEFI weight (0.54) for CX=1 CX=1 to obtain the two-stage FEFI weight.
Therefore, after two-stage FEFI, missing values in X are replaced by seven imputed values that have fractional weights proportional to the observed weighted frequencies of the second-stage donor cells conditional on the first-stage FEFI.
Observation unit 18 has missing values in all four variables X, CX, Y, and CY. The variable CX has two imputed levels (0 and 1), and the variable CY has two imputed levels (0 and 1) from the first-stage imputation.
The row that contains Unit 18 and ImpIndex=1 in Figure 12 has CX=0 and CY=0. Units 1, 2, 3, and 4 in the complete data have CX=0 and CY=0. These units are possible donors for missing X and Y when CX=0 and CY=0. The four donor units have three unique values for (X, Y): (0.3, –0.54), (0.2, –0.77), and (1.7, –0.59). These three unique values define three donor cells for imputing (X, Y) when CX=0 and CY=0. The missing value in (X, Y) for CX=0 and CY=0 is replaced by all three values. Because the weight for the donor cell that is defined by (X, Y) = (1.7, –0.59) is double the weight for the other two donor cells, the imputed row that contains X = 1.7 and Y = –0.59 is assigned a second-stage fractional weight of 1/2 and the other two rows each are assigned a second-stage fractional weight of 1/4. The second-stage fractional weight is then multiplied by the first-stage FEFI weight (0.27) for CX=0, CY=0, and ImpIndex=1 to obtain the two-stage FEFI weight.
The row that contains Unit 18 and ImpIndex=2 in Figure 12 has CX=0 and CY=1. Units 6, 7, and 8 in the complete data have CX=0 and CY=1. These units are donors for the missing (X, Y) when CX=0 and CY=1. The three donor units have three unique values for (X, Y): (1.8, -0.3), (1.9, 0.78), and (2.0, 0.95). These three unique values define three donor cells for imputing (X, Y) when CX=0 and CY=1. The missing value in (X, Y) for CX=0 and CY=1 is replaced by all three possible observed values. Because all three donor cells have equal weights, all three imputed rows are assigned a second-stage fractional weight of 1/3. The second-stage fractional weight is then multiplied by the first-stage FEFI weight (0.23) for CX=0 and CX=1 to obtain the two-stage FEFI weight.
Missing values in (X, Y) in rows that contain Unit 18, ImpIndex=3, and ImpIndex=4 in Figure 12 are imputed similarly.
Thus, after two-stage FEFI, missing values in (X, Y) are replaced by 14 imputed values that have fractional weights proportional to the observed weighted frequencies of the second-stage donor cells conditional on the first-stage FEFI.
The resulting data set has 42 rows. Fifteen rows for fully observed units (ImpIndex = 0), six rows for unit 5, seven rows for unit 14, and fourteen rows for unit 18. The sum of the fractional weights is 1 for all units. The imputation-adjusted replicate weights are computed by applying the first-stage and the second-stage imputation independently in each replicate sample as discussed in the previous list. The imputed data set along with first four imputation-adjusted replicate weights is displayed in Figure 14.
Figure 14: Two-Stage Fractional Imputation with the Imputation-Adjusted Replicate Weights
| Unit | ImpIndex | ImpWt | FracWt | X | CX | Y | CY | ImpRepWt_1 | ImpRepWt_2 | ImpRepWt_3 | ImpRepWt_4 |
|---|---|---|---|---|---|---|---|---|---|---|---|
| 1 | 0 | 1.0000 | 1.0000 | 0.3 | 0 | -0.54 | 0 | 0 | 1.0588 | 1.0588 | 1.0588 |
| 2 | 0 | 1.0000 | 1.0000 | 0.2 | 0 | -0.77 | 0 | 1.0588 | 0 | 1.0588 | 1.0588 |
| 3 | 0 | 1.0000 | 1.0000 | 1.7 | 0 | -0.59 | 0 | 1.0588 | 1.0588 | 0 | 1.0588 |
| 4 | 0 | 1.0000 | 1.0000 | 1.7 | 0 | -0.59 | 0 | 1.0588 | 1.0588 | 1.0588 | 0 |
| 5 | 1 | 0.1340 | 0.1340 | 1.0 | 0 | -0.77 | 0 | 0.1637 | 0 | 0.1637 | 0.1637 |
| 5 | 2 | 0.1340 | 0.1340 | 1.0 | 0 | -0.54 | 0 | 0 | 0.1637 | 0.1637 | 0.1637 |
| 5 | 3 | 0.2680 | 0.2680 | 1.0 | 0 | -0.59 | 0 | 0.3274 | 0.3274 | 0.1637 | 0.1637 |
| 5 | 4 | 0.1547 | 0.1547 | 1.0 | 0 | -0.03 | 1 | 0.1893 | 0.1893 | 0.1893 | 0.1893 |
| 5 | 5 | 0.1547 | 0.1547 | 1.0 | 0 | 0.78 | 1 | 0.1893 | 0.1893 | 0.1893 | 0.1893 |
| 5 | 6 | 0.1547 | 0.1547 | 1.0 | 0 | 0.95 | 1 | 0.1893 | 0.1893 | 0.1893 | 0.1893 |
| 6 | 0 | 1.0000 | 1.0000 | 1.8 | 0 | -0.03 | 1 | 1.0588 | 1.0588 | 1.0588 | 1.0588 |
| 7 | 0 | 1.0000 | 1.0000 | 2.0 | 0 | 0.95 | 1 | 1.0588 | 1.0588 | 1.0588 | 1.0588 |
| 8 | 0 | 1.0000 | 1.0000 | 1.9 | 0 | 0.78 | 1 | 1.0588 | 1.0588 | 1.0588 | 1.0588 |
| 9 | 0 | 1.0000 | 1.0000 | 6.7 | 1 | -0.15 | 0 | 1.0588 | 1.0588 | 1.0588 | 1.0588 |
| 10 | 0 | 1.0000 | 1.0000 | 6.0 | 1 | -1.01 | 0 | 1.0588 | 1.0588 | 1.0588 | 1.0588 |
| 11 | 0 | 1.0000 | 1.0000 | 3.3 | 1 | -1.86 | 0 | 1.0588 | 1.0588 | 1.0588 | 1.0588 |
| 12 | 0 | 1.0000 | 1.0000 | 7.3 | 1 | -0.21 | 0 | 1.0588 | 1.0588 | 1.0588 | 1.0588 |
| 13 | 0 | 1.0000 | 1.0000 | 6.7 | 1 | 0.80 | 1 | 1.0588 | 1.0588 | 1.0588 | 1.0588 |
| 14 | 1 | 0.1547 | 0.1547 | 1.8 | 0 | 1.23 | 1 | 0.1656 | 0.1656 | 0.1656 | 0.1656 |
| 14 | 2 | 0.1547 | 0.1547 | 1.9 | 0 | 1.23 | 1 | 0.1656 | 0.1656 | 0.1656 | 0.1656 |
| 14 | 3 | 0.1547 | 0.1547 | 2.0 | 0 | 1.23 | 1 | 0.1656 | 0.1656 | 0.1656 | 0.1656 |
| 14 | 4 | 0.1340 | 0.1340 | 2.9 | 1 | 1.23 | 1 | 0.1405 | 0.1405 | 0.1405 | 0.1405 |
| 14 | 5 | 0.1340 | 0.1340 | 6.7 | 1 | 1.23 | 1 | 0.1405 | 0.1405 | 0.1405 | 0.1405 |
| 14 | 6 | 0.1340 | 0.1340 | 9.6 | 1 | 1.23 | 1 | 0.1405 | 0.1405 | 0.1405 | 0.1405 |
| 14 | 7 | 0.1340 | 0.1340 | 10.0 | 1 | 1.23 | 1 | 0.1405 | 0.1405 | 0.1405 | 0.1405 |
| 15 | 0 | 1.0000 | 1.0000 | 2.9 | 1 | 0.65 | 1 | 1.0588 | 1.0588 | 1.0588 | 1.0588 |
| 16 | 0 | 1.0000 | 1.0000 | 9.6 | 1 | 0.95 | 1 | 1.0588 | 1.0588 | 1.0588 | 1.0588 |
| 17 | 0 | 1.0000 | 1.0000 | 10.0 | 1 | 0.13 | 1 | 1.0588 | 1.0588 | 1.0588 | 1.0588 |
| 18 | 1 | 0.0667 | 0.0667 | 0.2 | 0 | -0.77 | 0 | 0.0764 | 0 | 0.0764 | 0.0764 |
| 18 | 2 | 0.0667 | 0.0667 | 0.3 | 0 | -0.54 | 0 | 0 | 0.0764 | 0.0764 | 0.0764 |
| 18 | 3 | 0.1334 | 0.1334 | 1.7 | 0 | -0.59 | 0 | 0.1528 | 0.1528 | 0.0764 | 0.0764 |
| 18 | 4 | 0.0770 | 0.0770 | 1.8 | 0 | -0.03 | 1 | 0.0884 | 0.0884 | 0.0884 | 0.0884 |
| 18 | 5 | 0.0770 | 0.0770 | 1.9 | 0 | 0.78 | 1 | 0.0884 | 0.0884 | 0.0884 | 0.0884 |
| 18 | 6 | 0.0770 | 0.0770 | 2.0 | 0 | 0.95 | 1 | 0.0884 | 0.0884 | 0.0884 | 0.0884 |
| 18 | 7 | 0.0588 | 0.0588 | 3.3 | 1 | -1.86 | 0 | 0.0662 | 0.0662 | 0.0662 | 0.0662 |
| 18 | 8 | 0.0588 | 0.0588 | 6.0 | 1 | -1.01 | 0 | 0.0662 | 0.0662 | 0.0662 | 0.0662 |
| 18 | 9 | 0.0588 | 0.0588 | 6.7 | 1 | -0.15 | 0 | 0.0662 | 0.0662 | 0.0662 | 0.0662 |
| 18 | 10 | 0.0588 | 0.0588 | 7.3 | 1 | -0.21 | 0 | 0.0662 | 0.0662 | 0.0662 | 0.0662 |
| 18 | 11 | 0.0667 | 0.0667 | 2.9 | 1 | 0.65 | 1 | 0.0750 | 0.0750 | 0.0750 | 0.0750 |
| 18 | 12 | 0.0667 | 0.0667 | 6.7 | 1 | 0.80 | 1 | 0.0750 | 0.0750 | 0.0750 | 0.0750 |
| 18 | 13 | 0.0667 | 0.0667 | 9.6 | 1 | 0.95 | 1 | 0.0750 | 0.0750 | 0.0750 | 0.0750 |
| 18 | 14 | 0.0667 | 0.0667 | 10.0 | 1 | 0.13 | 1 | 0.0750 | 0.0750 | 0.0750 | 0.0750 |