The SURVEYFREQ Procedure

Fay’s BRR Method

When you specify the FAY method-option for VARMETHOD=BRR, PROC SURVEYFREQ uses Fay’s BRR method, which is a modification of the traditional BRR variance estimation method. Fay’s BRR method also requires a stratified sample design with two PSUs in each stratum. You can provide replicate weights by using a REPWEIGHTS statement, or the procedure can construct replicate weights for the analysis. PROC SURVEYFREQ estimates the parameter of interest (a proportion, total, odds ratio, or other statistic) from each replicate, and then uses the variability among replicate estimates to estimate the overall variance of the parameter estimate.

Replicate Weight Construction

If you do not provide replicate weights by using a REPWEIGHTS statement, PROC SURVEYFREQ constructs replicates based on the stratified design with two PSUs in each stratum. As for traditional BRR, the number of replicates R is the smallest multiple of 4 that is greater than the number of strata H, or you can specify the number of replicates with the REPS= method-option. You can provide a Hadamard matrix for replicate construction in the HADAMARD= method-option, or PROC SURVEYFREQ generates an appropriate Hadamard matrix.

The traditional BRR method constructs half-sample replicates by deleting one PSU per stratum according to the Hadamard matrix and doubling the original weights to form replicate weights. Fay’s BRR method adjusts the original weights by a coefficient , where . You can specify the value of with the FAY= method-option. If you do not specify the value of , PROC SURVEYFREQ uses by default. For information about the value of the Fay coefficient, see Judkins (1990) and Rao and Shao (1999). When , Fay’s method becomes the traditional BRR method. For more information, see Dippo, Fay, and Morganstein (1984), Fay (1989), and Judkins (1990).

PROC SURVEYFREQ constructs Fay BRR replicates by using the first H columns of the Hadamard matrix, where H denotes the number of strata. The rth replicate () is drawn from the full sample according to the rth row of the Hadamard matrix as follows:

If element (r, h) of the Hadamard matrix is 1, the sampling weight of the first PSU in stratum h is multiplied by , and the sampling weight of the second PSU is multiplied by to form the rth replicate weights.
If element (r, h) of the Hadamard matrix is –1, then the sampling weight of the second PSU in stratum h is multiplied by , and the sampling weight of the first PSU is multiplied by to form the rth replicate weights.

You can use the OUTWEIGHTS= method-option to store the replicate weights in a SAS data set. For information about the contents of the OUTWEIGHTS= data set, see the section Replicate Weight Output Data Set. You can provide these replicate weights to the procedure for subsequent analyses by using a REPWEIGHTS statement.

Variance Estimation

Let denote the population parameter to be estimated—for example, a proportion, total, odds ratio, or other statistic. Let denote the estimate of from the full sample, and let denote the estimate from the rth BRR replicate, which is computed by using the replicate weights. The Fay BRR variance estimate for is computed as

ModifyingAbove upper V With caret left-parenthesis ModifyingAbove theta With caret right-parenthesis equals StartFraction 1 Over upper R left-parenthesis 1 minus epsilon right-parenthesis squared EndFraction sigma-summation Underscript r equals 1 Overscript upper R Endscripts left-parenthesis ModifyingAbove theta With caret Subscript r Baseline minus ModifyingAbove theta With caret right-parenthesis squared

where R is the total number of replicates and is the Fay coefficient.

If you request Fay’s BRR method and also include a REPWEIGHTS statement, PROC SURVEYFREQ uses the replicate weights that you provide and includes the Fay coefficient in the denominator of the variance estimate in the preceding expression.

If you specify the CENTER=REPLICATES method-option, the Fay BRR variance estimate is computed as

where is the average of the replicate estimates and is computed as follows:

theta overbar equals StartFraction 1 Over upper R EndFraction sigma-summation Underscript r equals 1 Overscript upper R Endscripts ModifyingAbove theta With caret Subscript r

If a parameter cannot be estimated from one or more replicates, the variance estimate is computed by using those replicates from which the parameter can be estimated. For example, suppose the parameter is a column proportion—the proportion of column j for table cell (i, j). If a replicate r contains no observations in column j, then the column j proportion is not estimable from replicate r. In this case, the BRR variance estimate is computed as

ModifyingAbove upper V With caret left-parenthesis ModifyingAbove theta With caret right-parenthesis equals StartFraction 1 Over upper R prime left-parenthesis 1 minus epsilon right-parenthesis squared EndFraction sigma-summation Underscript r equals 1 Overscript upper R Superscript prime Baseline Endscripts left-parenthesis ModifyingAbove theta With caret Subscript r Baseline minus ModifyingAbove theta With caret right-parenthesis squared

where the summation is over the replicates for which the parameter is estimable and where is the number of those replicates.

Last updated: December 09, 2022