The SURVEYPHREG Procedure

Balanced Repeated Replication (BRR) Method

The balanced repeated replication (BRR) method requires that the full sample be drawn by using a stratified sample design with two primary sampling units (PSUs) per stratum. The BRR method constructs half-sample replicates by deleting one PSU per stratum according to a Hadamard matrix and doubling the original weight of the other PSU in that stratum. Let H be the total number of strata. The total number of replicates R is the smallest multiple of 4 that is greater than H. However, if you prefer a larger number of replicates, you can specify the REPS=n method-option. If a n times n Hadamard matrix cannot be constructed, the number of replicates is increased until a Hadamard matrix becomes available.

Each replicate is obtained by deleting one PSU per stratum according to a corresponding Hadamard matrix and adjusting the original weights for the remaining PSUs. The new weights are called replicate weights.

Replicates are constructed by using the first H columns of the upper R times upper R Hadamard matrix. The rth (r equals 1 comma 2 comma ellipsis comma upper R) replicate is drawn from the full sample according to the rth row of the Hadamard matrix as follows:

  • If the left-parenthesis r comma h right-parenthesis element of the Hadamard matrix is 1, then the first PSU of stratum h is included in the rth replicate and the second PSU of stratum h is excluded.

  • If the left-parenthesis r comma h right-parenthesis element of the Hadamard matrix is –1, then the second PSU of stratum h is included in the rth replicate and the first PSU of stratum h is excluded.

The replicate weights of the remaining PSUs in each half sample are then doubled to their original weights. For more detail about the BRR method, see Wolter (2007) and Lohr (2010).

By default, an appropriate Hadamard matrix is generated automatically to create the replicates. You can display the Hadamard matrix by specifying the VARMETHOD=BRR(PRINTH) method-option. If you provide a Hadamard matrix by specifying the VARMETHOD=BRR(HADAMARD=) method-option, then the replicates are generated according to the provided Hadamard matrix. You can use the VARMETHOD=BRR(OUTWEIGHTS=) method-option to store the replicate weights in a SAS data set.

Let ModifyingAbove bold-italic beta With caret be the estimated proportional hazards regression coefficients from the full sample, and let ModifyingAbove bold-italic beta With caret Subscript r be the estimated proportional hazards regression coefficients from the rth replicate by using replicate weights. PROC SURVEYPHREG estimates the covariance matrix of ModifyingAbove bold-italic beta With caret by

ModifyingAbove bold upper V With caret left-parenthesis ModifyingAbove bold-italic beta With caret right-parenthesis equals StartFraction 1 Over upper R EndFraction sigma-summation Underscript r equals 1 Overscript upper R Endscripts left-parenthesis ModifyingAbove bold-italic beta With caret Subscript r Baseline minus ModifyingAbove bold-italic beta With caret right-parenthesis left-parenthesis ModifyingAbove bold-italic beta With caret Subscript r Baseline minus ModifyingAbove bold-italic beta With caret right-parenthesis prime

with H degrees of freedom, where H is the number of strata.

If you specify the CENTER=REPLICATES method-option, then PROC SURVEYPHREG computes the covariance matrix of ModifyingAbove bold-italic beta With caret by

ModifyingAbove bold upper V With caret left-parenthesis ModifyingAbove bold-italic beta With caret right-parenthesis equals StartFraction 1 Over upper R EndFraction sigma-summation Underscript r equals 1 Overscript upper R Endscripts left-parenthesis ModifyingAbove bold-italic beta With caret Subscript r Baseline minus ModifyingAbove bold-italic beta With caret Subscript r Baseline overbar right-parenthesis left-parenthesis ModifyingAbove bold-italic beta With caret Subscript r Baseline minus ModifyingAbove bold-italic beta With caret Subscript r Baseline overbar right-parenthesis prime

where ModifyingAbove bold-italic beta With caret Subscript r Baseline overbar is the average of the replicate estimates as follows:

ModifyingAbove bold-italic beta With caret Subscript r Baseline overbar equals StartFraction 1 Over upper R EndFraction sigma-summation Underscript r equals 1 Overscript upper R Endscripts ModifyingAbove bold-italic beta Subscript r Baseline With caret

If one or more components of ModifyingAbove bold-italic beta With caret Subscript r cannot be calculated for some replicates, then the variance estimate is computed by using only the replicates for which the proportional hazards regression coefficients can be estimated. Estimability and nonconvergence are the two most common reasons why ModifyingAbove bold-italic beta With caret Subscript r might not be available for a replicate sample even if ModifyingAbove bold-italic beta With caret is defined for the full sample. Let upper R Subscript a be the number of replicates where ModifyingAbove bold-italic beta With caret Subscript r is available, and let upper R minus upper R Subscript a be the number of replicates where ModifyingAbove bold-italic beta With caret Subscript r is not available. Without loss of generality, assume that ModifyingAbove bold-italic beta With caret Subscript r is available only for the first upper R Subscript a replicates; then the BRR variance estimator is

ModifyingAbove bold upper V With caret left-parenthesis ModifyingAbove bold-italic beta With caret right-parenthesis equals StartFraction 1 Over upper R Subscript a Baseline EndFraction sigma-summation Underscript r equals 1 Overscript upper R Subscript a Baseline Endscripts left-parenthesis ModifyingAbove bold-italic beta With caret Subscript r Baseline minus ModifyingAbove bold-italic beta With caret right-parenthesis left-parenthesis ModifyingAbove bold-italic beta With caret Subscript r Baseline minus ModifyingAbove bold-italic beta With caret right-parenthesis prime

with degrees of freedom equal to the minimum of H and upper R Subscript a, where H is the number of strata. Alternatively, you can use the FAY= method-option to request Fay’s BRR method, as discussed in the following section.

Fay’s BRR Method

The traditional BRR method constructs half-sample replicates by deleting one PSU per stratum according to a Hadamard matrix and doubling the original weight of the other PSU. Fay’s BRR method uses the Fay coefficient, epsilon left-parenthesis 0 less-than-or-equal-to epsilon less-than 1 right-parenthesis, and instead of deleting one PSU per stratum, it multiplies the original weight by the coefficient epsilon. The original weight of the remaining PSU in that stratum is multiplied by 2 minus epsilon. PROC SURVEYPHREG uses epsilon equals 0.5 as the default value; alternatively, you can specify a value for epsilon with the FAY= method-option. When epsilon equals 0, Fay’s method becomes the traditional BRR method. For more details, see Dippo, Fay, and Morganstein (1984); Fay (1984, 1989); Judkins (1990). Because the traditional BRR method uses only half of the total sample in every replicate, several replicate estimators (ModifyingAbove bold-italic beta With caret Subscript r) might be undefined even when the full sample estimator (ModifyingAbove bold-italic beta With caret) is defined. Fay’s BRR method is especially useful for this situation because it uses all the sampled units in every replicate.

Let ModifyingAbove bold-italic beta With caret be the estimated proportional hazards regression coefficients from the full sample, and let ModifyingAbove bold-italic beta With caret Subscript r be the estimated regression coefficients that are obtained from the rth replicate by using replicate weights. PROC SURVEYPHREG estimates the covariance matrix of ModifyingAbove bold-italic beta With caret by

ModifyingAbove bold upper V With caret left-parenthesis ModifyingAbove bold-italic beta With caret right-parenthesis equals StartFraction 1 Over upper R left-parenthesis 1 minus epsilon right-parenthesis squared EndFraction sigma-summation Underscript r equals 1 Overscript upper R Endscripts left-parenthesis ModifyingAbove bold-italic beta With caret Subscript r Baseline minus ModifyingAbove bold-italic beta With caret right-parenthesis left-parenthesis ModifyingAbove bold-italic beta With caret Subscript r Baseline minus ModifyingAbove bold-italic beta With caret right-parenthesis prime

with H degrees of freedom, where H is the number of strata.

Hadamard Matrix

PROC SURVEYPHREG uses a Hadamard matrix to construct replicates for BRR variance estimation. You can provide a Hadamard matrix for replicate construction by using the HADAMARD= method-option for VARMETHOD=BRR. Otherwise, PROC SURVEYPHREG generates an appropriate Hadamard matrix. You can display the Hadamard matrix by specifying the PRINTH method-option.

A Hadamard matrix bold upper A of dimension R is a square matrix that has all elements equal to 1 or –1 such that bold upper A prime bold upper A equals upper R bold upper I, where bold upper I is an identity matrix of appropriate order. The dimension of a Hadamard matrix must equal 1, 2, or a multiple of 4.

For example, the following matrix is a Hadamard matrix of dimension k = 8:

StartLayout 1st Row 1st Column 1 2nd Column 1 3rd Column 1 4th Column 1 5th Column 1 6th Column 1 7th Column 1 8th Column 1 2nd Row 1st Column 1 2nd Column negative 1 3rd Column 1 4th Column negative 1 5th Column 1 6th Column negative 1 7th Column 1 8th Column negative 1 3rd Row 1st Column 1 2nd Column 1 3rd Column negative 1 4th Column negative 1 5th Column 1 6th Column 1 7th Column negative 1 8th Column negative 1 4th Row 1st Column 1 2nd Column negative 1 3rd Column negative 1 4th Column 1 5th Column 1 6th Column negative 1 7th Column negative 1 8th Column 1 5th Row 1st Column 1 2nd Column 1 3rd Column 1 4th Column 1 5th Column negative 1 6th Column negative 1 7th Column negative 1 8th Column negative 1 6th Row 1st Column 1 2nd Column negative 1 3rd Column 1 4th Column negative 1 5th Column negative 1 6th Column 1 7th Column negative 1 8th Column 1 7th Row 1st Column 1 2nd Column 1 3rd Column negative 1 4th Column negative 1 5th Column negative 1 6th Column negative 1 7th Column 1 8th Column 1 8th Row 1st Column 1 2nd Column negative 1 3rd Column negative 1 4th Column 1 5th Column negative 1 6th Column 1 7th Column 1 8th Column negative 1 EndLayout

For BRR replicate construction, the dimension of the Hadamard matrix must be at least H, where H denotes the number of first-stage strata in your design. If a Hadamard matrix of a given dimension exists, it is not necessarily unique. Therefore, if you want to use a specific Hadamard matrix, you must provide the matrix as a SAS data set in the HADAMARD= method-option. You must ensure that the matrix that you provide is actually a Hadamard matrix; PROC SURVEYPHREG does not check the validity of your Hadamard matrix.

See the section Balanced Repeated Replication (BRR) Method for details about how the Hadamard matrix is used to construct replicates for BRR variance estimation.

Last updated: December 09, 2022