The SURVEYPHREG Procedure

Jackknife Method

The jackknife method of variance estimation deletes one PSU at a time from the full sample to create replicates. This method is also known as the delete-1 jackknife method because it deletes exactly one PSU in every replicate. The total number of replicates R is the same as the total number of PSUs. In each replicate, the sampling weights of the remaining PSUs are modified by the jackknife coefficient alpha Subscript r. The modified weights are called replicate weights.

Let PSU i in stratum h Subscript r be omitted for the rth replicate; then the jackknife coefficient and replicate weights are computed as

alpha Subscript r Baseline equals StartLayout Enlarged left-brace 1st Row 1st Column StartFraction n Subscript h Sub Subscript r Subscript Baseline minus 1 Over n Subscript h Sub Subscript r Subscript Baseline EndFraction 2nd Column for a stratified design 2nd Row 1st Column StartFraction upper R minus 1 Over upper R EndFraction 2nd Column for designs without stratification EndLayout

and

w Subscript h i j Superscript left-parenthesis r right-parenthesis Baseline equals StartLayout Enlarged left-brace 1st Row 1st Column w Subscript h i j Baseline 2nd Column if observation unit j is not in donor stratum h Subscript r Baseline 2nd Row 1st Column 0 2nd Column if observation unit j is in PSU i of donor stratum h Subscript r Baseline 3rd Row 1st Column w Subscript h i j Baseline slash alpha Subscript r Baseline 2nd Column if observation unit j is not in PSU i but in donor stratum h Subscript r Baseline EndLayout

You can use the VARMETHOD=JACKKNIFE(OUTJKCOEFS=) method-option to store the jackknife coefficients in a SAS data set and use the VARMETHOD=JACKKNIFE(OUTWEIGHTS=) method-option to store the replicate weights in a SAS data set.

If you provide your own replicate weights with a REPWEIGHTS statement, then you can also provide corresponding jackknife coefficients with the JKCOEFS= option. If you provide replicate weights with a REPWEIGHTS statement but do not provide jackknife coefficients, then the procedure uses left-parenthesis upper R minus 1 right-parenthesis slash upper R as the default jackknife coefficient for every replicate, where R is the total number of replicates.

Let ModifyingAbove bold-italic beta With caret be the estimated proportional hazards regression coefficients from the full sample, and let ModifyingAbove bold-italic beta With caret Subscript r be the estimated regression coefficients for the rth replicate. PROC SURVEYPHREG estimates the covariance matrix of ModifyingAbove bold-italic beta With caret by

ModifyingAbove bold upper V With caret left-parenthesis ModifyingAbove bold-italic beta With caret right-parenthesis equals sigma-summation Underscript r equals 1 Overscript upper R Endscripts alpha Subscript r Baseline left-parenthesis ModifyingAbove bold-italic beta With caret Subscript r Baseline minus ModifyingAbove bold-italic beta With caret right-parenthesis left-parenthesis ModifyingAbove bold-italic beta With caret Subscript r Baseline minus ModifyingAbove bold-italic beta With caret right-parenthesis prime

with upper R minus upper H degrees of freedom, where R is the number of replicates and H is the number of strata, or R – 1 when there is no stratification.

If you specify the CENTER=REPLICATES method-option, then PROC SURVEYPHREG computes the covariance matrix of ModifyingAbove bold-italic beta With caret by

ModifyingAbove bold upper V With caret left-parenthesis ModifyingAbove bold-italic beta With caret right-parenthesis equals sigma-summation Underscript r equals 1 Overscript upper R Endscripts alpha Subscript r Baseline left-parenthesis ModifyingAbove bold-italic beta With caret Subscript r Baseline minus ModifyingAbove bold-italic beta With caret Subscript r Baseline overbar right-parenthesis left-parenthesis ModifyingAbove bold-italic beta With caret Subscript r Baseline minus ModifyingAbove bold-italic beta With caret Subscript r Baseline overbar right-parenthesis prime

where ModifyingAbove bold-italic beta With caret Subscript r Baseline overbar is the average of the replicate estimates as follows:

ModifyingAbove bold-italic beta With caret Subscript r Baseline overbar equals StartFraction 1 Over upper R EndFraction sigma-summation Underscript r equals 1 Overscript upper R Endscripts ModifyingAbove bold-italic beta Subscript r Baseline With caret

If one or more components of ModifyingAbove bold-italic beta With caret Subscript r cannot be calculated for some replicates, then the variance estimator uses only the replicates for which the proportional hazards regression coefficients can be estimated. Estimability and nonconvergence are two common reasons why ModifyingAbove bold-italic beta With caret Subscript r might not be available for a replicate sample even if ModifyingAbove bold-italic beta With caret is defined for the full sample. Let upper R Subscript a be the number of replicates where ModifyingAbove bold-italic beta With caret Subscript r are available, and let upper R minus upper R Subscript a be the number of replicates where ModifyingAbove bold-italic beta With caret Subscript r are not available. Without loss of generality, assume that ModifyingAbove bold-italic beta With caret Subscript r is available only for the first upper R Subscript a replicates; then the jackknife variance estimator is

ModifyingAbove bold upper V With caret left-parenthesis ModifyingAbove bold-italic beta With caret right-parenthesis equals sigma-summation Underscript r equals 1 Overscript upper R Subscript a Baseline Endscripts alpha Subscript r Baseline left-parenthesis ModifyingAbove bold-italic beta With caret Subscript r Baseline minus ModifyingAbove bold-italic beta With caret right-parenthesis left-parenthesis ModifyingAbove bold-italic beta With caret Subscript r Baseline minus ModifyingAbove bold-italic beta With caret right-parenthesis prime

with upper R Subscript a Baseline minus upper H degrees of freedom, where H is the number of strata. Alternatively, you can use the VADJUST=AVGREPSS option in the MODEL statement to use the average sum of squares for the invalid replicate samples. See Variance Adjustment Factors for details.

Last updated: December 09, 2022