The SURVEYSELECT Procedure

Specifying the Margin of Error

Instead of specifying the total sample size to allocate among the strata, you can specify the margin of error for the estimate of the overall mean from the stratified sample. Based on the requested allocation method and the stratum variances that you provide, PROC SURVEYSELECT computes the stratum sample sizes that are required to achieve this margin of error. You specify the margin of error in the MARGIN= option in the STRATA statement, and you provide stratum variances in the VAR= option. You can use the MARGIN= option with any allocation method (proportional, optimal, or Neyman) or with allocation proportions that you provide (ALLOC=(values) or ALLOC=SAS-data-set).

The margin of error e is the half-width of the 100 left-parenthesis 1 minus alpha right-parenthesis% confidence interval for the overall mean based on the stratified sample,

e equals z Subscript alpha slash 2 Baseline StartRoot normal upper V normal a normal r left-parenthesis y overbar Subscript s t r Baseline right-parenthesis EndRoot

where normal upper V normal a normal r left-parenthesis y overbar Subscript s t r Baseline right-parenthesis is the variance of the estimate of the mean from the stratified sample and z Subscript alpha slash 2 is the 100 left-parenthesis 1 minus alpha slash 2 right-parenthesisth percentile of the standard normal distribution. You can specify the value of alpha in the ALPHA= option in the STRATA statement. By default, PROC SURVEYSELECT uses a 95% confidence interval (ALPHA=0.05).

For the specified margin of error e, PROC SURVEYSELECT computes the target stratum sample sizes n Subscript h Superscript asterisk for without-replacement selection methods as

n Subscript h Superscript asterisk Baseline equals f Subscript h Superscript asterisk Baseline left-parenthesis sigma-summation Underscript i equals 1 Overscript upper H Endscripts upper N Subscript i Superscript 2 Baseline upper S Subscript i Superscript 2 Baseline slash f Subscript i Superscript asterisk Baseline right-parenthesis slash left-parenthesis left-parenthesis e upper N slash z Subscript alpha slash 2 Baseline right-parenthesis squared plus sigma-summation Underscript i equals 1 Overscript upper H Endscripts upper N Subscript i Baseline upper S Subscript i Superscript 2 Baseline right-parenthesis

where upper N Subscript i is the number of sampling units in stratum i, upper S Subscript i Superscript 2 is the variance within stratum i, N is the total number of sampling units for all strata, and H is the total number of strata.

The values of f Subscript h Superscript asterisk are the stratum allocation proportions, which PROC SURVEYSELECT computes according to the allocation method that you request. For more information, see the sections Proportional Allocation, Optimal Allocation, and Neyman Allocation.

For with-replacement selection methods, PROC SURVEYSELECT computes the target stratum sample sizes as

n Subscript h Superscript asterisk Baseline equals f Subscript h Superscript asterisk Baseline left-parenthesis sigma-summation Underscript i equals 1 Overscript upper H Endscripts upper N Subscript i Superscript 2 Baseline upper S Subscript i Superscript 2 Baseline slash f Subscript i Superscript asterisk Baseline right-parenthesis slash left-parenthesis e upper N slash z Subscript alpha slash 2 Baseline right-parenthesis squared

For more information, see Lohr (2010, p. 91), Cochran (1977, Chapter 5), and Arkin (1984, Chapter 10).

The target sample size values n Subscript h Superscript asterisk might not be integers, but the stratum sample sizes are required to be integers. PROC SURVEYSELECT rounds all fractional target sample sizes up to integer sample sizes. If you specify a minimum stratum sample size n Subscript m i n in the ALLOCMIN= option in the STRATA statement, then all stratum sample sizes n Subscript h are required to be at least n Subscript m i n.

For without-replacement selection methods, a stratum sample size cannot exceed the number of units in the stratum. If a target stratum sample size does exceed the number of units in the stratum, the procedure sets n Subscript h Baseline equals upper N Subscript h for that stratum, removes the stratum from the variance computation (because it contributes nothing to the sampling error), revises the allocation proportions f Subscript h Superscript asterisk for the remaining strata, and computes the stratum sample sizes again. If a stratum sample size equals the number of units in its stratum, the procedure also removes that stratum from the variance computation and revises the sample sizes for the remaining strata. For more information, see Cochran (1977, p. 104) and Arkin (1984, p. 176).

When you specify the STATS option with the MARGIN= option in the STRATA statement, PROC SURVEYSELECT displays the expected margin of error for the sample allocation. The expected margin of error (for the overall mean based on the stratified sample) is computed from the stratum sizes (upper N Subscript i), the stratum variances that you provide (upper S Subscript i Superscript 2), and the allocated stratum sample sizes that the procedure computes (n Subscript i). For without-replacement selection methods, the expected margin of error is

e equals left-parenthesis z Subscript alpha slash 2 Baseline slash upper N right-parenthesis StartRoot sigma-summation Underscript i equals 1 Overscript upper H Endscripts left-parenthesis upper N Subscript i Superscript 2 Baseline upper S Subscript i Superscript 2 Baseline slash n Subscript i Baseline right-parenthesis left-parenthesis 1 minus n Subscript i Baseline slash upper N right-parenthesis EndRoot

For with-replacement selection methods, the expected margin of error is

e equals left-parenthesis z Subscript alpha slash 2 Baseline slash upper N right-parenthesis StartRoot sigma-summation Underscript i equals 1 Overscript upper H Endscripts left-parenthesis upper N Subscript i Superscript 2 Baseline upper S Subscript i Superscript 2 Baseline slash n Subscript i Baseline right-parenthesis EndRoot

The expected margin of error should be less than or equal to the value specified in the MARGIN= option. Any difference between the expected margin and the specified value is due to rounding the target stratum sample sizes up to integer values and increasing stratum sample sizes to equal the required minimum value (ALLOCMIN=).

Last updated: December 09, 2022