The SURVEYSELECT procedure provides a variety of methods for selecting probability-based random samples. The procedure can select a simple random sample or can sample according to a complex multistage sample design that includes stratification, clustering, and unequal probabilities of selection. When you use probability sampling, each unit in the survey population has a known, positive probability of selection. This property of probability sampling avoids selection bias and enables you to use statistical theory to make valid inferences from the sample to the survey population.
PROC SURVEYSELECT provides methods for both equal-probability sampling and probability proportional to size (PPS) sampling. Available equal-probability sampling methods include simple random sampling (without replacement) and unrestricted random sampling (with replacement) in addition to systematic, sequential, Bernoulli, and balanced bootstrap selection.
In PPS sampling, a unit’s selection probability is proportional to its size measure. PPS sampling is often used in cluster sampling, where you select clusters (groups of sampling units) of varying size in the first stage of selection. Available PPS methods include without replacement, with replacement, systematic, and sequential with minimum replacement. PROC SURVEYSELECT can apply these selection methods for stratified, clustered, and replicated sample designs.
For stratified sampling, PROC SURVEYSELECT provides survey design methods to allocate the total sample size among the strata. Available allocation methods include proportional, Neyman, and optimal allocation. Optimal allocation maximizes the estimation precision within the available resources by taking into account stratum sizes, costs, and variances.
PROC SURVEYSELECT also provides random assignment (partitioning).
For more information, see Chapter 124, The SURVEYSELECT Procedure.