The SURVEYSELECT Procedure

Stratified Sampling with Control Sorting

This example shows how to use PROC SURVEYSELECT to select a stratified random sample by implementing control sorting and systematic random sampling within strata. The sampling frame (list of customers) is stratified by the variable State, and the sampling units (customers) are sorted by the variables Type and Usage within each stratum (state). Customers are then selected by systematic random sampling within strata. Systematic sampling together with control sorting distributes the sample uniformly over the range of type and usage values within each state.

The following PROC SURVEYSELECT statements select a probability sample of customers from the Customers data set by using this design:

title1 'Customer Satisfaction Survey';
title2 'Stratified Sampling with Control Sorting';
proc surveyselect data=Customers method=sys rate=.02
                  seed=1234 out=SampleControl;
   strata State;
   control Type Usage;
run;

The STRATA statement names the stratification variable State. The CONTROL statement names the control variables Type and Usage. In the PROC SURVEYSELECT statement, the METHOD=SYS option requests systematic random sampling, and the RATE= option specifies a sampling rate of 2% for each stratum. The SEED= option specifies the initial seed for random number generation.

Figure 7 displays the output from PROC SURVEYSELECT, which summarizes the sample selection. A sample of 270 customers is selected by using systematic random sampling within strata that are determined by the variable State. The sampling frame Customers is sorted by the control variables Type and Usage within strata. Sorting is performed by using hierarchic serpentine sorting, which is the default type of sorting. For more information, see the section Sorting by CONTROL Variables.

By default, the sorted data set replaces the input data set. To store the sorted input data in another data set, you can specify the OUTSORT= option. The output data set SampleControl contains the sample of customers.

Figure 7: Sample Selection Summary

Customer Satisfaction Survey
Stratified Sampling with Control Sorting

The SURVEYSELECT Procedure

Selection Method Systematic Random Sampling
Strata Variable State
Control Variables Type
  Usage
Control Sorting Serpentine

Input Data Set CUSTOMERS
Random Number Seed 1234
Stratum Sampling Rate 0.02
Number of Strata 4
Total Sample Size 270
Output Data Set SAMPLECONTROL


Last updated: December 09, 2022