The following PROC SURVEYSELECT statements select a probability sample of customers from the Customers data set by using simple random sampling:
title1 'Customer Satisfaction Survey';
title2 'Simple Random Sampling';
proc surveyselect data=Customers method=srs n=100
out=SampleSRS;
run;
The PROC SURVEYSELECT statement invokes the procedure. The DATA= option names the SAS data set Customers as the input data set from which to select the sample. The METHOD=SRS option specifies simple random sampling as the sample selection method. In simple random sampling, each sampling unit (observation) has an equal probability of selection, and sampling is performed without replacement. (Without-replacement sampling means that a unit cannot be selected more than once.) The N= option specifies a sample size of 100 customers. The OUT= option stores the sample in the SAS data set named SampleSRS.
Figure 2 displays the output from PROC SURVEYSELECT, which summarizes the sample selection. The "Sample Selection Method" table shows that the selection method is simple random sampling. The "Sample Selection Summary" table shows that a sample of 100 customers is selected from the input data set Customers.
When you use simple random sampling (without stratification), all sampling units have the same selection probability. In this example, the selection probability for each customer is 0.007423, which is the sample size (100) divided by the population size (13,471). The sampling weight for each customer in the sample is 134.71, which is the inverse of the selection probability. If you specify the STATS option, PROC SURVEYSELECT includes the selection probabilities and sampling weights in the output data set. For more complex designs, PROC SURVEYSELECT includes this information in the output data set by default.
The "Sample Selection Summary" table also displays the initial seed that PROC SURVEYSELECT uses for random number generation (39647). When you do not specify the SEED= option, PROC SURVEYSELECT uses the time of day from the computer’s clock to obtain an initial seed. To reproduce this same sample, you can specify SEED=39647 (for the same input data set and sample selection method).
Figure 2: Sample Selection Summary
| Customer Satisfaction Survey |
| Simple Random Sampling |
| Selection Method | Simple Random Sampling |
|---|
| Input Data Set | CUSTOMERS |
|---|---|
| Random Number Seed | 39647 |
| Sample Size | 100 |
| Selection Probability | 0.007423 |
| Sampling Weight | 134.71 |
| Output Data Set | SAMPLESRS |
The sample of 100 customers is stored in the SAS data set SampleSRS. PROC SURVEYSELECT does not display this output data set. The following PROC PRINT statements display the first 20 observations of SampleSRS:
title1 'Customer Satisfaction Survey';
title2 'Sample of 100 Customers, Selected by SRS';
title3 '(First 20 Observations)';
proc print data=SampleSRS(obs=20);
run;
Figure 3 displays the first 20 observations of the output data set SampleSRS, which contains the sample of customers. This data set includes all variables in the input data set Customers. If you do not want to include all variables, you can use the ID statement to specify which variables to copy from the input data set to the output (sample) data set.
Figure 3: Customer Sample (First 20 Observations)
| Customer Satisfaction Survey |
| Sample of 100 Customers, Selected by SRS |
| (First 20 Observations) |
| Obs | CustomerID | State | Type | Usage |
|---|---|---|---|---|
| 1 | 017-27-4096 | GA | New | 168 |
| 2 | 026-37-3895 | AL | New | 59 |
| 3 | 038-54-9276 | GA | New | 785 |
| 4 | 046-40-3131 | FL | New | 60 |
| 5 | 070-37-6924 | GA | New | 524 |
| 6 | 100-58-3342 | FL | New | 302 |
| 7 | 107-61-9029 | AL | New | 235 |
| 8 | 110-95-0432 | FL | New | 12 |
| 9 | 112-81-9251 | SC | New | 347 |
| 10 | 137-33-0478 | GA | New | 551 |
| 11 | 143-83-4677 | AL | New | 203 |
| 12 | 147-19-9164 | GA | New | 172 |
| 13 | 159-51-0606 | FL | New | 102 |
| 14 | 164-14-7799 | GA | Old | 388 |
| 15 | 165-05-7323 | SC | New | 606 |
| 16 | 174-69-3566 | AL | Old | 111 |
| 17 | 177-69-6934 | FL | New | 202 |
| 18 | 181-58-3508 | AL | Old | 261 |
| 19 | 207-41-8446 | AL | Old | 183 |
| 20 | 207-64-7308 | GA | New | 193 |