The SURVEYPHREG Procedure

DOMAIN Statement

  • DOMAIN variable <( 'formatted-level-value' …'formatted-level-value' )> <variable <( 'formatted-level-value' …'formatted-level-value' )>*variable <( 'formatted-level-value' …'formatted-level-value' )> >;

The DOMAIN statement requests analysis for domains (subpopulations), in addition to analysis for the entire study population. The DOMAIN statement names the variables that identify domains, which are called domain variables.

It is common practice to compute statistics for domains. The formation of these domains might not be known at the design stage. Therefore, the sample sizes for the domains are often random. Use a DOMAIN statement to incorporate this variability into the variance estimation.

Note that a DOMAIN statement is different from a BY statement. In a BY statement, you treat the sample sizes as fixed in each subpopulation, and you perform analysis within each BY group independently.

Use the DOMAIN statement on the entire data set to perform a domain analysis. Creating a new data set from a single domain and analyzing that with PROC SURVEYPHREG can yield inappropriate estimates of variance for domain statistics.

A domain variable can be either character or numeric. The procedure treats domain variables as categorical variables. If a variable appears by itself in a DOMAIN statement, each level of this variable determines a domain in the study population. If two or more variables are joined by asterisks (*), then every possible combination of levels of these variables determines a domain. The procedure performs a descriptive analysis within each domain that is defined by the domain variables. Domain variables must not occur in the CLASS statement.

The formatted values of the domain variables determine the categorical variable levels. Thus, you can use formats to group values into levels. For more information, see the FORMAT procedure in the Base SAS Procedures Guide.

By default, the SURVEYPHREG procedure performs analyses for all levels of domains that are formed by the variables in the DOMAIN statement. Optionally, you can specify domain analyses for particular levels of a DOMAIN variable by listing quoted formatted-level-values in parentheses after the variable name. You must enclose each formatted-level-value in single or double quotation marks. You can specify one or more levels of each variable; when you specify more than one level, separate the levels by a space or a comma. The following example requests domain analysis only for females within each race category:

domain Race*Gender(''Female'');

The following example requests domain analyses only for white and Asian races, and separate domain analyses for both genders:

domain Race('White','Asian') Gender;

If a domain variable appears more than once in any domain cross-classification but the specified levels for that domain variable are not the same, then PROC SURVEYPHREG includes all specified levels of that variable in the domain cross-classification.

In the following example, two different levels for Race are specified in two DOMAIN statements:

domain Race('White')*Gender;
domain Race('Asian')*Gender;

Thus, the preceding specification is equivalent to the following:

domain Race('Asian' 'White')*Gender;

However, if a domain variable appears more than once in cross-classifications but the levels for that domain variable are not specified in all cross-classifications, then PROC SURVEYPHREG includes only the specified levels.

In the following example, a level for Gender is specified in the first DOMAIN statement but no levels for Gender are specified in the second DOMAIN statement:

domain Race('White')*Gender('Female');
domain Race('Asian')*Gender;

Thus, the preceding specification is equivalent to the following:

domain Race('White' 'Asian')*Gender('Female');
Last updated: December 09, 2022