KEEP Statement

Specifies the variables to include in output SAS data sets.

Valid in: DATA step
Categories: CAS
Information
Type: Declarative

Syntax

Arguments

variable-list

specifies the names of the variables to write to the output data set.

Tip List the variables in any form that SAS allows.

Details

The KEEP statement causes a DATA step to write only the variables that you specify to one or more SAS data sets. The KEEP statement applies to all SAS data sets that are created within the same DATA step and can appear anywhere in the step. If no KEEP or DROP statement appears, all data sets that are created in the DATA step contain all variables.

If the same variable is listed on both the DROP and KEEP statements, DROP takes precedence over KEEP regardless of the order of the statements, and the variable is dropped.

Note: Do not use both the KEEP and DROP statements within the same DATA step.

Comparisons

  • The KEEP statement cannot be used in SAS PROC steps. The KEEP= data set option can.
  • The KEEP statement applies to all output data sets that are named in the DATA statement. To write different variables to different data sets, you must use the KEEP= data set option.
  • The DROP statement is a parallel statement that specifies variables to omit from the output data set.
  • The KEEP and DROP statements select variables to include in or exclude from output data sets. The subsetting IF statement selects observations.
  • Do not confuse the KEEP statement with the RETAIN statement. The RETAIN statement causes SAS to hold the value of a variable from one iteration of the DATA step to the next iteration. The KEEP statement does not affect the value of variables but specifies only which variables to include in any output data sets.

Examples

Example 1: KEEP Statement Basic Usage

These examples show the correct syntax for listing variables in the KEEP statement.

keep name address city state zip phone;
keep rep1-rep5;

Example 2: Keeping Variables in the Data Set

This example uses the KEEP statement to include only the variables NAME and AVG in the output data set. The variables SCORE1 through SCORE20, from which AVG is calculated, are not written to the data set AVERAGE.

data average;
   keep name avg;
   infile file-specification;
   input name $ score1-score20;
   avg=mean(of score1-score20);
run;

See Also

Last updated: June 17, 2025