Naming Conventions for SAS and Spark
For general information, see SAS Names and Support for DBMS Names.
Current versions of Spark are case insensitive for
comparisons, but case is preserved for display purposes. By default, SAS converts
them to
uppercase. Users can set the PRESERVE_COL_NAMES= and PRESERVE_TAB_NAMES= options (shortcut
PRESERVE_NAMES=) to preserve the case of identifiers. Doing this usually is not required
unless the case must be preserved for display purposes.
- Spark currently does not permit a period (.)
or colon (:) within a column name, and a table name cannot begin with the underscore
(_)
character.
- A SAS name must be from 1 to 32 characters
long. When Spark column names and table names are 32 characters or less, SAS handles
them
seamlessly. When SAS reads Spark column names that are longer than 32 characters,
a
generated SAS variable name is truncated to 32 characters. Spark table names should
be 32
characters or less because SAS cannot truncate a table reference. If you already have
a
table name that is greater than 32 characters, create a Spark table view or use the
explicit SQL feature of PROC SQL to access the table.
- If truncating would result in identical names,
SAS generates a unique name.
- Further naming restrictions might apply based
on the Spark distribution. For example, some older Spark distributions do not support
UTF-8 column names.
Last updated: February 3, 2026