SCANSTRINGCOLUMNS= LIBNAME Statement Option

Specifies whether to determine the maximum length of VARCHAR columns in a database table or a query.

Valid in: SAS/ACCESS LIBNAME statement
Category: Data Set Control
Alias: SCAN_STRING_COLUMNS=, SCAN_STRINGS=, SCANSTRINGS=
Default: NO
Data source: Google BigQuery, Microsoft SQL Server, ODBC, Snowflake
Notes: Support for this option was added in the April 2021 update for SAS/ACCESS on SAS 9.4M7 and SAS Viya 3.5. Support for Google BigQuery was extended to SAS 9.4M8 and SAS 9.4M9 with the release of SAS 9.4M9.
Support for Microsoft SQL Server and ODBC was added in SAS 9.4M9.
See: SCANSTRINGCOLUMNS= data set option

Syntax

SCANSTRINGCOLUMNS=YES | NO

Required Arguments

YES

specifies to scan all VARCHAR columns in a database table to determine the actual maximum length of the columns before loading data. If the VARCHAR columns have been created with the maximum VARCHAR precision, then using this option can reduce the size of the resulting table and can accelerate the loading process.

NO

specifies that VARCHAR columns are not scanned before loading data.

Details

Overview of SCANSTRINGCOLUMNS= Behavior

This option can be specified for any table, but the best performance improvement occurs for larger tables. A table is considered to be large if it contains a large number of rows, a large number of VARCHAR columns, or both. It might be preferable to specify the SCANSTRINGCOLUMNS= data set option for specific tables.

Here is a list of the data types that are typically scanned for the data sources that use this option:

  • Google BigQuery: STRING and BYTE columns
  • Microsoft SQL Server (with DataDirect drive): VARCHAR(max), NVARCHAR(max), VARBINARY(max)
  • ODBC: varies by data source. Here is a list of scanned data types for a sample of ODBC data sources:
    • ODBC to Dremio: VARCHAR and VARBINARY
    • ODBC to Microsoft SQL Server (with DataDirect driver): VARCHAR(max), NVARCHAR(max), VARBINARY(max)
    • ODBC to Oracle: BLOB, CLOB, NCLOB
  • Snowflake: VARCHAR columns

This option applies to any table that you access or to any query that you are running. For example, if you specify SCANSTRINGCOLUMNS=YES and call PROC PRINT for a table, then the scan runs for the table that you are printing. When you run a PROC SQL query, SCANSTRINGCOLUMNS=YES results in scans being done for any table in the query. However, for ODBC data sources, only implicit SQL queries use the SCANSTRINGCOLUMNMS= option. For more information, see the next topic.

ODBC: Use Implicit PROC SQL Queries for SCANSTRINGCOLUMNS=

For ODBC data sources, use implicit SQL queries with PROC SQL to benefit from using SCANSTRINGCOLUMNS=. Implicit queries result in SQL code that is generated by SAS. Here is an example of a query that returns a data set with data lengths that are based on an initial scan of the data:

libname a odbc dsn=dremio preserve_tab_names=yes schema="schema1" 
          scanStringColumns=yes;

proc sql;
create table work.goodsize as select from a.mydata;
quit;

In contrast, an explicit SQL query with PROC SQL results in a larger table, because the results use the maximum default length for long data types. An explicit query is one in which you provide data source specific SQL syntax in your PROC SQL call. Explicit queries typically include the CONNECT USING clause, such as in this query:

libname a odbc dsn=dremio preserve_tab_names=yes schema="schema1" scanStringColumns=yes;

proc sql;
connect using a;
create table work.bigsize as select * from connection to a (select * from <location>.mydata);
disconnect from a;
quit;
TipYou can set SASTRACE=',,,d' and SASTRACELOG=SASLOG in an OPTIONS statement to verify whether SAS generates SQL code (implicit SQL) or passes SQL code exactly as you provide it (explicit SQL).
Last updated: February 3, 2026