Introduction to SAS/ACCESS Interface to Impala

Overview

For available SAS/ACCESS features, see Impala supported features. For more information about Impala, see your Impala documentation.

SAS/ACCESS Interface to Impala includes SAS Data Connector to Impala. The data connector enables you to load large amounts of data into the CAS server for parallel processing. For more information, see these sections:

Impala Concepts

Cloudera Impala is an open-source, massively parallel processing (MPP) query engine that runs natively on Apache Hadoop. You can use it to issue SQL queries to data stored in HDFS and Apache Hbase without moving or transforming data. Similar to other SAS/ACCESS engines, SAS/ACCESS Interface to Impala lets you run SAS procedures against data that is stored in Impala and returns the results to SAS. You can use it to read and write data to and from Hadoop as if it were any other relational data source to which SAS can connect.

Configuring Impala ODBC Driver

If you are using SAS/ACCESS Interface to Impala to connect to an Impala server on a Cloudera cluster, you must set up the Cloudera Impala ODBC driver. For instructions, see the Cloudera driver documentation.

For the minimum required ODBC driver for Impala, see the system requirements for SAS 9.4 or SAS Viya at the SAS Install Center.

High speed data retrieval through bulk loading is available for Impala. For information about how to configure bulk loading, see Bulk Loading for Impala.

Last updated: February 3, 2026