Every column in a table has a name and a data type. The data type tells Impala how much physical storage to set aside for the column and the form in which the data is stored. This section includes information about Impala data types, null and default values, and data conversions.
SAS/ACCESS Interface to Impala does not directly support any data types that are not listed below. Any columns that use unsupported data types are read into SAS as character strings.
For more information about Impala data types and to determine which data types are available for your version of Impala, see your Impala documentation.
Here are the data types that the Impala engine supports.
| BIGINT | FLOAT |
| TIMESTAMP | INT |
| BOOLEAN | SMALLINT |
| DOUBLE | TINYINT |
| DECIMAL |
| CHARn | VARCHARn |
| STRING |
To use CHAR and VARCHAR, an Impala 2.0 server and the Cloudera 2.5.22 or higher ODBC client driver are required.
SAS has two fundamental data types, character and numeric. SAS character variables (columns) are of a fixed length with a maximum of 32,767 characters. SAS numeric variables are signed eight-byte, floating-point numbers. When SAS numerics are used in conjunction with SAS formats, they can represent a number of data types, including DATE, TIME, and DATETIME. For more information about SAS data types, see SAS Programmer’s Guide: Essentials.
This table shows the default SAS formats that are assigned to SAS variables that are created when SAS/ACCESS reads Impala table columns.
|
Impala Data Type |
SAS Data Type |
Default SAS Format |
|---|---|---|
|
CHAR |
character |
$255. |
|
STRING |
$32767. |
|
|
VARCHAR |
$32767. |
|
|
BOOLEAN |
numeric |
1. |
|
BIGINT |
20. |
|
|
DOUBLE |
none |
|
|
FLOAT |
none |
|
|
INT |
11. |
|
|
SMALLINT |
6. |
|
|
TINYINT |
4. |
Below are some issues that you might face when you convert data from Impala to SAS:
This table shows the Impala data types that are assigned when SAS/ACCESS Interface to Impala creates an Impala table.
|
SAS Data Type |
SAS Format |
Impala Data Type |
|---|---|---|
|
character |
$n. |
CHAR(n) |
|
$n. |
STRING |
|
|
$n. |
VARCHAR(n) |
|
|
numeric |
DATETIMEw.p |
TIMESTAMP |
|
DATEw. |
DATE |
|
|
TIMEw. |
TIMESTAMP1 |
|
|
1. to 2. |
TINYINT |
|
|
3. to 4. |
SMALLINT |
|
|
5. to 9. |
INT |
|
|
10. to 18. |
BIGINT |
|
|
other numeric formats |
DOUBLE |
|
| 1 Apache Impala does not support time values with no associated date. Time values with no date are passed to Impala with the CURRENT_DATE. | ||
Impala has a special value called NULL. An Impala NULL value represents an absence of information and is analogous to a SAS missing value. When SAS/ACCESS Interface to Impala reads an Impala NULL value, it interprets it as a SAS missing value. For more information about how SAS handles NULL values, see Potential Result Set Differences When Processing Null Data.