Lets SAS improve performance when a single Hive store table is queried.
| Valid in: | SAS/ACCESS LIBNAME statement |
|---|---|
| Category: | Data Access |
| Default: | NO |
| Data source: | Hadoop |
| See: | ANALYZE= data set option, READ_METHOD= LIBNAME option, READ_METHOD= data set option |
Table of Contents
specifies that SAS might run an ANALYZE TABLE command to update table statistics. Current table statistics might improve SAS Read performance when a single table is queried. This operation is considered a hint, so if statistics are up-to-date, SAS might not perform the operation. The format of the ANALYZE TABLE command is subject to change in future releases of SAS as needed.
specifies that SAS does not perform an additional operation to update table statistics.
Performance improves when Hive statistics are up-to-date. When the Hive Boolean variable hive.stats.autogather is set to TRUE, in most cases Hive automatically gathers statistics. This option can be useful when hive.stats.autogather is set to FALSE or when statistics are not being computed. Specifically, Hive statistics are not generated when loading a target table with a file format of TEXT. Users can check the state of Hive statistics using the Hive DESCRIBE FORMATTED command.