Passing Functions to the DBMS Using PROC SQL

Overview of Passing Functions to a DBMS

In most cases, SAS/ACCESS automatically passes down SAS SQL aggregate functions, such as MIN, MAX, AVG, MEAN, FREQ, N, SUM, and COUNT, to the DBMS. These are SQL ANSI-defined aggregate functions. They typically automatically pass down functions that have the same behavior.

Here is an example that shows how the AVG function is passed to Netezza.

libname x netezza database=TEST ...;
/* Create test data */
proc delete data=x.class; 
run;

data x.class;
   set sashelp.class;
run;

/* Call aggregate function AVG */
proc sql;
   select avg(age)
   from x.class;
quit;

PROC SQL generates this DBMS query:

select AVG(TXT_1."AGE") from TEST..class TXT_1

In some cases, such as when remerging data is required, only the SELECT statement is passed to the database. Additional calculations, including aggregate functions, are completed by SAS. In these cases, a note is printed to the SAS log when SASTRACE= is enabled.

PROC SQL also generates DBMS-specific queries when functions are present in the WHERE clause. Here is an example that shows how SAS/ACCESS translates the SAS UPCASE function into the Netezza UPPER function.

proc sql; 
   select avg(age) 
   from x.class 
   where upcase(sex) = 'F'; 
quit;

PROC SQL generates this DBMS query:

NETEZZA_15: Prepared: on connection 3
 select AVG(TXT_1."AGE") from TEST..class TXT_1 where upper(TXT_1."SEX") = 'F'

The functions that are passed to the DBMS vary based on the specific SAS/ACCESS product. See the “Passing SAS Functions” section for your SAS/ACCESS interface for a list of SAS functions and the DBMS functions that correspond to each one.

Using SQL_FUNCTIONS=ALL

By default, the DBMS functions that are called in a query that is generated by SAS produce the same results as the comparable function in SAS. However, there are some DBMS functions that produce slightly different results than those that a comparable SAS function generates. Because different results are produced, SAS does not call those functions by default. Sometimes the results that are generated by a DBMS function are very close to the results that would be generated in SAS. In those cases, it is preferable to pass the query to the DBMS and to call the DBMS function. Evaluate the differences in results for your data to make sure that results are acceptable for your needs.

To enable use of these functions, specify the SQL_FUNCTIONS=ALL LIBNAME option. See the documentation for your interface about passing SAS functions to determine which functions are passed down when you specify SQL_FUNCTIONS=ALL. Alternatively, you can generate the list of options that are passed to your DBMS by using the SQL_FUNCTIONS_COPY= LIBNAME option. Here is an example for a Netezza database:

/* generate data set WORK.SF, contains default SQL functions */
libname x netezza server='myserver.com' database=mydb user=myuser password=mypwd 
   SQL_FUNCTIONS_COPY=WORK.SF ;

/* generate data set WORK.SFALL, contains ALL target SQL functions */
libname x netezza server='myserver.com' database=mydb user=myuser password=mypwd 
   SQL_FUNCTIONS=ALL SQL_FUNCTIONS_COPY=WORK.SFALL ;

/* use PROC SQL to create SFDIFF containing just the non-default functions */ 
proc sql;
   create table sfdiff as
     select * from SFALL
     except all
     select * from SF;
quit;

It is also possible to add functions to the SQL dictionary for your interface. For more information, see SQL_FUNCTIONS= LIBNAME Statement Option.

Functions Where Results Might Vary: MOD Function

In general, SAS functions that are passed through to a DBMS yield the same results. That is, your results are the same whether you pass a function to SAS or to your DBMS. However, in some cases, SAS might yield a different result than you would obtain with the same function call to your DBMS.

The MOD function is one function that might give a different result depending on where the function runs. More specifically, when you pass non-integer arguments to the MOD function, the results might differ. The MOD function returns the remainder from the division of the first argument by the second argument. In SAS, both arguments can be non-integers, and the calculations are performed without altering the arguments. In some DBMSs, such as DB2 or PostgreSQL, non-integer arguments are truncated to the nearest integer before performing the division.

As a best practice, be sure to understand how the MOD function works on your DBMS or you might get unexpected results.

Processing the DISTINCT Operator

The DISTINCT operator specifies that only distinct values across all variables in the SELECT clause should be returned by a query. The DISTINCT operator can be inefficient when it is used to generate query results within SAS. Therefore, SAS always attempts to pass queries that use the DISTINCT operator to the DBMS.

To demonstrate the impact of using the DISTINCT operator, consider these queries.

libname x netezza server='myserver.com' database=mydb user=myuser 
   password=mypwd;

/* The LOG function is not passed to the DBMS */
proc sql;
   select log(age) from x.class;
quit;

This call to PROC SQL generates this SQL query to be passed to the database.

SELECT  "AGE" FROM TEST..class

Adding DISTINCT causes the LOG function to be passed to the DBMS. (For Netezza, the SAS LOG function is translated to the Netezza LN function.)

libname x netezza server='myserver.com' database=mydb user=myuser 
   password=mypwd;

/* The LOG function is not passed to the DBMS */
proc sql;
   select distinct log(age) from x.class;
quit;

This call to PROC SQL generates this SQL query to be passed to the database.

select distinct ln(TXT_1."AGE") from TEST..class TXT_1

Notice that in the first query, the LN function is not called. The first query requires a full table scan. Because each row is being examined, SAS completes the LOG computation as it processes each row.

In the second query, the DISTINCT operator causes the processing to be passed to the DBMS for optimal performance. Therefore, the LOG function (translated to the LN function for Netezza) is also passed to the DBMS.

Last updated: February 3, 2026