Autopartitioning for SAS/ACCESS Interface to ODBC is a modulo (MOD) function method. For general information about this feature, see Autopartitioning Techniques in SAS/ACCESS.
SAS/ACCESS Interface to ODBC places additional restrictions on the columns that you can use for the partitioning column during the autopartitioning phase. Here is how columns are partitioned.
The exception to the above rule is for Oracle SQL_DECIMAL columns. As long as the scale of the SQL_DECIMAL column is 0, you can use the column as the partitioning column.
If you select a nullable
column for autopartitioning, the OR<column-name>IS
NULL SQL statement is appended at the end of the SQL
code that is generated for the threaded Read. This ensures that any
possible NULL values are returned in the result set. Also, if the
column to be used for the partitioning is SQL_BIT, the number of threads
are automatically changed to two, regardless of how the DBSLICEPARM=
option is specified.
Autopartitioning does not select a column to be the partitioning column if it appears in the WHERE clause. For example, the following DATA step could not use a threaded Read to retrieve the data because all numeric columns in the table are in the WHERE clause.
data work.locemp;
set trlib.MYEMPS;
where EMPNUM<=30 and ISTENURE=0 and
SALARY<=35000 and NUMCLASS>2;
run;
SAS/ACCESS Interface to ODBC defaults to three threads when you use autopartitioning. However, do not specify a maximum number of threads in DBSLICEPARM= to use for the threaded Read.
You might achieve the best possible performance when using threaded Reads by specifying the DBSLICE= option for ODBC in your SAS operation. This is especially true if your DBMS supports multiple database partitions and provides a mechanism to allow connections to individual partitions. If your DBMS supports this concept, you can configure an ODBC data source for each partition. You can also use the DBSLICE= clause to specify both the data source and the WHERE clause for each partition, as shown in this example.
proc print data=trilib.MYEMPS(DBSLICE=(DSN1="EMPNUM BETWEEN 1 AND 33"
DSN2="EMPNUM BETWEEN 34 AND 66"
DSN3="EMPNUM BETWEEN 67 AND 100"));
run;
See your DBMS or ODBC driver documentation for more information about configuring for multiple partition access. You can also see Configuring Microsoft SQL Server Partitioned Views for Use with DBSLICE= for an example of configuring multiple partition access to a table.
Using the DATASOURCE= syntax is not required to use DBSLICE= with threaded Reads for the ODBC interface. The methods and examples described in DBSLICE= work well in cases where the table that you want to read is not stored in multiple partitions in your DBMS. These methods also give you flexibility in column selection. For example, if you know that the STATE column in your employee table contains only a few distinct values, you can customize your DBSLICE= clause accordingly.
datawork.locemp;
set trlib2.MYEMP(DBSLICE=("STATE='FL'" "STATE='GA'"
"STATE='SC'" "STATE='VA'" "STATE='NC'"));
where EMPNUM<=30 and ISTENURE=0 and SALARY<=35000 and NUMCLASS>2;
run;
Microsoft SQL Server implements multiple partitioning by creating a global view across multiple instances of a Microsoft SQL Server database. For this example, assume that Microsoft SQL Server has been installed on three separate machines (SERVER1, SERVER2, SERVER3). Three ODBC data sources (SSPART1, SSPART2, SSPART3) have been configured against these servers. Also, a linked server definition for each of these servers has been specified. This example uses SAS to create the tables and associated views, but you can create the tables and associated outside of the SAS environment.
data work.MYEMPS;
format HIREDATE mmddyy 0. SALARY 9.2
NUMCLASS 6. GENDER $1. STATE $2. EMPNUM 10.;
do EMPNUM=1 to 100;
morf=mod(EMPNUM,2)+1;
if(morf eq 1) then
GENDER='F';
else
GENDER='M';
SALARY=(ranuni(0)*5000);
HIREDATE=int(ranuni(13131)*3650);
whatstate=int(EMPNUM/5);
if(whatstate eq 1) then
STATE='FL';
if(whatstate eq 2) then
STATE='GA';
if(whatstate eq 3) then
STATE='SC';
if(whatstate eq 4) then
STATE='VA';
else
state='NC';
ISTENURE=mod(EMPNUM,2);
NUMCLASS=int(EMPNUM/5)+2;
output;
end;
run;
libname trlib odbc user=ssuser pw=sspwd dsn=sspart1;
proc datasets library=trlib;
delete MYEMPS1;run;
run;
data trlib.MYEMPS1(drop=morf whatstate
DBTYPE=(HIREDATE="datetime" SALARY="numeric(8,2)"
NUMCLASS="smallint" GENDER="char(1)" ISTENURE="bit" STATE="char(2)"
EMPNUM="int NOT NULL Primary Key CHECK (EMPNUM BETWEEN 0 AND 33)"));
set work.MYEMPS;
where (EMPNUM BETWEEN 0 AND 33);
run;
libname trlib odbc user=ssuer pw=sspwd dsn=sspart2;
proc datasets library=trlib;
delete MYEMPS2;run;
data trlib.MYEMPS2(drop=morf whatstate
DBTYPE=(HIREDATE="datetime" SALARY="numeric(8,2)"
NUMCLASS="smallint" GENDER="char(1)" ISTENURE="bit" STATE="char(2)"
EMPNUM="int NOT NULL Primary Key CHECK (EMPNUM BETWEEN 34 AND 66)"));
set work.MYEMPS;
where (EMPNUM BETWEEN 34 AND 66);
run;
libname trlib odbc user=ssuer pw=sspwd dsn=sspart3;
proc datasets library=trlib;
delete MYEMPS3;run;
data trlib.MYEMPS3(drop=morf whatstate
DBTYPE=(HIREDATE="datetime" SALARY="numeric(8,2)"
NUMCLASS="smallint" GENDER="char(1)" ISTENURE="bit" STATE="char(2)"
EMPNUM="int NOT NULL Primary Key CHECK (EMPNUM BETWEEN 67 AND 100)"));
set work.MYEMPS;
where (EMPNUM BETWEEN 67 AND 100);
run;
These table definitions also use CHECK constraints to enforce the distribution of the data on each of the subtables of the target view.
/*SERVER1,SSPART1*/
proc sql noerrorstop;
connect to odbc (UID=ssuser PWD=sspwd DSN=SSPART1);
execute (drop view MYEMPS) by odbc;
execute (create view MYEMPS AS
SELECT * FROM users.ssuser.MYEMPS1
UNION ALL
SELECT * FROM SERVER2.users.ssuser.MYEMPS2
UNION ALL
SELECT * FROM SERVER3.users.ssuser.MYEMPS3) by odbc;
quit;
/*SERVER2,SSPART2*/
proc sql noerrorstop;
connect to odbc (UID=ssuser PWD=sspwd DSN=SSPART2);
execute (drop view MYEMPS) by odbc;
execute (create view MYEMPS AS
SELECT * FROM users.ssuser.MYEMPS2
UNION ALL
SELECT * FROM SERVER1.users.ssuser.MYEMPS1
UNION ALL
SELECT * FROM SERVER3.users.ssuser.MYEMPS3) by odbc;
quit;
/*SERVER3,SSPART3*/
proc sql noerrorstop;
connect to odbc (UID=ssuser PWD=sspwd DSN=SSPART3);
execute (drop view MYEMPS) by odbc;
execute (create view MYEMPS AS
SELECT * FROM users.ssuser.MYEMPS3
UNION ALL
SELECT * FROM SERVER2.users.ssuser.MYEMPS2
UNION ALL
SELECT * FROM SERVER1.users.ssuser.MYEMPS1) by odbc;
quit;
This creates a global view that references the entire data set.
proc print data=trlib.MYEMPS(DBLICE=(sspart1="EMPNUM BETWEEN 1 AND 33"
sspart2="EMPNUM BETWEEN 34 AND 66"
sspart3="EMPNUM BETWEEN 67 AND 100"));
run;
The DBSLICE= option contains the Microsoft SQL Server partitioning information.
This configuration lets the ODBC interface access the data for the MYEMPS view directly from each subtable on the corresponding Microsoft SQL Server instance. The data is inserted directly into each subtable, but this process can also be accomplished by using the global view to divide up the data. For example, you can create empty tables and then create the view as seen in the example with the UNION ALL construct. You can then insert the data into the view MYEMPS. The CHECK constraints allow the Microsoft SQL Server query processor to determine which subtables should receive the data.
Other tuning options are available when you configure Microsoft SQL Server to use partitioned data. For more information, see the "Creating a Partitioned View" and "Using Partitioned Views" sections in Creating and Maintaining Databases (SQL Server 2000).