Uses one or multiple columns to distribute table rows across database segments.
| Valid in: | DATA and PROC steps (when accessing DBMS data using SAS/ACCESS software) |
|---|---|
| Category: | Data Set Control |
| Default: | RANDOMLY DISTRIBUTED |
| Data source: | Greenplum, HAWQ |
Table of Contents
specifies a DBMS column name.
determines the column or set of columns that the Greenplum database uses to distribute table rows across database segments. This is known as round-robin distribution.
For uniform distribution—namely, so that table records are stored evenly across segments (machines) that are part of the database configuration—the distribution key should be as unique as possible.
libname x greenplm user=myusr1 password=mypwd1 dsn=mysrv1;
data x.sales (dbtype=(id=int qty=int amt=int)
distributed_by='distributed by (id)');
id = 1;
qty = 100;
sales_date = '27Aug2009'd;
amt = 20000;
run;
It creates the SALES table.
CREATE TABLE SALES
(id int,
qty int,
sales_date double precision,
amt int
) distributed by (id)