Shared Concepts and Topics

Polynomial Effects

  • EFFECT name=POLYNOMIAL (var-list </ polynomial-options>);

  • EFFECT name=POLY (var-list </ polynomial-options>);

The variables in var-list must be numeric. A design matrix column is generated for each term of the specified polynomial. By default, each of these terms is treated as a separate effect for the purpose of model building. For example, the statements

proc glmselect;
   effect MyPoly = polynomial(x1-x3/degree=2);
   model y = MyPoly;
run;

yield the identical analysis to the statements

proc glmselect;
   model y = x1 x2 x3 x1*x1 x1*x2 x1*x3 x2*x2 x2*x3 x3*x3;
run;

You can specify the following polynomial-options after a slash (/):

DEGREE=n

specifies the degree of the polynomial. The degree must be a positive integer. The degree is typically a small integer, such as 1, 2, or 3. The default is DEGREE=1.

DETAILS

requests a table that shows the details of the specified polynomial, including the number of terms generated. If you also specify the STANDARDIZE option, then a table that shows the standardization details is also produced.

LABELSTYLE=(style-opts)
LABELSTYLE=style-opt

specifies how the terms in the polynomial are labeled. By default, powers are shown with ^ as the exponentiation operator and * as the multiplication operator. For example, a polynomial term such as x 1 cubed x 2 x 3 squared is labeled x1^3*x2*x3^2. You can change the style of the label by using the following style-opts within parentheses. If you specify a single style-opt, then you can omit the enclosing parentheses.

EXPAND

specifies that each variable with an exponent greater than 1 be written as products of that variable. For example, the term x 1 cubed x 2 x 3 squared receives the label x1*x1*x1*x2*x3*x3.

EXPONENT <=quoted string>

specifies that each variable with an exponent greater than 1 be written using exponential notation. By default, the symbol ^ is used as the exponentiation operator. If you supply the optional quoted string after an equal sign, then that string is used as the exponentiation operator. For example, if you specify

LABELSTYLE=(EXPONENT="**")

then the term x 1 cubed x 2 x 3 squared receives the label x1**3*x2*x3**2.

INCLUDENAME

specifies that the name of the effect followed by an underscore be used as a prefix for term labels. For example, the following statement generates terms with labels MyPoly_x1 and MyPoly_x1^2:


EFFECT MyPoly=POLYNOMIAL(x1/degree=2 labelstyle=INCLUDENAME)

The INCLUDENAME option is ignored if you also specify the NOSEPARATE option in the EFFECT=POLYNOMIAL statement.

PRODUCTSYMBOL=NONE | quoted string

specifies that the supplied string be used as the product symbol. For example, the following statement generates terms with labels x1, x2, and x1 x2:


EFFECT MyPoly=POLYNOMIAL(x1 x2 / degree=2 mdegree=1
                                 labelstyle=(PRODUCTSYMBOL=" "))

If you specify PRODUCTSYMBOL=NONE, then the labels are formed by juxtaposing the constituent variable names.

MDEGREE=n

specifies the maximum degree of any variable in a term of the polynomial. This degree must be a positive integer. The default is the degree of the specified polynomial. For example, the following statement generates the terms x 1, x 2, x 1 squared, x 1 x 2, x 2 squared, x 1 squared x 2, x 1 x 2 squared and x 1 squared x 2 squared:


EFFECT MyPoly=POLYNOMIAL(x1 x2/degree=4 MDEGREE=2);
NOSEPARATE

specifies that the polynomial be treated as a single effect with multiple degrees of freedom. The effect name that you specify is used as the constructed effect name, and the labels of the terms are used as labels of the corresponding parameters.

STANDARDIZE <(centerscale-opts)> <= standardize-opt>

specifies that the variables that define the polynomial be standardized. By default, the standardized variables receive prefix "s_" in the variable names.

You can use the following centerscale-opts to specify how the center and scale are estimated:

METHOD=MOMENTS

specifies that the center be estimated by the variable mean and the scale be estimated by the standard deviation. If a weight variable is specified using a WEIGHT statement, the observations with invalid weights are ignored when forming the mean and standard deviation, but the weights are otherwise not used. Only observations that are used in performing the analysis are used for the standardization.

METHOD=RANGE

specifies that the center be estimated by the midpoint of the variable range and the scale be estimated as half the variable range. Any observation that has a missing value for any regressor used in the model is ignored when computing the range of variables in a polynomial effect. Observations with valid regressor values but missing or invalid values of frequency variables, weight variables, or dependent variables are used in computing variable ranges. The default (if you do not specify the METHOD= suboption) is METHOD=RANGE.

METHOD=WMOMENTS

is the same as METHOD=MOMENTS except that weighted means and weighted standard deviations are used.

Let

StartLayout 1st Row 1st Column n 2nd Column equals 3rd Column normal n normal u normal m normal b normal e normal r normal o normal f normal o normal b normal s normal e normal r normal v normal a normal t normal i normal o normal n normal s normal u normal s normal e normal d normal i normal n normal t normal h normal e normal a normal n normal a normal l normal y normal s normal i normal s 2nd Row 1st Column w 2nd Column equals 3rd Column normal w normal e normal i normal g normal h normal t normal v normal a normal r normal i normal a normal b normal l normal e 3rd Row 1st Column f 2nd Column equals 3rd Column normal f normal r normal e normal q normal u normal e normal n normal c normal y normal v normal a normal r normal i normal a normal b normal l normal e 4th Row 1st Column x 2nd Column equals 3rd Column normal v normal a normal r normal i normal a normal b normal l normal e normal t normal o normal b normal e normal s normal t normal a normal n normal d normal a normal r normal d normal i normal z normal e normal d 5th Row 1st Column x Subscript left-parenthesis n right-parenthesis 2nd Column equals 3rd Column Max Subscript i equals 1 Superscript n Baseline left-parenthesis x Subscript i Baseline right-parenthesis 6th Row 1st Column x Subscript left-parenthesis 1 right-parenthesis 2nd Column equals 3rd Column Min Subscript i equals 1 Superscript n Baseline left-parenthesis x Subscript i Baseline right-parenthesis 7th Row 1st Column upper F 2nd Column equals 3rd Column normal s normal u normal m normal o normal f normal f normal r normal e normal q normal u normal e normal n normal c normal i normal e normal s 8th Row 1st Column Blank 2nd Column equals 3rd Column normal upper Sigma Subscript i equals 1 Superscript n Baseline f Subscript i 9th Row 1st Column normal upper W normal upper F 2nd Column equals 3rd Column normal s normal u normal m normal o normal f normal w normal e normal i normal g normal h normal t normal e normal d normal f normal r normal e normal q normal u normal e normal n normal c normal i normal e normal s 10th Row 1st Column Blank 2nd Column equals 3rd Column normal upper Sigma Subscript i equals 1 Superscript n Baseline w Subscript i Baseline f Subscript i EndLayout

Table 13 shows how the center and scale are computed for each of the supported methods.

Table 13: Center and Scale Estimates by Method

Method Center Scale
Range left-parenthesis x Subscript left-parenthesis n right-parenthesis Baseline plus x Subscript left-parenthesis 1 right-parenthesis Baseline right-parenthesis slash 2 left-parenthesis x Subscript left-parenthesis n right-parenthesis Baseline minus x Subscript left-parenthesis 1 right-parenthesis Baseline right-parenthesis slash 2
Moments x overbar equals normal upper Sigma Subscript i equals 1 Superscript n Baseline f Subscript i Baseline x Subscript i Baseline slash upper F StartRoot normal upper Sigma Subscript i equals 1 Superscript n Baseline f Subscript i Baseline left-parenthesis x Subscript i Baseline minus x overbar right-parenthesis squared slash left-parenthesis upper F minus 1 right-parenthesis EndRoot
WMoments x overbar Subscript w Baseline equals normal upper Sigma Subscript i equals 1 Superscript n Baseline w Subscript i Baseline f Subscript i Baseline x Subscript i Baseline slash normal upper W normal upper F StartRoot normal upper Sigma Subscript i equals 1 Superscript n Baseline w Subscript i Baseline f Subscript i Baseline left-parenthesis x Subscript i Baseline minus x overbar Subscript w Baseline right-parenthesis squared slash left-parenthesis upper F minus 1 right-parenthesis EndRoot


PREFIX=NONE | quoted-string

specifies the prefix that is appended to standardized variables when forming the term labels. If you omit this option, the default prefix is "s_". If you specify PREFIX=NONE, then standardized variables are not prefixed.

You can control whether the standardization is to center, scale, or both center and scale by specifying a standardize-opt:

CENTER

specifies that variables be centered but not scaled. For a variable x,

s normal bar x equals x minus center
CENTERSCALE

specifies that variables be centered and scaled. This is the default if you do not specify a standardization-opt. For a variable x,

s normal bar x equals StartFraction x minus center Over scale EndFraction
NONE

specifies that no standardization be performed.

SCALE

specifies that variables be scaled but not centered. For a variable x,

s normal bar x equals StartFraction x Over scale EndFraction

Last updated: December 09, 2022