The GAMPL Procedure

Thin-Plate Regression Splines

The GAMPL procedure uses thin-plate regression splines (Wood 2003) to construct spline basis expansions. The thin-plate regression splines are based on thin-plate smoothing splines (Duchon 1976, 1977). Compared to thin-plate smoothing splines, thin-plate regression splines produce fewer basis expansions and thus make direct fitting of generalized additive models possible.

Thin-Plate Smoothing Splines

Consider the problem of estimating a smoothing function f of bold x with d covariates from n observations. The model assumes

y Subscript i Baseline equals f left-parenthesis bold x Subscript i Baseline right-parenthesis plus epsilon Subscript i Baseline comma i equals 1 comma ellipsis comma n

Then the thin-plate smoothing splines estimate the smoothing function f by minimizing the penalized least squares function:

sigma-summation Underscript i equals 1 Overscript n Endscripts left-parenthesis y Subscript i Baseline minus f left-parenthesis bold x Subscript i Baseline right-parenthesis right-parenthesis squared plus lamda upper J Subscript m comma d Baseline left-parenthesis f right-parenthesis

The penalty term lamda upper J Subscript m comma d Baseline left-parenthesis f right-parenthesis includes the function that measures roughness on the f estimate:

upper J Subscript m comma d Baseline left-parenthesis f right-parenthesis equals integral midline-horizontal-ellipsis integral sigma-summation Underscript alpha 1 plus midline-horizontal-ellipsis plus alpha Subscript d Baseline equals m Endscripts StartFraction m factorial Over alpha 1 factorial midline-horizontal-ellipsis alpha Subscript d Baseline factorial EndFraction Start 1 By 1 Matrix 1st Row  StartFraction partial-differential Superscript m Baseline f Over partial-differential x 1 Superscript alpha 1 Baseline midline-horizontal-ellipsis partial-differential x Subscript d Baseline Superscript alpha Super Subscript d Superscript Baseline EndFraction EndMatrix squared normal d x 1 midline-horizontal-ellipsis normal d x Subscript d

The parameter m (which corresponds to the M= option for a spline effect) specifies how the penalty is applied to the function roughness. Function derivatives whose order is less than m are not penalized. The relation 2 m greater-than d must be satisfied.

The penalty term also includes the smoothing parameter lamda element-of left-bracket 0 comma normal infinity right-parenthesis, which controls the trade-off between the model’s fidelity to the data and the function smoothness of f. When lamda equals 0, the function estimate corresponds to an interpolation. When lamda right-arrow normal infinity, the function estimate becomes the least squares fit. By using the defined penalized least squares criterion and a fixed lamda value, you can explicitly express the estimate of the smooth function f in the following form:

f Subscript lamda Baseline left-parenthesis bold x right-parenthesis equals sigma-summation Underscript j equals 1 Overscript upper M Endscripts theta Subscript j Baseline phi Subscript j Baseline left-parenthesis bold x right-parenthesis plus sigma-summation Underscript i Overscript n Endscripts delta Subscript i Baseline eta Subscript m comma d Baseline left-parenthesis double-vertical-bar bold x minus bold x Subscript i Baseline double-vertical-bar right-parenthesis

In the expression of f Subscript lamda Baseline left-parenthesis bold x right-parenthesis, delta Subscript i and theta Subscript j are coefficients to be estimated. The functions phi Subscript j Baseline left-parenthesis bold x right-parenthesis correspond to unpenalized polynomials of bold x with degrees up to m minus 1. The total number of these polynomials is upper M equals StartBinomialOrMatrix m plus d minus 1 Choose d EndBinomialOrMatrix. The function eta Subscript m comma d models the extra nonlinearity besides the polynomials and is a function of the Euclidean distance r between any bold x value and an observed bold x Subscript i value:

eta Subscript m comma d Baseline left-parenthesis r right-parenthesis equals StartLayout Enlarged left-brace 1st Row 1st Column StartFraction left-parenthesis negative 1 right-parenthesis Superscript m plus 1 plus d slash 2 Baseline Over 2 Superscript 2 m minus 1 Baseline pi Superscript d slash 2 Baseline left-parenthesis m minus 1 right-parenthesis factorial left-parenthesis m minus d slash 2 right-parenthesis factorial EndFraction r Superscript 2 m minus d Baseline log left-parenthesis r right-parenthesis 2nd Column if d is even 2nd Row 1st Column StartFraction normal upper Gamma left-parenthesis d slash 2 minus m right-parenthesis Over 2 Superscript 2 m Baseline pi Superscript d slash 2 Baseline left-parenthesis m minus 1 right-parenthesis factorial EndFraction r Superscript 2 m minus d Baseline 2nd Column if d is odd EndLayout

Define the penalty matrix bold upper E such that each entry upper E Subscript i j Baseline equals eta Subscript m comma d Baseline left-parenthesis double-vertical-bar bold x Subscript i Baseline minus bold x Subscript j Baseline double-vertical-bar right-parenthesis, let bold y be the vector of the response, let bold upper T be the matrix where each row is formed by phi Subscript j Baseline left-parenthesis bold x right-parenthesis, and let bold-italic theta and bold-italic delta be vectors of coefficients theta Subscript j and delta Subscript i. Then you can obtain the function estimate f Subscript lamda from the following minimization problem:

min double-vertical-bar bold y minus bold upper T bold-italic theta minus bold upper E bold-italic delta double-vertical-bar squared plus lamda bold-italic delta prime bold upper E bold-italic delta subject to bold upper T prime bold-italic delta equals bold 0

For more information about thin-plate smoothing splines, see Chapter 125, The TPSPLINE Procedure.

Low-Rank Approximation

Given the representation of the thin-plate smoothing spline, the estimate of f involves as many parameters as the number of unique data points. Solving left-parenthesis bold-italic theta comma bold-italic delta right-parenthesis with an optimum lamda becomes difficult for large problems.

Because the matrix bold upper E is symmetric and nonnegative definite, the eigendecomposition can be taken as bold upper E equals bold upper U bold upper D bold upper U prime, where bold upper D is the diagonal matrix of eigenvalues d Subscript i of bold upper E, and bold upper U is the matrix of eigenvectors that corresponds to d Subscript i. The truncated eigendecomposition forms bold upper E overTilde Subscript k, which is an approximation to bold upper E such that

bold upper E overTilde Subscript k Baseline equals bold upper U Subscript k Baseline bold upper D Subscript k Baseline bold upper U prime Subscript k

where bold upper D Subscript k is a diagonal matrix that contains the k most extreme eigenvalues in descending order of absolute values: StartAbsoluteValue d overTilde Subscript 1 Baseline EndAbsoluteValue greater-than midline-horizontal-ellipsis greater-than StartAbsoluteValue d overTilde Subscript k Baseline EndAbsoluteValue. bold upper U Subscript k is the matrix that is formed by columns of eigenvectors that correspond to the eigenvalues in bold upper D Subscript k.

The approximation bold upper E overTilde Subscript k not only reduces the dimension from n times n of bold upper E to n times k but also is optimal in two senses. First, bold upper E overTilde Subscript k minimizes the spectral norm double-vertical-bar bold upper E minus bold upper F Subscript k Baseline double-vertical-bar Subscript 2 between bold upper E and all rank k matrices bold upper F Subscript k. Second, bold upper E overTilde Subscript k also minimizes the worst possible change that is introduced by the eigenspace truncation as defined by

max Underscript bold-italic delta not-equals bold 0 Endscripts StartFraction bold-italic delta prime left-parenthesis bold upper E minus bold upper G Subscript k Baseline right-parenthesis bold-italic delta Over double-vertical-bar bold-italic delta double-vertical-bar squared EndFraction

where bold upper G Subscript k is formed by any k eigenvalues and corresponding eigenvectors. For more information, see Wood (2003).

Now given bold upper E almost-equals bold upper E overTilde Subscript k and bold upper E overTilde Subscript k Baseline equals bold upper U Subscript k Baseline bold upper D Subscript k Baseline bold upper U prime Subscript k, and letting bold-italic delta Subscript k Baseline equals bold upper U prime Subscript k Baseline bold-italic delta, the minimization problem becomes

min double-vertical-bar bold y minus bold upper T bold-italic theta minus bold upper U Subscript k Baseline bold upper D Subscript k Baseline bold-italic delta Subscript k Baseline double-vertical-bar squared plus lamda bold-italic delta prime Subscript k Baseline bold upper D Subscript k Baseline bold-italic delta Subscript k Baseline subject to bold upper T prime bold upper U Subscript k Baseline bold-italic delta Subscript k Baseline equals bold 0

You can turn the constrained optimization problem into an unconstrained one by using any orthogonal column basis bold upper Z. One way to form bold upper Z is via the QR decomposition of bold upper U prime Subscript k Baseline bold upper T:

bold upper U prime Subscript k Baseline bold upper T equals left-bracket bold upper Q 1 bold upper Q 2 right-bracket StartBinomialOrMatrix bold upper R Choose bold 0 EndBinomialOrMatrix

Let bold upper Z equals bold upper Q 2. Then it is verified that

bold upper T prime bold upper U Subscript k Baseline bold upper Z equals bold upper R prime bold upper Q prime 1 bold upper Q 2 equals bold 0

So for bold-italic delta Subscript k such that bold upper T prime bold upper U Subscript k Baseline bold-italic delta Subscript k Baseline equals bold 0, it is true that bold-italic delta Subscript k Baseline equals bold upper Z bold-italic delta overTilde. Now the problem becomes the unconstrained optimization,

min double-vertical-bar bold y minus bold upper T bold-italic theta minus bold upper U Subscript k Baseline bold upper D Subscript k Baseline bold upper Z bold-italic delta overTilde double-vertical-bar squared plus lamda bold-italic delta overTilde prime bold upper Z prime bold upper D Subscript k Baseline bold upper Z bold-italic delta overTilde

Let

bold-italic beta equals StartBinomialOrMatrix bold-italic theta Choose bold-italic delta overTilde EndBinomialOrMatrix comma bold upper X equals left-bracket bold upper T colon bold upper U Subscript k Baseline bold upper D Subscript k Baseline bold upper Z right-bracket comma and bold upper S equals Start 2 By 2 Matrix 1st Row 1st Column bold 0 2nd Column bold 0 2nd Row 1st Column bold 0 2nd Column bold upper Z prime bold upper D Subscript k Baseline bold upper Z EndMatrix

The optimization is simplified as

min double-vertical-bar bold y minus bold upper X bold-italic beta double-vertical-bar squared plus lamda bold-italic beta prime bold upper S bold-italic beta with respect to bold-italic beta
Last updated: December 09, 2022