The GAMPL Procedure

Fitting Algorithms

For models that assume a normally distributed response variable, you can minimize the model evaluation criteria directly by searching for optimal smoothing parameters. For models that have nonnormal distributions, you cannot use the model evaluation criteria directly because the involved statistics keep changing between iterations. The GAMPL procedure enables you to use either of two fitting approaches to search for optimum models: the outer iteration method and the performance iteration method. The outer iteration method modifies the model evaluation criteria so that a global objective function can be minimized in order to find the best smoothing parameters. The performance iteration method minimizes a series of objective functions in an iterative fashion and then obtains the optimum smoothing parameters at convergence. For large data sets, the performance iteration method usually converges faster than the outer iteration method.

Outer Iteration

The outer iteration method is outlined in Wood (2006). The method uses modified model evaluation criteria, which are defined as follows:

StartLayout 1st Row 1st Column script upper V Subscript g Superscript o Baseline left-parenthesis bold-italic lamda right-parenthesis 2nd Column equals 3rd Column StartFraction n upper D Subscript bold-italic lamda Baseline left-parenthesis bold-italic mu right-parenthesis Over left-parenthesis n minus gamma normal t normal r left-parenthesis bold upper H Subscript bold-italic lamda Baseline right-parenthesis right-parenthesis squared EndFraction 2nd Row 1st Column script upper V Subscript u Superscript o Baseline left-parenthesis bold-italic lamda right-parenthesis 2nd Column equals 3rd Column StartFraction upper D Subscript bold-italic lamda Baseline left-parenthesis bold-italic mu right-parenthesis Over n EndFraction minus StartFraction 2 Over n EndFraction sigma squared normal t normal r left-parenthesis bold upper I minus gamma bold upper H Subscript bold-italic lamda Baseline right-parenthesis plus sigma squared 3rd Row 1st Column script upper V Subscript a Superscript o Baseline left-parenthesis bold-italic lamda right-parenthesis 2nd Column equals 3rd Column StartFraction upper D Subscript bold-italic lamda Baseline left-parenthesis bold-italic mu right-parenthesis Over n EndFraction plus StartFraction 2 gamma normal t normal r left-parenthesis bold upper H Subscript bold-italic lamda Baseline right-parenthesis upper P Subscript bold-italic lamda Baseline Over n normal t normal r left-parenthesis bold upper I minus bold upper H Subscript bold-italic lamda Baseline right-parenthesis EndFraction EndLayout

By replacing double-vertical-bar bold y minus bold upper H Subscript bold-italic lamda Baseline bold y double-vertical-bar squared with model deviance upper D Subscript bold-italic lamda Baseline left-parenthesis bold-italic mu right-parenthesis, the modified model evaluation criteria relate to the smoothing parameter bold-italic lamda in a direct way such that the analytic gradient and Hessian are available in explicit forms. The Pearson’s statistic upper P Subscript bold-italic lamda is used in the GACV criterion script upper V Subscript a Superscript o Baseline left-parenthesis bold-italic lamda right-parenthesis (Wood 2008). The algorithm for the outer iteration is thus as follows:

  1. Initialize smoothing parameters by taking one step of performance iteration based on adjusted response and adjusted weight except for spline terms with initial values that are specified in the INITSMOOTH= option.

  2. Search for the best smoothing parameters by minimizing the modified model evaluation criteria. The optimization process stops when any of the convergence criteria that are specified in the SMOOTHOPTIONS option is met. At each optimization step:

    1. Initialize by setting initial regression parameters bold-italic beta equals StartSet g left-parenthesis ModifyingAbove bold y With caret right-parenthesis comma 0 comma ellipsis comma 0 EndSet prime. Set the initial dispersion parameter if necessary.

    2. Search for the best regression parameters bold-italic beta by minimizing the penalized deviance upper D Subscript p (or maximizing the penalized likelihood script l Subscript p for negative binomial distribution). The optimization process stops when any of the convergence criteria that are specified in the PLIKEOPTIONS option is met.

    3. At convergence, evaluate derivatives of the model evaluation criteria with respect to bold-italic lamda by using partial-differential upper D Subscript p slash partial-differential bold-italic beta, partial-differential squared upper D Subscript p slash left-parenthesis partial-differential bold-italic beta partial-differential bold-italic beta prime right-parenthesis, partial-differential bold-italic beta slash partial-differential lamda Subscript j, and partial-differential squared bold-italic beta slash left-parenthesis partial-differential lamda Subscript j Baseline partial-differential lamda Subscript k Baseline right-parenthesis.

Step 2b usually converges quickly because it is essentially penalized likelihood estimation given that upper D Subscript p Baseline equals 2 phi left-parenthesis script l Subscript max Baseline minus script l right-parenthesis plus bold-italic beta bold upper S Subscript bold-italic lamda Baseline bold-italic beta prime. Step 2c contains involved computation by using the chain rule of derivatives. For more information about computing derivatives of script upper V Subscript g Superscript o and script upper V Subscript u Superscript o, see Wood (2008, 2011).

Performance Iteration

The performance iteration method is proposed by Gu and Wahba (1991). Wood (2004) modifies and stabilizes the algorithm for fitting generalized additive models. The algorithm for the performance iteration method is as follows:

  1. Initialize smoothing parameters bold-italic lamda equals StartSet 1 comma ellipsis comma 1 EndSet, except for spline terms whose initial values are specified in the INITSMOOTH= option. Set the initial regression parameters bold-italic beta equals StartSet g left-parenthesis bold y overbar right-parenthesis comma 0 comma ellipsis comma 0 EndSet prime. Set the initial dispersion parameter if necessary.

  2. Repeat the following steps until any of these three conditions is met: (1) the absolute function change in penalized likelihood is sufficiently small; (2) the absolute relative function change in penalized likelihood is sufficiently small; (3) the number of iterations exceeds the maximum iteration limit.

    1. Form the adjusted response and adjusted weight from bold-italic mu equals g Superscript negative 1 Baseline left-parenthesis bold-italic eta right-parenthesis. For each observation,

      z Subscript i Baseline equals eta Subscript i Baseline plus left-parenthesis y Subscript i Baseline minus mu Subscript i Baseline right-parenthesis slash mu Subscript i Superscript prime Baseline comma w Subscript i Baseline equals omega Subscript i Baseline mu Subscript i Superscript prime 2 Baseline slash nu Subscript i Baseline
    2. Search for the best smoothing parameters for the current iteration based on valid adjusted response values and adjusted weight values.

    3. Use the smoothing parameters to construct the linear predictor and the predicted means.

    4. Obtain an estimate for the dispersion parameter if necessary.

In step 2b, you can use different optimization techniques to search for the best smoothing parameters. The Newton-Raphson optimization is efficient in finding the optimum bold-italic lamda where the first- and second-order derivatives are available. For more information about computing derivatives of script upper V Subscript g and script upper V Subscript u with respect to bold-italic lamda, see Wood (2004).

Last updated: December 09, 2022