The NLIN Procedure

Example 88.4 Affecting Curvature through Parameterization

(View the complete code for this example.)

The work of Ratkowsky (1983, 1990) has brought into focus the importance of close-to-linear behavior of parameters in nonlinear regression models. The curvature in a nonlinear model consists of two components: the intrinsic curvature and the parameter-effects curvature. See the section Relative Curvature Measures of Nonlinearity for details. Intrinsic curvature expresses the degree to which the nonlinear model bends as values of the parameters change. This is not the same as the curviness of the model as a function of the covariates (the x variables). Intrinsic curvature is a function of the type of model you are fitting and the data. This curvature component cannot be affected by reparameterization of the model. According to Ratkowsky (1983), the intrinsic curvature component is typically smaller than the parameter-effects curvature, which can be affected by altering the parameterization of the model.

In models with low curvature, the nonlinear least squares parameter estimators behave similarly to least squares estimators in linear regression models, which have a number of desirable properties. If the model is correct, they are best linear unbiased estimators and are normally distributed if the model errors are normal (otherwise they are asymptotically normal). As you lower the curvature of a nonlinear model, you can expect that the parameter estimators approach the behavior of the linear regression model estimators: they behave "close to linear."

This example uses a simple data set and a commonly applied model for dose-response relationships to examine how the parameter-effects curvature can be reduced. The statistics by which an estimator’s behavior is judged are Box’s bias (Box 1971) and Hougaard’s measure of skewness (Hougaard 1982, 1985).

The log-logistic model

normal upper E left-bracket upper Y vertical-bar x right-bracket equals delta plus StartFraction alpha minus delta Over 1 plus gamma exp left-brace beta ln left-parenthesis x right-parenthesis right-brace EndFraction

is a popular model to express the response Y as a function of dose x. The response is bounded between the asymptotes alpha and delta. The term in the denominator governs the transition between the asymptotes and depends on two parameters, gamma and beta. The log-logistic model can be viewed as a member of a broader class of dose-response functions, those relying on switch-on or switch-off mechanisms (see, for example, Schabenberger and Pierce 2002, sec. 5.8.6). A switch function is usually a monotonic function upper S left-parenthesis x comma bold-italic theta right-parenthesis that takes values between 0 and 1. A switch-on function increases in x; a switch-off function decreases in x. In the log-logistic case, the function

upper S left-parenthesis x comma left-bracket beta comma gamma right-bracket right-parenthesis equals StartFraction 1 Over 1 plus gamma exp left-brace beta ln left-parenthesis x right-parenthesis right-brace EndFraction

is a switch-off function for beta greater-than 0 and a switch-on function for beta less-than 0. You can write general dose-response functions with asymptotes simply as

normal upper E left-bracket upper Y vertical-bar x right-bracket equals mu Subscript min Baseline plus left-parenthesis mu Subscript max Baseline minus mu Subscript min Baseline right-parenthesis upper S left-parenthesis x comma bold-italic theta right-parenthesis

The following DATA step creates a small data set from a dose-response experiment with response y:

data logistic;
   input dose y;
   logdose = log(dose);
   datalines;
0.009  106.56
0.035   94.12
0.07    89.76
0.15    60.21
0.20    39.95
0.28    21.88
0.50     7.46
;

A graph of these data is produced with the following statements:

proc sgplot data=logistic;
   scatter y=y x=dose;
   xaxis type=log logstyle=linear;
run;

Output 88.4.1: Observed Data in Dose-Response Experiment

 Observed Data in Dose-Response Experiment


When dose is expressed on the log scale, the sigmoidal shape of the dose-response relationship is clearly visible (Output 88.4.1). The log-logistic switching model in the preceding parameterization is fit with the following statements in the NLIN procedure:

proc nlin data=logistic bias hougaard nlinmeasures;
   parameters alpha=100 beta=3 gamma=300;
   delta = 0;
   Switch = 1/(1+gamma*exp(beta*log(dose)));
   model y = delta + (alpha - delta)*Switch;
run;

The lower asymptote delta is assumed to be 0 in this case. Since delta is not listed in the PARAMETERS statement and is assigned a value in the program, it is assumed to be constant. Note that the term Switch is the switch-off function in the log-logistic model. The BIAS and HOUGAARD options in the PROC NLIN statement request that Box’s bias, percentage bias, and Hougaard’s skewness measure be added to the table of parameter estimates, and the NLINMEASURES option requests that the global nonlinearity measures be produced.

The NLIN procedure converges after 10 iterations and achieves a residual mean squared error of 15.1869 (Figure 15). This value is not that important by itself, but it is worth noting since this model fit is compared to the fit with other parameterizations later on.

Figure 15: Iteration History and Analysis of Variance

The NLIN Procedure
Dependent Variable y
Method: Gauss-Newton

Iterative Phase
Iter alpha beta gamma Sum of Squares
0 100.0 3.0000 300.0 386.4
1 100.4 2.8011 162.8 129.1
2 100.8 2.6184 101.4 69.2710
3 101.3 2.4266 69.7579 68.2167
4 101.7 2.3790 69.0358 60.8223
5 101.8 2.3621 67.3709 60.7516
6 101.8 2.3582 67.0044 60.7477
7 101.8 2.3573 66.9150 60.7475
8 101.8 2.3571 66.8948 60.7475
9 101.8 2.3570 66.8902 60.7475
10 101.8 2.3570 66.8892 60.7475

NOTE: Convergence criterion met.


Note: An intercept was not specified for this model.

Source DF Sum of Squares Mean Square F Value Approx
Pr > F
Model 3 33965.4 11321.8 745.50 <.0001
Error 4 60.7475 15.1869    
Uncorrected Total 7 34026.1      


The table of parameter estimates displays the estimates of the three model parameters, their approximate standard errors, 95% confidence limits, Hougaard’s skewness measure, Box’s bias, and percentage bias (Figure 16). Parameters for which the skewness measure is less than 0.1 in absolute value and with percentage bias less than 1% exhibit very close-to-linear behavior, and skewness values less than 0.25 in absolute value indicate reasonably close-to-linear behavior (Ratkowsky 1990). According to these rules, the estimators ModifyingAbove beta With caret and ModifyingAbove gamma With caret suffer from substantial curvature. The estimator ModifyingAbove gamma With caret is especially "far-from-linear." Inferences that involve ModifyingAbove gamma With caret and rely on the reported standard errors or confidence limits (or both) for this parameter might be questionable.

Figure 16: Parameter Estimates, Hougaard’s Skewness, and Box’s Bias

Parameter Estimate Approx
Std Error
Approximate 95% Confidence
Limits
Skewness Bias Percent
Bias
alpha 101.8 3.0034 93.4751 110.2 0.1415 0.1512 0.15
beta 2.3570 0.2928 1.5440 3.1699 0.4987 0.0303 1.29
gamma 66.8892 31.6146 -20.8870 154.7 1.9200 10.9230 16.3


The related global nonlinearity measures output table (Figure 17) shows that both the maximum and RMS parameter-effects curvature are substantially larger than the critical curvature value recommended by Bates and Watts (1980). In contrast, the intrinsic curvatures of the model are less than the critical value. This implies that most of the nonlinearity can be removed by reparameterization.

Figure 17: Global Nonlinearity Measures

Global Nonlinearity Measures
Max Intrinsic Curvature 0.2397
RMS Intrinsic Curvature 0.1154
Max Parameter-Effects Curvature 4.0842
RMS Parameter-Effects Curvature 1.8198
Curvature Critical Value 0.3895
Raw Residual Variance 15.187
Projected Residual Variance 5.922


One method of reducing the parameter-effects curvature, and thereby reduce the bias and skewness of the parameter estimators, is to replace a parameter with its expected-value parameterization. Schabenberger et al. (1999) and Schabenberger and Pierce (2002, sec. 5.7.2) refer to this method as reparameterization through defining relationships. A defining relationship is obtained by equating the mean response at a chosen value of x (say, x Superscript asterisk) to the model:

normal upper E left-bracket upper Y vertical-bar x Superscript asterisk Baseline right-bracket equals delta plus StartFraction alpha minus delta Over 1 plus gamma exp left-brace beta ln left-parenthesis x Superscript asterisk Baseline right-parenthesis right-brace EndFraction

This equation is then solved for a parameter that is subsequently replaced in the original equation. This method is particularly useful if x Superscript asterisk has an interesting interpretation. For example, let lamda Subscript upper K denote the value that reduces the response by upper K times 100%,

normal upper E left-bracket upper Y vertical-bar lamda Subscript upper K Baseline right-bracket equals delta plus left-parenthesis StartFraction 100 minus upper K Over 100 EndFraction right-parenthesis left-parenthesis alpha minus delta right-parenthesis

Because gamma exhibits large bias and skewness, it is the target in the first round of reparameterization. Setting the expression for the conditional mean at lamda Subscript upper K equal to the mean function when x equals lamda Subscript upper K yields the following expression:

delta plus left-parenthesis StartFraction 100 minus upper K Over 100 EndFraction right-parenthesis left-parenthesis alpha minus delta right-parenthesis equals delta plus StartFraction alpha minus delta Over 1 plus gamma exp left-brace beta ln left-parenthesis lamda Subscript upper K Baseline right-parenthesis right-brace EndFraction

This expression is solved for gamma, and the result is substituted back into the model equation. This leads to a log-logistic model in which gamma is replaced by the parameter lamda Subscript upper K, the dose at which the response was reduced by upper K times 100%. The new model equation is

normal upper E left-bracket upper Y vertical-bar x right-bracket equals delta plus StartFraction alpha minus delta Over 1 plus upper K slash left-parenthesis 100 minus upper K right-parenthesis exp left-brace beta ln left-parenthesis x slash lamda Subscript upper K Baseline right-parenthesis right-brace EndFraction

A particularly interesting choice is K = 50, since lamda 50 is the dose at which the response is halved. In studies of mortality, this concentration is also known as the LD50. For the special case of lamda 50 the model equation becomes

normal upper E left-bracket upper Y vertical-bar x right-bracket equals delta plus StartFraction alpha minus delta Over 1 plus exp left-brace beta ln left-parenthesis x slash lamda 50 right-parenthesis right-brace EndFraction

You can fit the model in the LD50 parameterization with the following statements:

proc nlin data=logistic bias hougaard;
   parameters alpha=100 beta=3 LD50=0.15;
   delta = 0;
   Switch = 1/(1+exp(beta*log(dose/LD50)));
   model y = delta + (alpha - delta)*Switch;
   output out=nlinout pred=p lcl=lcl ucl=ucl;
run;

Partial results from this NLIN run are shown in Figure 18. The analysis of variance tables in Figure 15 and Figure 18 are identical. Changing the parameterization of a model does not affect the model fit. It does, however, affect the interpretation of the parameters and the statistical properties (close-to-linear behavior) of the parameter estimators. The skewness and bias measures of the parameter LD50 is considerably reduced compared to those values for the parameter gamma in the previous parameterization. Also, gamma has been replaced by a parameter with a useful interpretation, the dose that yields a 50% reduction in mean response. Also notice that the bias and skewness measures of alpha and beta are not affected by the gamma right-arrow LD50 reparameterization.

Figure 18: ANOVA Table and Parameter Estimates in LD50 Parameterization

The NLIN Procedure
Dependent Variable y
Method: Gauss-Newton

NOTE: Convergence criterion met.


Note: An intercept was not specified for this model.

Source DF Sum of Squares Mean Square F Value Approx
Pr > F
Model 3 33965.4 11321.8 745.50 <.0001
Error 4 60.7475 15.1869    
Uncorrected Total 7 34026.1      

Parameter Estimate Approx
Std Error
Approximate 95% Confidence
Limits
Skewness Bias Percent
Bias
alpha 101.8 3.0034 93.4752 110.2 0.1415 0.1512 0.15
beta 2.3570 0.2928 1.5440 3.1699 0.4987 0.0303 1.29
LD50 0.1681 0.00915 0.1427 0.1935 -0.0605 -0.00013 -0.08


To reduce the parameter-effects curvature of the beta parameter, you can use the technique of defining relationships again. This can be done generically, by solving

mu Superscript asterisk Baseline equals delta plus StartFraction alpha minus delta Over 1 plus exp left-brace beta ln left-parenthesis x slash lamda right-parenthesis right-brace EndFraction

for beta, treating mu Superscript asterisk as the new parameter (in lieu of beta), and choosing a value for x Superscript asterisk that leads to low skewness. This results in the expected-value parameterization of beta. Solving for beta yields

beta equals StartStartFraction log left-parenthesis StartFraction alpha minus mu Superscript asterisk Baseline Over mu Superscript asterisk Baseline minus delta EndFraction right-parenthesis OverOver log left-parenthesis x Superscript asterisk Baseline slash lamda right-parenthesis EndEndFraction

The interpretation of the parameter mu Superscript asterisk that replaces beta in the model equation is simple: it is the mean dose response when the dose is x Superscript asterisk. Fixing x Superscript asterisk Baseline equals 0.3, the following PROC NLIN statements fit this model:

proc nlin data=logistic bias hougaard nlinmeasures;
   parameters alpha=100 mustar=20 LD50=0.15;
   delta   = 0;
   xstar   = 0.3;
   beta    = log((alpha - mustar)/(mustar - delta)) / log(xstar/LD50);
   Switch  = 1/(1+exp(beta*log(dose/LD50)));
   model y = delta + (alpha - delta)*Switch;
   output out=nlinout pred=p lcl=lcl ucl=ucl;
run;

Note that the switch-off function continues to be written in terms of beta and the LD50. The only difference from the previous model is that beta is now expressed as a function of the parameter mu Superscript asterisk. Using expected-value parameterizations is a simple mechanism to lower the curvature in a model and to arrive at starting values. The starting value for mu Superscript asterisk can be gleaned from Output 88.4.1 at x = 0.3.

Figure 19 shows selected results from this NLIN run. The ANOVA table is again unaffected by the change in parameterization. The skewness for mu Superscript asterisk is significantly reduced in comparison to those of the beta parameter in the previous model (compare Figure 19 and Figure 18), while its bias remains on the same scale from Figure 18 to Figure 19. Also note the substantial reduction in the parameter-effects curvature values. As expected, the intrinsic curvature values remain intact.

Figure 19: ANOVA Table and Parameter Estimates in Expected-Value Parameterization

The NLIN Procedure
Dependent Variable y
Method: Gauss-Newton

NOTE: Convergence criterion met.


Note: An intercept was not specified for this model.

Source DF Sum of Squares Mean Square F Value Approx
Pr > F
Model 3 33965.4 11321.8 745.50 <.0001
Error 4 60.7475 15.1869    
Uncorrected Total 7 34026.1      

Parameter Estimate Approx
Std Error
Approximate 95% Confidence
Limits
Skewness Bias Percent
Bias
alpha 101.8 3.0034 93.4752 110.2 0.1415 0.1512 0.15
mustar 20.7073 2.6430 13.3693 28.0454 -0.0572 -0.0983 -0.47
LD50 0.1681 0.00915 0.1427 0.1935 -0.0605 -0.00013 -0.08

Global Nonlinearity Measures
Max Intrinsic Curvature 0.2397
RMS Intrinsic Curvature 0.1154
Max Parameter-Effects Curvature 0.2925
RMS Parameter-Effects Curvature 0.1500
Curvature Critical Value 0.3895
Raw Residual Variance 15.187
Projected Residual Variance 5.9219


Last updated: December 09, 2022