The HPLMIXED Procedure

Estimating Covariance Parameters in the Mixed Model

Estimation is more difficult in the mixed model than in the general linear model. Not only do you have bold-italic beta as in the general linear model, but you also have unknown parameters in bold-italic gamma, bold upper G, and bold upper R . Least squares is no longer the best method. Generalized least squares (GLS) is more appropriate, minimizing

left-parenthesis bold y minus bold upper X bold-italic beta right-parenthesis prime bold upper V Superscript negative 1 Baseline left-parenthesis bold y minus bold upper X bold-italic beta right-parenthesis

However, GLS requires knowledge of bold upper V and therefore knowledge of bold upper G and bold upper R. Lacking such information, one approach is to use an estimated GLS, in which you insert some reasonable estimate for bold upper V into the minimization problem. The goal thus becomes finding a reasonable estimate of bold upper G and bold upper R.

In many situations, the best approach is to use likelihood-based methods, exploiting the assumption that bold-italic gamma and bold-italic epsilon are normally distributed (Hartley and Rao 1967; Patterson and Thompson 1971; Harville 1977; Laird and Ware 1982; Jennrich and Schluchter 1986). PROC HPLMIXED implements two likelihood-based methods: maximum likelihood (ML) and restricted (residual) maximum likelihood (REML). A favorable theoretical property of ML and REML is that they accommodate data that are missing at random (Rubin 1976; Little 1995).

PROC HPLMIXED constructs an objective function associated with ML or REML and maximizes it over all unknown parameters. Using calculus, it is possible to reduce this maximization problem to one over only the parameters in bold upper G and bold upper R. The corresponding log-likelihood functions are as follows:

StartLayout 1st Row 1st Column normal upper M normal upper L colon l left-parenthesis bold upper G comma bold upper R right-parenthesis 2nd Column equals minus one-half log StartAbsoluteValue bold upper V EndAbsoluteValue minus one-half bold r prime bold upper V Superscript negative 1 Baseline bold r minus StartFraction n Over 2 EndFraction log left-parenthesis 2 pi right-parenthesis 2nd Row 1st Column normal upper R normal upper E normal upper M normal upper L colon l Subscript upper R Baseline left-parenthesis bold upper G comma bold upper R right-parenthesis 2nd Column equals minus one-half log StartAbsoluteValue bold upper V EndAbsoluteValue minus one-half log StartAbsoluteValue bold upper X prime bold upper V Superscript negative 1 Baseline bold upper X EndAbsoluteValue minus one-half bold r prime bold upper V Superscript negative 1 Baseline bold r minus StartFraction n minus p Over 2 EndFraction log left-parenthesis 2 pi right-parenthesis EndLayout

where bold r equals bold y minus bold upper X left-parenthesis bold upper X prime bold upper V Superscript negative 1 Baseline bold upper X right-parenthesis Superscript minus Baseline bold upper X prime bold upper V Superscript negative 1 Baseline bold y and p is the rank of bold upper X. By default, PROC HPLMIXED actually minimizes a normalized form of negative 2 times these functions by using a ridge-stabilized Newton-Raphson algorithm by default. Lindstrom and Bates (1988) provide reasons for preferring Newton-Raphson to the expectation-maximum (EM) algorithm described in Dempster, Laird, and Rubin (1977) and Laird, Lange, and Stram (1987), in addition to analytical details for implementing a QR-decomposition approach to the problem. Wolfinger, Tobias, and Sall (1994) present the sweep-based algorithms that are implemented in PROC HPLMIXED. You can change the optimization technique with the TECHNIQUE= option in the PROC HPLMIXED statement.

One advantage of using the Newton-Raphson algorithm is that the second derivative matrix of the objective function evaluated at the optima is available upon completion. Denoting this matrix bold upper H, the asymptotic theory of maximum likelihood (Serfling 1980) shows that 2 bold upper H Superscript negative 1 is an asymptotic variance-covariance matrix of the estimated parameters of bold upper G and bold upper R. Thus, tests and confidence intervals based on asymptotic normality can be obtained. However, these can be unreliable in small samples, especially for parameters such as variance components that have sampling distributions that tend to be skewed to the right.

If a residual variance sigma squared is a part of your mixed model, it can usually be profiled out of the likelihood. This means solving analytically for the optimal sigma squared and plugging this expression back into the likelihood formula (Wolfinger, Tobias, and Sall 1994). This reduces the number of optimization parameters by 1 and can improve convergence properties. PROC HPLMIXED profiles the residual variance out of the log likelihood.

Last updated: December 09, 2022