The GLIMMIX Procedure

Generalized Linear Models Theory

A generalized linear model consists of the following:

  • a linear predictor eta equals bold x prime bold-italic beta

  • a monotonic mapping between the mean of the data and the linear predictor

  • a response distribution in the exponential family of distributions

A density or mass function in this family can be written as

f left-parenthesis y right-parenthesis equals exp left-brace StartFraction y theta minus b left-parenthesis theta right-parenthesis Over phi EndFraction plus c left-parenthesis y comma f left-parenthesis phi right-parenthesis right-parenthesis right-brace

for some functions b left-parenthesis dot right-parenthesis and c left-parenthesis dot right-parenthesis. The parameter theta is called the natural (canonical) parameter. The parameter phi is a scale parameter, and it is not present in all exponential family distributions. See Table 22 for a list of distributions for which phi identical-to 1. In the case where observations are weighted, the scale parameter is replaced with phi slash w in the preceding density (or mass function), where w is the weight associated with the observation y.

The mean and variance of the data are related to the components of the density, normal upper E left-bracket upper Y right-bracket equals mu equals b prime left-parenthesis theta right-parenthesis, normal upper V normal a normal r left-bracket upper Y right-bracket equals phi b double-prime left-parenthesis theta right-parenthesis, where primes denote first and second derivatives. If you express theta as a function of mu, the relationship is known as the natural link or the canonical link function. In other words, modeling data with a canonical link assumes that theta equals bold x prime bold-italic beta; the effect contributions are additive on the canonical scale. The second derivative of b left-parenthesis dot right-parenthesis, expressed as a function of mu, is the variance function of the generalized linear model, a left-parenthesis mu right-parenthesis equals b double-prime left-parenthesis theta left-parenthesis mu right-parenthesis right-parenthesis. Note that because of this relationship, the distribution determines the variance function and the canonical link function. You cannot, however, proceed in the opposite direction. If you provide a user-specified variance function, the GLIMMIX procedure assumes that only the first two moments of the response distribution are known. The full distribution of the data is then unknown and maximum likelihood estimation is not possible. Instead, the GLIMMIX procedure then estimates parameters by quasi-likelihood.

Last updated: December 09, 2022