The GENMOD Procedure

Bayesian Analysis

In generalized linear models, the response has a probability distribution from a family of distributions of the exponential form. That is, the probability density of the response Y for continuous response variables, or the probability function for discrete responses, can be expressed as

f left-parenthesis y right-parenthesis equals exp left-brace StartFraction y theta minus b left-parenthesis theta right-parenthesis Over a left-parenthesis phi right-parenthesis EndFraction plus c left-parenthesis y comma phi right-parenthesis right-brace

for some functions a, b, and c that determine the specific distribution. The canonical parameters theta depend only on the means of the response mu Subscript i, which are related to the regression parameters beta through the link function g left-parenthesis mu Subscript i Baseline right-parenthesis equals x prime beta. The additional parameter phi is the dispersion parameter. The GENMOD procedure estimates the regression parameters and the scale parameter sigma equals phi Superscript one-half by maximum likelihood. However, the GENMOD procedure can also provide Bayesian estimates of the regression parameters and either the scale sigma, the dispersion phi, or the precision tau equals phi Superscript negative 1 by sampling from the posterior distribution. Except where noted, the following discussion applies to either sigma, phi, or tau, although phi is used to illustrate the formulas. Note that the Poisson and binomial distributions do not have a dispersion parameter, and the dispersion is considered to be fixed at phi equals 1. The ASSESS, CONTRAST, ESTIMATE, OUTPUT, and REPEATED statements, if specified, are ignored. Also ignored are the PLOTS= option in the PROC GENMOD statement and the following options in the MODEL statement: ALPHA=, CORRB, COVB, TYPE1, TYPE3, SCALE=DEVIANCE (DSCALE), SCALE=PEARSON (PSCALE), OBSTATS, RESIDUALS, XVARS, PREDICTED, DIAGNOSTICS, and SCALE= for Poisson and binomial distributions. The multinomial and zero-inflated Poisson distributions are not available for Bayesian analysis.

See the section Assessing Markov Chain Convergence in Chapter 8, Introduction to Bayesian Analysis Procedures, for information about assessing the convergence of the chain of posterior samples.

Several algorithms, specified with the SAMPLING= option in the BAYES statement, are available in GENMOD for drawing samples from the posterior distribution.

ARMS Algorithm for Gibbs Sampling

This section provides details for Bayesian analysis by Gibbs sampling in generalized linear models. See the section Gibbs Sampler in Chapter 8, Introduction to Bayesian Analysis Procedures, for a general discussion of Gibbs sampling. See Gilks, Richardson, and Spiegelhalter (1996) for a discussion of applications of Gibbs sampling to a number of different models, including generalized linear models.

Let bold-italic theta equals left-parenthesis theta 1 comma ellipsis comma theta Subscript k Baseline right-parenthesis prime be the parameter vector. For generalized linear models, the theta Subscript is are the regression coefficients beta Subscript is and the dispersion parameter phi. Let upper L left-parenthesis upper D vertical-bar bold-italic theta right-parenthesis be the likelihood function, where D is the observed data. Let pi left-parenthesis bold-italic theta right-parenthesis be the prior distribution. The full conditional distribution of left-bracket theta Subscript i Baseline vertical-bar theta Subscript j Baseline comma i not-equals j right-bracket is proportional to the joint distribution; that is,

pi left-parenthesis theta Subscript i Baseline vertical-bar theta Subscript j Baseline comma i not-equals j comma upper D right-parenthesis proportional-to upper L left-parenthesis upper D vertical-bar bold-italic theta right-parenthesis p left-parenthesis bold-italic theta right-parenthesis

For instance, the one-dimensional conditional distribution of theta 1 given theta Subscript j Baseline equals theta Subscript j Superscript asterisk Baseline comma 2 less-than-or-equal-to j less-than-or-equal-to k, is computed as

pi left-parenthesis theta 1 vertical-bar theta Subscript j Baseline equals theta Subscript j Superscript asterisk Baseline comma 2 less-than-or-equal-to j less-than-or-equal-to k comma upper D right-parenthesis equals upper L left-parenthesis upper D vertical-bar left-parenthesis bold-italic theta equals left-parenthesis theta 1 comma theta 2 Superscript asterisk Baseline comma ellipsis comma theta Subscript k Superscript asterisk Baseline right-parenthesis Superscript prime Baseline right-parenthesis p left-parenthesis bold-italic theta equals left-parenthesis theta 1 comma theta 2 Superscript asterisk Baseline comma ellipsis comma theta Subscript k Superscript asterisk Baseline right-parenthesis prime right-parenthesis

Suppose you have a set of arbitrary starting values StartSet theta 1 Superscript left-parenthesis 0 right-parenthesis Baseline comma ellipsis comma theta Subscript k Superscript left-parenthesis 0 right-parenthesis Baseline EndSet. Using the ARMS (adaptive rejection Metropolis sampling) algorithm (Gilks and Wild 1992; Gilks, Best, and Tan 1995), you can do the following:

  • draw theta 1 Superscript left-parenthesis 1 right-parenthesis from left-bracket theta 1 vertical-bar theta 2 Superscript left-parenthesis 0 right-parenthesis Baseline comma ellipsis comma theta Subscript k Superscript left-parenthesis 0 right-parenthesis Baseline right-bracket

  • draw theta 2 Superscript left-parenthesis 1 right-parenthesis from left-bracket theta 2 vertical-bar theta 1 Superscript left-parenthesis 1 right-parenthesis Baseline comma theta 3 Superscript left-parenthesis 0 right-parenthesis Baseline comma ellipsis comma theta Subscript k Superscript left-parenthesis 0 right-parenthesis Baseline right-bracket

  • ellipsis

  • draw theta Subscript k Superscript left-parenthesis 1 right-parenthesis from left-bracket theta Subscript k Baseline vertical-bar theta 1 Superscript left-parenthesis 1 right-parenthesis Baseline comma ellipsis comma theta Subscript k minus 1 Superscript left-parenthesis 1 right-parenthesis Baseline right-bracket

This completes one iteration of the Gibbs sampler. After one iteration, you have StartSet theta 1 Superscript left-parenthesis 1 right-parenthesis Baseline comma ellipsis comma theta Subscript k Superscript left-parenthesis 1 right-parenthesis Baseline EndSet. After n iterations, you have StartSet theta 1 Superscript left-parenthesis n right-parenthesis Baseline comma ellipsis comma theta Subscript k Superscript left-parenthesis n right-parenthesis Baseline EndSet. PROC GENMOD implements the ARMS algorithm provided by Gilks (2003) to draw a sample from a full conditional distribution. See the section Adaptive Rejection Sampling Algorithm in Chapter 8, Introduction to Bayesian Analysis Procedures, for more information about the ARMS algorithm.

Gamerman Algorithm

The Gamerman algorithm, unlike a Gibbs sampling algorithm, samples parameters from their multivariate posterior conditional distribution. The algorithm uses the structure of generalized linear models to efficiently sample from the posterior distribution of the model parameters. For a detailed description and explanation of the algorithm, see Gamerman (1997) and the section Gamerman Algorithm in Chapter 8, Introduction to Bayesian Analysis Procedures. The Gamerman algorithm is the default method used to sample from the posterior distribution. See any of the introductory references in Chapter 8, Introduction to Bayesian Analysis Procedures, for a discussion of conjugate prior distributions for a linear model with the normal distribution.

Independence Metropolis Algorithm

The independence Metropolis algorithm is another sampling algorithm that draws multivariate samples from the posterior distribution. See the section Independence Sampler in Chapter 8, Introduction to Bayesian Analysis Procedures, for more details.

Posterior Samples Output Data Set

You can output posterior samples into a SAS data set through ODS. The following SAS statement outputs the posterior samples into the SAS data set Post:

ODS OUTPUT POSTERIORSAMPLE=Post

You can alternatively create the SAS data set Post with the OUTPOST=Post option in the BAYES statement.

The data set also includes the variables LogPost and LogLike, which represent the log of the posterior likelihood and the log of the likelihood, respectively.

Priors for Model Parameters

The model parameters are the regression coefficients and the dispersion parameter (or the precision or scale), if the model has one. The priors for the dispersion parameter and the priors for the regression coefficients are assumed to be independent, while you can have a joint multivariate normal prior for the regression coefficients.

Dispersion, Precision, or Scale Parameter
Gamma Prior

The gamma distribution upper G left-parenthesis a comma b right-parenthesis has a probability density function

f left-parenthesis u right-parenthesis equals StartFraction b left-parenthesis b u right-parenthesis Superscript a minus 1 Baseline normal e Superscript minus b u Baseline Over normal upper Gamma left-parenthesis a right-parenthesis EndFraction comma u greater-than 0

where a is the shape parameter and b is the inverse-scale parameter. The mean is StartFraction a Over b EndFraction and the variance is StartFraction a Over b squared EndFraction.

Improper Prior

The joint prior density is given by

p left-parenthesis u right-parenthesis proportional-to u Superscript negative 1 Baseline comma u greater-than 0
Inverse Gamma Prior

The inverse gamma distribution normal upper I normal upper G left-parenthesis a comma b right-parenthesis has a probability density function

f left-parenthesis u right-parenthesis equals StartFraction b Superscript a Baseline Over normal upper Gamma left-parenthesis a right-parenthesis EndFraction u Superscript minus left-parenthesis a plus 1 right-parenthesis Baseline normal e Superscript negative b slash u Baseline comma u greater-than 0

where a is the shape parameter and b is the scale parameter. The mean is StartFraction b Over a minus 1 EndFraction if a greater-than 1, and the variance is StartFraction b squared Over left-parenthesis a minus 1 right-parenthesis squared left-parenthesis a minus 2 right-parenthesis EndFraction if a greater-than 2.

Regression Coefficients

Let bold-italic beta be the regression coefficients.

Jeffreys’ Prior

The joint prior density is given by

p left-parenthesis bold-italic beta right-parenthesis proportional-to StartAbsoluteValue bold upper I left-parenthesis bold-italic beta right-parenthesis EndAbsoluteValue Superscript one-half

where bold upper I left-parenthesis bold-italic beta right-parenthesis is the Fisher information matrix for the model. If the underlying model has a scale parameter (for example, a normal linear regression model), then the Fisher information matrix is computed with the scale parameter set to a fixed value of one.

If you specify the CONDITIONAL option, then Jeffreys’ prior, conditional on the current Markov chain value of the generalized linear model precision parameter tau, is given by

StartAbsoluteValue tau bold upper I left-parenthesis bold-italic beta right-parenthesis EndAbsoluteValue Superscript one-half

where tau is the model precision parameter.

See Ibrahim and Laud (1991) for a full discussion, with examples, of Jeffreys’ prior for generalized linear models.

Normal Prior

Assume bold-italic beta has a multivariate normal prior with mean vector bold-italic beta 0 and covariance matrix bold upper Sigma 0. The joint prior density is given by

p left-parenthesis bold-italic beta right-parenthesis proportional-to normal e Superscript minus one-half left-parenthesis bold-italic beta minus bold-italic beta 0 right-parenthesis prime bold upper Sigma 0 Super Superscript negative 1 Superscript left-parenthesis bold-italic beta minus bold-italic beta 0 right-parenthesis

If you specify the CONDITIONAL option, then, conditional on the current Markov chain value of the generalized linear model precision parameter tau, the joint prior density is given by

p left-parenthesis bold-italic beta right-parenthesis proportional-to normal e Superscript minus one-half left-parenthesis bold-italic beta minus bold-italic beta 0 right-parenthesis prime tau bold upper Sigma 0 Super Superscript negative 1 Superscript left-parenthesis bold-italic beta minus bold-italic beta 0 right-parenthesis
Uniform Prior

The joint prior density is given by

p left-parenthesis bold-italic beta right-parenthesis proportional-to 1

Deviance Information Criterion

Let theta Subscript i be the model parameters at iteration i of the Gibbs sampler and let LL(theta Subscript i) be the corresponding model log likelihood. PROC GENMOD computes the following fit statistics defined by Spiegelhalter et al. (2002):

  • Effective number of parameters:

    p Subscript upper D Baseline equals ModifyingAbove normal upper L normal upper L left-parenthesis theta right-parenthesis With bar minus normal upper L normal upper L left-parenthesis theta overbar right-parenthesis
  • Deviance information criterion (DIC):

    normal upper D normal upper I normal upper C equals ModifyingAbove normal upper L normal upper L left-parenthesis theta right-parenthesis With bar plus p Subscript upper D

where

StartLayout 1st Row 1st Column ModifyingAbove normal upper L normal upper L left-parenthesis theta right-parenthesis With bar 2nd Column equals 3rd Column StartFraction 1 Over n EndFraction sigma-summation Underscript i equals 1 Overscript n Endscripts normal upper L normal upper L left-parenthesis theta Subscript i Baseline right-parenthesis 2nd Row 1st Column theta overbar 2nd Column equals 3rd Column StartFraction 1 Over n EndFraction sigma-summation Underscript i equals 1 Overscript n Endscripts theta Subscript i EndLayout

PROC GENMOD uses the full log likelihoods defined in the section Log-Likelihood Functions, with all terms included, for computing the DIC.

Posterior Distribution

Denote the observed data by D.

The posterior distribution is

pi left-parenthesis bold-italic beta vertical-bar upper D right-parenthesis proportional-to upper L Subscript upper P Baseline left-parenthesis upper D vertical-bar bold-italic beta right-parenthesis p left-parenthesis bold-italic beta right-parenthesis

where upper L Subscript upper P Baseline left-parenthesis upper D vertical-bar bold-italic beta right-parenthesis is the likelihood function with regression coefficients bold-italic beta as parameters.

Starting Values of the Markov Chains

When the BAYES statement is specified, PROC GENMOD generates one Markov chain containing the approximate posterior samples of the model parameters. Additional chains are produced when the Gelman-Rubin diagnostics are requested. Starting values (or initial values) can be specified in the INITIAL= data set in the BAYES statement. If INITIAL= option is not specified, PROC GENMOD picks its own initial values for the chains.

Denote left-bracket x right-bracket as the integral value of x. Denote ModifyingAbove s With caret left-parenthesis upper X right-parenthesis as the estimated standard error of the estimator X.

Regression Coefficients

For the first chain that the summary statistics and regression diagnostics are based on, the default initial values are estimates of the mode of the posterior distribution. If the INITIALMLE option is specified, the initial values are the maximum likelihood estimates; that is,

beta Subscript i Superscript left-parenthesis 0 right-parenthesis Baseline equals ModifyingAbove beta With caret Subscript i

Initial values for the rth chain (r greater-than-or-equal-to 2) are given by

beta Subscript i Superscript left-parenthesis 0 right-parenthesis Baseline equals ModifyingAbove beta With caret Subscript i Baseline plus-or-minus left-parenthesis 2 plus left-bracket StartFraction r Over 2 EndFraction right-bracket right-parenthesis ModifyingAbove s With caret left-parenthesis ModifyingAbove beta With caret Subscript i Baseline right-parenthesis

with the plus sign for odd r and minus sign for even r.

Dispersion, Scale, or Precision Parameter lamda

Let lamda be the generalized linear model parameter you choose to sample, either the dispersion, scale, or precision parameter. Note that the Poisson and binomial distributions do not have this additional parameter.

For the first chain that the summary statistics and regression diagnostics are based on, the default initial values are estimates of the mode of the posterior distribution. If the INITIALMLE option is specified, the initial values are the maximum likelihood estimates; that is,

lamda Superscript left-parenthesis 0 right-parenthesis Baseline equals ModifyingAbove lamda With caret

The initial values of the rth chain (r greater-than-or-equal-to 2) are given by

lamda Superscript left-parenthesis 0 right-parenthesis Baseline equals ModifyingAbove lamda With caret normal e Superscript plus-or-minus left-parenthesis left-bracket StartFraction r Over 2 EndFraction right-bracket plus 2 right-parenthesis ModifyingAbove s With caret left-parenthesis ModifyingAbove lamda With caret right-parenthesis

with the plus sign for odd r and minus sign for even r.

OUTPOST= Output Data Set

The OUTPOST= data set contains the generated posterior samples. There are 3+n variables, where n is the number of model parameters. The variable Iteration represents the iteration number, the variable LogLike contains the log of the likelihood, and the variable LogPost contains the log of the posterior. The other n variables represent the draws of the Markov chain for the model parameters.

Last updated: December 09, 2022