Introduction to Bayesian Analysis Procedures

Bayesian Inference

Bayesian inference about theta is primarily based on the posterior distribution of theta. There are various ways in which you can summarize this distribution. For example, you can report your findings through point estimates. You can also use the posterior distribution to construct hypothesis tests or probability statements.

Point Estimation and Estimation Error

Classical methods often report the maximum likelihood estimator (MLE) or the method of moments estimator (MOME) of a parameter. In contrast, Bayesian approaches often use the posterior mean. The definition of the posterior mean is given by

upper E left-parenthesis theta vertical-bar bold y right-parenthesis equals integral theta p left-parenthesis theta vertical-bar bold y right-parenthesis d theta

Other commonly used posterior estimators include the posterior median, defined as

theta colon upper P left-parenthesis theta greater-than-or-equal-to normal m normal e normal d normal i normal a normal n vertical-bar bold y right-parenthesis equals upper P left-parenthesis theta less-than-or-equal-to normal m normal e normal d normal i normal a normal n vertical-bar bold y right-parenthesis equals one-half

and the posterior mode, defined as the value of theta that maximizes p left-parenthesis theta vertical-bar bold y right-parenthesis.

The variance of the posterior density (simply referred to as the posterior variance) describes the uncertainty in the parameter, which is a random variable in the Bayesian paradigm. A Bayesian analysis typically uses the posterior variance, or the posterior standard deviation, to characterize the dispersion of the parameter. In multidimensional models, covariance or correlation matrices are used.

If you know the distributional form of the posterior density of interest, you can report the exact posterior point estimates. When models become too difficult to analyze analytically, you have to use simulation algorithms, such as the MCMC method to obtain posterior estimates (see the section Markov Chain Monte Carlo Method). All of the Bayesian procedures rely on MCMC to obtain all posterior estimates. Using only a finite number of samples, simulations introduce an additional level of uncertainty to the accuracy of the estimates. Monte Carlo standard error (MCSE), which is the standard error of the posterior mean estimate, measures the simulation accuracy. See the section Standard Error of the Mean Estimate for more information.

The posterior standard deviation and the MCSE are two completely different concepts: the posterior standard deviation describes the uncertainty in the parameter, while the MCSE describes only the uncertainty in the parameter estimate as a result of MCMC simulation. The posterior standard deviation is a function of the sample size in the data set, and the MCSE is a function of the number of iterations in the simulation.

Hypothesis Testing

Suppose you have the following null and alternative hypotheses: upper H 0 is theta element-of normal upper Theta 0 and upper H 1 is theta element-of normal upper Theta 0 Superscript c, where normal upper Theta 0 is a subset of the parameter space and normal upper Theta 0 Superscript c is its complement. Using the posterior distribution pi left-parenthesis theta vertical-bar bold y right-parenthesis, you can compute the posterior probabilities upper P left-parenthesis theta element-of normal upper Theta 0 vertical-bar bold y right-parenthesis and upper P left-parenthesis theta element-of normal upper Theta 0 Superscript c Baseline vertical-bar bold y right-parenthesis, or the probabilities that upper H 0 and upper H 1 are true, respectively. One way to perform a Bayesian hypothesis test is to accept the null hypothesis if upper P left-parenthesis theta element-of normal upper Theta 0 vertical-bar bold y right-parenthesis greater-than-or-equal-to upper P left-parenthesis theta element-of normal upper Theta 0 Superscript c Baseline vertical-bar bold y right-parenthesis and vice versa, or to accept the null hypothesis if upper P left-parenthesis theta element-of normal upper Theta 0 vertical-bar bold y right-parenthesis is greater than a predefined threshold, such as 0.75, to guard against falsely accepted null distribution.

It is more difficult to carry out a point null hypothesis test in a Bayesian analysis. A point null hypothesis is a test of upper H 0 colon theta equals theta 0 versus upper H 1 colon theta not-equals theta 0. If the prior distribution pi left-parenthesis theta right-parenthesis is a continuous density, then the posterior probability of the null hypothesis being true is 0, and there is no point in carrying out the test. One alternative is to restate the null to be a small interval hypothesis: theta element-of normal upper Theta 0 equals left-parenthesis theta 0 minus a comma theta 0 plus a right-parenthesis, where a is a very small constant. The Bayesian paradigm can deal with an interval hypothesis more easily. Another approach is to give a mixture prior distribution to theta with a positive probability of p 0 on theta 0 and the density left-parenthesis 1 minus p 0 right-parenthesis pi left-parenthesis theta right-parenthesis on theta not-equals theta 0. This prior ensures a nonzero posterior probability on theta 0, and you can then make realistic probabilistic comparisons. For more detailed treatment of Bayesian hypothesis testing, see Berger (1985).

Interval Estimation

The Bayesian set estimates are called credible sets, which are also known as credible intervals. This is analogous to the concept of confidence intervals used in classical statistics. Given a posterior distribution p left-parenthesis theta vertical-bar bold y right-parenthesis, A is a credible set for theta if

upper P left-parenthesis theta element-of upper A vertical-bar bold y right-parenthesis equals integral Underscript upper A Endscripts p left-parenthesis theta vertical-bar bold y right-parenthesis d theta

For example, you can construct a 95% credible set for theta by finding an interval, A, over which integral Underscript upper A Endscripts p left-parenthesis theta vertical-bar bold y right-parenthesis equals 0.95.

You can construct credible sets that have equal tails. A 100 left-parenthesis 1 minus alpha right-parenthesis percent-sign equal-tail interval corresponds to the 100 left-parenthesis alpha slash 2 right-parenthesisth and 100 left-parenthesis 1 minus alpha slash 2 right-parenthesisth percentiles of the posterior distribution. Some statisticians prefer this interval because it is invariant under transformations. Another frequently used Bayesian credible set is called the highest posterior density (HPD) interval.

A 100 left-parenthesis 1 minus alpha right-parenthesis percent-sign HPD interval is a region that satisfies the following two conditions:

  1. The posterior probability of that region is 100 left-parenthesis 1 minus alpha right-parenthesis percent-sign.

  2. The minimum density of any point within that region is equal to or larger than the density of any point outside that region.

The HPD is an interval in which most of the distribution lies. Some statisticians prefer this interval because it is the smallest interval.

One major distinction between Bayesian and classical sets is their interpretation. The Bayesian probability reflects a person’s subjective beliefs. Following this approach, a statistician can make the claim that theta is inside a credible interval with measurable probability. This property is appealing because it enables you to make a direct probability statement about parameters. Many people find this concept to be a more natural way of understanding a probability interval, which is also easier to explain to nonstatisticians. A confidence interval, on the other hand, enables you to make a claim that the interval covers the true parameter. The interpretation reflects the uncertainty in the sampling procedure; a confidence interval of 100 left-parenthesis 1 minus alpha right-parenthesis percent-sign asserts that, in the long run, 100 left-parenthesis 1 minus alpha right-parenthesis percent-sign of the realized confidence intervals cover the true parameter.

Last updated: December 09, 2022