Introduction to Statistical Modeling with SAS/STAT Software

Maximum Likelihood Estimation

To estimate the parameters in a linear model with mean function normal upper E left-bracket bold upper Y right-bracket equals bold upper X bold-italic beta by maximum likelihood, you need to specify the distribution of the response vector bold upper Y. In the linear model with a continuous response variable, it is commonly assumed that the response is normally distributed. In that case, the estimation problem is completely defined by specifying the mean and variance of bold upper Y in addition to the normality assumption. The model can be written as bold upper Y tilde upper N left-parenthesis bold upper X bold-italic beta comma sigma squared bold upper I right-parenthesis, where the notation upper N left-parenthesis bold a comma bold upper V right-parenthesis indicates a multivariate normal distribution with mean vector bold a and variance matrix bold upper V. The log likelihood for bold upper Y then can be written as

l left-parenthesis bold-italic beta comma sigma squared semicolon bold y right-parenthesis equals minus StartFraction n Over 2 EndFraction log left-brace 2 pi right-brace minus StartFraction n Over 2 EndFraction log left-brace sigma squared right-brace minus StartFraction 1 Over 2 sigma squared EndFraction left-parenthesis bold y minus bold upper X bold-italic beta right-parenthesis prime left-parenthesis bold y minus bold upper X bold-italic beta right-parenthesis

This function is maximized in bold-italic beta when the sum of squares left-parenthesis bold y minus bold upper X bold-italic beta right-parenthesis prime left-parenthesis bold y minus bold upper X bold-italic beta right-parenthesis is minimized. The maximum likelihood estimator of bold-italic beta is thus identical to the ordinary least squares estimator. To maximize l left-parenthesis bold-italic beta comma sigma squared semicolon bold y right-parenthesis with respect to sigma squared, note that

StartFraction partial-differential l left-parenthesis bold-italic beta comma sigma squared semicolon bold y right-parenthesis Over partial-differential sigma squared EndFraction equals minus StartFraction n Over 2 sigma squared EndFraction plus StartFraction 1 Over 2 sigma Superscript 4 Baseline EndFraction left-parenthesis bold y minus bold upper X bold-italic beta right-parenthesis prime left-parenthesis bold y minus bold upper X bold-italic beta right-parenthesis

Hence the MLE of sigma squared is the estimator

StartLayout 1st Row 1st Column ModifyingAbove sigma With caret Subscript upper M Superscript 2 2nd Column equals StartFraction 1 Over n EndFraction left-parenthesis bold upper Y minus bold upper X ModifyingAbove bold-italic beta With caret right-parenthesis prime left-parenthesis bold upper Y minus bold upper X ModifyingAbove bold-italic beta With caret right-parenthesis 2nd Row 1st Column Blank 2nd Column equals normal upper S normal upper S normal upper R slash n EndLayout

This is a biased estimator of sigma squared, with a bias that decreases with n.

Last updated: December 09, 2022