The QUANTREG Procedure

Quantile Regression as an Optimization Problem

The generic model for linear quantile regression is

upper Q Subscript tau Baseline left-parenthesis upper Y vertical-bar upper X equals bold x right-parenthesis equals upper Q Subscript upper Y vertical-bar bold x Baseline left-parenthesis tau right-parenthesis equals bold x prime bold-italic beta left-parenthesis tau right-parenthesis

where Y is the response random variable, bold x is the explanatory covariates vector, bold-italic beta left-parenthesis tau right-parenthesis equals left-parenthesis beta 1 left-parenthesis tau right-parenthesis comma ellipsis comma beta Subscript p Baseline left-parenthesis tau right-parenthesis right-parenthesis prime is the left-parenthesis p times 1 right-parenthesis vector of the functional model parameters at the quantile level tau, and upper Q Subscript upper Y vertical-bar bold x Baseline left-parenthesis dot right-parenthesis is the quantile function for Y conditional on upper X equals bold x.

This generic model is compatible with the following1 linear model:

y Subscript i Baseline equals bold x prime Subscript i Baseline bold-italic beta left-parenthesis tau right-parenthesis plus epsilon Subscript i Baseline left-parenthesis tau right-parenthesis for i equals 1 comma ellipsis comma n

where y Subscript i is the response value, bold x Subscript i is the explanatory covariates vector, and epsilon Subscript i Baseline left-parenthesis tau right-parenthesis equals y Subscript i Baseline minus upper Q Subscript upper Y vertical-bar bold x Sub Subscript i Subscript Baseline left-parenthesis tau right-parenthesis is an unknown error.

upper L 1 regression, also known as median regression, is a natural extension of the sample median when the response is conditioned on the covariates. In upper L 1 regression, the least absolute residuals estimate ModifyingAbove bold-italic beta With caret Subscript upper L upper A upper R, referred to as the upper L 1-norm estimate, is obtained as the solution of the following minimization problem:

min Underscript bold-italic beta element-of bold upper R Superscript p Endscripts sigma-summation Underscript i equals 1 Overscript n Endscripts StartAbsoluteValue y Subscript i Baseline minus bold x prime Subscript i Baseline bold-italic beta EndAbsoluteValue

More generally, for quantile regression Koenker and Bassett (1978) defined the tau regression quantile, 0 less-than tau less-than 1, as any solution to the following minimization problem:

min Underscript bold-italic beta element-of bold upper R Superscript p Endscripts left-bracket sigma-summation Underscript i element-of StartSet i colon y Subscript i Baseline greater-than-or-equal-to bold x prime Subscript i Baseline bold-italic beta EndSet Endscripts tau StartAbsoluteValue y Subscript i Baseline minus bold x prime Subscript i Baseline bold-italic beta EndAbsoluteValue plus sigma-summation Underscript i element-of StartSet i colon y Subscript i Baseline less-than bold x prime Subscript i Baseline bold-italic beta EndSet Endscripts left-parenthesis 1 minus tau right-parenthesis StartAbsoluteValue y Subscript i Baseline minus bold x prime Subscript i Baseline bold-italic beta EndAbsoluteValue right-bracket

The solution is denoted as ModifyingAbove bold-italic beta With caret left-parenthesis tau right-parenthesis, and the upper L 1-norm estimate corresponds to ModifyingAbove bold-italic beta With caret left-parenthesis 1 slash 2 right-parenthesis. The tau regression quantile is an extension of the tau sample quantile ModifyingAbove xi With caret left-parenthesis tau right-parenthesis, which can be formulated as the solution of

min Underscript xi element-of bold upper R Endscripts left-bracket sigma-summation Underscript i element-of StartSet i colon y Subscript i Baseline greater-than-or-equal-to xi EndSet Endscripts tau StartAbsoluteValue y Subscript i Baseline minus xi EndAbsoluteValue plus sigma-summation Underscript i element-of StartSet i colon y Subscript i Baseline less-than xi EndSet Endscripts left-parenthesis 1 minus tau right-parenthesis StartAbsoluteValue y Subscript i Baseline minus xi EndAbsoluteValue right-bracket

If you specify weights w Subscript i Baseline comma i equals 1 comma ellipsis comma n, with the WEIGHT statement, weighted quantile regression is carried out by solving

min Underscript bold-italic beta Subscript w Baseline element-of bold upper R Superscript p Endscripts left-bracket sigma-summation Underscript i element-of StartSet i colon y Subscript i Baseline greater-than-or-equal-to bold x prime Subscript i Baseline bold-italic beta Subscript w Baseline EndSet Endscripts w Subscript i Baseline tau StartAbsoluteValue y Subscript i Baseline minus bold x prime Subscript i Baseline bold-italic beta Subscript w Baseline EndAbsoluteValue plus sigma-summation Underscript i element-of StartSet i colon y Subscript i Baseline less-than bold x prime Subscript i Baseline bold-italic beta Subscript w Baseline EndSet Endscripts w Subscript i Baseline left-parenthesis 1 minus tau right-parenthesis StartAbsoluteValue y Subscript i Baseline minus bold x prime Subscript i Baseline bold-italic beta Subscript w Baseline EndAbsoluteValue right-bracket

Weighted regression quantiles bold-italic beta Subscript w can be used for L-estimation (Koenker and Zhao 1994).

Last updated: December 09, 2022