The HPQUANTSELECT Procedure

Statistical Tests for Significance Level

The HPQUANTSELECT procedure supports the significance level (SL) criterion for effect selection. Consider the general form of a linear quantile regression model:

upper Q Subscript upper Y Baseline left-parenthesis tau vertical-bar bold x 1 comma bold x 2 right-parenthesis equals bold x prime 1 bold-italic beta 1 left-parenthesis tau right-parenthesis plus bold x prime 2 bold-italic beta 2 left-parenthesis tau right-parenthesis

At each step of an effect-selection process, a candidate effect can be represented as bold x 2, and the significance level of the candidate effect can be calculated by testing the null hypothesis: upper H 0 colon bold-italic beta 2 left-parenthesis tau right-parenthesis equals bold 0.

When you use SL as a criterion for effect selection, you can further use the TEST= option in the SELECTION statement to specify a statistical test method to compute the significance-level values as follows:

  • The TEST=WALD option specifies the Wald test. Let ModifyingAbove bold-italic beta With caret left-parenthesis tau right-parenthesis equals left-parenthesis ModifyingAbove bold-italic beta With caret prime Subscript 1 Baseline left-parenthesis tau right-parenthesis comma ModifyingAbove bold-italic beta With caret prime Subscript 2 Baseline left-parenthesis tau right-parenthesis right-parenthesis prime be the parameter estimates for the extended model, and denote the estimated covariance matrix of ModifyingAbove bold-italic beta With caret left-parenthesis tau right-parenthesis as

    ModifyingAbove normal upper Sigma With caret left-parenthesis tau right-parenthesis equals Start 2 By 2 Matrix 1st Row 1st Column ModifyingAbove normal upper Sigma With caret Subscript 11 Baseline left-parenthesis tau right-parenthesis 2nd Column ModifyingAbove normal upper Sigma With caret Subscript 12 Baseline left-parenthesis tau right-parenthesis 2nd Row 1st Column ModifyingAbove normal upper Sigma With caret Subscript 21 Baseline left-parenthesis tau right-parenthesis 2nd Column ModifyingAbove normal upper Sigma With caret Subscript 22 Baseline left-parenthesis tau right-parenthesis EndMatrix

    where ModifyingAbove normal upper Sigma With caret Subscript 22 Baseline left-parenthesis tau right-parenthesis is the covariance matrix for ModifyingAbove bold-italic beta With caret Subscript 2 Baseline left-parenthesis tau right-parenthesis. Then the Wald test score is defined as

    ModifyingAbove bold-italic beta With caret prime Subscript 2 Baseline left-parenthesis tau right-parenthesis ModifyingAbove normal upper Sigma With caret Subscript 22 Superscript negative 1 Baseline left-parenthesis tau right-parenthesis ModifyingAbove bold-italic beta With caret Subscript 2 Baseline left-parenthesis tau right-parenthesis

    If you specify the SPARSITY(IID) option in the MODEL statement, ModifyingAbove normal upper Sigma With caret left-parenthesis tau right-parenthesis is estimated under the iid errors assumption. Otherwise, ModifyingAbove normal upper Sigma With caret left-parenthesis tau right-parenthesis is estimated by using non-iid settings. For more information about the linear model with iid errors and non-iid settings, see the section Quantile Regression.

  • The TEST=LR1 or TEST=LR2 option specifies the Type I or Type II quasi-likelihood ratio test, respectively. Under the iid assumption, Koenker and Machado (1999) propose two types of quasi-likelihood ratio tests for quantile regression, where the error distribution is flexible but not limited to the asymmetric Laplace distribution. The Type I test score, LR1, is defined as

    StartFraction 2 left-parenthesis upper D 1 left-parenthesis tau right-parenthesis minus upper D 2 left-parenthesis tau right-parenthesis right-parenthesis Over tau left-parenthesis 1 minus tau right-parenthesis ModifyingAbove s With caret EndFraction

    where upper D 1 left-parenthesis tau right-parenthesis equals sigma-summation rho Subscript tau Baseline left-parenthesis y Subscript i Baseline minus bold x Subscript 1 i Baseline ModifyingAbove bold-italic beta With caret Subscript 1 Sub Subscript 1 Subscript Baseline left-parenthesis tau right-parenthesis right-parenthesis is the sum of check losses for the reduced model, upper D 2 left-parenthesis tau right-parenthesis equals sigma-summation rho Subscript tau Baseline left-parenthesis y Subscript i Baseline minus bold x Subscript 1 i Baseline ModifyingAbove bold-italic beta With caret Subscript 1 Sub Subscript 2 Subscript Baseline left-parenthesis tau right-parenthesis minus bold x Subscript 2 i Baseline ModifyingAbove bold-italic beta With caret Subscript 2 Baseline left-parenthesis tau right-parenthesis right-parenthesis is the sum of check losses for the extended model, and ModifyingAbove s With caret is the estimated sparsity function. The Type II test score, LR2, is defined as

    StartFraction 2 upper D 2 left-parenthesis tau right-parenthesis left-parenthesis log left-parenthesis upper D 1 left-parenthesis tau right-parenthesis right-parenthesis minus log left-parenthesis upper D 2 left-parenthesis tau right-parenthesis right-parenthesis right-parenthesis Over tau left-parenthesis 1 minus tau right-parenthesis ModifyingAbove s With caret EndFraction

Under the null hypothesis that the reduced model is the true model, the Wald score, LR1 score, and LR2 score all follow a chi squared distribution with degrees of freedom d f equals d f Subscript 2 Baseline minus d f Subscript 1, where d f Subscript 1 and d f Subscript 2 are the degrees of freedom for the reduced model and the extended model, respectively .

When you use SL as a criterion for effect selection, the algorithm for estimating sparsity function depends on whether an effect is being considered as an add or a drop candidate. For testing an add candidate effect, the sparsity function, which is s left-parenthesis tau right-parenthesis under the iid error assumption or s Subscript i Baseline left-parenthesis tau right-parenthesis for non-iid settings, is estimated on the reduced model that does not include the add candidate effect. For testing a drop candidate effect, the sparsity function is estimated on the extended model that does not exclude the drop candidate effect. Then, these estimated sparsity function values are used to compute LR1 or LR2 and the covariance matrix of the parameter estimates for the extended model. However, for the model that is selected at each step, the sparsity function for estimating standard errors and confidence limits of the parameter estimates is estimated on that model itself, but not on the model that was selected at the preceding step.

Because the null hypotheses usually do not hold, the SLENTRY and SLSTAY values cannot reliably be viewed as probabilities. One way to address this difficulty is to replace hypothesis testing as a means of selecting a model with information criteria or out-of-sample prediction criteria.

Table 6 provides formulas and definitions for these fit statistics.

Table 6: Formulas and Definitions for Model Fit Summary Statistics for Single Quantile Effect Selection

Statistic Definition or Formula
n Number of observations
p Number of parameters, including the intercept
r Subscript i Baseline left-parenthesis tau right-parenthesis Residual for the ith observation; r Subscript i Baseline left-parenthesis tau right-parenthesis equals y Subscript i Baseline minus bold x Subscript i Baseline ModifyingAbove bold-italic beta With caret left-parenthesis tau right-parenthesis
upper D left-parenthesis tau right-parenthesis Total sum of check losses; upper D left-parenthesis tau right-parenthesis equals sigma-summation Underscript i equals 1 Overscript n Endscripts rho Subscript tau Baseline left-parenthesis r Subscript i Baseline right-parenthesis. upper D left-parenthesis tau right-parenthesis is labeled as Objective Function in the "Fit Statistics" table.
upper D 0 left-parenthesis tau right-parenthesis Total sum of check losses for intercept-only model if the intercept is a forced-in effect; otherwise for empty model.
ACL left-parenthesis tau right-parenthesis Average check loss; ACL left-parenthesis tau right-parenthesis equals StartFraction upper D left-parenthesis tau right-parenthesis Over n EndFraction
upper R 1 left-parenthesis tau right-parenthesis Counterpart of linear regression R square for quantile regression; upper R 1 left-parenthesis tau right-parenthesis equals 1 minus StartFraction upper D left-parenthesis tau right-parenthesis Over upper D 0 left-parenthesis tau right-parenthesis EndFraction
ADJR 1 left-parenthesis tau right-parenthesis Adjusted R1; 1 minus StartFraction left-parenthesis n minus 1 right-parenthesis upper D left-parenthesis tau right-parenthesis Over left-parenthesis n minus p right-parenthesis upper D 0 left-parenthesis tau right-parenthesis EndFraction if intercept is a forced-in effect; otherwise 1 minus StartFraction n upper D left-parenthesis tau right-parenthesis Over left-parenthesis n minus p right-parenthesis upper D 0 left-parenthesis tau right-parenthesis EndFraction.
AIC left-parenthesis tau right-parenthesis 2 n ln left-parenthesis ACL left-parenthesis tau right-parenthesis right-parenthesis plus 2 p
AICC left-parenthesis tau right-parenthesis 2 n ln left-parenthesis ACL left-parenthesis tau right-parenthesis right-parenthesis plus StartFraction 2 p n Over n minus p minus 1 EndFraction
SBC left-parenthesis tau right-parenthesis 2 n ln left-parenthesis ACL left-parenthesis tau right-parenthesis right-parenthesis plus p ln left-parenthesis n right-parenthesis


The ADJR 1 left-parenthesis tau right-parenthesis criterion is equivalent to the generalized approximate cross validation (GACV) criterion for quantile regression (Yuan 2006). The GACV criterion is defined as

GACV left-parenthesis tau right-parenthesis equals upper D left-parenthesis tau right-parenthesis slash left-parenthesis n minus p right-parenthesis

which is proportional to 1 minus ADJR 1 left-parenthesis tau right-parenthesis.

Last updated: December 09, 2022