The FREQ Procedure

Jonckheere-Terpstra Test

The JT option in the TABLES statement provides the Jonckheere-Terpstra test, which is a nonparametric test for ordered differences among classes. It tests the null hypothesis that the distribution of the response variable does not differ among classes. It is designed to detect alternatives of ordered class differences, which can be expressed as tau 1 less-than-or-equal-to tau 2 less-than-or-equal-to midline-horizontal-ellipsis less-than-or-equal-to tau Subscript upper R (or tau 1 greater-than-or-equal-to tau 2 greater-than-or-equal-to midline-horizontal-ellipsis greater-than-or-equal-to tau Subscript upper R), with at least one of the inequalities being strict, where tau Subscript i denotes the effect of class i. For such ordered alternatives, the Jonckheere-Terpstra test can be preferable to tests of more general class difference alternatives, such as the Kruskal–Wallis test (produced by the WILCOXON option in the NPAR1WAY procedure). See Pirie (1983) and Hollander and Wolfe (1999) for more information about the Jonckheere-Terpstra test.

The Jonckheere-Terpstra test is appropriate for a two-way table in which an ordinal column variable represents the response. The row variable, which can be nominal or ordinal, represents the classification variable. The levels of the row variable should be ordered according to the ordering you want the test to detect. The order of variable levels is determined by the ORDER= option in the PROC FREQ statement. By default, ORDER=INTERNAL, which orders by unformatted values. If you specify ORDER=DATA, PROC FREQ orders values according to their order in the input data set. For more information about how to order variable levels, see the ORDER= option.

The Jonckheere-Terpstra test statistic is computed by first forming upper R left-parenthesis upper R minus 1 right-parenthesis slash 2 Mann-Whitney counts upper M Subscript i comma i prime, where i less-than i prime, for pairs of rows in the contingency table,

StartLayout 1st Row 1st Column upper M Subscript i comma i Sub Superscript prime Subscript 2nd Column equals left-brace 3rd Column number of times upper X Subscript i comma j Baseline less-than upper X Subscript i prime comma j Sub Superscript prime Subscript Baseline comma j equals 1 comma ellipsis comma n Subscript i period Baseline semicolon j prime equals 1 comma ellipsis comma n Subscript i prime period Baseline right-brace 2nd Row 1st Column Blank 2nd Column plus one-half left-brace 3rd Column number of times upper X Subscript i comma j Baseline equals upper X Subscript i prime comma j Sub Superscript prime Subscript Baseline comma j equals 1 comma ellipsis comma n Subscript i period Baseline semicolon j prime equals 1 comma ellipsis comma n Subscript i prime period Baseline right-brace EndLayout

where upper X Subscript i comma j is response j in row i. The Jonckheere-Terpstra test statistic is computed as

upper J equals sigma-summation Underscript 1 less-than-or-equal-to i less-than Endscripts sigma-summation Underscript i prime less-than-or-equal-to upper R Endscripts upper M Subscript i comma i prime

This test rejects the null hypothesis of no difference among classes for large values of J. Asymptotic p-values for the Jonckheere-Terpstra test are obtained by using the normal approximation for the distribution of the standardized test statistic. The standardized test statistic is computed as

upper J Superscript asterisk Baseline equals left-parenthesis upper J minus normal upper E 0 left-parenthesis upper J right-parenthesis right-parenthesis slash StartRoot normal upper V normal a normal r Subscript 0 Baseline left-parenthesis upper J right-parenthesis EndRoot

where normal upper E 0 left-parenthesis upper J right-parenthesis and normal upper V normal a normal r Subscript 0 Baseline left-parenthesis upper J right-parenthesis are the expected value and variance of the test statistic under the null hypothesis,

normal upper E 0 left-parenthesis upper J right-parenthesis equals left-parenthesis n squared minus sigma-summation Underscript i Endscripts n Subscript i dot Superscript 2 Baseline right-parenthesis slash 4
normal upper V normal a normal r Subscript 0 Baseline left-parenthesis upper J right-parenthesis equals upper A slash 72 plus upper B slash left-parenthesis 36 n left-parenthesis n minus 1 right-parenthesis left-parenthesis n minus 2 right-parenthesis right-parenthesis plus upper C slash left-parenthesis 8 n left-parenthesis n minus 1 right-parenthesis right-parenthesis

where

upper A equals n left-parenthesis n minus 1 right-parenthesis left-parenthesis 2 n plus 5 right-parenthesis minus sigma-summation Underscript i Endscripts n Subscript i dot Baseline left-parenthesis n Subscript i dot Baseline minus 1 right-parenthesis left-parenthesis 2 n Subscript i dot Baseline plus 5 right-parenthesis minus sigma-summation Underscript j Endscripts n Subscript dot j Baseline left-parenthesis n Subscript dot j Baseline minus 1 right-parenthesis left-parenthesis 2 n Subscript dot j Baseline plus 5 right-parenthesis
upper B equals left-parenthesis sigma-summation Underscript i Endscripts n Subscript i dot Baseline left-parenthesis n Subscript i dot Baseline minus 1 right-parenthesis left-parenthesis n Subscript i dot Baseline minus 2 right-parenthesis right-parenthesis left-parenthesis sigma-summation Underscript j Endscripts n Subscript dot j Baseline left-parenthesis n Subscript dot j Baseline minus 1 right-parenthesis left-parenthesis n Subscript dot j Baseline minus 2 right-parenthesis right-parenthesis
upper C equals left-parenthesis sigma-summation Underscript i Endscripts n Subscript i dot Baseline left-parenthesis n Subscript i dot Baseline minus 1 right-parenthesis right-parenthesis left-parenthesis sigma-summation Underscript j Endscripts n Subscript dot j Baseline left-parenthesis n Subscript dot j Baseline minus 1 right-parenthesis right-parenthesis

PROC FREQ computes one-sided and two-sided p-values for the Jonckheere-Terpstra test. When the standardized test statistic is greater than its null hypothesis expected value of 0, PROC FREQ displays the right-sided p-value, which is the probability of a larger value of the statistic occurring under the null hypothesis. A small right-sided p-value supports the alternative hypothesis of increasing order from row 1 to row R. When the standardized test statistic is less than or equal to 0, PROC FREQ displays the left-sided p-value. A small left-sided p-value supports the alternative of decreasing order from row 1 to row R.

The one-sided p-value for the Jonckheere-Terpstra test, upper P 1, is computed as

upper P 1 equals StartLayout Enlarged left-brace 1st Row  normal upper P normal r normal o normal b left-parenthesis upper Z greater-than upper J Superscript asterisk Baseline right-parenthesis normal i normal f upper J Superscript asterisk Baseline greater-than 0 2nd Row  normal upper P normal r normal o normal b left-parenthesis upper Z less-than upper J Superscript asterisk Baseline right-parenthesis normal i normal f upper J Superscript asterisk Baseline less-than-or-equal-to 0 EndLayout

where Z has a standard normal distribution. The two-sided p-value, upper P 2, is computed as

upper P 2 equals normal upper P normal r normal o normal b left-parenthesis StartAbsoluteValue upper Z EndAbsoluteValue greater-than StartAbsoluteValue upper J Superscript asterisk Baseline EndAbsoluteValue right-parenthesis

PROC FREQ also provides exact p-values for the Jonckheere-Terpstra test. You can request the exact test by specifying the JT option in the EXACT statement. See the section Exact Statistics for more information.

Last updated: December 09, 2022