The NPAR1WAY Procedure

Simple Linear Rank Tests for Two-Sample Data

Statistics of the form

upper S equals sigma-summation Underscript j equals 1 Overscript n Endscripts c Subscript j Baseline a left-parenthesis upper R Subscript j Baseline right-parenthesis

are called simple linear rank statistics, where

StartLayout 1st Row 1st Column upper R Subscript j 2nd Column is the rank of observation j 2nd Row 1st Column a left-parenthesis upper R Subscript j Baseline right-parenthesis 2nd Column is the score based on the rank of observation j 3rd Row 1st Column c Subscript j 2nd Column indicates the class of observation j 4th Row 1st Column n 2nd Column is the total number of observations EndLayout

For two-sample data (where the observations are classified into two levels), PROC NPAR1WAY calculates simple linear rank statistics for the scores that you specify. The section Scores for Linear Rank and One-Way ANOVA Tests describes the available scores, which you can use to test for differences in location and differences in scale.

To compute the linear rank statistic S, PROC NPAR1WAY sums the scores of the observations in the smaller of the two samples. If both samples have the same number of observations, PROC NPAR1WAY sums those scores for the sample (level) that appears first in the displayed output. (By default, class levels are displayed in the order in which they appear in the input data set. If you specify the ORDER=FORMATTED option, class levels are displayed in order of their formatted value.)

For each score that you specify, PROC NPAR1WAY computes an asymptotic test of the null hypothesis of no difference between the two classification levels. Exact tests are also available for these two-sample linear rank statistics. PROC NPAR1WAY computes exact tests for each score type that you specify in the EXACT statement. For more information, see the section Exact Tests.

To compute an asymptotic test for a linear rank sum statistic, PROC NPAR1WAY uses a standardized test statistic z, which has an asymptotic standard normal distribution under the null hypothesis. The standardized test statistic is computed as

z equals left-parenthesis upper S minus normal upper E 0 left-parenthesis upper S right-parenthesis right-parenthesis slash StartRoot normal upper V normal a normal r 0 left-parenthesis upper S right-parenthesis EndRoot

where is the expected value of S under the null hypothesis, and is the variance under the null hypothesis. As shown in Randles and Wolfe (1979),

normal upper E 0 left-parenthesis upper S right-parenthesis equals StartFraction n 1 Over n EndFraction sigma-summation Underscript j equals 1 Overscript n Endscripts a left-parenthesis upper R Subscript j Baseline right-parenthesis

where is the number of observations in the first (smaller) class level (sample), is the number of observations in the other class level, and

normal upper V normal a normal r 0 left-parenthesis upper S right-parenthesis equals StartFraction n 1 n 2 Over n left-parenthesis n minus 1 right-parenthesis EndFraction sigma-summation Underscript j equals 1 Overscript n Endscripts left-parenthesis a left-parenthesis upper R Subscript j Baseline right-parenthesis minus a overbar right-parenthesis squared

where is the average score,

a overbar equals StartFraction 1 Over n EndFraction sigma-summation Underscript j equals 1 Overscript n Endscripts a left-parenthesis upper R Subscript j Baseline right-parenthesis

Definition of p-Values

PROC NPAR1WAY computes one-sided and two-sided asymptotic p-values for each two-sample linear rank test. When the test statistic z is greater than its null hypothesis expected value of 0, PROC NPAR1WAY computes the right-sided p-value, which is the probability of a larger value of the statistic occurring under the null hypothesis. When the test statistic is less than or equal to 0, PROC NPAR1WAY computes the left-sided p-value, which is the probability of a smaller value of the statistic occurring under the null hypothesis. The one-sided p-value can be expressed as

upper P 1 left-parenthesis z right-parenthesis equals StartLayout Enlarged left-brace 1st Row normal upper P normal r normal o normal b left-parenthesis upper Z greater-than z right-parenthesis normal i normal f z greater-than 0 2nd Row normal upper P normal r normal o normal b left-parenthesis upper Z less-than z right-parenthesis normal i normal f z less-than-or-equal-to 0 EndLayout

where Z has a standard normal distribution. The two-sided p-value is computed as

upper P 2 left-parenthesis z right-parenthesis equals normal upper P normal r normal o normal b left-parenthesis StartAbsoluteValue upper Z EndAbsoluteValue greater-than StartAbsoluteValue z EndAbsoluteValue right-parenthesis

Continuity Correction

PROC NPAR1WAY uses a continuity correction for the asymptotic two-sample Wilcoxon and Siegel-Tukey tests by default. You can remove the continuity correction by specifying the CORRECT=NO option. PROC NPAR1WAY incorporates the continuity correction when computing the standardized test statistic z by subtracting 0.5 from the numerator if it is greater than 0. If the numerator is less than 0, PROC NPAR1WAY adds 0.5. Some sources recommend a continuity correction for nonparametric tests that use a continuous distribution to approximate a discrete distribution. For more information, see Sheskin (1997).

If you specify CORRECT=NO, PROC NPAR1WAY does not use a continuity correction for any test.

Last updated: December 09, 2022