The MULTTEST Procedure

Statistical Tests

The following section discusses the statistical tests performed in the MULTTEST procedure. For continuous data, a t test for the mean (MEAN) is available. For discrete variables, available tests are the Cochran-Armitage linear trend test (CA), the Freeman-Tukey double arcsine test (FT), the Peto mortality-prevalence test (PETO), and the Fisher exact test (FISHER).

Throughout this section, the discrete and continuous variables are denoted by and , respectively, where v is the variable, g is the treatment group, s is the stratum, and r is the replication. Let denote the sample size for a binary variable v within group g and stratum s. A plus sign (+) subscript denotes summation over an index. Note that the tests are invariant to the location and scale of the contrast coefficients .

Cochran-Armitage Linear Trend Test

The Cochran-Armitage linear trend test (Cochran 1954; Armitage 1955; Agresti 2002) is implemented by using a Z-score approximation, an exact permutation distribution, or a combination of both.

Z-Score Approximation

The pooled probability estimate for variable v and stratum s is

p Subscript v s Baseline equals StartFraction upper S Subscript v plus s plus Baseline Over m Subscript v plus s Baseline EndFraction

The expected value (under constant within-stratum treatment probabilities) for variable v, group g, and stratum s is

upper E Subscript v g s Baseline equals m Subscript v g s Baseline p Subscript v s

Letting denote the contrast trend coefficients specified by the CONTRAST statement, the test statistic for variable v has numerator

upper N Subscript v Baseline equals sigma-summation Underscript s Endscripts sigma-summation Underscript g Endscripts t Subscript g Baseline left-parenthesis upper S Subscript v g s plus Baseline minus upper E Subscript v g s Baseline right-parenthesis

The binomial variance estimate for this statistic is

upper V Subscript v Baseline equals sigma-summation Underscript s Endscripts p Subscript v s Baseline left-parenthesis 1 minus p Subscript v s Baseline right-parenthesis sigma-summation Underscript g Endscripts m Subscript v g s Baseline left-parenthesis t Subscript g Baseline minus t overbar Subscript v s Baseline right-parenthesis squared

where

t overbar Subscript v s Baseline equals sigma-summation Underscript g Endscripts StartFraction m Subscript v g s Baseline t Subscript g Baseline Over m Subscript v plus s Baseline EndFraction

The hypergeometric variance estimate (the default) is

upper V Subscript v Baseline equals sigma-summation Underscript s Endscripts StartSet m Subscript v plus s Baseline slash left-parenthesis m Subscript v plus s Baseline minus 1 right-parenthesis EndSet p Subscript v s Baseline left-parenthesis 1 minus p Subscript v s Baseline right-parenthesis sigma-summation Underscript g Endscripts m Subscript v g s Baseline left-parenthesis t Subscript g Baseline minus t overbar Subscript v s Baseline right-parenthesis squared

For any strata s with , the contribution to the variance is taken to be zero.

PROC MULTTEST computes the Z-score statistic

upper Z Subscript v Baseline equals StartFraction upper N Subscript v Baseline Over StartRoot upper V Subscript v Baseline EndRoot EndFraction

The p-value for this statistic comes from the standard normal distribution. Whenever a 0 is computed for the denominator, the p-value is set to 1. This p-value approximates the probability obtained from the exact permutation distribution, discussed in the following text.

The Z-score statistic can be continuity-corrected to better approximate the permutation distribution. With continuity correction c, the upper-tailed p-value is computed from

upper Z Subscript v Baseline equals StartFraction upper N Subscript v Baseline minus c Over StartRoot upper V Subscript v Baseline EndRoot EndFraction

For two-tailed, noncontinuity-corrected tests, PROC MULTTEST reports the p-value as , where p is the upper-tailed p-value. The same formula holds for the continuity-corrected test, with the exception that when the noncontinuity-corrected Z and the continuity-corrected Z have opposite signs, the two-tailed p-value is 1.

When the PERMUTATION= option is specified and no STRATA variable is specified, PROC MULTTEST uses a continuity correction selected to optimally approximate the upper-tail probability of permutation distributions with smaller marginal totals (Westfall and Lin 1988). Otherwise, the continuity correction is specified by the CONTINUITY= option in the TEST statement.

The CA Z-score statistic is the Hoel-Walburg (Mantel-Haenszel) statistic reported by Dinse (1985).

Exact Permutation Test

When you use the PERMUTATION= option for CA in the TEST statement, PROC MULTTEST computes the exact permutation distribution of the trend score

upper T Subscript v Baseline equals sigma-summation Underscript s Endscripts sigma-summation Underscript g Endscripts t Subscript g Baseline upper S Subscript v g s plus

where the contrast trend coefficients must be integer valued. The observed value of this trend is compared to the permutation distribution to obtain the p-value

p Subscript v Baseline equals probability left-parenthesis upper X greater-than-or-equal-to observed upper T Subscript v Baseline right-parenthesis

where X is a random variable from the permutation distribution and where upper-tailed tests are requested. This probability can be viewed as a binomial probability, where the within-stratum probabilities are constant and where the probability is conditional with respect to the marginal totals . It also can be considered a rerandomization probability.

Because the computations can be quite time-consuming with large data sets, specifying the PERMUTATION=number option in the TEST statement limits the situations where PROC MULTTEST computes the exact permutation distribution. When marginal total success or total failure frequencies exceed number for a particular stratum, the permutation distribution is approximated by a continuity-corrected normal distribution. You should be cautious when using the PERMUTATION= option in conjunction with bootstrap resampling because the permutation distribution is recomputed for each bootstrap sample. This recomputation is not necessary with permutation resampling.

The permutation distribution is computed in two steps:

The permutation distributions of the trend scores are computed within each stratum.
The distributions are convolved to obtain the distribution of the total trend.

As long as the total success or failure frequency does not exceed number for any stratum, the computed distributions are exact. In other words, if number or number for all s, then the permutation trend distribution for variable v is computed exactly.

In step 1, the distribution of the within-stratum trend

sigma-summation Underscript g Endscripts t Subscript g Baseline upper S Subscript v g s plus

is computed by using the multivariate hypergeometric distribution of the , provided number is not exceeded. This distribution can be written as

probability left-parenthesis upper S Subscript v Baseline 1 s plus Baseline comma upper S Subscript v Baseline 2 s plus Baseline comma ellipsis comma upper S Subscript v upper G s plus Baseline right-parenthesis equals product Underscript g equals 1 Overscript upper G Endscripts StartFraction StartBinomialOrMatrix m Subscript v g s Baseline Choose upper S Subscript v g s plus Baseline EndBinomialOrMatrix Over StartBinomialOrMatrix m Subscript v plus s Baseline Choose upper S Subscript v plus s plus Baseline EndBinomialOrMatrix EndFraction

The distribution of the within-stratum trend is then computed by summing these probabilities over appropriate configurations. For further information about this technique, see Bickis and Krewski (1986) and Westfall and Lin (1988). In step 2, the exact convolution distribution is obtained for the trend statistic summed over all strata having totals that meet the threshold criterion. This distribution is obtained by applying the fast Fourier transform to the exact within-stratum distributions. A description of this general method can be found in Pagano and Tritchler (1983) and Good (1987).

The convolution distribution of the overall trend is then computed by convolving the exact distribution with the distribution of the continuity-corrected standard normal approximation. To be more specific, let denote the subset of stratum indices that satisfy the threshold criterion, and let denote the subset of indices that do not satisfy the criterion. Let denote the combined trend statistic from the set , which has an exact distribution obtained from Fourier analysis as previously outlined, and let denote the combined trend statistic from the set . Then the distribution of the overall trend is obtained by convolving the analytic distribution of with the continuity-corrected normal approximation for . Using the notation from the section Z-Score Approximation, this convolution can be written as

where Z is a standard normal random variable, and

z equals StartFraction 1 Over StartRoot upper V Subscript v Baseline EndRoot EndFraction left-parenthesis u minus u Baseline 1 minus sigma-summation Underscript upper S 2 Endscripts p Subscript v s Baseline sigma-summation Underscript g Endscripts t Subscript g Baseline m Subscript v g s Baseline minus c right-parenthesis

In this expression, the summation of s in is over , and c is the continuity correction discussed under the Z-score approximation.

When a two-tailed test is requested, the expected trend is computed

upper E Subscript v Baseline equals sigma-summation Underscript s Endscripts sigma-summation Underscript g Endscripts t Subscript g Baseline upper E Subscript v g s

The two-tailed p-value is reported as the permutation tail probability for the observed trend plus the permutation tail probability for , the reflected trend.

Freeman-Tukey Double Arcsine Test

For this test, the contrast trend coefficients are centered to the values , where , , and G is the number of groups. The numerator of this test statistic is

upper N Subscript v Baseline equals sigma-summation Underscript s Endscripts w Subscript v s Baseline sigma-summation Underscript g Endscripts c Subscript g Baseline f left-parenthesis upper S Subscript v g s plus Baseline comma m Subscript v g s Baseline right-parenthesis

where the weights take on three different types of values depending upon your specification of the WEIGHT= option in the STRATA statement. The default value is the within-strata sample size , ensuring comparability with the ordinary CA trend statistic. WEIGHT=HARMONIC sets equal to the harmonic mean

left-bracket left-parenthesis sigma-summation Underscript g Endscripts StartFraction 1 Over m Subscript v g s Baseline EndFraction right-parenthesis slash upper G Superscript asterisk Baseline right-bracket Superscript negative 1

where is the number of nonmissing groups and the summation is over only the nonmissing elements. The harmonic means analysis places more weight on the smaller sample sizes than does the default sample size method, and is similar to a Type 2 analysis in PROC GLM. WEIGHT=EQUAL sets for all v and s, and is similar to a Type 3 analysis in PROC GLM.

The function is the double arcsine transformation:

f left-parenthesis r comma n right-parenthesis equals arc sine left-parenthesis StartRoot StartFraction r Over n plus 1 EndFraction EndRoot right-parenthesis plus arc sine left-parenthesis StartRoot StartFraction r plus 1 Over n plus 1 EndFraction EndRoot right-parenthesis

The variance estimate is

upper V Subscript v Baseline equals sigma-summation Underscript s Endscripts w Subscript v s Superscript 2 Baseline sigma-summation Underscript g Endscripts StartFraction c Subscript g Superscript 2 Baseline Over m Subscript v g s Baseline plus one-half EndFraction

The test statistic is

The Freeman-Tukey transformation and its variance are described by Freeman and Tukey (1950) and Miller (1978). Since its variance is not weighted by the pooled probabilities, as is the CA test, the FT test can be more useful than the CA test for tests involving only a subset of the groups.

Peto Mortality-Prevalence Trend Test

The Peto test is a modified Cochran-Armitage procedure incorporating mortality and prevalence information. The Peto test is computed like two Cochran-Armitage Z-score approximations, one for prevalence and one for mortality (Peto et al. 1980). It represents a special case in PROC MULTTEST because the data structure requirements are different, and the resampling methods used for adjusting p-values are not valid. The TIME= option variable is required to specify "death" times or, more generally, times of occurrence. In addition, the test variables must assume one of the following three values:

0 = no occurrence
1 = incidental occurrence
2 = fatal occurrence

Use the TIME= option variable to define the mortality strata, and use the STRATA statement variable to define the prevalence strata.

In the following notation, the subscript v represents the variable, g represents the treatment group, s represents the stratum, and t represents the time. Recall that a plus sign in a subscript location denotes summation over that subscript.

Let be the number of incidental occurrences, and let be the total sample size for variable v in group g, stratum s, excluding fatal tumors.

Let be the number of fatal occurrences in time period t, and let be the number of patients alive at the end of time t – 1.

The pooled probability estimates are given by

StartLayout 1st Row 1st Column p Subscript v s Superscript upper P 2nd Column equals 3rd Column StartFraction upper S Subscript v plus s Superscript upper P Baseline Over m Subscript v plus s Superscript upper P Baseline EndFraction 2nd Row 1st Column p Subscript v t Superscript upper F 2nd Column equals 3rd Column StartFraction upper S Subscript v plus t Superscript upper F Baseline Over m Subscript v plus t Superscript upper F Baseline EndFraction EndLayout

The expected values are

StartLayout 1st Row 1st Column upper E Subscript v g s Superscript upper P 2nd Column equals 3rd Column m Subscript v g s Superscript upper P Baseline p Subscript v s Superscript upper P 2nd Row 1st Column upper E Subscript v g t Superscript upper F 2nd Column equals 3rd Column m Subscript v g t Superscript upper F Baseline p Subscript v t Superscript upper F EndLayout

Let denote a contrast trend coefficient, and define the numerator terms as follows:

StartLayout 1st Row 1st Column upper N Subscript v Superscript upper P 2nd Column equals 3rd Column sigma-summation Underscript s Endscripts sigma-summation Underscript g Endscripts t Subscript g Baseline left-parenthesis upper S Subscript v g s Superscript upper P Baseline minus upper E Subscript v g s Superscript upper P Baseline right-parenthesis 2nd Row 1st Column upper N Subscript v Superscript upper F 2nd Column equals 3rd Column sigma-summation Underscript t Endscripts sigma-summation Underscript g Endscripts t Subscript g Baseline left-parenthesis upper S Subscript v g t Superscript upper F Baseline minus upper E Subscript v g t Superscript upper F Baseline right-parenthesis EndLayout

Define the denominator variance terms by using the binomial variance:

StartLayout 1st Row 1st Column upper V Subscript v Superscript upper P 2nd Column equals 3rd Column sigma-summation Underscript s Endscripts p Subscript v s Superscript upper P Baseline left-parenthesis 1 minus p Subscript v s Superscript upper P Baseline right-parenthesis left-bracket left-parenthesis sigma-summation Underscript g Endscripts m Subscript v g s Superscript upper P Baseline t Subscript g Baseline Superscript 2 Baseline right-parenthesis minus StartFraction 1 Over m Subscript v plus s Superscript upper P Baseline EndFraction left-parenthesis sigma-summation Underscript g Endscripts m Subscript v g s Superscript upper P Baseline t Subscript g Baseline right-parenthesis squared right-bracket 2nd Row 1st Column upper V Subscript v Superscript upper F 2nd Column equals 3rd Column sigma-summation Underscript s Endscripts p Subscript v t Superscript upper F Baseline left-parenthesis 1 minus p Subscript v t Superscript upper F Baseline right-parenthesis left-bracket left-parenthesis sigma-summation Underscript g Endscripts m Subscript v g t Superscript upper F Baseline t Subscript g Baseline Superscript 2 Baseline right-parenthesis minus StartFraction 1 Over m Subscript v plus t Superscript upper F Baseline EndFraction left-parenthesis sigma-summation Underscript g Endscripts m Subscript v g t Superscript upper F Baseline t Subscript g Baseline right-parenthesis squared right-bracket EndLayout

The hypergeometric variances (the default) are calculated by weighting the within-strata variances as discussed in the section Z-Score Approximation.

The Peto statistic is computed as

StartLayout 1st Row 1st Column upper Z Subscript v 2nd Column equals 3rd Column StartFraction upper N Subscript v Superscript upper P Baseline plus upper N Subscript v Superscript upper F Baseline minus c Over StartRoot upper V Subscript v Superscript upper P Baseline plus upper V Subscript v Superscript upper F Baseline EndRoot EndFraction EndLayout

where c is a continuity correction. The p-value is determined from the standard normal distribution unless the PERMUTATION=number option is used. When you use the PERMUTATION= option for PETO in the TEST statement, PROC MULTTEST computes the "discrete approximation" permutation distribution described by Mantel (1980) and Soper and Tonkonoh (1993). Specifically, the permutation distribution of is computed, assuming that and are independent over all s and t. Note that the contrast trend coefficients must be integer valued. The p-values are exact under this independence assumption. However, the independence assumption is valid only asymptotically, which is why these p-values are called "approximate."

An exact permutation distribution is available only under the assumption of equal risk of censoring in all treatment groups; even then, computing this distribution can be cumbersome. Soper and Tonkonoh (1993) describe situations where the discrete approximation distribution closely fits the exact permutation distribution.

Fisher Exact Test

The CONTRAST statement in PROC MULTTEST enables you to compute Fisher exact tests for two-group comparisons. No stratification variable is allowed for this test. Note, however, that the FISHER exact test is a special case of the exact permutation tests performed by PROC MULTTEST and that these permutation tests allow a stratification variable. Recall that contrast coefficients can be –1, 0, or 1 for the Fisher test. The frequencies and sample sizes of the groups scored as –1 are combined, as are the frequencies and sample sizes of the groups scored as 1. Groups scored as 0 are excluded. The –1 group is then compared with the 1 group by using the Fisher exact test.

Letting x and m denote the frequency and sample size of the 1 group, and letting y and n denote those of the –1 group, the p-value is calculated as

probability left-parenthesis upper X greater-than-or-equal-to x vertical-bar upper X plus upper Y equals x plus y right-parenthesis equals sigma-summation Underscript i equals x Overscript m Endscripts StartFraction StartBinomialOrMatrix m Choose i EndBinomialOrMatrix StartBinomialOrMatrix n Choose x plus y minus i EndBinomialOrMatrix Over StartBinomialOrMatrix m plus n Choose x plus y EndBinomialOrMatrix EndFraction

where X and Y are independent binomially distributed random variables with sample sizes m and n and common probability parameters. The hypergeometric distribution is used to determine the stated probability; Yates (1984) discusses this technique. PROC MULTTEST computes the two-tailed p-values by adding probabilities from both tails of the hypergeometric distribution. The first tail is from the observed x and y, and the other tail is chosen so that the resulting probability is as large as possible without exceeding the probability from the first tail. If the variable being tested has only one level, then the p-value is set to 1.

t Test for the Mean

For continuous variables, PROC MULTTEST automatically centers the contrast trend coefficients, as in the Freeman-Tukey test. These centered coefficients are then used to form a t statistic contrasting the within-group means. Let denote the sample size within group g and stratum s; it depends on variable v only when there are missing values. Determine the weights as in the Freeman-Tukey test with replacing . Define

upper X overbar Subscript v g s plus Baseline equals StartFraction 1 Over n Subscript v g s Baseline EndFraction sigma-summation Underscript r Endscripts upper X Subscript v g s r

as the sample mean within a group-and-stratum combination, and let denote the treatment means. Write the null hypothesis as

sigma-summation Underscript s Endscripts w Subscript v s Baseline sigma-summation Underscript g Endscripts c Subscript g Baseline mu Subscript v g s Baseline equals 0

Also define

s Subscript v Superscript 2 Baseline equals StartFraction sigma-summation Underscript s Endscripts sigma-summation Underscript g Endscripts sigma-summation Underscript r Endscripts left-parenthesis upper X Subscript v g s r Baseline minus upper X overbar Subscript v g s plus Baseline right-parenthesis squared Over sigma-summation Underscript s Endscripts sigma-summation Underscript g Endscripts left-parenthesis n Subscript v g s Baseline minus 1 right-parenthesis EndFraction

as the pooled sample variance.

Homogeneous Variance

Assuming constant variance for all group-and-stratum combinations, the t statistic for the mean is

upper M Subscript v Baseline equals StartFraction sigma-summation Underscript s Endscripts w Subscript v s Baseline sigma-summation Underscript g Endscripts c Subscript g Baseline upper X overbar Subscript v g s plus Baseline Over StartRoot s Subscript v Superscript 2 Baseline left-parenthesis sigma-summation Underscript s Endscripts w Subscript v s Superscript 2 Baseline sigma-summation Underscript g Endscripts StartFraction c Subscript g Superscript 2 Baseline Over n Subscript v g s Baseline EndFraction right-parenthesis EndRoot EndFraction

Then under the null hypothesis and assuming normality, independence, and homoscedasticity, follows a t distribution with degrees of freedom.

Whenever a denominator of 0 is computed, the p-value is set to 1. When missing data force , the contribution to the denominator of the pooled variance is 0 and not –1. This is also true for the degrees of freedom.

Heterogeneous Variance

If you do not assume constant variance for all group-and-stratum combinations, then the approximate t test is

Under the null hypothesis and assuming normality and independence, the Satterthwaite (1946) approximation for the degrees of freedom of the t test is given by

d f Subscript s Baseline equals StartStartFraction left-parenthesis sigma-summation Underscript s Endscripts w Subscript v s Superscript 2 Baseline sigma-summation Underscript g Endscripts c Subscript g Superscript 2 Baseline StartFraction s Subscript v g s Superscript 2 Baseline Over n Subscript v g s Baseline EndFraction right-parenthesis squared OverOver sigma-summation Underscript s Endscripts sigma-summation Underscript g Endscripts StartFraction left-parenthesis w Subscript v s Superscript 2 Baseline c Subscript g Superscript 2 Baseline StartFraction s Subscript v g s Superscript 2 Baseline Over n Subscript v g s Baseline EndFraction right-parenthesis squared Over n Subscript v g s Baseline minus 1 EndFraction EndEndFraction

under the restriction .

Whenever a denominator of 0 for is computed, the p-value is set to 1. If the denominator for is computed as 0, then set . When missing data force , that group-and-stratum combination does not contribute to the computation.

Last updated: December 09, 2022