Exact statistics can be useful in situations where the asymptotic assumptions are not met and therefore the asymptotic p-values might not be close approximations for the true p-values. Standard asymptotic methods involve the assumption that the test statistic follows a particular distribution when the sample size is sufficiently large. When the sample size is not large, asymptotic results might not be valid. Asymptotic results might also be unreliable when the distribution of the data is sparse, skewed, or heavily tied. For more information, see Agresti (2007) and Bishop, Fienberg, and Holland (1975). Exact computations are based on the statistical theory of exact conditional inference for contingency tables, which is reviewed by Agresti (1992).
In addition to the computation of exact p-values, PROC FREQ provides the option to estimate exact p-values by Monte Carlo simulation. This can be useful for large problems where exact computations require a substantial amount of time and memory but asymptotic approximations might not be sufficient.
Exact p-values are available for many tests that PROC FREQ performs. For one-way tables, PROC FREQ provides exact p-values for the binomial proportion test, the chi-square goodness-of-fit test, and the likelihood ratio chi-square test. PROC FREQ also provides exact (Clopper-Pearson) confidence limits for the binomial proportion.
For two-way tables, PROC FREQ provides exact p-values for the following tests: Pearson chi-square test, likelihood ratio chi-square test, Mantel-Haenszel chi-square test, Fisher’s exact test, Jonckheere-Terpstra test, Cochran-Armitage test for trend, and the symmetry test. PROC FREQ also provides exact p-values for tests of the following statistics: Pearson correlation coefficient, Spearman correlation coefficient, Kendall’s tau-b, Stuart’s tau-c, Somers’ , Somers’
, simple kappa coefficient, and weighted kappa coefficient.
For tables, PROC FREQ provides the exact McNemar’s test, exact confidence limits for the odds ratio, and Barnard’s unconditional exact test for the risk (proportion) difference. PROC FREQ also provides exact unconditional confidence limits for the risk (proportion) difference and for the relative risk (ratio of proportions). For stratified
tables, PROC FREQ provides Zelen’s exact test for equal odds ratios, exact confidence limits for the common odds ratio, and an exact test for the common odds ratio.
The following sections summarize the exact computational algorithms, define the exact p-values that PROC FREQ computes, discuss the computational resource requirements, and describe the Monte Carlo estimation option.
PROC FREQ computes exact p-values for general tables by using the network algorithm, which was developed by Mehta and Patel (1983). This algorithm provides a substantial advantage over direct enumeration, which can be very time-consuming and feasible only for small problems. See Agresti (1992) for a review of algorithms for computation of exact p-values, and see Mehta, Patel, and Tsiatis (1984) and Mehta, Patel, and Senchaudhuri (1991) for information about the performance of the network algorithm.
To implement the network algorithm, PROC FREQ defines a reference set from the input data. For most exact tests that PROC FREQ provides, the reference set includes all tables that have the same marginal row and column sums as the observed table. Corresponding to the reference set, the network algorithm forms a directed acyclic network consisting of nodes in a number of stages. A path through the network corresponds to a distinct table in the reference set. The distances between nodes are defined so that the total distance of a path through the network is the corresponding value of the test statistic. At each node, the algorithm computes the shortest and longest path distances for all the paths that pass through that node. For statistics that can be expressed as a linear combination of cell frequencies multiplied by increasing row and column scores, PROC FREQ computes shortest and longest path distances by using the algorithm of Agresti, Mehta, and Patel (1990). For statistics of other forms, PROC FREQ computes an upper bound for the longest path and a lower bound for the shortest path by following the approach of Valz and Thompson (1994).
The longest and shortest path distances (bounds) for a node are compared to the value of the test statistic to determine whether all paths through the node contribute to the p-value, no paths through the node contribute to the p-value, or neither of these situations occurs. If all paths through the node contribute, the p-value is incremented accordingly, and these paths are eliminated from further analysis. If no paths contribute, these paths are eliminated from further analysis. Otherwise, the algorithm continues to process this node and the associated paths. The algorithm finishes when all nodes have been accounted for.
PROC FREQ performs the network algorithm by using full numerical precision to represent all statistics, row and column scores, and other quantities in the computations. Although it is possible to use rounding to improve the speed and memory requirements of the algorithm, PROC FREQ does not use rounding because it might reduce the accuracy of the results.
For one-way tables, PROC FREQ computes the exact chi-square goodness-of-fit test by the method of Radlow and Alf (1975). PROC FREQ generates all possible one-way tables with the observed total sample size and number of categories. For each possible table, PROC FREQ compares its chi-square value with the value for the observed table. If the table’s chi-square value is greater than or equal to the observed chi-square, PROC FREQ increments the exact p-value by the probability of that table, which is calculated under the null hypothesis by using the multinomial frequency distribution. By default, the null hypothesis states that all categories have equal proportions. If you specify null hypothesis proportions or frequencies by using the TESTP= or TESTF= option in the TABLES statement, PROC FREQ calculates the exact chi-square test based on that null hypothesis.
Other exact computations are described in sections about the individual statistics. For information about the computation of exact confidence limits and tests for the binomial proportion, see the section Binomial Proportion. For information about computation of exact confidence limits for the odds ratio, see the subsection "Exact Confidence Limits" in the section Confidence Limits for the Odds Ratio. For information about other exact computations, see the subsection "Exact Unconditional Confidence Limits" in the section Confidence Limits for the Risk Difference, the subsection "Exact Unconditional Confidence Limits" in the section Confidence Limits for the Relative Risk, and the sections Exact Symmetry Test, Exact Confidence Limits for the Common Odds Ratio and Zelen’s Exact Test for Equal Odds Ratios.
For several tests in PROC FREQ, the test statistic is nonnegative, and large values of the test statistic indicate a departure from the null hypothesis. Such nondirectional tests include the Pearson chi-square, the likelihood ratio chi-square, the Mantel-Haenszel chi-square, Fisher’s exact test for tables larger than , McNemar’s test, the symmetry test, and the one-way chi-square goodness-of-fit test. The exact p-value for a nondirectional test is the sum of probabilities for those tables having a test statistic greater than or equal to the value of the observed test statistic.
There are other tests where it might be appropriate to test against either a one-sided or a two-sided alternative hypothesis. For example, when you test the null hypothesis that the true parameter value equals 0 (), the alternative of interest might be one-sided (
, or
) or two-sided (
). Such tests include the Pearson correlation coefficient, Spearman correlation coefficient, Jonckheere-Terpstra test, Cochran-Armitage test for trend, simple kappa coefficient, and weighted kappa coefficient. For these tests, PROC FREQ displays the right-sided p-value when the observed value of the test statistic is greater than its expected value. The right-sided p-value is the sum of probabilities for those tables for which the test statistic is greater than or equal to the observed test statistic. Otherwise, when the observed test statistic is less than or equal to the expected value, PROC FREQ displays the left-sided p-value. The left-sided p-value is the sum of probabilities for those tables for which the test statistic is less than or equal to the one observed. The one-sided p-value
can be expressed as
where t is the observed value of the test statistic and is the expected value of the test statistic under the null hypothesis. PROC FREQ computes the two-sided p-value as the sum of the one-sided p-value and the corresponding area in the opposite tail of the distribution of the statistic, equidistant from the expected value. The two-sided p-value
can be expressed as
If you specify the POINT option in the EXACT statement, PROC FREQ provides exact point probabilities for the exact tests. The exact point probability is the exact probability that the test statistic equals the observed value.
If you specify the MIDP option in the EXACT statement, PROC FREQ provides exact mid-p-values. The exact mid p-value is defined as the exact p-value minus half the exact point probability, which equals the average of and
for a right-sided test. The exact mid p-value is smaller and less conservative than the non-adjusted exact p-value. For more information, see Agresti (2013, section 1.1.4) and Hirji (2006, sections 2.5 and 2.11.1).
PROC FREQ uses relatively fast and efficient algorithms for exact computations. These algorithms, together with improvements in computing power, make it feasible to perform exact computations for data where previously only asymptotic methods could be applied. Nevertheless, depending on your available computing resources, exact computations for some very large problems might require a prohibitive amount of time and memory. For such large problems, consider whether exact methods are really needed or whether asymptotic methods might give results that are very close to the exact results while requiring much less computing time and memory. When asymptotic methods might not be sufficient for such large problems, consider using Monte Carlo estimation of exact p-values, which is described in the section Monte Carlo Estimation.
There is no formula that can predict in advance how much time and memory are needed to compute an exact p-value for a specific data set and test. The time and memory requirements depend on several factors, which include the following: the total number of observations, the number of rows and columns in the table, the particular arrangement of the observations into table cells, and the test to be performed. Generally, larger problems (in terms of total sample size, number of rows, and number of columns) tend to require more time and memory. For a fixed total sample size, time and memory requirements tend to increase as the number of rows and number of columns increase because of the corresponding increase in the number of reference set tables. For a fixed sample size, time and memory requirements also tend to increase as the marginal row and column totals become more homogeneous. For more information, see Agresti, Mehta, and Patel (1990) and Gail and Mantel (1977).
While PROC FREQ is computing an exact p-value, you can terminate the computation by pressing the system interrupt key sequence and choosing to stop computations. For more information, see the SAS Companion for your system. After you terminate an exact computation, PROC FREQ completes all other remaining tasks. The procedure reports missing values for any exact p-values that were not computed before termination.
To limit the amount of time that PROC FREQ uses for exact computations, you can specify the MAXTIME= option in the EXACT statement. This option sets the maximum amount of clock time (in seconds) that PROC FREQ can use to compute an exact p-value. If PROC FREQ does not finish an exact computation in the time that you specify, the procedure terminates the computation and completes the remaining tasks.
When you specify the MC option in the EXACT statement, PROC FREQ computes Monte Carlo estimates of exact p-values. Monte Carlo estimation can be useful for large problems where exact computations require a substantial amount of time and memory but asymptotic approximations might not be sufficient. Monte Carlo estimates are available for all exact tests that PROC FREQ provides except the binomial proportion test and those tests that apply only to or
tables.
To describe the precision of a Monte Carlo estimate, PROC FREQ provides the asymptotic standard error and % confidence limits. You can specify the confidence level
in the ALPHA= option in the EXACT statement; by default, ALPHA=0.01, which produces 99% confidence limits.
You can specify the number of Monte Carlo samples by using the N=n option in the EXACT statement. By default, PROC FREQ uses 10,000 samples to compute a Monte Carlo estimate. To improve the precision of the Monte Carlo estimates, you can specify a larger value of n; this increases the computation time because more samples are generated. To reduce the computation time, you can specify a smaller value of n.
PROC FREQ computes a Monte Carlo estimate of an exact p-value by generating a random sample of tables from the reference set for the exact test. For most exact tests that PROC FREQ provides, the reference set includes tables that have the same total sample size, row sums, and column sums as the observed table. (For the exact symmetry test, the reference set includes tables that have the same total sample size as the observed table and the same frequency sums of the off-diagonal table cell pairs.)
PROC FREQ generates a random sample of tables from the reference set by using the algorithm of Agresti, Wackerly, and Boyett (1979), which generates tables in proportion to their hypergeometric probabilities conditional on the marginal frequencies. For each sample table, PROC FREQ computes the value of the test statistic and compares it to the value of the test statistic for the observed table. To estimate a right-sided p-value, PROC FREQ counts all sample tables for which the test statistic is greater than or equal to the observed test statistic. The estimate of the p-value is the number of these tables divided by the total number of sample tables, which can be expressed as
PROC FREQ computes estimates of left-sided and two-sided exact p-values similarly. For left-sided exact p-values, PROC FREQ evaluates whether the sample test statistics are less than or equal to the observed test statistic. For two-sided exact p-values, PROC FREQ compares sample test statistics to the observed test statistic by using the definition of the two-sided p-value () for the test. For more information, see the section Definition of p-Values and descriptions of the individual tests.
The variable m has a binomial distribution with n trials and success probability p. The asymptotic standard error of the Monte Carlo estimate is
PROC FREQ constructs asymptotic confidence limits for the exact p-value as
where is the
th percentile of the standard normal distribution and the confidence level
is determined by the ALPHA= option in the EXACT statement.
When the Monte Carlo estimate is 0, PROC FREQ computes confidence limits for the p-value as
When the Monte Carlo estimate is 1, PROC FREQ computes confidence limits for the p-value as