The FREQ Procedure

Tests and Measures of Agreement

When you specify the AGREE option in the TABLES statement, PROC FREQ computes tests and measures of agreement for square tables (for which the number of rows equals the number of columns). By default, these statistics include McNemar’s test for 2 times 2 tables, Bowker’s symmetry test, the simple kappa coefficient, and the weighted kappa coefficient. For multiple strata (n-way tables, where n > 2), the AGREE option provides the overall simple and weighted kappa coefficients, in addition to tests for equal kappas (simple and weighted) among strata. For multiple strata of 2 times 2 tables, the AGREE option provides Cochran’s Q test.

Optionally, PROC FREQ provides kappa tests and other agreement statistics. In addition to the asymptotic tests described in this section, PROC FREQ provides exact p-values for McNemar’s test, the simple kappa coefficient test, and the weighted kappa coefficient test. You can request these exact tests by specifying the corresponding options in the EXACT statement. For more information, see the section Exact Statistics.

The following sections provide the formulas that PROC FREQ uses to compute agreement statistics. For information about the use and interpretation of these statistics, see Agresti (2002, 2007); Fleiss, Levin, and Paik (2003), and the other references cited for each statistic.

McNemar’s Test

PROC FREQ computes McNemar’s test (McNemar 1947) for 2 times 2 tables when you specify the AGREE option. This test is appropriate when you are analyzing data from matched pairs of subjects with a dichotomous (yes-no) response. By default, the null hypothesis for McNemar’s test is marginal homogeneity, which can be expressed as p Subscript 1 dot Baseline equals p Subscript dot 1; this is equivalent to a discordant proportion ratio (p 12 slash p 21) of 1. The corresponding test statistic is computed as

upper Q Subscript upper M Baseline equals left-parenthesis n 12 minus n 21 right-parenthesis squared slash left-parenthesis n 12 plus n 21 right-parenthesis

Under the null hypothesis, upper Q Subscript upper M has an asymptotic chi-square distribution with 1 degree of freedom.

Optionally, you can specify the null ratio of discordant proportions (p 12 slash p 21) by using the AGREE(MNULLRATIO=) option. When the null ratio is r, McNemar’s test is computed as

upper Q Subscript upper M Baseline left-parenthesis r right-parenthesis equals left-parenthesis n 12 minus e 12 right-parenthesis squared right-parenthesis slash e 12 plus left-parenthesis n 21 minus e 21 right-parenthesis squared slash e 21

where e 12 equals upper D slash left-parenthesis 1 plus 1 slash r right-parenthesis, e 21 equals upper D slash left-parenthesis 1 plus r right-parenthesis, and D is the number of discordant pairs, (n 12 plus n 21). Under the null hypothesis, upper Q Subscript upper M Baseline left-parenthesis r right-parenthesis has an asymptotic chi-square distribution with 1 degree of freedom.

PROC FREQ also computes an exact p-value for McNemar’s test when you specify the MCNEM option in the EXACT statement.

Bowker’s Symmetry Test

The null hypothesis for Bowker’s symmetry test (Bowker 1948) is symmetric table-cell proportions, which can be expressed as p Subscript i j Baseline equals p Subscript j i for all off-diagonal pairs of table cells. For 2 times 2 tables, Bowker’s test is identical to McNemar’s test; therefore, PROC FREQ provides Bowker’s test only for square tables that are larger than 2 times 2.

Bowker’s symmetry test is computed as

upper Q Subscript upper B Baseline equals sigma-summation sigma-summation Underscript i less-than j Endscripts left-parenthesis n Subscript i j Baseline minus n Subscript j i Baseline right-parenthesis squared slash left-parenthesis n Subscript i j Baseline plus n Subscript j i Baseline right-parenthesis

For large samples, upper Q Subscript upper B has an asymptotic chi-square distribution with upper R left-parenthesis upper R minus 1 right-parenthesis slash 2 degrees of freedom under the null hypothesis of symmetry, where R is the dimension of the square, two-way table.

By default, the number of degrees of freedom for this test (upper R left-parenthesis upper R minus 1 right-parenthesis slash 2) is the number of off-diagonal table-cell comparisons. You can specify the number of degrees of freedom in the AGREE(DFSYM=) option. Alternatively, you can specify the AGREE(DFSYM=ADJUST) option, which reduces the degrees of freedom by the number of off-diagonal table-cell pairs that have a total frequency of 0. For more information, see Hoenig, Morgan, and Brown (1995).

Exact Symmetry Test

When you specify the SYMMETRY option in the EXACT statement, PROC FREQ provides an exact symmetry test by using the method of Krauth (1973). This exact test is computed by conditioning on the observed frequency sums of the complementary off-diagonal table-cell pairs (n Subscript i j Baseline plus n Subscript j i). PROC FREQ evaluates the symmetry test statistic for all tables in the reference set, which includes all possible tables in which the frequency sums of the off-diagonal table-cell pairs match the corresponding frequency sums in the observed table. The exact p-value is then computed as the sum of the table probabilities for those tables for which the symmetry test statistic is greater than or equal to the observed test statistic. The table probabilities are computed as products of upper R left-parenthesis upper R minus 1 right-parenthesis slash 2 binomial probabilities (which correspond to the off-diagonal table-cell pairs in tables of dimension R) by using the binomial proportion 0.5 under the null hypothesis of symmetry. For more information, see the section Exact Statistics.

Alternatively, you can request a Monte Carlo estimate of the exact p-value by specifying the SYMMETRY option together with the MC computation-option in the EXACT statement. The Monte Carlo computation for the exact symmetry test is conditional on the same reference set that the exact test uses (tables in which the frequency sums of the off-diagonal table-cell pairs match the corresponding sums in the observed table). For more information, see the section Monte Carlo Estimation.

Simple Kappa Coefficient

The simple kappa coefficient (Cohen 1960) is a measure of interrater agreement. PROC FREQ computes the simple kappa coefficient as

ModifyingAbove kappa With caret equals left-parenthesis upper P Subscript o Baseline minus upper P Subscript e Baseline right-parenthesis slash left-parenthesis 1 minus upper P Subscript e Baseline right-parenthesis

where upper P Subscript o Baseline equals sigma-summation Underscript i Endscripts p Subscript i i and upper P Subscript e Baseline equals sigma-summation Underscript i Endscripts p Subscript i period Baseline p Subscript period i. The component upper P Subscript o is the proportion of observed agreement, and the component upper P Subscript e represents the proportion of chance-expected agreement.

If the two response variables are viewed as two independent ratings of the n subjects, the kappa coefficient is +1 when there is complete agreement of the raters. When the observed agreement exceeds the chance-expected agreement, the kappa coefficient is positive, and its magnitude reflects the strength of agreement. When the observed agreement is less than the chance-expected agreement, the kappa coefficient is negative. The minimum value of kappa is between –1 and 0, depending on the marginal proportions of the table.

PROC FREQ computes the asymptotic variance of the simple kappa coefficient as

normal upper V normal a normal r left-parenthesis ModifyingAbove kappa With caret right-parenthesis equals left-parenthesis upper A plus upper B minus upper C right-parenthesis slash left-parenthesis 1 minus upper P Subscript e Baseline right-parenthesis squared n

where

StartLayout 1st Row 1st Column upper A 2nd Column equals 3rd Column sigma-summation Underscript i Endscripts p Subscript i i Baseline left-parenthesis 1 minus left-parenthesis p Subscript i dot Baseline plus p Subscript dot i Baseline right-parenthesis left-parenthesis 1 minus ModifyingAbove kappa With caret right-parenthesis right-parenthesis squared 2nd Row 1st Column upper B 2nd Column equals 3rd Column left-parenthesis 1 minus ModifyingAbove kappa With caret right-parenthesis squared sigma-summation sigma-summation Underscript i not-equals j Endscripts p Subscript i j Baseline left-parenthesis p Subscript dot i Baseline plus p Subscript j dot Baseline right-parenthesis squared 3rd Row 1st Column upper C 2nd Column equals 3rd Column left-parenthesis ModifyingAbove kappa With caret minus upper P Subscript e Baseline left-parenthesis 1 minus ModifyingAbove kappa With caret right-parenthesis right-parenthesis squared EndLayout

For more information, see Fleiss, Cohen, and Everitt (1969).

Confidence limits for the simple kappa coefficient are computed as

ModifyingAbove kappa With caret plus-or-minus left-parenthesis z Subscript alpha slash 2 Baseline times StartRoot normal upper V normal a normal r left-parenthesis ModifyingAbove kappa With caret right-parenthesis EndRoot right-parenthesis

where z Subscript alpha slash 2 is the 100 left-parenthesis 1 minus alpha slash 2 right-parenthesisth percentile of the standard normal distribution. The value of alpha is determined by the ALPHA= option; by default ALPHA=0.05, which produces 95% confidence limits.

PROC FREQ provides an asymptotic test for the simple kappa coefficient. By default, the null hypothesis value of kappa is 0; alternatively, you can specify a nonzero null value of kappa (by using the AGREE(NULLKAPPA=) option in the TABLES statement). When the null value of kappa is nonzero, PROC FREQ computes the test statistic as

z equals left-parenthesis ModifyingAbove kappa With caret minus kappa 0 right-parenthesis slash StartRoot normal upper V normal a normal r left-parenthesis ModifyingAbove kappa With caret right-parenthesis EndRoot

where kappa 0 is the null value that you specify and normal upper V normal a normal r left-parenthesis ModifyingAbove kappa With caret right-parenthesis is the variance of the kappa coefficient.

When the null value of kappa is 0, PROC FREQ computes the test statistic as

z equals ModifyingAbove kappa With caret slash StartRoot normal upper V normal a normal r Subscript 0 Baseline left-parenthesis ModifyingAbove kappa With caret right-parenthesis EndRoot

where normal upper V normal a normal r Subscript 0 Baseline left-parenthesis ModifyingAbove kappa With caret right-parenthesis is the variance of the kappa coefficient under the null hypothesis (that kappa is 0) and is computed as

normal upper V normal a normal r Subscript 0 Baseline left-parenthesis ModifyingAbove kappa With caret right-parenthesis equals left-parenthesis upper P Subscript e Baseline plus upper P Subscript e Superscript 2 Baseline minus sigma-summation Underscript i Endscripts p Subscript i dot Baseline p Subscript dot i Baseline left-parenthesis p Subscript i dot Baseline plus p Subscript dot i Baseline right-parenthesis right-parenthesis slash left-parenthesis 1 minus upper P Subscript e Baseline right-parenthesis squared n

This test statistic has an asymptotic standard normal distribution under the null hypothesis. For more information, see Fleiss, Levin, and Paik (2003).

PROC FREQ also provides an exact test for the simple kappa coefficient. You can request the exact test by specifying the KAPPA or AGREE option in the EXACT statement. For more information, see the section Exact Statistics.

Kappa Details

When you specify the AGREE(KAPPADETAILS) option, PROC FREQ displays the "Kappa Details" table, which includes the observed agreement upper P Subscript o, chance-expected agreement upper P Subscript e, maximum kappa, and upper B Subscript n measure.

The maximum kappa, which is the maximum possible value of the kappa coefficient given the marginal proportions of the two-way table, is computed as

max left-parenthesis kappa right-parenthesis equals left-parenthesis max left-parenthesis upper P Subscript o Baseline right-parenthesis minus upper P Subscript e Baseline right-parenthesis slash left-parenthesis 1 minus upper P Subscript e Baseline right-parenthesis

where

max left-parenthesis upper P Subscript o Baseline right-parenthesis equals left-parenthesis sigma-summation Underscript i Endscripts min left-parenthesis n Subscript i dot Baseline comma n Subscript dot i Baseline right-parenthesis right-parenthesis slash n

The upper B Subscript n measure (Bangdiwala 1988; Bangdiwala et al. 2008) is computed as

upper B Subscript n Baseline equals left-parenthesis sigma-summation Underscript i Endscripts n Subscript i i Superscript 2 Baseline right-parenthesis slash left-parenthesis sigma-summation Underscript i Endscripts sigma-summation Underscript j Endscripts n Subscript i dot Baseline n Subscript dot i Baseline right-parenthesis

For 2 times 2 tables, the "Kappa Details" table also includes the prevalence index and the bias index. The prevalence index is the absolute difference between the agreement proportions, StartAbsoluteValue p 11 minus p 22 EndAbsoluteValue. The bias index is the absolute difference between the disagreement proportions, StartAbsoluteValue p 12 minus p 21 EndAbsoluteValue. For more information, see Sim and Wright (2005) and Byrt, Bishop, and Carlin (1993).

Weighted Kappa Coefficient

The weighted kappa coefficient is a generalization of the simple kappa coefficient that uses weights to quantify the relative differences between categories. For 2 times 2 tables, the weighted kappa coefficient is equivalent to the simple kappa coefficient; therefore, PROC FREQ displays the weighted kappa coefficient only for tables larger than 2 times 2. PROC FREQ computes the kappa weights from the column scores, by using either Cicchetti-Allison weights or Fleiss-Cohen weights, both of which are described in the section Kappa Weights. The kappa weights w Subscript i j are constructed so that 0 less-than-or-equal-to w Subscript i j Baseline less-than 1 for all i not-equals j, w Subscript i i Baseline equals 1 for all i, and w Subscript i j Baseline equals w Subscript j i. The weighted kappa coefficient is computed as

ModifyingAbove kappa With caret Subscript w Baseline equals left-parenthesis upper P Subscript o left-parenthesis w right-parenthesis Baseline minus upper P Subscript e left-parenthesis w right-parenthesis Baseline right-parenthesis slash left-parenthesis 1 minus upper P Subscript e left-parenthesis w right-parenthesis Baseline right-parenthesis

where

upper P Subscript o left-parenthesis w right-parenthesis Baseline equals sigma-summation Underscript i Endscripts sigma-summation Underscript j Endscripts w Subscript i j Baseline p Subscript i j
upper P Subscript e left-parenthesis w right-parenthesis Baseline equals sigma-summation Underscript i Endscripts sigma-summation Underscript j Endscripts w Subscript i j Baseline p Subscript i dot Baseline p Subscript dot j

The component upper P Subscript o left-parenthesis w right-parenthesis is the proportion of observed (weighted) agreement, and the component upper P Subscript e left-parenthesis w right-parenthesis represents the proportion of chance-expected (weighted) agreement. When you specify the AGREE(WTKAPDETAILS) option, PROC FREQ displays these components in the "Weighted Kappa Details" table.

PROC FREQ computes the asymptotic variance of the weighted kappa coefficient as

normal upper V normal a normal r left-parenthesis ModifyingAbove kappa With caret Subscript w Baseline right-parenthesis equals left-parenthesis sigma-summation Underscript i Endscripts sigma-summation Underscript j Endscripts p Subscript i j Baseline left-parenthesis w Subscript i j Baseline minus left-parenthesis w overbar Subscript i dot Baseline plus w overbar Subscript dot j Baseline right-parenthesis left-parenthesis 1 minus ModifyingAbove kappa With caret Subscript w Baseline right-parenthesis right-parenthesis squared minus left-parenthesis ModifyingAbove kappa With caret Subscript w Baseline minus upper P Subscript e left-parenthesis w right-parenthesis Baseline left-parenthesis 1 minus ModifyingAbove kappa With caret Subscript w Baseline right-parenthesis right-parenthesis squared right-parenthesis slash left-parenthesis 1 minus upper P Subscript e left-parenthesis w right-parenthesis Baseline right-parenthesis squared n

where

w overbar Subscript i dot Baseline equals sigma-summation Underscript j Endscripts p Subscript dot j Baseline w Subscript i j
w overbar Subscript dot j Baseline equals sigma-summation Underscript i Endscripts p Subscript i dot Baseline w Subscript i j

For more information, see Fleiss, Cohen, and Everitt (1969).

Confidence limits for the weighted kappa coefficient are computed as

ModifyingAbove kappa With caret Subscript w Baseline plus-or-minus left-parenthesis z Subscript alpha slash 2 Baseline times StartRoot normal upper V normal a normal r left-parenthesis ModifyingAbove kappa With caret Subscript w Baseline right-parenthesis EndRoot right-parenthesis

where z Subscript alpha slash 2 is the 100 left-parenthesis 1 minus alpha slash 2 right-parenthesisth percentile of the standard normal distribution. The value of alpha is determined by the ALPHA= option; by default ALPHA=0.05, which produces 95% confidence limits.

PROC FREQ provides an asymptotic test for the weighted kappa coefficient. By default, the null hypothesis value of weighted kappa is 0; alternatively, you can specify a nonzero null value of weighted kappa (by using the AGREE(NULLWTKAPPA=) option in the TABLES statement). When the null value of weighted kappa is nonzero, PROC FREQ computes the test statistic as

z equals left-parenthesis ModifyingAbove kappa With caret Subscript w Baseline minus kappa Subscript w left-parenthesis 0 right-parenthesis Baseline right-parenthesis slash StartRoot normal upper V normal a normal r left-parenthesis ModifyingAbove kappa With caret Subscript w Baseline right-parenthesis EndRoot

where kappa Subscript w left-parenthesis 0 right-parenthesis is the null value that you specify and normal upper V normal a normal r left-parenthesis ModifyingAbove kappa With caret Subscript w Baseline right-parenthesis is the variance of the weighted kappa coefficient.

When the null value of weighted kappa is 0, PROC FREQ computes the test statistic as

z equals ModifyingAbove kappa With caret Subscript w Baseline slash StartRoot normal upper V normal a normal r Subscript 0 Baseline left-parenthesis ModifyingAbove kappa With caret Subscript w Baseline right-parenthesis EndRoot

where normal upper V normal a normal r Subscript 0 Baseline left-parenthesis ModifyingAbove kappa With caret Subscript w Baseline right-parenthesis is the variance of the weighted kappa coefficient under the null hypothesis (that weighted kappa is 0) and is computed as

normal upper V normal a normal r Subscript 0 Baseline left-parenthesis ModifyingAbove kappa With caret Subscript w Baseline right-parenthesis equals left-parenthesis sigma-summation Underscript i Endscripts sigma-summation Underscript j Endscripts p Subscript i dot Baseline p Subscript dot j Baseline left-parenthesis w Subscript i j Baseline minus left-parenthesis w overbar Subscript i dot Baseline plus w overbar Subscript dot j Baseline right-parenthesis right-parenthesis squared minus upper P Subscript e left-parenthesis w right-parenthesis Superscript 2 Baseline right-parenthesis slash left-parenthesis 1 minus upper P Subscript e left-parenthesis w right-parenthesis Baseline right-parenthesis squared n

This test statistic has an asymptotic standard normal distribution under the null hypothesis. For more information, see Fleiss, Levin, and Paik (2003).

PROC FREQ also provides an exact test for the weighted kappa coefficient. You can request the exact test by specifying the KAPPA or AGREE option in the EXACT statement. For more information, see the section Exact Statistics.

Kappa Weights

PROC FREQ computes kappa coefficient weights by using the column scores and one of the two available weight types. The column scores are determined by the SCORES= option in the TABLES statement. The two available types of kappa weights are Cicchetti-Allison and Fleiss-Cohen weights. By default, PROC FREQ uses Cicchetti-Allison weights. If you specify the AGREE(WT=FC) option, PROC FREQ uses Fleiss-Cohen weights to compute the weighted kappa coefficient.

PROC FREQ computes Cicchetti-Allison kappa coefficient weights as

w Subscript i j Baseline equals 1 minus StartFraction StartAbsoluteValue upper C Subscript i Baseline minus upper C Subscript j Baseline EndAbsoluteValue Over upper C Subscript upper C Baseline minus upper C 1 EndFraction

where upper C Subscript i is the score for column i and C is the number of categories or columns. For more information, see Cicchetti and Allison (1971).

The SCORES= option in the TABLES statement determines the type of column scores used to compute the kappa weights (and other score-based statistics). By default, SCORES=TABLE. For more information, see the section Scores. For numeric variables, table scores are the values of the variable levels. You can assign numeric values to the levels in a way that reflects their level of similarity. For example, suppose you have four levels and order them according to similarity. If you assign them values of 0, 2, 4, and 10, the Cicchetti-Allison kappa weights take the following values: w 12 = 0.8, w 13 = 0.6, w 14 = 0, w 23 = 0.8, w 24 = 0.2, and w 34 = 0.4. Note that when there are only two categories (that is, C = 2), the weighted kappa coefficient is identical to the simple kappa coefficient.

If you specify the AGREE(WT=FC) option in the TABLES statement, PROC FREQ computes Fleiss-Cohen kappa coefficient weights as

w Subscript i j Baseline equals 1 minus StartFraction left-parenthesis upper C Subscript i Baseline minus upper C Subscript j Baseline right-parenthesis squared Over left-parenthesis upper C Subscript upper C Baseline minus upper C 1 right-parenthesis squared EndFraction

For more information, see Fleiss and Cohen (1973).

For the preceding example, the Fleiss-Cohen kappa weights are w 12 = 0.96, w 13 = 0.84, w 14 = 0, w 23 = 0.96, w 24 = 0.36, and w 34 = 0.64.

Prevalence-Adjusted Bias-Adjusted Kappa

When you specify the AGREE(PABAK) option, PROC FREQ provides the prevalence-adjusted bias-adjusted kappa coefficient (PABAK) (Byrt, Bishop, and Carlin 1993). This coefficient is computed as

ModifyingAbove kappa With caret Subscript a Baseline equals left-parenthesis upper P Subscript o Baseline minus 1 slash upper R right-parenthesis slash left-parenthesis 1 minus 1 slash upper R right-parenthesis

where upper P Subscript o Baseline equals sigma-summation Underscript i Endscripts p Subscript i i and R is the dimension of the square, two-way table. The component upper P Subscript o is the proportion of observed agreement, and the component 1 slash upper R represents the chance-expected agreement. When the table is 2 times 2, ModifyingAbove kappa With caret Subscript a Baseline equals 2 upper P Subscript o Baseline minus 1. For more information, see Sim and Wright (2005), Xie (2013), and Holley and Guilford (1964).

PROC FREQ computes the variance of the prevalence-adjusted bias-adjusted kappa as

normal upper V normal a normal r left-parenthesis ModifyingAbove kappa With caret Subscript a Baseline right-parenthesis equals left-parenthesis upper R slash left-parenthesis upper R minus 1 right-parenthesis right-parenthesis squared left-parenthesis upper P Subscript o Baseline left-parenthesis 1 minus upper P Subscript o Baseline right-parenthesis slash n right-parenthesis

Confidence limits are computed as

ModifyingAbove kappa With caret Subscript a Baseline plus-or-minus left-parenthesis z Subscript alpha slash 2 Baseline times StartRoot normal upper V normal a normal r left-parenthesis ModifyingAbove kappa With caret Subscript a Baseline right-parenthesis EndRoot right-parenthesis

where z Subscript alpha slash 2 is the 100 left-parenthesis 1 minus alpha slash 2 right-parenthesisth percentile of the standard normal distribution. The value of alpha is determined by the ALPHA= option; by default ALPHA=0.05, which produces 95% confidence limits.

AC1 Agreement Coefficient

When you specify the AGREE(AC1) option, PROC FREQ provides Gwet’s first-order agreement coefficient, AC1 (Gwet 2008). This coefficient is computed as

ModifyingAbove gamma With caret equals left-parenthesis upper P Subscript o Baseline minus upper P Subscript e left-parenthesis gamma right-parenthesis Baseline right-parenthesis slash left-parenthesis 1 minus upper P Subscript e left-parenthesis gamma right-parenthesis Baseline right-parenthesis

where upper P Subscript o Baseline equals sigma-summation Underscript i Endscripts p Subscript i i, upper P Subscript e Baseline equals sigma-summation Underscript i Endscripts e Subscript i Baseline left-parenthesis 1 minus e Subscript i Baseline right-parenthesis slash left-parenthesis upper R minus 1 right-parenthesis, and e Subscript i Baseline equals left-parenthesis p Subscript i dot Baseline plus p Subscript dot i Baseline right-parenthesis slash 2 The component upper P Subscript o is the proportion of observed agreement, and the component upper P Subscript e left-parenthesis gamma right-parenthesis represents the proportion of chance-expected agreement. For more information, see Xie (2013) and Blood and Spratt (2007).

PROC FREQ computes the variance of AC1 as

normal upper V normal a normal r left-parenthesis ModifyingAbove gamma With caret right-parenthesis equals left-parenthesis upper P Subscript o Baseline left-parenthesis 1 minus upper P Subscript o Baseline right-parenthesis minus 4 left-parenthesis 1 minus ModifyingAbove gamma With caret right-parenthesis upper A plus 4 left-parenthesis 1 minus ModifyingAbove gamma With caret squared right-parenthesis upper B right-parenthesis slash n left-parenthesis 1 minus upper P Subscript e left-parenthesis gamma right-parenthesis Baseline right-parenthesis squared

where

upper A equals sigma-summation Underscript i Endscripts p Subscript i i Baseline left-parenthesis 1 minus e Subscript i Baseline right-parenthesis slash left-parenthesis upper R minus 1 right-parenthesis minus upper P Subscript o Baseline upper P Subscript e left-parenthesis gamma right-parenthesis
upper B equals sigma-summation Underscript i Endscripts sigma-summation Underscript j Endscripts p Subscript i j Baseline left-parenthesis 1 minus left-parenthesis e Subscript i Baseline plus e Subscript j Baseline right-parenthesis slash 2 right-parenthesis squared slash left-parenthesis upper R minus 1 right-parenthesis squared minus upper P Subscript e left-parenthesis gamma right-parenthesis Superscript 2

Confidence limits for AC1 are computed as

ModifyingAbove gamma With caret plus-or-minus left-parenthesis z Subscript alpha slash 2 Baseline times StartRoot normal upper V normal a normal r left-parenthesis ModifyingAbove gamma With caret right-parenthesis EndRoot right-parenthesis

where z Subscript alpha slash 2 is the 100 left-parenthesis 1 minus alpha slash 2 right-parenthesisth percentile of the standard normal distribution. The value of alpha is determined by the ALPHA= option; by default ALPHA=0.05, which produces 95% confidence limits.

Overall Kappa Coefficient

When there are multiple strata, PROC FREQ combines the stratum-level estimates of kappa into an overall estimate of the supposed common value of kappa. Assume there are q strata, indexed by h equals 1 comma 2 comma ellipsis comma q, and let normal upper V normal a normal r left-parenthesis ModifyingAbove kappa With caret Subscript h Baseline right-parenthesis denote the variance of ModifyingAbove kappa With caret Subscript h. The estimate of the overall kappa coefficient is computed as

ModifyingAbove kappa With caret Subscript upper T Baseline equals sigma-summation Underscript h equals 1 Overscript q Endscripts StartFraction ModifyingAbove kappa With caret Subscript h Baseline Over normal upper V normal a normal r left-parenthesis ModifyingAbove kappa With caret Subscript h Baseline right-parenthesis EndFraction slash sigma-summation Underscript h equals 1 Overscript q Endscripts StartFraction 1 Over normal upper V normal a normal r left-parenthesis ModifyingAbove kappa With caret Subscript h Baseline right-parenthesis EndFraction

For more information, see Fleiss, Levin, and Paik (2003).

PROC FREQ computes an estimate of the overall weighted kappa in the same way.

Tests for Equal Kappa Coefficients

When there are multiple strata, the following chi-square statistic tests whether the stratum-level values of kappa are equal:

upper Q Subscript upper K Baseline equals sigma-summation Underscript h equals 1 Overscript q Endscripts left-parenthesis ModifyingAbove kappa With caret Subscript h Baseline minus ModifyingAbove kappa With caret Subscript upper T Baseline right-parenthesis squared slash normal upper V normal a normal r left-parenthesis ModifyingAbove kappa With caret Subscript h Baseline right-parenthesis

Under the null hypothesis of equal kappas for the q strata, upper Q Subscript upper K has an asymptotic chi-square distribution with q–1 degrees of freedom. See Fleiss, Levin, and Paik (2003) for more information. PROC FREQ computes a test for equal weighted kappa coefficients in the same way.

Cochran’s Q Test

Cochran’s Q is computed for multiway tables when each variable has two levels, that is, for 2 times 2 midline-horizontal-ellipsis times 2 tables. Cochran’s Q statistic is used to test the homogeneity of the one-dimensional margins. Let m denote the number of variables and N denote the total number of subjects. Cochran’s Q statistic is computed as

upper Q Subscript upper C Baseline equals m left parenthesis m minus 1 right parenthesis left parenthesis sigma summation Underscript j equals 1 Overscript m Endscripts upper T Subscript j Superscript 2 Baseline minus upper T squared divided by m right parenthesis slash left parenthesis m upper T minus sigma summation Underscript k equals 1 Overscript upper N Endscripts upper S Subscript k Superscript 2 Baseline right parenthesis

where upper T Subscript j is the number of positive responses for variable j, T is the total number of positive responses over all variables, and upper S Subscript k is the number of positive responses for subject k. Under the null hypothesis, Cochran’s Q has an asymptotic chi-square distribution with m–1 degrees of freedom. For more information, see Cochran (1950). When there are only two binary response variables (m=2), Cochran’s Q simplifies to McNemar’s test. When there are more than two response categories, you can test for marginal homogeneity by using the repeated measures capabilities of the CATMOD procedure.

Tables with Zero-Weight Rows or Columns

The AGREE statistics are defined only for square tables, where the number of rows equals the number of columns; if a table is not square, PROC FREQ does not compute AGREE statistics for the table. In the kappa statistic framework, where two independent raters assign ratings to each of n subjects, suppose one of the raters does not use all possible r rating levels. If the corresponding table contains r rows but only r–1 columns, the table is not square and PROC FREQ does not compute AGREE statistics. To create a square table in this situation, you can use the ZEROS option in the WEIGHT statement, which includes zero-weight observations in the analysis. You can include zero-weight observations in the input data set to represent any rating levels that are not used by a rater, so that the input data set has at least one observation for each possible rater and rating combination. When you use this input data set and specify the ZEROS option, the analysis includes all rating levels (even when all levels are not actually assigned by both raters). The resulting table (of rater 1 by rater 2) is a square table, and AGREE statistics can be computed.

For more information, see the description of the ZEROS option in the WEIGHT statement. By default, PROC FREQ does not process observations that have weights of 0 because these observations do not contribute to the total frequency count, and because many of the tests and measures of association are undefined for tables that contain zero-weight rows or columns. However, kappa statistics are defined for tables that contain zero-weight rows or columns, and the ZEROS option enables you to input zero-weight observations and construct the tables needed to compute kappa statistics.

Last updated: December 09, 2022