The FREQ Procedure

Common Risk Difference

If you specify the COMMONRISKDIFF option in the TABLES statement, PROC FREQ provides estimates, confidence limits, and tests for the common (overall) risk difference for multiway 2 times 2 tables.

Mantel-Haenszel Confidence Limits and Test

PROC FREQ computes the Mantel-Haenszel estimate, confidence limits, and test for the common risk difference by using Mantel-Haenszel stratum weights (Mantel and Haenszel 1959) and the Sato variance estimator (Sato 1989). The Mantel-Haenszel estimate of the common risk difference is

ModifyingAbove d With caret Subscript normal upper M normal upper H Baseline equals sigma-summation Underscript h Endscripts ModifyingAbove d With caret Subscript h Baseline w Subscript h Baseline

where ModifyingAbove d With caret Subscript h is the risk difference in stratum h and

w Subscript h Baseline equals StartFraction n Subscript h 1 dot Baseline n Subscript h 2 dot Baseline Over n Subscript h Baseline EndFraction slash sigma-summation Underscript i Endscripts StartFraction n Subscript i 1 dot Baseline n Subscript i 2 dot Baseline Over n Subscript i Baseline EndFraction

is the Mantel-Haenszel weight of stratum h. The column 1 risk difference in stratum (2 times 2 table) h is computed as

ModifyingAbove d With caret Subscript h Baseline equals ModifyingAbove p With caret Subscript h Baseline 1 Baseline minus ModifyingAbove p With caret Subscript h Baseline 2 Baseline equals left-parenthesis n Subscript h Baseline 11 Baseline slash n Subscript h 1 dot Baseline right-parenthesis minus left-parenthesis n Subscript h Baseline 21 Baseline slash n Subscript h 2 dot Baseline right-parenthesis

where ModifyingAbove p With caret Subscript h Baseline 1 is the proportion of row 1 observations that are classified in column 1 and ModifyingAbove p With caret Subscript h Baseline 2 is the proportion or row 2 observations that are classified in column 1. The column 2 risk is computed in the same way. For more information, see Agresti (2013, p. 231).

PROC FREQ computes the variance of ModifyingAbove d With caret Subscript normal upper M normal upper H (Sato 1989) as

ModifyingAbove sigma With caret squared left-parenthesis ModifyingAbove d With caret Subscript normal upper M normal upper H Baseline right-parenthesis equals left-parenthesis ModifyingAbove d With caret Subscript normal upper M normal upper H Baseline sigma-summation Underscript h Endscripts upper P Subscript h Baseline plus sigma-summation Underscript h Endscripts upper Q Subscript h Baseline right-parenthesis slash left-parenthesis sigma-summation Underscript h Endscripts n Subscript h 1 dot Baseline n Subscript h 2 dot Baseline slash n Subscript h Baseline right-parenthesis squared

where

upper P Subscript h Baseline equals left-parenthesis n Subscript h 1 dot Superscript 2 Baseline n Subscript h Baseline 21 Baseline minus n Subscript h 2 dot Superscript 2 Baseline n Subscript h Baseline 11 Baseline plus n Subscript h 1 dot Baseline n Subscript h 2 dot Baseline left-parenthesis n Subscript h 2 dot Baseline minus n Subscript h 1 dot Baseline right-parenthesis slash 2 right-parenthesis slash n Subscript h Superscript 2
upper Q Subscript h Baseline equals left-parenthesis n Subscript h Baseline 11 Baseline left-parenthesis n Subscript h 2 dot Baseline minus n Subscript h Baseline 21 Baseline right-parenthesis plus n Subscript h Baseline 21 Baseline left-parenthesis n Subscript h 1 dot Baseline minus n Subscript h Baseline 11 Baseline right-parenthesis right-parenthesis slash 2 n Subscript h

The 100 left-parenthesis 1 minus alpha right-parenthesis% confidence limits for the common risk difference are

ModifyingAbove d With caret Subscript normal upper M normal upper H Baseline plus-or-minus left-parenthesis z Subscript alpha slash 2 Baseline times ModifyingAbove sigma With caret left-parenthesis ModifyingAbove d With caret Subscript normal upper M normal upper H Baseline right-parenthesis right-parenthesis

If you specify the COMMONRISKDIFF(TEST=MH) option, PROC FREQ provides a Mantel-Haenszel test of the null hypothesis that the common risk difference is 0, which is computed as z Subscript normal upper M normal upper H Baseline equals ModifyingAbove d With caret Subscript normal upper M normal upper H Baseline slash ModifyingAbove sigma With caret left-parenthesis ModifyingAbove d With caret Subscript normal upper M normal upper H Baseline right-parenthesis. The two-sided p-value is normal upper P normal r normal o normal b left-parenthesis StartAbsoluteValue upper Z EndAbsoluteValue greater-than StartAbsoluteValue z Subscript normal upper M normal upper H Baseline EndAbsoluteValue right-parenthesis, where Z has a standard normal distribution.

Klingenberg Confidence Limits

Klingenberg confidence limits (Klingenberg 2014) for the Mantel-Haenszel common risk difference are based on inverting a test of homogeneity that uses the null form of the Sato variance estimator (Sato 1989). For performance evaluation of Klingenberg confidence limits, see Fisher (2015) and Klingenberg (2014).

The 100 left-parenthesis 1 minus alpha right-parenthesis% Klingenberg confidence limits for the common risk difference are

ModifyingAbove d With caret Subscript normal upper M normal i normal d Baseline plus-or-minus upper M Subscript alpha slash 2 Baseline

where M (margin of error) is computed as

upper M Subscript alpha slash 2 Baseline equals StartRoot ModifyingAbove d With caret Subscript normal upper M normal i normal d Superscript 2 Baseline minus ModifyingAbove d With caret Subscript normal upper M normal upper H Superscript 2 Baseline plus z Subscript alpha slash 2 Superscript 2 Baseline left-parenthesis upper Q slash upper W squared right-parenthesis EndRoot

and the confidence interval midpoint is computed as

ModifyingAbove d With caret Subscript normal upper M normal i normal d Baseline equals ModifyingAbove d With caret Subscript normal upper M normal upper H Baseline plus 0.5 z Subscript alpha slash 2 Superscript 2 Baseline left-parenthesis upper P slash upper W squared right-parenthesis

The values P, Q, and W are computed as

StartLayout 1st Row 1st Column upper P 2nd Column equals 3rd Column sigma-summation Underscript h Endscripts upper P Subscript h 2nd Row 1st Column upper Q 2nd Column equals 3rd Column sigma-summation Underscript h Endscripts upper Q Subscript h 3rd Row 1st Column upper W 2nd Column equals 3rd Column sigma-summation Underscript h Endscripts n Subscript h 1 dot Baseline n Subscript h 2 dot Baseline slash n Subscript h Baseline EndLayout

where h denotes the stratum, and upper P Subscript h and upper Q Subscript h are defined in the section Mantel-Haenszel Confidence Limits and Test.

Minimum Risk Confidence Limits and Test

PROC FREQ computes the minimum risk estimate, confidence limits, and test for the common risk difference by using the method of Mehrotra and Railkar (2000). The stratum estimates are weighted by minimum risk weights, which minimize the mean square error of the estimate of the common risk difference. Minimum risk weights are designed to improve precision and reduce bias (compared to other weighting strategies) and can minimize the power loss that can occur when underlying assumptions are not met. For more information, see Mehrotra (2001) and Dmitrienko et al. (2005, section 1.3.3).

The minimum risk estimate of the common risk difference is

ModifyingAbove d With caret Subscript normal upper M normal upper R Baseline equals sigma-summation Underscript h Endscripts ModifyingAbove d With caret Subscript h Baseline w Subscript h Superscript asterisk

where ModifyingAbove d With caret Subscript h is the risk difference in stratum h and w Subscript h Superscript asterisk is the minimum risk weight of stratum h (which is described in the section Minimum Risk Weights). The variance of ModifyingAbove d With caret Subscript normal upper M normal upper R is estimated by

ModifyingAbove upper V With caret left-parenthesis ModifyingAbove d With caret Subscript normal upper M normal upper R Baseline right-parenthesis equals sigma-summation Underscript h Endscripts w Subscript h Superscript asterisk 2 Baseline ModifyingAbove upper V With caret Subscript h Baseline

where ModifyingAbove upper V With caret Subscript h (the variance estimate of the stratum h risk difference) is computed as

ModifyingAbove upper V With caret Subscript h Baseline equals ModifyingAbove p With caret Subscript h Baseline 1 Baseline left-parenthesis 1 minus ModifyingAbove p With caret Subscript h Baseline 1 Baseline right-parenthesis slash n Subscript h 1 dot Baseline plus ModifyingAbove p With caret Subscript h Baseline 2 Baseline left-parenthesis 1 minus ModifyingAbove p With caret Subscript h Baseline 2 Baseline right-parenthesis slash n Subscript h 2 dot Baseline

The 100 left-parenthesis 1 minus alpha right-parenthesis% minimum risk confidence limits for the common risk difference are

ModifyingAbove d With caret Subscript normal upper M normal upper R Baseline plus-or-minus left-parenthesis c plus z Subscript alpha slash 2 Baseline StartRoot ModifyingAbove upper V With caret left-parenthesis ModifyingAbove d With caret Subscript normal upper M normal upper R Baseline EndRoot right-parenthesis right-parenthesis

where the continuity correction is

c equals 0.1875 slash sigma-summation Underscript h Endscripts left-parenthesis n Subscript h 1 dot Baseline n Subscript h 2 dot Baseline slash n Subscript h Baseline right-parenthesis

The continuity correction is applied only when c less-than StartAbsoluteValue ModifyingAbove d With caret Subscript normal upper M normal upper R Baseline EndAbsoluteValue (Fleiss, Levin, and Paik 2003). You can remove the continuity correction by specifying the COMMONRISKDIFF(CORRECT=NO) option.

By default, the minimum risk test is computed as

z Subscript normal upper M normal upper R Baseline equals left-parenthesis ModifyingAbove d With caret Subscript normal upper M normal upper R Baseline plus-or-minus c right-parenthesis slash StartRoot ModifyingAbove upper V With caret Subscript 0 Baseline left-parenthesis ModifyingAbove d With caret Subscript normal upper M normal upper R Baseline right-parenthesis EndRoot

The continuity correction c is subtracted from ModifyingAbove d With caret Subscript normal upper M normal upper R if ModifyingAbove d With caret Subscript normal upper M normal upper R Baseline greater-than 0 and added to ModifyingAbove d With caret Subscript normal upper M normal upper R if ModifyingAbove d With caret Subscript normal upper M normal upper R Baseline less-than 0. The null variance of the common risk difference is estimated by

ModifyingAbove upper V With caret Subscript 0 Baseline left-parenthesis ModifyingAbove d With caret Subscript normal upper M normal upper R Baseline right-parenthesis equals sigma-summation Underscript h Endscripts w Subscript h Superscript asterisk 2 Baseline ModifyingAbove upper V With caret Subscript 0 Baseline left-parenthesis ModifyingAbove d With caret Subscript h Baseline right-parenthesis

where ModifyingAbove upper V With caret Subscript 0 Baseline left-parenthesis ModifyingAbove d With caret Subscript h Baseline right-parenthesis (an estimate of the variance of the stratum h risk difference under the null hypothesis) is

ModifyingAbove upper V With caret Subscript 0 Baseline left-parenthesis ModifyingAbove d With caret Subscript h Baseline right-parenthesis equals p overbar Subscript h Baseline left-parenthesis 1 minus p overbar Subscript h Baseline right-parenthesis left-parenthesis 1 slash n Subscript h 1 dot Baseline plus 1 slash n Subscript h 2 dot Baseline right-parenthesis

and

p overbar Subscript h Baseline equals left-parenthesis n Subscript h 1 dot Baseline ModifyingAbove p With caret Subscript h 1 dot Baseline plus n Subscript h 2 dot Baseline ModifyingAbove p With caret Subscript h 2 dot Baseline right-parenthesis slash left-parenthesis n Subscript h 1 dot Baseline plus n Subscript h 2 dot Baseline right-parenthesis

The two-sided p-value is normal upper P normal r normal o normal b left-parenthesis StartAbsoluteValue upper Z EndAbsoluteValue greater-than StartAbsoluteValue z Subscript normal upper M normal upper R Baseline EndAbsoluteValue right-parenthesis, where Z has a standard normal distribution.

If you specify the VAR=SAMPLE option for COMMONRISKDIFF(TEST=MR), PROC FREQ uses the sample variance estimate ModifyingAbove upper V With caret left-parenthesis ModifyingAbove d With caret Subscript normal upper M normal upper R Baseline right-parenthesis instead of the null variance estimate ModifyingAbove upper V With caret Subscript 0 Baseline left-parenthesis ModifyingAbove d With caret Subscript normal upper M normal upper R Baseline right-parenthesis in the denominator of the test statistic z Subscript normal upper M normal upper R. If you specify the COMMONRISKDIFF(CORRECT=NO) option, the continuity correction is not included in the test statistic.

Minimum Risk Weights

The estimate of the minimum risk weight for stratum h is defined by Mehrotra and Railkar (2000) as

w Subscript h Superscript asterisk Baseline equals StartFraction beta Subscript h Baseline Over sigma-summation Underscript i Endscripts ModifyingAbove upper V With caret Subscript i Superscript negative 1 Baseline EndFraction minus left-parenthesis StartFraction alpha Subscript h Baseline ModifyingAbove upper V With caret Subscript h Superscript negative 1 Baseline Over sigma-summation Underscript i Endscripts ModifyingAbove upper V With caret Subscript i Superscript negative 1 Baseline plus sigma-summation Underscript i Endscripts alpha Subscript i Baseline ModifyingAbove d With caret Subscript i Baseline ModifyingAbove upper V With caret Subscript i Superscript negative 1 Baseline EndFraction right-parenthesis left-parenthesis StartFraction sigma-summation Underscript i Endscripts ModifyingAbove d With caret Subscript i Baseline beta Subscript i Baseline Over sigma-summation Underscript i Endscripts ModifyingAbove upper V With caret Subscript i Superscript negative 1 Baseline EndFraction right-parenthesis

where

alpha Subscript h Baseline equals ModifyingAbove d With caret Subscript h Baseline sigma-summation Underscript i Endscripts ModifyingAbove upper V With caret Subscript i Superscript negative 1 Baseline minus sigma-summation Underscript i Endscripts ModifyingAbove d With caret Subscript i Baseline ModifyingAbove upper V With caret Subscript i Superscript negative 1
beta Subscript h Baseline equals ModifyingAbove upper V With caret Subscript h Superscript negative 1 Baseline left-parenthesis 1 plus alpha Subscript h Baseline sigma-summation Underscript i Endscripts f Subscript i Baseline ModifyingAbove d With caret Subscript i Baseline right-parenthesis

and f Subscript h is the fraction in stratum h

f Subscript h Baseline equals n Subscript h Baseline slash sigma-summation Underscript i Endscripts n Subscript h Baseline

All sums are over the s strata (2 times 2 tables) in the multiway table request, ModifyingAbove d With caret Subscript i denotes the risk difference estimate in stratum i, and ModifyingAbove upper V With caret Subscript i denotes the sample variance estimate of the risk difference in stratum i.

Summary Score Confidence Limits

PROC FREQ computes the summary score estimate of the common risk difference (Agresti 2013, p. 231) by using inverse-variance stratum weights and Miettinen-Nurminen (score) confidence limits for the stratum risk differences. For more information, see the section "Miettinen-Nurminen (Score) Confidence Limits."

The score confidence interval for the risk difference in stratum h can be expressed as ModifyingAbove d With caret prime Subscript h Baseline plus-or-minus z Subscript alpha slash 2 Baseline s prime Subscript h, where ModifyingAbove d With caret prime Subscript h is the midpoint of the score confidence interval and s prime Subscript h is the width of the confidence interval divided by 2 z Subscript alpha slash 2. The summary score estimate of the common risk difference is computed as

ModifyingAbove d With caret Subscript upper S Baseline equals sigma-summation Underscript h Endscripts ModifyingAbove d With caret Subscript h Superscript prime Baseline w prime Subscript h

where

w prime Subscript h Baseline equals left-parenthesis 1 slash s prime Subscript h Baseline squared right-parenthesis slash sigma-summation Underscript i Endscripts left-parenthesis 1 slash s prime Subscript i Baseline squared right-parenthesis

The variance of ModifyingAbove d With caret Subscript upper S is computed as

ModifyingAbove sigma With caret squared left-parenthesis ModifyingAbove d With caret Subscript upper S Baseline right-parenthesis equals 1 slash sigma-summation Underscript h Endscripts left-parenthesis 1 slash s prime Subscript h Baseline squared right-parenthesis

The 100 left-parenthesis 1 minus alpha right-parenthesis% summary score confidence limits for the common risk difference are

ModifyingAbove d With caret Subscript upper S Baseline plus-or-minus left-parenthesis z Subscript alpha slash 2 Baseline times ModifyingAbove sigma With caret left-parenthesis ModifyingAbove d With caret Subscript upper S Baseline right-parenthesis right-parenthesis

If you specify the COMMONRISKDIFF(TEST=SCORE) option, PROC FREQ provides a summary score test of the null hypothesis that the common risk difference is 0. The test statistic is z Subscript upper S Baseline equals ModifyingAbove d With caret Subscript upper S Baseline slash ModifyingAbove sigma With caret left-parenthesis ModifyingAbove d With caret Subscript upper S Baseline right-parenthesis The two-sided p-value is normal upper P normal r normal o normal b left-parenthesis StartAbsoluteValue upper Z EndAbsoluteValue greater-than StartAbsoluteValue z Subscript upper S Baseline EndAbsoluteValue right-parenthesis where Z has a standard normal distribution.

Stratified Newcombe Confidence Limits

PROC FREQ computes stratified Newcombe confidence limits for the common risk (proportion) difference by using the method of Yan and Su (2010). The stratified Newcombe confidence limits are constructed from stratified Wilson confidence limits for the common (overall) row proportions. By default, the strata are weighted by Mantel-Haenszel weights; if you specify the COMMONRISKDIFF(CL=NEWCOMBEMR) option, the strata are weighted by minimum risk weights.

PROC FREQ first computes individual Wilson confidence limits for the row proportions in each 2 times 2 table (stratum), as described in the section Wilson (Score) Confidence Limits. These stratum Wilson confidence limits are then combined to form stratified Wilson confidence limits for the overall row proportions by using stratum weights (either Mantel-Haenszel or minimum risk). The confidence levels of the stratum Wilson confidence limits are chosen so that the overall confidence coefficient (for the stratified Wilson confidence limits) is 100 left-parenthesis 1 minus alpha right-parenthesis% (Yan and Su 2010).

Denote the lower and upper stratified Wilson score confidence limits for the common row 1 proportion as upper L 1 and upper U 1, respectively, and denote the lower and upper stratified Wilson confidence limits for the common row 2 proportion as upper L 2 and upper U 2, respectively. The 100 left-parenthesis 1 minus alpha right-parenthesis% stratified Newcombe confidence limits for the common risk (proportion) difference are

StartLayout 1st Row 1st Column upper L 2nd Column equals 3rd Column ModifyingAbove d With caret minus z Subscript alpha slash 2 Baseline StartRoot lamda 1 upper L 1 left-parenthesis 1 minus upper L 1 right-parenthesis plus lamda 2 upper U 2 left-parenthesis 1 minus upper U 2 right-parenthesis EndRoot 2nd Row 1st Column upper U 2nd Column equals 3rd Column ModifyingAbove d With caret plus z Subscript alpha slash 2 Baseline StartRoot lamda 2 upper L 2 left-parenthesis 1 minus upper L 2 right-parenthesis plus lamda 1 upper U 1 left-parenthesis 1 minus upper U 1 right-parenthesis EndRoot EndLayout

where ModifyingAbove d With caret is the weighted estimate of the common risk difference and

StartLayout 1st Row 1st Column lamda 1 2nd Column equals 3rd Column sigma-summation Underscript h Endscripts w Subscript h Superscript 2 Baseline slash n Subscript h 1 dot Baseline 2nd Row 1st Column lamda 2 2nd Column equals 3rd Column sigma-summation Underscript h Endscripts w Subscript h Superscript 2 Baseline slash n Subscript h 2 dot Baseline EndLayout

By default, the strata are weighted by Mantel-Haenszel weights, which are defined as

w Subscript h Baseline equals StartFraction n Subscript h 1 dot Baseline n Subscript h 2 dot Baseline Over n Subscript h Baseline EndFraction slash sigma-summation Underscript i Endscripts StartFraction n Subscript i 1 dot Baseline n Subscript i 2 dot Baseline Over n Subscript i Baseline EndFraction

and the weighted estimate of the common risk difference is ModifyingAbove d With caret Subscript normal upper M normal upper H. For more information, see the section Mantel-Haenszel Confidence Limits and Test. Optionally, the strata are weighted by minimum risk weights, and the weighted estimate of the common risk difference is ModifyingAbove d With caret Subscript normal upper M normal upper R. For more information, see the section Minimum Risk Confidence Limits and Test.

When there is a single stratum, the stratified Newcombe confidence interval is equivalent to the (unstratified) Newcombe confidence interval. For more information, see the subsection "Newcombe Confidence Limits" in the section Confidence Limits for the Risk Difference. See also Kim and Won (2013).

Last updated: December 09, 2022