The FREQ Procedure

Odds Ratio and Relative Risks

Odds Ratio

The odds ratio is a useful measure of association for a variety of study designs. For a retrospective design called a case-control study, the odds ratio can be used to estimate the relative risk when the probability of positive response is small (Agresti 2002). In a case-control study, two independent samples are identified based on a binary (yes-no) response variable, and the conditional distribution of a binary explanatory variable is examined within fixed levels of the response variable. For more information, see Stokes, Davis, and Koch (2012), Agresti (2013), and Agresti (2007).

The odds of a positive response (column 1) in row 1 is n 11 slash n 12. Similarly, the odds of a positive response in row 2 is n 21 slash n 22. The odds ratio is formed as the ratio of the row 1 odds to the row 2 odds. The odds ratio for a 2 times 2 table is defined as

normal upper O normal upper R equals StartFraction n 11 slash n 12 Over n 21 slash n 22 EndFraction equals StartFraction n 11 n 22 Over n 12 n 21 EndFraction

The odds ratio can be any nonnegative number. When the row and column variables are independent, the true value of the odds ratio is 1. An odds ratio greater than 1 indicates that the odds of a positive response are higher in row 1 than in row 2. An odds ratio less than 1 indicates that the odds of a positive response are higher in row 2. The strength of association increases as the deviation from 1 increases.

The transformation upper G equals left-parenthesis normal upper O normal upper R minus 1 right-parenthesis slash left-parenthesis normal upper O normal upper R plus 1 right-parenthesis transforms the odds ratio to the range (–1,1), where G = 0 when OR = 1; G = –1 when OR = 0; and G approaches 1 as OR approaches infinity. G is the gamma statistic, which PROC FREQ computes when you specify the MEASURES option.

Confidence Limits for the Odds Ratio

The following types of confidence limits are available for the odds ratio: exact, exact mid-p, likelihood ratio, score, Wald, and Wald modified.

Wald Confidence Limits
The asymptotic Wald confidence limits are based on a log transformation of the odds ratio (Woolf 1955; Haldane 1956). PROC FREQ computes the Wald confidence limits as

left-parenthesis normal upper O normal upper R times exp left-parenthesis minus z StartRoot v EndRoot right-parenthesis comma normal upper O normal upper R times exp left-parenthesis z StartRoot v EndRoot right-parenthesis right-parenthesis

where

v equals normal upper V normal a normal r left-parenthesis log left-parenthesis normal upper O normal upper R right-parenthesis right-parenthesis equals 1 slash n 11 plus 1 slash n 12 plus 1 slash n 21 plus 1 slash n 22

and z is the 100 left-parenthesis 1 minus alpha slash 2 right-parenthesisth percentile of the standard normal distribution. The confidence level alpha is determined by the ALPHA= option in the TABLES statement; by default, ALPHA=0.05, which produces 95% confidence limits for the odds ratio. If any of the four cell frequencies are 0, v is undefined and the Wald confidence limits cannot be computed. For more information, see Agresti (2013, p. 70).

Wald Modified Confidence Limits
PROC FREQ computes Wald modified confidence limits (Haldane 1956) for the odds ratio by replacing the n Subscript i j by left-parenthesis n Subscript i j Baseline plus 0.5 right-parenthesis in the estimate and variance as follows:

normal upper O normal upper R Subscript m Baseline equals StartFraction left-parenthesis n 11 plus 0.5 right-parenthesis left-parenthesis n 22 plus 0.5 right-parenthesis Over left-parenthesis n 12 plus 0.5 right-parenthesis left-parenthesis n 21 plus 0.5 right-parenthesis EndFraction
v equals normal upper V normal a normal r left-parenthesis log left-parenthesis normal upper O normal upper R Subscript m Baseline right-parenthesis right-parenthesis equals 1 slash left-parenthesis n 11 plus 0.5 right-parenthesis plus 1 slash left-parenthesis n 12 plus 0.5 right-parenthesis plus 1 slash left-parenthesis n 21 plus 0.5 right-parenthesis plus 1 slash left-parenthesis n 22 plus 0.5 right-parenthesis

The modified confidence limits are computed as

left-parenthesis normal upper O normal upper R Subscript m Baseline times exp left-parenthesis minus z StartRoot v EndRoot right-parenthesis comma normal upper O normal upper R Subscript m Baseline times exp left-parenthesis z StartRoot v EndRoot right-parenthesis right-parenthesis

where z is the 100 left-parenthesis 1 minus alpha slash 2 right-parenthesisth percentile of the standard normal distribution. For more information, see Fleiss, Levin, and Paik (2003) and Agresti (2013).

Score Confidence Limits
Score confidence limits for the odds ratio (Miettinen and Nurminen 1985) are computed by inverting score tests for the odds ratio. A score-based chi-square test statistic for the null hypothesis that the odds ratio is theta can be expressed as

upper Q left-parenthesis theta right-parenthesis equals StartSet n Subscript 1 dot Baseline left-parenthesis ModifyingAbove p With caret Subscript 1 Baseline minus p overTilde Subscript 1 Baseline right-parenthesis EndSet squared slash StartSet n slash left-parenthesis n minus 1 right-parenthesis EndSet StartSet 1 slash left-parenthesis n Subscript 1 dot Baseline p overTilde Subscript 1 Baseline left-parenthesis 1 minus p overTilde Subscript 1 Baseline right-parenthesis right-parenthesis plus 1 slash left-parenthesis n Subscript 2 dot Baseline p overTilde Subscript 2 Baseline left-parenthesis 1 minus p overTilde Subscript 2 Baseline right-parenthesis right-parenthesis EndSet Superscript negative 1

where ModifyingAbove p With caret Subscript 1 is the observed row 1 risk (n 11 slash n Subscript 1 dot), and p overTilde Subscript 1 and p overTilde Subscript 2 are the maximum likelihood estimates of the row 1 and row 2 risks under the restriction that the odds ratio (n 11 n 22 slash n 12 n 21) is theta. For more information, see Miettinen and Nurminen (1985) and Miettinen (1985, chapter 14).

The 100 left-parenthesis 1 minus alpha right-parenthesis% score confidence interval for the odds ratio consists of all values of theta for which the test statistic upper Q left-parenthesis theta right-parenthesis falls in the acceptance region,

StartSet theta colon upper Q left-parenthesis theta right-parenthesis less-than chi Subscript 1 comma alpha Superscript 2 Baseline EndSet

where chi Subscript 1 comma alpha Superscript 2 is the 100left-parenthesis 1 minus alpha right-parenthesisth percentile of the chi-square distribution with 1 degree of freedom. For more information about score confidence limits, see Agresti (2013).

By default, the score confidence limits include the bias correction factor n slash left-parenthesis n minus 1 right-parenthesis in the denominator of upper Q left-parenthesis theta right-parenthesis (Miettinen and Nurminen 1985, p. 217). If you specify the CL=SCORE(CORRECT=NO) option, PROC FREQ does not include this factor in the computation.

The maximum likelihood estimates of p 1 and p 2, subject to the constraint that the odds ratio is theta, are computed as

p overTilde Subscript 2 Baseline equals left-parenthesis negative b plus StartRoot b squared minus 4 a c EndRoot right-parenthesis slash 2 a normal a normal n normal d p overTilde Subscript 1 Baseline equals p overTilde Subscript 2 Baseline theta slash left-parenthesis 1 plus p overTilde Subscript 2 Baseline left-parenthesis theta minus 1 right-parenthesis right-parenthesis

where

StartLayout 1st Row 1st Column a 2nd Column equals 3rd Column n Subscript 2 dot Baseline left-parenthesis theta minus 1 right-parenthesis 2nd Row 1st Column b 2nd Column equals 3rd Column n Subscript 1 dot Baseline theta plus n Subscript 2 dot minus ModifyingAbove p With caret Subscript dot 1 Baseline left-parenthesis theta minus 1 right-parenthesis 3rd Row 1st Column c 2nd Column equals 3rd Column minus ModifyingAbove p With caret Subscript dot 1 EndLayout

For more information, see Miettinen and Nurminen (1985, pp. 217–218) and Miettinen (1985, chapter 14).

Likelihood Ratio Confidence Limits
Likelihood ratio (profile likelihood) confidence limits for the odds ratio are computed by inverting likelihood ratio tests. The likelihood ratio test statistic for the null hypothesis that the odds ratio is theta can be expressed as

upper G squared left-parenthesis theta right-parenthesis equals 2 left-parenthesis n 11 log left-parenthesis ModifyingAbove p With caret Subscript 1 Baseline slash p overTilde Subscript 1 Baseline right-parenthesis plus n 12 log left-parenthesis left-parenthesis 1 minus ModifyingAbove p With caret Subscript 1 Baseline right-parenthesis slash left-parenthesis 1 minus p overTilde Subscript 1 Baseline right-parenthesis right-parenthesis plus n 21 log left-parenthesis ModifyingAbove p With caret Subscript 2 Baseline slash p overTilde Subscript 2 Baseline right-parenthesis plus n 22 log left-parenthesis left-parenthesis 1 minus ModifyingAbove p With caret Subscript 2 Baseline right-parenthesis slash left-parenthesis 1 minus p overTilde Subscript 2 Baseline right-parenthesis right-parenthesis right-parenthesis

where ModifyingAbove p With caret Subscript i is the observed row i risk (n 11 slash n Subscript 1 dot) and p overTilde Subscript i is the maximum likelihood estimate of the row i risk under the restriction that the odds ratio is theta. The computation of the maximum likelihood estimates is described in the subsection "Score Confidence Limits" in this section. For more information, see Agresti (2013), Miettinen and Nurminen (1985), and Miettinen (1985, chapter 14).

The 100 left-parenthesis 1 minus alpha right-parenthesis% likelihood ratio confidence interval for the odds ratio consists of all values of theta for which the test statistic upper G squared left-parenthesis theta right-parenthesis falls in the acceptance region,

StartSet theta colon upper G squared left-parenthesis theta right-parenthesis less-than chi Subscript 1 comma alpha Superscript 2 Baseline EndSet

where chi Subscript 1 comma alpha Superscript 2 is the 100left-parenthesis 1 minus alpha right-parenthesisth percentile of the chi-square distribution with 1 degree of freedom.

Exact Confidence Limits
PROC FREQ computes exact confidence limits for the odds ratio by inverting two one-sided (equal-tail) exact tests that are based on the noncentral hypergeometric distribution, where the distribution is conditional on the observed marginal totals of the 2 times 2 table. The exact confidence limits phi 1 and phi 2 are the solutions to the equations

StartLayout 1st Row 1st Column sigma-summation Underscript i equals n 11 Overscript n Subscript dot 1 Baseline Endscripts f left-parenthesis i colon n Subscript dot 1 Baseline comma n Subscript 1 dot Baseline comma n Subscript 2 dot Baseline comma phi 1 right-parenthesis 2nd Column equals 3rd Column alpha slash 2 2nd Row 1st Column sigma-summation Underscript i equals 0 Overscript n 11 Endscripts f left-parenthesis i colon n Subscript dot 1 Baseline comma n Subscript 1 dot Baseline comma n Subscript 2 dot Baseline comma phi 2 right-parenthesis 2nd Column equals 3rd Column alpha slash 2 EndLayout

where

f left-parenthesis i colon n Subscript dot 1 Baseline comma n Subscript 1 dot Baseline comma n Subscript 2 dot Baseline comma phi right-parenthesis equals StartBinomialOrMatrix n Subscript 1 dot Baseline Choose i EndBinomialOrMatrix StartBinomialOrMatrix n Subscript 2 dot Baseline Choose n Subscript dot 1 Baseline minus i EndBinomialOrMatrix phi Superscript i Baseline slash sigma-summation Underscript i equals 0 Overscript n Subscript dot 1 Baseline Endscripts StartBinomialOrMatrix n Subscript 1 dot Baseline Choose i EndBinomialOrMatrix StartBinomialOrMatrix n Subscript 2 dot Baseline Choose n Subscript dot 1 Baseline minus i EndBinomialOrMatrix phi Superscript i

For more information, see Fleiss, Levin, and Paik (2003), Thomas (1971), and Gart (1971).

Because this is a discrete problem, the confidence coefficient for the exact confidence interval is not exactly left-parenthesis 1 minus alpha right-parenthesis but is at least left-parenthesis 1 minus alpha right-parenthesis; thus, these confidence limits are conservative. For more information, see Agresti (1992).

When the odds ratio is 0, which occurs when either n 11 equals 0 or n 22 equals 0, PROC FREQ sets the lower exact confidence limit to 0 and determines the upper limit by using the level alpha (instead of alpha slash 2). Similarly, when the odds ratio is infinity, which occurs when either n 12 equals 0 or n 21 equals 0, PROC FREQ sets the upper exact confidence limit to infinity and determines the lower limit by using level alpha.

Exact Mid-p Confidence Limits
PROC FREQ computes exact mid-p confidence limits for the odds ratio by inverting two one-sided hypergeometric tests that include mid-p tail areas. The mid-p approach replaces the probability of the observed table with half of that probability in the hypergeometric probability sums, which are described in the subsection "Exact Confidence Limits" in this section. The exact mid-p confidence limits phi 1 and phi 2 are the solutions to the equations

StartLayout 1st Row 1st Column sigma-summation Underscript i equals n 11 plus 1 Overscript n Subscript dot 1 Baseline Endscripts left-parenthesis f left-parenthesis i colon n Subscript dot 1 Baseline comma n Subscript 1 dot Baseline comma n Subscript 2 dot Baseline comma phi 1 right-parenthesis right-parenthesis plus left-parenthesis 1 slash 2 right-parenthesis f left-parenthesis n 11 colon n Subscript dot 1 Baseline comma n Subscript 1 dot Baseline comma n Subscript 2 dot Baseline comma phi 1 right-parenthesis 2nd Column equals 3rd Column alpha slash 2 2nd Row 1st Column sigma-summation Underscript i equals 0 Overscript n 11 minus 1 Endscripts left-parenthesis f left-parenthesis i colon n Subscript dot 1 Baseline comma n Subscript 1 dot Baseline comma n Subscript 2 dot Baseline comma phi 2 right-parenthesis right-parenthesis plus left-parenthesis 1 slash 2 right-parenthesis f left-parenthesis n 11 colon n Subscript dot 1 Baseline comma n Subscript 1 dot Baseline comma n Subscript 2 dot Baseline comma phi 2 right-parenthesis 2nd Column equals 3rd Column alpha slash 2 EndLayout

where

f left-parenthesis i colon n Subscript dot 1 Baseline comma n Subscript 1 dot Baseline comma n Subscript 2 dot Baseline comma phi right-parenthesis equals StartBinomialOrMatrix n Subscript 1 dot Baseline Choose i EndBinomialOrMatrix StartBinomialOrMatrix n Subscript 2 dot Baseline Choose n Subscript dot 1 Baseline minus i EndBinomialOrMatrix phi Superscript i Baseline slash sigma-summation Underscript i equals 0 Overscript n Subscript dot 1 Baseline Endscripts StartBinomialOrMatrix n Subscript 1 dot Baseline Choose i EndBinomialOrMatrix StartBinomialOrMatrix n Subscript 2 dot Baseline Choose n Subscript dot 1 Baseline minus i EndBinomialOrMatrix phi Superscript i

For more information, see Agresti (2013).

When the odds ratio is 0, which occurs when either n 11 equals 0 or n 22 equals 0, PROC FREQ sets the lower exact confidence limit to 0 and determines the upper limit by using the level alpha (instead of alpha slash 2). Similarly, when the odds ratio is infinity, which occurs when either n 12 equals 0 or n 21 equals 0, PROC FREQ sets the upper exact confidence limit to infinity and determines the lower limit by using level alpha.

Relative Risks

Relative risks are useful measures in cohort (prospective) study designs, where two samples are identified based on the presence or absence of an explanatory factor. The two samples are observed in future time for the binary (yes-no) response variable under study. Relative risks are also useful in cross-sectional studies, where two variables are observed simultaneously. For more information, see Stokes, Davis, and Koch (2012) and Agresti (2007).

The relative risk is the ratio of the row 1 risk to the row 2 risk in a 2 times 2 table. The column 1 risk in row 1 is the proportion of row 1 observations that are classified in column 1, which can be expressed as

p 1 equals n 11 slash n Subscript 1 dot Baseline

Similarly, the column 1 risk in row 2 is

p 2 equals n 21 slash n Subscript 2 dot Baseline

The column 1 relative risk is defined as

upper R equals p 1 slash p 2

A relative risk greater than 1 indicates that the probability of positive response is greater in row 1 than in row 2. Similarly, a relative risk less than 1 indicates that the probability of positive response is less in row 1 than in row 2. The strength of association increases as the deviation from 1 increases.

Confidence Limits for the Relative Risk

PROC FREQ provides the following types of confidence limits for the relative risk: exact unconditional, likelihood ratio, score, Wald, and Wald modified.

Wald Confidence Limits
The asymptotic Wald confidence limits are based on a log transformation of the relative risk. PROC FREQ computes the Wald confidence limits for the column 1 relative risk as

left-parenthesis ModifyingAbove r With caret times exp left-parenthesis minus z StartRoot v EndRoot right-parenthesis comma ModifyingAbove r With caret times exp left-parenthesis z StartRoot v EndRoot right-parenthesis right-parenthesis

where ModifyingAbove r With caret is the observed value of the relative risk, ModifyingAbove p With caret Subscript 1 Baseline slash ModifyingAbove p With caret Subscript 2, and

v equals normal upper V normal a normal r left-parenthesis log left-parenthesis ModifyingAbove r With caret right-parenthesis right-parenthesis equals 1 slash n 11 plus 1 slash n 21 minus 1 slash n Subscript 1 dot Baseline minus 1 slash n Subscript 2 dot

and z is the 100 left-parenthesis 1 minus alpha slash 2 right-parenthesisth percentile of the standard normal distribution. The confidence level alpha is determined by the ALPHA= option in the TABLES statement; by default, ALPHA=0.05, which produces 95% confidence limits. If either cell frequency n 11 or n 21 is 0, v is undefined and the Wald confidence limits cannot be computed.

PROC FREQ computes the confidence limits for the column 2 relative risk in the same way.

Wald Modified Confidence Limits
PROC FREQ computes Wald modified confidence limits (Haldane 1956) for the relative risk by replacing the n Subscript i j with left-parenthesis n Subscript i j Baseline plus 0.5 right-parenthesis and the n Subscript i dot with left-parenthesis n Subscript i dot Baseline plus 0.5 right-parenthesis in the estimate and variance as follows:

ModifyingAbove r With caret Subscript m Baseline equals StartFraction left-parenthesis n 11 plus 0.5 right-parenthesis slash left-parenthesis n Subscript 1 dot Baseline plus 0.5 right-parenthesis Over left-parenthesis n 21 plus 0.5 right-parenthesis slash left-parenthesis n Subscript 2 dot Baseline plus 0.5 right-parenthesis EndFraction
v equals normal upper V normal a normal r left-parenthesis log left-parenthesis ModifyingAbove r With caret Subscript m Baseline right-parenthesis right-parenthesis equals 1 slash left-parenthesis n 11 plus 0.5 right-parenthesis plus 1 slash left-parenthesis n 21 plus 0.5 right-parenthesis minus 1 slash left-parenthesis n Subscript 1 dot Baseline plus 0.5 right-parenthesis minus 1 slash left-parenthesis n Subscript 2 dot Baseline plus 0.5 right-parenthesis

The confidence limits are computed as

left-parenthesis ModifyingAbove r With caret Subscript m Baseline times exp left-parenthesis minus z StartRoot v EndRoot right-parenthesis comma ModifyingAbove r With caret Subscript m Baseline times exp left-parenthesis z StartRoot v EndRoot right-parenthesis right-parenthesis

where z is the 100 left-parenthesis 1 minus alpha slash 2 right-parenthesisth percentile of the standard normal distribution. For more information, see Fleiss, Levin, and Paik (2003) and Agresti (2013).

Score Confidence Limits
Score confidence limits (Miettinen and Nurminen 1985; Farrington and Manning 1990) are computed by inverting score tests for the relative risk. A score-based chi-square test statistic for the null hypothesis that the relative risk is r 0 can be expressed as

upper Q left-parenthesis r 0 right-parenthesis equals left-parenthesis ModifyingAbove p With caret Subscript 1 Baseline minus r 0 ModifyingAbove p With caret Subscript 2 Baseline right-parenthesis squared slash ModifyingAbove normal upper V normal a normal r With tilde left-parenthesis r 0 right-parenthesis

where ModifyingAbove p With caret Subscript 1 and ModifyingAbove p With caret Subscript 2 are the observed row 1 and row 2 risks (proportions), respectively,

ModifyingAbove normal upper V normal a normal r With tilde left-parenthesis r 0 right-parenthesis equals left-parenthesis n slash left-parenthesis n minus 1 right-parenthesis right-parenthesis left-parenthesis p overTilde Subscript 1 Baseline left-parenthesis 1 minus p overTilde Subscript 1 Baseline right-parenthesis slash n Subscript 1 dot Baseline plus r 0 squared p overTilde Subscript 2 Baseline left-parenthesis 1 minus p overTilde Subscript 2 Baseline right-parenthesis slash n Subscript 2 dot Baseline right-parenthesis

where p overTilde Subscript 1 and p overTilde Subscript 2 are the maximum likelihood estimates of p 1 and p 2, respectively, under the null hypothesis that the relative risk is r 0. For more information, see Miettinen and Nurminen (1985) and Miettinen (1985, chapter 13).

The 100 left-parenthesis 1 minus alpha right-parenthesis% score confidence interval for the relative risk consists of all values of r 0 for which the test statistic upper Q left-parenthesis r 0 right-parenthesis falls in the acceptance region,

StartSet r 0 colon upper Q left-parenthesis r 0 right-parenthesis less-than chi Subscript 1 comma alpha Superscript 2 Baseline EndSet

where chi Subscript 1 comma alpha Superscript 2 is the 100left-parenthesis 1 minus alpha right-parenthesisth percentile of the chi-square distribution with 1 degree of freedom. For more information, see Agresti (2013).

By default, the score confidence limits include the bias correction factor n slash left-parenthesis n minus 1 right-parenthesis in the denominator of upper Q left-parenthesis r 0 right-parenthesis (Miettinen and Nurminen 1985, p. 217). If you specify the CL=SCORE(CORRECT=NO) option, PROC FREQ does not include this factor in the computation.

The maximum likelihood estimates of p 1 and p 2, subject to the constraint that the relative risk is r 0, are computed as

p overTilde Subscript 1 Baseline equals left-parenthesis negative b minus StartRoot b squared minus 4 a c EndRoot right-parenthesis slash 2 a normal a normal n normal d p overTilde Subscript 2 Baseline equals p overTilde Subscript 1 Baseline slash r 0

where

StartLayout 1st Row 1st Column a 2nd Column equals 3rd Column 1 plus theta 2nd Row 1st Column b 2nd Column equals 3rd Column minus left-parenthesis r 0 left-parenthesis 1 plus theta ModifyingAbove p With caret Subscript 2 Baseline right-parenthesis plus theta plus ModifyingAbove p With caret Subscript 1 Baseline right-parenthesis 3rd Row 1st Column c 2nd Column equals 3rd Column r 0 left-parenthesis ModifyingAbove p With caret Subscript 1 Baseline plus theta ModifyingAbove p With caret Subscript 2 Baseline right-parenthesis 4th Row 1st Column theta 2nd Column equals 3rd Column n Subscript 2 dot Baseline slash n Subscript 1 dot EndLayout

For more information, see Farrington and Manning (1990, p. 1454) and Miettinen and Nurminen (1985, p. 217).

Likelihood Ratio Confidence Limits
Likelihood ratio (profile likelihood) confidence limits for the relative risk are computed by inverting likelihood ratio tests. The likelihood ratio test statistic for the null hypothesis that the relative risk ratio is r 0 can be expressed as

upper G squared left-parenthesis r 0 right-parenthesis equals 2 left-parenthesis n 11 log left-parenthesis ModifyingAbove p With caret Subscript 1 Baseline slash p overTilde Subscript 1 Baseline right-parenthesis plus n 12 log left-parenthesis left-parenthesis 1 minus ModifyingAbove p With caret Subscript 1 Baseline right-parenthesis slash left-parenthesis 1 minus p overTilde Subscript 1 Baseline right-parenthesis right-parenthesis plus n 21 log left-parenthesis ModifyingAbove p With caret Subscript 2 Baseline slash p overTilde Subscript 2 Baseline right-parenthesis plus n 22 log left-parenthesis left-parenthesis 1 minus ModifyingAbove p With caret Subscript 2 Baseline right-parenthesis slash left-parenthesis 1 minus p overTilde Subscript 2 Baseline right-parenthesis right-parenthesis right-parenthesis

where ModifyingAbove p With caret Subscript i is the observed row i risk (n Subscript i Baseline 1 Baseline slash n Subscript i dot) and p overTilde Subscript i is the maximum likelihood estimate of the row i risk under the restriction that the relative risk is r 0. Expressions for the maximum likelihood estimates p overTilde Subscript 1 and p overTilde Subscript 2 are given in the subsection "Score Confidence Limits" in this section. For more information, see Miettinen and Nurminen (1985) and Miettinen (1985, chapter 13).

The 100 left-parenthesis 1 minus alpha right-parenthesis% likelihood ratio confidence interval for the relative risk consists of all values of r 0 for which the test statistic upper G squared left-parenthesis r 0 right-parenthesis falls in the acceptance region,

StartSet theta colon upper G squared left-parenthesis r 0 right-parenthesis less-than chi Subscript 1 comma alpha Superscript 2 Baseline EndSet

where chi Subscript 1 comma alpha Superscript 2 is the 100left-parenthesis 1 minus alpha right-parenthesisth percentile of the chi-square distribution with 1 degree of freedom.

Exact Unconditional Confidence Limits
If you specify the RELRISK option in the EXACT statement, PROC FREQ provides exact unconditional confidence limits for the relative risk. The exact unconditional approach fixes the row margins of the 2 times 2 table and eliminates the nuisance parameter p 2 by using the maximum p-value (worst-case scenario) over all possible values of p 2 (Santner and Snell 1980). The conditional approach, which is described in the section Exact Statistics, does not apply to the relative risk because of the nuisance parameter (Agresti 1992).

By default, PROC FREQ computes the confidence limits by the tail method, which inverts two separate one-sided exact tests of the relative risk, where the tests are based on the score statistic (Chan and Zhang 1999). The size of each one-sided exact test is at most alpha slash 2, and the confidence coefficient is at least left-parenthesis 1 minus alpha right-parenthesis. If you specify the RELRISK(METHOD=NOSCORE) option in the EXACT statement, PROC FREQ computes the confidence limits by inverting two separate one-sided exact tests that are based on the unstandardized relative risk. If you specify the RELRISK(METHOD=SCORE2) option in the EXACT statement, PROC FREQ computes the confidence limits by inverting a single two-sided exact test that is based on the score statistic (Agresti and Min 2001).

PROC FREQ uses the relative risk score statistic (or the modified form of the unstandardized relative risk) to compute the exact confidence limits as described in the subsection "Exact Unconditional Confidence Limits" in the section Confidence Limits for the Risk Difference.

The score statistic is a less discrete statistic than the unstandardized risk difference and produces less conservative confidence limits (Agresti and Min 2001). For more information, see Santner et al. (2007). The relative risk score statistic (Miettinen and Nurminen 1985; Farrington and Manning 1990) is computed as

z left-parenthesis r 0 right-parenthesis equals left-parenthesis ModifyingAbove p With caret Subscript 1 Baseline minus r 0 ModifyingAbove p With caret Subscript 2 Baseline right-parenthesis slash normal s normal e left-parenthesis r 0 right-parenthesis

where

normal s normal e left-parenthesis r 0 right-parenthesis equals StartRoot p overTilde Subscript 1 Baseline left-parenthesis 1 minus p overTilde Subscript 1 Baseline right-parenthesis slash n Subscript 1 dot Baseline plus r 0 squared p overTilde Subscript 2 Baseline left-parenthesis 1 minus p overTilde Subscript 2 Baseline right-parenthesis slash n Subscript 2 dot Baseline EndRoot

where p overTilde Subscript 1 and p overTilde Subscript 2 are the maximum likelihood estimates of p 1 and p 2 under the restriction that the relative risk is r 0. Expressions for the maximum likelihood estimates p overTilde Subscript 1 and p overTilde Subscript 2 are given in the subsection "Score Confidence Limits" in this section. For more information, see Farrington and Manning (1990, p. 1454) and Miettinen and Nurminen (1985, p. 217).

When the confidence limits are computed by using the unstandardized relative risk as the test statistic (METHOD=NOSCORE), PROC FREQ uses a modified form of the relative risk to ensure that the statistic is defined when there are zero-frequency table cells. The modified form adds 0.5 to the table cell and row frequencies (Gart and Nam 1988) and is computed as

ModifyingAbove r With caret Subscript m Baseline equals StartFraction left-parenthesis n 11 plus 0.5 right-parenthesis slash left-parenthesis n Subscript 1 dot Baseline plus 0.5 right-parenthesis Over left-parenthesis n 21 plus 0.5 right-parenthesis slash left-parenthesis n Subscript 2 dot Baseline plus 0.5 right-parenthesis EndFraction

For more information, see the subsection "Wald Modified Confidence Limits" in this section.

Relative Risk Tests

PROC FREQ provides tests of equality, noninferiority, superiority, and equivalence for the relative risk. The following analysis methods are available: Wald (which is based on a log transformation), Wald modified, score, and likelihood ratio. You can specify the method by using the METHOD= relrisk-option; by default, PROC FREQ provides Wald tests.

Equality Test
An equality test for the relative risk can be expressed as

upper H 0 colon upper R equals r 0

versus the alternative

upper H Subscript a Baseline colon upper R not-equals r 0

where upper R equals p 1 slash p 2 denotes the relative risk (for column 1 or column 2) and r 0 denotes the null value. You can specify a null value by using the EQUAL(NULL=) relrisk-option; by default, the null value is 1.

The test statistic is computed by the method that you specify; by default, PROC FREQ uses the Wald test. For information about test statistic computation, see the subsections "Wald Test," "Wald Modified Test," "Farrington-Manning (Score) Test," and "Likelihood Ratio Test" in this section.

For the Wald and score methods, the test statistics z have standard normal distributions under the null hypothesis. For the likelihood ratio test, the test statistic upper G squared has a chi-square distribution with 1 degree of freedom under the null hypothesis.

When the test statistic z is greater than 0, PROC FREQ displays the right-sided p-value, which is the probability of a larger value occurring under the null hypothesis. The one-sided p-value can be expressed as

upper P 1 equals StartLayout Enlarged left-brace 1st Row  normal upper P normal r normal o normal b left-parenthesis upper Z greater-than z right-parenthesis normal i normal f z greater-than 0 2nd Row  normal upper P normal r normal o normal b left-parenthesis upper Z less-than z right-parenthesis normal i normal f z less-than-or-equal-to 0 EndLayout

where Z has a standard normal distribution. The two-sided p-value is computed as upper P 2 equals 2 times upper P 1.

Noninferiority Test
A noninferiority test for the relative risk can be expressed as

upper H 0 colon upper R less-than-or-equal-to delta

versus the alternative

upper H Subscript a Baseline colon upper R greater-than delta

where upper R equals p 1 slash p 2 denotes the relative risk (for column 1 or column 2) and delta denotes the noninferiority margin (limit). You can specify the margin by using the MARGIN= relrisk-option; by default, the noninferiority margin is 0.8. The noninferiority margin for a relative risk test should be less than 1. Rejection of the null hypothesis indicates that the row 1 risk is not inferior to the row 2 risk. For more information, see Chow, Shao, and Wang (2008).

The test statistic z is computed by the method that you specify. For information about test statistic computation, see the subsections "Wald Test," "Wald Modified Test," "Farrington-Manning (Score) Test," and "Likelihood Ratio Test" in this section. The test statistic z is computed by using the noninferiority margin (limit) as the null value of the relative risk. Under the null hypothesis, the test statistic has a standard normal distribution. The p-value for the noninferiority test is the right-sided p-value (the probability that upper Z greater-than z).

As part of the noninferiority analysis, PROC FREQ also provides confidence limits for the relative risk. The confidence coefficient is 100 left-parenthesis 1 minus 2 alpha right-parenthesis% (Schuirmann 1999). The confidence level alpha is determined by the ALPHA= option in the TABLES statement; by default, ALPHA=0.05, which produces 90% confidence limits for the noninferiority analysis. You can compare the confidence limits to the value of the noninferiority limit delta.

Superiority Test
A superiority test for the relative risk can be expressed as

upper H 0 colon upper R less-than-or-equal-to delta

versus the alternative

upper H Subscript a Baseline colon upper R greater-than delta

where upper R equals p 1 slash p 2 denotes the relative risk (for column 1 or column 2) and delta denotes the superiority margin (limit). You can specify the margin by using the MARGIN= relrisk-option; by default, the superiority margin is 1.25. The superiority margin for a relative risk test should be greater than 1. Rejection of the null hypothesis indicates that the row 1 risk is superior to the row 2 risk. For more information, see Chow, Shao, and Wang (2008).

The test statistic z is computed by using the superiority margin (limit) as the null value of the relative risk. Under the null hypothesis, the test statistic has a standard normal distribution. The p-value for the superiority test is the right-sided p-value (the probability that upper Z greater-than z).

The computations for the superiority analysis are the same as the computations for the noninferiority analysis, which are described in the subsection "Noninferiority Test" in this section.

Equivalence Test
An equivalence test for the relative risk can be expressed as

upper H 0 colon upper R less-than-or-equal-to delta Subscript upper L Baseline normal o normal r upper R greater-than-or-equal-to delta Subscript upper U Baseline

versus the alternative

upper H Subscript a Baseline colon delta Subscript upper L Baseline less-than upper R less-than delta Subscript upper U Baseline

where delta Subscript upper L is the lower margin and delta Subscript upper U is the upper margin. Rejection of the null hypothesis indicates that the two risks are equivalent. For more information, see Chow, Shao, and Wang (2008).

You can specify the margins by using the MARGIN= relrisk-option; by default, the lower margin is 0.8 and the upper margin is 1.25. If you specify a single margin value, PROC FREQ uses this value as the lower margin for the equivalence test and computes the upper margin as the inverse of the lower margin.

PROC FREQ computes two one-sided tests (TOST) for equivalence analysis (Schuirmann 1987), which include a right-sided test for the lower margin delta Subscript upper L and a left-sided test for the upper margin delta Subscript upper U. The lower test statistic uses the lower margin as the null relative risk value, and the p-value is the right-sided probability (upper Z greater-than z Subscript upper L). The upper test statistic uses the upper margin as the null value, and the p-value is the left-sided probability (upper Z less-than z Subscript upper U). The overall p-value is taken to be the larger of the two p-values for the lower and upper tests.

The test statistics are computed by the method that you specify. For more information about the test statistic computation, see the subsections "Wald Test," "Wald Modified Test," "Farrington-Manning (Score) Test," and "Likelihood Ratio Test" in this section.

As part of the equivalence analysis, PROC FREQ also provides confidence limits for the relative risk. The confidence coefficient is 100 left-parenthesis 1 minus 2 alpha right-parenthesis% (Schuirmann 1999). The confidence level alpha is determined by the ALPHA= option in the TABLES statement; by default, ALPHA=0.05, which produces 90% confidence limits for the equivalence analysis. You can compare the confidence limits to the equivalence limits, which are delta Subscript upper L and delta Subscript upper U.

Wald Test
The Wald test statistic (which is based on a log transformation of the relative risk) is computed as z left-parenthesis r 0 right-parenthesis equals left-parenthesis log left-parenthesis ModifyingAbove r With caret right-parenthesis minus log left-parenthesis r 0 right-parenthesis right-parenthesis slash StartRoot v EndRoot, where ModifyingAbove r With caret is the relative risk estimate (ModifyingAbove p With caret Subscript 1 Baseline slash ModifyingAbove p With caret Subscript 2), r 0 is the null value of the relative risk, and

v equals normal upper V normal a normal r left-parenthesis log left-parenthesis ModifyingAbove r With caret right-parenthesis right-parenthesis equals 1 slash n 11 plus 1 slash n 21 minus 1 slash n Subscript 1 dot Baseline minus 1 slash n Subscript 2 dot

The null value is determined by the type of test (equality, noninferiority, superiority, or equivalence) and the null or margin values that you specify. The side of the p-value and the interpretation of the test are also determined by the type of test; for more information, see the subsections "Equality Test," "Noninferiority Test," "Superiority Test," and "Equivalence Test" in this section.

Wald Modified Test
The Wald modified test statistic is computed by replacing the n Subscript i j with left-parenthesis n Subscript i j Baseline plus 0.5 right-parenthesis and the n Subscript i dot with left-parenthesis n Subscript i dot Baseline plus 0.5 right-parenthesis in the relative risk estimate and variance. The test statistic is computed as z left-parenthesis r 0 right-parenthesis equals left-parenthesis log left-parenthesis ModifyingAbove r With caret Subscript m Baseline right-parenthesis minus log left-parenthesis r 0 right-parenthesis right-parenthesis slash StartRoot v EndRoot, where r 0 is the null value of the relative risk,

ModifyingAbove r With caret Subscript m Baseline equals StartFraction left-parenthesis n 11 plus 0.5 right-parenthesis slash left-parenthesis n Subscript 1 dot Baseline plus 0.5 right-parenthesis Over left-parenthesis n 21 plus 0.5 right-parenthesis slash left-parenthesis n Subscript 2 dot Baseline plus 0.5 right-parenthesis EndFraction
v equals normal upper V normal a normal r left-parenthesis log left-parenthesis ModifyingAbove r With caret Subscript m Baseline right-parenthesis right-parenthesis equals 1 slash left-parenthesis n 11 plus 0.5 right-parenthesis plus 1 slash left-parenthesis n 21 plus 0.5 right-parenthesis minus 1 slash left-parenthesis n Subscript 1 dot Baseline plus 0.5 right-parenthesis minus 1 slash left-parenthesis n Subscript 2 dot Baseline plus 0.5 right-parenthesis

The null value is determined by the type of test (equality, noninferiority, superiority, or equivalence) and the null or margin values that you specify. The side of the p-value and the interpretation of the test are also determined by the type of test; for more information, see the subsections "Equality Test," "Noninferiority Test," "Superiority Test," and "Equivalence Test" in this section.

Farrington-Manning (Score) Test
The relative risk score test statistic (Miettinen and Nurminen 1985; Farrington and Manning 1990) for the null value r 0 is computed as

z left-parenthesis r 0 right-parenthesis equals left-parenthesis ModifyingAbove p With caret Subscript 1 Baseline minus r 0 ModifyingAbove p With caret Subscript 2 Baseline right-parenthesis slash normal s normal e left-parenthesis r 0 right-parenthesis

where

normal s normal e left-parenthesis r 0 right-parenthesis equals StartRoot p overTilde Subscript 1 Baseline left-parenthesis 1 minus p overTilde Subscript 1 Baseline right-parenthesis slash n Subscript 1 dot Baseline plus r 0 squared p overTilde Subscript 2 Baseline left-parenthesis 1 minus p overTilde Subscript 2 Baseline right-parenthesis slash n Subscript 2 dot Baseline EndRoot

where p overTilde Subscript 1 and p overTilde Subscript 2 are the maximum likelihood estimates of p 1 and p 2 under the null value r 0. Expressions for the maximum likelihood estimates p overTilde Subscript 1 and p overTilde Subscript 2 are given in the subsection "Score Confidence Limits" in this section.

The null value is determined by the type of test (equality, noninferiority, superiority, or equivalence) and the null or margin values that you specify. The side of the p-value and the interpretation of the test are also determined by the type of test; for more information, see the subsections "Equality Test," "Noninferiority Test," "Superiority Test," and "Equivalence Test" in this section.

Likelihood Ratio Test
The likelihood ratio statistic for the null relative risk value r 0 is computed as

upper G squared left-parenthesis r 0 right-parenthesis equals 2 left-parenthesis n 11 log left-parenthesis ModifyingAbove p With caret Subscript 1 Baseline slash p overTilde Subscript 1 Baseline right-parenthesis plus n 12 log left-parenthesis left-parenthesis 1 minus ModifyingAbove p With caret Subscript 1 Baseline right-parenthesis slash left-parenthesis 1 minus p overTilde Subscript 1 Baseline right-parenthesis right-parenthesis plus n 21 log left-parenthesis ModifyingAbove p With caret Subscript 2 Baseline slash p overTilde Subscript 2 Baseline right-parenthesis plus n 22 log left-parenthesis left-parenthesis 1 minus ModifyingAbove p With caret Subscript 2 Baseline right-parenthesis slash left-parenthesis 1 minus p overTilde Subscript 2 Baseline right-parenthesis right-parenthesis right-parenthesis

where p overTilde Subscript 1 and p overTilde Subscript 2 are the maximum likelihood estimates of p 1 and p 2 under the null value r 0. Expressions for the maximum likelihood estimates p overTilde Subscript 1 and p overTilde Subscript 2 are given in the subsection "Score Confidence Limits" in this section. For more information, see Miettinen and Nurminen (1985) and Miettinen (1985, chapter 13).

PROC FREQ computes the likelihood ratio test statistic z left-parenthesis r 0 right-parenthesis for the noninferiority, superiority, and equivalence tests as StartRoot upper G squared left-parenthesis r 0 right-parenthesis EndRoot, where the sign is positive if the estimate is greater than the null value (ModifyingAbove r With caret greater-than-or-equal-to r 0) and negative otherwise (ModifyingAbove r With caret less-than r 0).

The null value is determined by the type of test (equality, noninferiority, superiority, or equivalence) and the null or margin values that you specify. The side of the p-value and the interpretation of the test are also determined by the type of test; for more information, see the subsections "Equality Test," "Noninferiority Test," "Superiority Test," and "Equivalence Test" in this section.

Last updated: December 09, 2022