The SURVEYFREQ Procedure

Odds Ratio and Relative Risks

The OR option provides estimates of the odds ratio, the column 1 relative risk, and the column 2 relative risk for 2 times 2 tables, together with their confidence limits.

Odds Ratio

For a 2 times 2 table, the odds of a positive (column 1) response in row 1 is upper N 11 slash upper N 12. Similarly, the odds of a positive response in row 2 is upper N 21 slash upper N 22. The odds ratio is formed as the ratio of the row 1 odds to the row 2 odds. The estimate of the odds ratio is computed as

ModifyingAbove normal upper O normal upper R With caret equals StartFraction ModifyingAbove upper N With caret Subscript 11 Baseline slash ModifyingAbove upper N With caret Subscript 12 Baseline Over ModifyingAbove upper N With caret Subscript 21 Baseline slash ModifyingAbove upper N With caret Subscript 22 Baseline EndFraction equals StartFraction ModifyingAbove upper N With caret Subscript 11 Baseline ModifyingAbove upper N With caret Subscript 22 Baseline Over ModifyingAbove upper N With caret Subscript 12 Baseline ModifyingAbove upper N With caret Subscript 21 Baseline EndFraction

The value of the odds ratio can be any nonnegative number. When the row and column variables are independent, the true value of the odds ratio is 1. An odds ratio greater than 1 indicates that the odds of a positive response are higher in row 1 than in row 2. An odds ratio less than 1 indicates that the odds of positive response are higher in row 2. The strength of association increases with the deviation from 1. For more information, see Stokes, Davis, and Koch (2000) and Agresti (2007).

PROC SURVEYFREQ constructs confidence limits for the odds ratio by using the log transform. The 100 left-parenthesis 1 minus alpha right-parenthesis% confidence limits for the odds ratio are computed as

left-parenthesis ModifyingAbove normal upper O normal upper R With caret times exp left-parenthesis minus t Subscript d f comma alpha slash 2 Baseline StartRoot v EndRoot right-parenthesis comma ModifyingAbove normal upper O normal upper R With caret times exp left-parenthesis t Subscript d f comma alpha slash 2 Baseline StartRoot v EndRoot right-parenthesis right-parenthesis

where

v equals ModifyingAbove normal upper V normal a normal r With caret left-parenthesis log left-parenthesis ModifyingAbove normal upper O normal upper R With caret right-parenthesis right-parenthesis equals ModifyingAbove normal upper V normal a normal r With caret left-parenthesis ModifyingAbove normal upper O normal upper R With caret right-parenthesis slash ModifyingAbove normal upper O normal upper R With caret Superscript 2

is the estimate of the variance of the log odds ratio and t Subscript d f comma alpha slash 2 is the 100 left-parenthesis 1 minus alpha slash 2 right-parenthesisth percentile of the t distribution with df degrees of freedom. (For more information, see the section Degrees of Freedom.) The value of the confidence coefficient alpha is determined by the ALPHA= option; by default, ALPHA=0.05, which produces 95% confidence limits.

PROC SURVEYFREQ estimates the variance of the odds ratio by using the variance estimation method that you request. If you request a replication variance estimation method (bootstrap, BRR, jackknife, or replicate weights), PROC SURVEYFREQ estimates the variance of the odds ratio as described in the section Replication Variance Estimation. The default variance estimation method is Taylor series.

By using Taylor series linearization, the variance estimate for the odds ratio can be expressed as

ModifyingAbove normal upper V normal a normal r With caret left-parenthesis ModifyingAbove normal upper O normal upper R With caret right-parenthesis equals ModifyingAbove bold upper D With caret ModifyingAbove bold upper V With caret left-parenthesis ModifyingAbove bold upper N With caret right-parenthesis ModifyingAbove bold upper D With caret prime

where ModifyingAbove bold upper V With caret left-parenthesis ModifyingAbove bold upper N With caret right-parenthesis is the covariance matrix of the estimates of the cell totals ModifyingAbove bold upper N With caret,

ModifyingAbove bold upper N With caret equals left-parenthesis ModifyingAbove upper N With caret Subscript 11 Baseline comma ModifyingAbove upper N With caret Subscript 12 Baseline comma ModifyingAbove upper N With caret Subscript 21 Baseline comma ModifyingAbove upper N With caret Subscript 22 Baseline right-parenthesis

and ModifyingAbove bold upper D With caret is an array that contains the partial derivatives of the odds ratio with respect to the elements of ModifyingAbove bold upper N With caret. The section Covariances of Frequency Estimates describes the computation of ModifyingAbove bold upper V With caret left-parenthesis ModifyingAbove bold upper N With caret right-parenthesis. The array ModifyingAbove bold upper D With caret is computed as

ModifyingAbove bold upper D With caret equals left-parenthesis ModifyingAbove upper N With caret Subscript 22 Baseline slash left-parenthesis ModifyingAbove upper N With caret Subscript 12 Baseline ModifyingAbove upper N With caret Subscript 21 Baseline right-parenthesis comma minus ModifyingAbove upper N With caret Subscript 11 Baseline ModifyingAbove upper N With caret Subscript 22 Baseline slash left-parenthesis ModifyingAbove upper N With caret Subscript 21 Baseline ModifyingAbove upper N With caret Subscript 12 Superscript 2 Baseline right-parenthesis comma minus ModifyingAbove upper N With caret Subscript 11 Baseline ModifyingAbove upper N With caret Subscript 22 Baseline slash left-parenthesis ModifyingAbove upper N With caret Subscript 12 Baseline ModifyingAbove upper N With caret Subscript 21 Superscript 2 Baseline right-parenthesis comma ModifyingAbove upper N With caret Subscript 11 Baseline slash left-parenthesis ModifyingAbove upper N With caret Subscript 12 Baseline ModifyingAbove upper N With caret Subscript 21 Baseline right-parenthesis right-parenthesis

For more information, see Wolter (1985, pp. 239–242).

Relative Risks

For a 2 times 2 table, the column 1 relative risk is the ratio of the column 1 risks for row 1 to row 2. As described in the section Risks and Risk Difference, the column 1 risk for row 1 is the proportion of row 1 observations classified in column 1, and the column 1 risk for row 2 is the proportion of row 2 observations classified in column 1. The estimate of the column 1 relative risk is computed as

ModifyingAbove normal upper R normal upper R With caret Subscript 1 Baseline equals StartFraction ModifyingAbove upper N With caret Subscript 11 Baseline slash ModifyingAbove upper N With caret Subscript 1 dot Baseline Over ModifyingAbove upper N With caret Subscript 21 Baseline slash ModifyingAbove upper N With caret Subscript 2 dot Baseline EndFraction

Similarly, the estimate of the column 2 relative risk is computed as

ModifyingAbove normal upper R normal upper R With caret Subscript 2 Baseline equals StartFraction ModifyingAbove upper N With caret Subscript 12 Baseline slash ModifyingAbove upper N With caret Subscript 1 dot Baseline Over ModifyingAbove upper N With caret Subscript 22 Baseline slash ModifyingAbove upper N With caret Subscript 2 dot Baseline EndFraction

A relative risk greater than 1 indicates that the probability of positive response is greater in row 1 than in row 2. Similarly, a relative risk less than 1 indicates that the probability of positive response is less in row 1 than in row 2. The strength of association increases with the deviation from 1. For more information, see Stokes, Davis, and Koch (2000) and Agresti (2007).

PROC SURVEYFREQ constructs confidence limits for the relative risk by using the log transform, which is similar to the odds ratio computations described previously. The 100 left-parenthesis 1 minus alpha right-parenthesis% confidence limits for the column 1 relative risk are computed as

left-parenthesis ModifyingAbove normal upper R normal upper R With caret Subscript 1 Baseline times exp left-parenthesis minus t Subscript d f comma alpha slash 2 Baseline StartRoot v EndRoot right-parenthesis comma ModifyingAbove normal upper R normal upper R With caret Subscript 1 Baseline times exp left-parenthesis t Subscript d f comma alpha slash 2 Baseline StartRoot v EndRoot right-parenthesis right-parenthesis

where

v equals ModifyingAbove normal upper V normal a normal r With caret left-parenthesis log left-parenthesis ModifyingAbove normal upper R normal upper R With caret Subscript 1 Baseline right-parenthesis right-parenthesis equals ModifyingAbove normal upper V normal a normal r With caret left-parenthesis ModifyingAbove normal upper R normal upper R With caret Subscript 1 Baseline right-parenthesis slash ModifyingAbove normal upper R normal upper R With caret Subscript 1 Superscript 2

is the estimate of the variance of the log column 1 relative risk and t Subscript d f comma alpha slash 2 is the 100 left-parenthesis 1 minus alpha slash 2 right-parenthesisth percentile of the t distribution with df degrees of freedom. (For more information, see the section Degrees of Freedom.) The value of the confidence coefficient alpha is determined by the ALPHA= option; by default, ALPHA=0.05, which produces 95% confidence limits.

PROC SURVEYFREQ estimates the variance of the relative risk by using the variance estimation method that you request. If you request a replication variance estimation method (bootstrap, BRR, jackknife, or replicate weights), PROC SURVEYFREQ estimates the variance of the relative risk as described in the section Replication Variance Estimation. The default variance estimation method is Taylor series.

By using Taylor series linearization, the variance estimate for the column 1 relative risk can be expressed as

ModifyingAbove normal upper V normal a normal r With caret left-parenthesis ModifyingAbove normal upper R normal upper R With caret Subscript 1 Baseline right-parenthesis equals ModifyingAbove bold upper D With caret ModifyingAbove bold upper V With caret left-parenthesis ModifyingAbove bold upper X With caret right-parenthesis ModifyingAbove bold upper D With caret prime

where ModifyingAbove bold upper V With caret left-parenthesis ModifyingAbove bold upper X With caret right-parenthesis is the covariance matrix of ModifyingAbove bold upper X With caret,

ModifyingAbove bold upper X With caret equals left-parenthesis ModifyingAbove upper N With caret Subscript 11 Baseline comma ModifyingAbove upper N With caret Subscript 1 dot Baseline comma ModifyingAbove upper N With caret Subscript 21 Baseline comma ModifyingAbove upper N With caret Subscript 2 dot Baseline right-parenthesis

and ModifyingAbove bold upper D With caret is an array that contains the partial derivatives of the column 1 relative risk with respect to the elements of ModifyingAbove bold upper X With caret,

ModifyingAbove bold upper D With caret equals left-parenthesis ModifyingAbove upper N With caret Subscript 2 dot Baseline slash left-parenthesis ModifyingAbove upper N With caret Subscript 21 Baseline ModifyingAbove upper N With caret Subscript 1 dot Baseline right-parenthesis comma minus ModifyingAbove upper N With caret Subscript 11 Baseline ModifyingAbove upper N With caret Subscript 2 dot Baseline slash left-parenthesis ModifyingAbove upper N With caret Subscript 21 Baseline ModifyingAbove upper N With caret Subscript 1 dot Superscript 2 Baseline right-parenthesis comma minus ModifyingAbove upper N With caret Subscript 11 Baseline ModifyingAbove upper N With caret Subscript 2 dot Baseline slash left-parenthesis ModifyingAbove upper N With caret Subscript 1 dot Baseline ModifyingAbove upper N With caret Subscript 21 Superscript 2 Baseline right-parenthesis comma ModifyingAbove upper N With caret Subscript 11 Baseline slash left-parenthesis ModifyingAbove upper N With caret Subscript 21 Baseline ModifyingAbove upper N With caret Subscript 1 dot Baseline right-parenthesis right-parenthesis

For more information, see Wolter (1985, pp. 239–242).

Confidence limits for the column 2 relative risk are computed similarly.

Last updated: December 09, 2022