The SURVEYFREQ Procedure

Risks and Risk Difference

The RISK option provides estimates of risks (binomial proportions) and risk differences for tables, together with their standard errors and confidence limits. Risk statistics include the row 1 risk, row 2 risk, overall risk, and risk difference.

When you specify the RISK option, PROC SURVEYFREQ provides both column 1 and column 2 risks by default. You can request only column 1 (or only column 2) risks by specifying the RISK(COLUMN=) option. To display only the risk difference (and suppress display of the row 1, row 2, and overall risks), you can specify the RISKDIFF(ONLY) option.

The column 1 risk for row 1 is the row 1 proportion for table cell (1,1). The column 1 risk estimate is computed as the ratio of the estimated total for table cell (1,1) to the estimated total for row 1,

ModifyingAbove upper P With caret Subscript 11 Superscript left-parenthesis 1 right-parenthesis Baseline equals ModifyingAbove upper N With caret Subscript 11 Baseline slash ModifyingAbove upper N With caret Subscript 1 dot

where the total estimates are computed as described in the section Totals. The column 1 risk for row 2 is the row 2 proportion for table cell (2,1), which is estimated as

ModifyingAbove upper P With caret Subscript 21 Superscript left-parenthesis 2 right-parenthesis Baseline equals ModifyingAbove upper N With caret Subscript 21 Baseline slash ModifyingAbove upper N With caret Subscript 2 dot

The overall column 1 risk is the overall proportion in column 1, and its estimate is computed as

ModifyingAbove upper P With caret Subscript dot 1 Baseline equals ModifyingAbove upper N With caret Subscript dot 1 Baseline slash ModifyingAbove upper N With caret

The column 2 risk estimates are computed similarly.

The row 1 and row 2 risks are the same as the row proportions for a table, and their variances are computed as described in the section Row and Column Proportions. The overall risk is the overall proportion in the column, and its variance computation is described in the section Proportions. Confidence limits for the column 1 risk for row 1 are computed as

ModifyingAbove upper P With caret Subscript 11 Superscript left-parenthesis 1 right-parenthesis Baseline plus-or-minus left-parenthesis t Subscript d f comma alpha slash 2 Baseline times normal upper S normal t normal d normal upper E normal r normal r left-parenthesis ModifyingAbove upper P With caret Subscript 11 Superscript left-parenthesis 1 right-parenthesis Baseline right-parenthesis right-parenthesis

where is the standard error of the risk estimate and is the th percentile of the t distribution with df degrees of freedom. (For more information, see the section Degrees of Freedom.) The value of the confidence coefficient is determined by the ALPHA= option; by default, ALPHA=0.05, which produces 95% confidence limits. Confidence limits for the other risks are computed similarly.

The risk difference is defined as the row 1 risk minus the row 2 risk. The estimate of the column 1 risk difference is computed as

The column 2 risk difference is computed similarly.

PROC SURVEYFREQ estimates the variance of the risk difference by using the variance estimation method that you request. If you request a replication method (bootstrap, BRR, jackknife, or replicate weights), the procedure estimates the variance as described in the section Replication Variance Estimation. By default, PROC SURVEYFREQ estimates the variance by using the Taylor series method.

By using Taylor series linearization, the variance estimate for the column 1 risk difference can be expressed as

ModifyingAbove normal upper V normal a normal r With caret left-parenthesis ModifyingAbove normal upper R normal upper D With caret Subscript 1 Baseline right-parenthesis equals ModifyingAbove bold upper D With caret ModifyingAbove bold upper V With caret left-parenthesis ModifyingAbove bold upper X With caret right-parenthesis ModifyingAbove bold upper D With caret prime

where is the covariance matrix of ,

ModifyingAbove bold upper X With caret equals left-parenthesis ModifyingAbove upper N With caret Subscript 11 Baseline comma ModifyingAbove upper N With caret Subscript 1 dot Baseline comma ModifyingAbove upper N With caret Subscript 21 Baseline comma ModifyingAbove upper N With caret Subscript 2 dot Baseline right-parenthesis

and is an array that contains the partial derivatives of the risk difference with respect to the elements of ,

ModifyingAbove bold upper D With caret equals left-parenthesis 1 slash ModifyingAbove upper N With caret Subscript 1 dot Baseline comma minus ModifyingAbove upper N With caret Subscript 11 Baseline slash ModifyingAbove upper N With caret Subscript 1 dot Superscript 2 Baseline comma negative 1 slash ModifyingAbove upper N With caret Subscript 2 dot Baseline comma minus ModifyingAbove upper N With caret Subscript 21 Baseline slash ModifyingAbove upper N With caret Subscript 2 dot Superscript 2 Baseline right-parenthesis

For more information, see Wolter (1985, pp. 239–242). The variance estimate for the column 2 risk difference is computed similarly.

The standard error of the column 1 risk difference is

normal upper S normal t normal d normal upper E normal r normal r left-parenthesis ModifyingAbove normal upper R normal upper D With caret Subscript 1 Baseline right-parenthesis equals StartRoot ModifyingAbove normal upper V normal a normal r With caret left-parenthesis ModifyingAbove normal upper R normal upper D With caret Subscript 1 Baseline right-parenthesis EndRoot

Confidence limits for the column 1 risk difference are computed as

ModifyingAbove normal upper R normal upper D With caret Subscript 1 Baseline plus-or-minus left-parenthesis t Subscript d f comma alpha slash 2 Baseline times normal upper S normal t normal d normal upper E normal r normal r left-parenthesis ModifyingAbove normal upper R normal upper D With caret Subscript 1 Baseline right-parenthesis right-parenthesis

where is the th percentile of the t distribution with df degrees of freedom. (For more information, see the section Degrees of Freedom.) The value of the confidence coefficient is determined by the ALPHA= option; by default, ALPHA=0.05, which produces 95% confidence limits. Confidence limits for the column 2 risk difference are computed in the same way.

Last updated: December 09, 2022