The SURVEYFREQ Procedure

Rao-Scott Chi-Square Test

The Rao-Scott chi-square test is a design-adjusted version of the Pearson chi-square test, which involves differences between observed and expected frequencies. For information about design-adjusted chi-square tests, see Lohr (2010, Section 10.3.2), Rao and Scott (1981), Rao and Scott (1984), Rao and Scott (1987), and Thomas, Singh, and Roberts (1996).

PROC SURVEYFREQ provides a first-order Rao-Scott chi-square test by default. If you specify the CHISQ(SECONDORDER) option, PROC SURVEYFREQ provides a second-order (Satterthwaite) Rao-Scott chi-square test. The first-order design correction depends only on the design effects of the table cell proportion estimates and, for two-way tables, the design effects of the marginal proportion estimates. The second-order design correction requires computation of the full covariance matrix of the proportion estimates. The second-order test requires more computational resources than the first-order test, but it can provide some performance advantages (for Type I error and power), particularly when the design effects are variable (Thomas and Rao 1987; Rao and Thomas 1989).

One-Way Tables

For one-way tables, the CHISQ option provides a Rao-Scott (design-based) goodness-of-fit test for one-way tables. By default, this is a test for the null hypothesis of equal proportions. If you specify null hypothesis proportions in the TESTP= option, the goodness-of-fit test uses the specified proportions.

First-Order Test

The first-order Rao-Scott chi-square statistic for the goodness-of-fit test is computed as

upper Q Subscript normal upper R normal upper S Baseline 1 Baseline equals upper Q Subscript upper P Baseline slash upper D

where upper Q Subscript upper P is the Pearson chi-square based on the estimated totals and D is the first-order design correction described in the section First-Order Design Correction. For more information, see Rao and Scott (1979), Rao and Scott (1981), Rao and Scott (1984).

For a one-way table with C levels, the Pearson chi-square is computed as

upper Q Subscript upper P Baseline equals left-parenthesis n slash ModifyingAbove upper N With caret right-parenthesis sigma-summation Underscript c Endscripts left-parenthesis ModifyingAbove upper N With caret Subscript c Baseline minus upper E Subscript c Baseline right-parenthesis squared slash upper E Subscript c

where n is the sample size, ModifyingAbove upper N With caret is the estimated overall total, ModifyingAbove upper N With caret Subscript c is the estimated total for level c, and upper E Subscript c is the expected total for level c under the null hypothesis. For the null hypothesis of equal proportions, the expected total for each level is

upper E Subscript c Baseline equals ModifyingAbove upper N With caret slash upper C

For specified null proportions, the expected total for level c is

upper E Subscript c Baseline equals ModifyingAbove upper N With caret times upper P Subscript c Superscript 0

where upper P Subscript c Superscript 0 is the null proportion that you specify for level c.

Under the null hypothesis, the first-order Rao-Scott chi-square upper Q Subscript normal upper R normal upper S Baseline 1 approximately follows a chi-square distribution with (C – 1) degrees of freedom. A better approximation can be obtained by the F statistic,

upper F italic 1 equals upper Q Subscript normal upper R normal upper S Baseline 1 Baseline slash left-parenthesis upper C minus 1 right-parenthesis

which has an F distribution with left-parenthesis upper C minus 1 right-parenthesis and kappa left-parenthesis upper C minus 1 right-parenthesis degrees of freedom under the null hypothesis (Thomas and Rao 1984, 1987). The value of kappa is the degrees of freedom for the variance estimator. The degrees of freedom computation depends on the sample design and the variance estimation method. For more information, see the section Degrees of Freedom.

First-Order Design Correction

By default for one-way tables, the first-order design correction is computed from the proportion estimates as

upper D equals sigma-summation Underscript c Endscripts left-parenthesis 1 minus ModifyingAbove upper P With caret Subscript c Baseline right-parenthesis normal upper D normal e normal f normal f left-parenthesis ModifyingAbove upper P With caret Subscript c Baseline right-parenthesis slash left-parenthesis upper C minus 1 right-parenthesis

where

StartLayout 1st Row 1st Column normal upper D normal e normal f normal f left-parenthesis ModifyingAbove upper P With caret Subscript c Baseline right-parenthesis 2nd Column equals 3rd Column ModifyingAbove normal upper V normal a normal r With caret left-parenthesis ModifyingAbove upper P With caret Subscript c Baseline right-parenthesis slash ModifyingAbove normal upper V normal a normal r With caret Subscript normal s normal r normal s Baseline left-parenthesis ModifyingAbove upper P With caret Subscript c Baseline right-parenthesis 2nd Row 1st Column Blank 2nd Column equals 3rd Column ModifyingAbove normal upper V normal a normal r With caret left-parenthesis ModifyingAbove upper P With caret Subscript c Baseline right-parenthesis slash left-parenthesis left-parenthesis 1 minus f right-parenthesis ModifyingAbove upper P With caret Subscript c Baseline left-parenthesis 1 minus ModifyingAbove upper P With caret Subscript c Baseline right-parenthesis slash left-parenthesis n minus 1 right-parenthesis right-parenthesis EndLayout

as described in the section Design Effect. ModifyingAbove upper P With caret Subscript c is the proportion estimate for level c, ModifyingAbove normal upper V normal a normal r With caret left-parenthesis ModifyingAbove upper P With caret Subscript c Baseline right-parenthesis is the variance of the estimate, f is the overall sampling fraction, and n is the number of observations in the sample. The factor (1 – f) is included only for Taylor series variance estimation (VARMETHOD=TAYLOR) when you specify the RATE= or TOTAL= option. For more information, see the section Design Effect.

If you specify the CHISQ(MODIFIED) or LRCHISQ(MODIFIED) option, the design correction is computed by using null hypothesis proportions instead of proportion estimates. By default, null hypothesis proportions are equal proportions for all levels of the one-way table. Alternatively, you can specify null proportion values in the TESTP= option. The modified design correction upper D 0 is computed from null hypothesis proportions as

upper D 0 equals sigma-summation Underscript c Endscripts left-parenthesis 1 minus upper P Subscript c Superscript 0 Baseline right-parenthesis normal upper D normal e normal f normal f Subscript 0 Baseline left-parenthesis ModifyingAbove upper P With caret Subscript c Baseline right-parenthesis slash left-parenthesis upper C minus 1 right-parenthesis

where

StartLayout 1st Row 1st Column normal upper D normal e normal f normal f Subscript 0 Baseline left-parenthesis ModifyingAbove upper P With caret Subscript c Baseline right-parenthesis 2nd Column equals 3rd Column ModifyingAbove normal upper V normal a normal r With caret left-parenthesis ModifyingAbove upper P With caret Subscript c Baseline right-parenthesis slash ModifyingAbove normal upper V normal a normal r With caret Subscript normal s normal r normal s Baseline left-parenthesis upper P Subscript c Superscript 0 Baseline right-parenthesis 2nd Row 1st Column Blank 2nd Column equals 3rd Column ModifyingAbove normal upper V normal a normal r With caret left-parenthesis ModifyingAbove upper P With caret Subscript c Baseline right-parenthesis slash left-parenthesis left-parenthesis 1 minus f right-parenthesis upper P Subscript c Superscript 0 Baseline left-parenthesis 1 minus upper P Subscript c Superscript 0 Baseline right-parenthesis slash left-parenthesis n minus 1 right-parenthesis right-parenthesis EndLayout

The null hypothesis proportion upper P Subscript c Superscript 0 is 1 slash upper C for equal proportions (the default), or upper P Subscript c Superscript 0 is the null proportion that you specify for level c if you use the TESTP= option.

Second-Order Test

The second-order (Satterthwaite) Rao-Scott chi-square statistic for the goodness-of-fit test is computed as

upper Q Subscript normal upper R normal upper S Baseline 2 Baseline equals upper Q Subscript normal upper R normal upper S Baseline 1 Baseline slash left-parenthesis 1 plus ModifyingAbove a With caret squared right-parenthesis

where upper Q Subscript normal upper R normal upper S Baseline 1 is the first-order Rao-Scott chi-square statistic described in the section First-Order Test and ModifyingAbove a With caret squared is the second-order design correction described in the section Second-Order Design Correction. For more information, see Rao and Scott (1979), Rao and Scott (1981), and Rao and Thomas (1989).

Under the null hypothesis, the second-order Rao-Scott chi-square upper Q Subscript normal upper R normal upper S Baseline 2 approximately follows a chi-square distribution with left-parenthesis upper C minus 1 right-parenthesis slash left-parenthesis 1 plus ModifyingAbove a With caret squared right-parenthesis degrees of freedom. The corresponding F statistic is

upper F Subscript normal upper R normal upper S Baseline 2 Baseline equals upper Q Subscript normal upper R normal upper S Baseline 2 Baseline slash left-parenthesis upper C minus 1 right-parenthesis

which has an F distribution with left-parenthesis upper C minus 1 right-parenthesis slash left-parenthesis 1 plus ModifyingAbove a With caret squared right-parenthesis and kappa left-parenthesis upper C minus 1 right-parenthesis slash left-parenthesis 1 plus ModifyingAbove a With caret squared right-parenthesis degrees of freedom under the null hypothesis (Thomas and Rao 1984, 1987). The value of kappa is the degrees of freedom for the variance estimator. The degrees of freedom computation depends on the sample design and the variance estimation method. For more information, see the section Degrees of Freedom.

Second-Order Design Correction

The second-order (Satterthwaite) design correction for one-way tables is computed from the eigenvalues of the estimated design effects matrix ModifyingAbove bold upper Delta With caret, which are known as generalized design effects. The design effects matrix is computed as

ModifyingAbove bold upper Delta With caret equals left-parenthesis n minus 1 right-parenthesis slash left-parenthesis 1 minus f right-parenthesis left-parenthesis bold upper C bold o bold v Subscript bold s bold r bold s Baseline left-parenthesis ModifyingAbove bold upper P With caret right-parenthesis Superscript negative 1 Baseline ModifyingAbove bold upper C bold o bold v With caret left-parenthesis ModifyingAbove bold upper P With caret right-parenthesis right-parenthesis

where bold upper C bold o bold v Subscript bold s bold r bold s left-parenthesis ModifyingAbove bold upper P With caret right-parenthesis is the covariance under multinomial sampling (srs with replacement) and ModifyingAbove bold upper C bold o bold v With caret left-parenthesis ModifyingAbove bold upper P With caret right-parenthesis is the covariance matrix of the first (C – 1) proportion estimates. For more information, see Rao and Scott (1979), Rao and Scott (1981), and Rao and Thomas (1989).

By default, the srs covariance matrix is computed from the proportion estimates as

bold upper C bold o bold v Subscript bold s bold r bold s Baseline left-parenthesis ModifyingAbove bold upper P With caret right-parenthesis equals normal upper D normal i normal a normal g left-parenthesis ModifyingAbove bold upper P With caret right-parenthesis minus ModifyingAbove bold upper P With caret ModifyingAbove bold upper P With caret prime

where ModifyingAbove bold upper P With caret is an array of (C – 1) proportion estimates. If you specify the CHISQ(MODIFIED) or LRCHISQ(MODIFIED) option, the srs covariance matrix is computed from the null hypothesis proportions bold upper P bold 0 as

bold upper C bold o bold v Subscript bold s bold r bold s Baseline left-parenthesis bold upper P bold 0 right-parenthesis equals normal upper D normal i normal a normal g left-parenthesis bold upper P bold 0 right-parenthesis minus bold upper P bold 0 bold upper P prime bold 0

where bold upper P bold 0 is an array of (C – 1) null hypothesis proportions. The null hypothesis proportions equal 1 slash upper C by default. If you use the TESTP= option to specify null hypothesis proportions, bold upper P bold 0 is an array of (C – 1) proportions that you specify.

The second-order design correction is computed as

ModifyingAbove a With caret squared equals left-parenthesis sigma-summation Underscript c equals 1 Overscript upper C minus 1 Endscripts d Subscript c Superscript 2 Baseline slash left-parenthesis upper C minus 1 right-parenthesis d overbar squared right-parenthesis minus 1

where d Subscript c are the eigenvalues of the design effects matrix ModifyingAbove bold upper Delta With caret and d overbar is the average of the eigenvalues.

Two-Way Tables

For two-way tables, the CHISQ option provides a Rao-Scott (design-based) test of association between the row and column variables. PROC SURVEYFREQ provides a first-order Rao-Scott chi-square test by default. If you specify the CHISQ(SECONDORDER) option, PROC SURVEYFREQ provides a second-order (Satterthwaite) Rao-Scott chi-square test.

First-Order Test

The first-order Rao-Scott chi-square statistic is computed as

upper Q Subscript normal upper R normal upper S Baseline 1 Baseline equals upper Q Subscript upper P Baseline slash upper D

where upper Q Subscript upper P is the Pearson chi-square based on the estimated totals and D is the design correction described in the section First-Order Design Correction. For more information, see Rao and Scott (1979), Rao and Scott (1984), and Rao and Scott (1987).

For a two-way tables with R rows and C columns, the Pearson chi-square is computed as

upper Q Subscript upper P Baseline equals left-parenthesis n slash ModifyingAbove upper N With caret right-parenthesis sigma-summation Underscript r Endscripts sigma-summation Underscript c Endscripts left-parenthesis ModifyingAbove upper N With caret Subscript r c Baseline minus upper E Subscript r c Baseline right-parenthesis squared slash upper E Subscript r c Baseline

where n is the sample size, ModifyingAbove upper N With caret is the estimated overall total, ModifyingAbove upper N With caret Subscript r c is the estimated total for table cell (r, c), and upper E Subscript r c is the expected total for table cell (r,c) under the null hypothesis of no association,

upper E Subscript r c Baseline equals ModifyingAbove upper N With caret Subscript r dot Baseline ModifyingAbove upper N With caret Subscript dot c Baseline slash ModifyingAbove upper N With caret

Under the null hypothesis of no association, the first-order Rao-Scott chi-square upper Q Subscript normal upper R normal upper S Baseline 1 approximately follows a chi-square distribution with (R – 1)(C – 1) degrees of freedom. A better approximation can be obtained by the F statistic,

upper F italic 1 equals upper Q Subscript normal upper R normal upper S Baseline 1 Baseline slash left-parenthesis upper R minus 1 right-parenthesis left-parenthesis upper C minus 1 right-parenthesis

which has an F distribution with left-parenthesis upper R minus 1 right-parenthesis left-parenthesis upper C minus 1 right-parenthesis and kappa left-parenthesis upper R minus 1 right-parenthesis left-parenthesis upper C minus 1 right-parenthesis degrees of freedom under the null hypothesis (Thomas and Rao 1984, 1987). The value of kappa is the degrees of freedom for the variance estimator. The degrees of freedom computation depends on the sample design and the variance estimation method. For more information, see the section Degrees of Freedom.

First-Order Design Correction

By default for a first-order test, PROC SURVEYFREQ computes the design correction from proportion estimates. If you specify the CHISQ(MODIFIED) or LRCHISQ(MODIFIED) option for a first-order test, the procedure computes the design correction from null hypothesis proportions.

Second-order tests, which you request by specifying the CHISQ(SECONDORDER) or LRCHISQ(SECONDORDER) option, are computed by applying both first-order and second-order design corrections to the weighted chi-square statistic. For second-order tests for two-way tables, PROC SURVEYFREQ always uses null hypothesis proportions to compute both the first-order and second-order design corrections.

The first-order design correction D that is based on proportion estimates is computed as

StartLayout 1st Row 1st Column upper D equals 2nd Column left-parenthesis sigma-summation Underscript r Endscripts sigma-summation Underscript c Endscripts left-parenthesis 1 minus ModifyingAbove upper P With caret Subscript r c Baseline right-parenthesis normal upper D normal e normal f normal f left-parenthesis ModifyingAbove upper P With caret Subscript r c Baseline right-parenthesis minus sigma-summation Underscript r Endscripts left-parenthesis 1 minus ModifyingAbove upper P With caret Subscript r dot Baseline right-parenthesis normal upper D normal e normal f normal f left-parenthesis ModifyingAbove upper P With caret Subscript r dot Baseline right-parenthesis 2nd Row 1st Column Blank 2nd Column minus sigma-summation Underscript c Endscripts left-parenthesis 1 minus ModifyingAbove upper P With caret Subscript dot c Baseline right-parenthesis normal upper D normal e normal f normal f left-parenthesis ModifyingAbove upper P With caret Subscript dot c Baseline right-parenthesis right-parenthesis slash left-parenthesis upper R minus 1 right-parenthesis left-parenthesis upper C minus 1 right-parenthesis EndLayout

where

StartLayout 1st Row 1st Column normal upper D normal e normal f normal f left-parenthesis ModifyingAbove upper P With caret Subscript r c Baseline right-parenthesis 2nd Column equals 3rd Column ModifyingAbove normal upper V normal a normal r With caret left-parenthesis ModifyingAbove upper P With caret Subscript r c Baseline right-parenthesis slash ModifyingAbove normal upper V normal a normal r With caret Subscript normal s normal r normal s Baseline left-parenthesis ModifyingAbove upper P With caret Subscript r c Baseline right-parenthesis 2nd Row 1st Column Blank 2nd Column equals 3rd Column ModifyingAbove normal upper V normal a normal r With caret left-parenthesis ModifyingAbove upper P With caret Subscript r c Baseline right-parenthesis slash left-parenthesis left-parenthesis 1 minus f right-parenthesis ModifyingAbove upper P With caret Subscript r c Baseline left-parenthesis 1 minus ModifyingAbove upper P With caret Subscript r c Baseline right-parenthesis slash left-parenthesis n minus 1 right-parenthesis right-parenthesis EndLayout

as described in the section Design Effect. ModifyingAbove upper P With caret Subscript r c is the estimate of the proportion in table cell (r, c), ModifyingAbove normal upper V normal a normal r With caret left-parenthesis ModifyingAbove upper P With caret Subscript r c Baseline right-parenthesis is the variance of the estimate, f is the overall sampling fraction, and n is the number of observations in the sample. The factor (1 – f) is included only for Taylor series variance estimation (VARMETHOD=TAYLOR) when you specify the RATE= or TOTAL= option. For more information, see the section Design Effect.

The design effects for the estimate of the proportion in row r and the estimate of the proportion in column c (normal upper D normal e normal f normal f left-parenthesis ModifyingAbove upper P With caret Subscript r dot Baseline right-parenthesis and normal upper D normal e normal f normal f left-parenthesis ModifyingAbove upper P With caret Subscript dot c Baseline right-parenthesis, respectively) are computed in the same way.

If you specify the CHISQ(MODIFIED) or LRCHISQ(MODIFIED) option for a first-order Rao-Scott test, or if you request a second-order test for a two-way table (CHISQ(SECONDORDER) or LRCHISQ(SECONDORDER)), the procedure computes the design correction from the null hypothesis cell proportions instead of the estimated cell proportions. For two-way tables, the null hypothesis cell proportions are computed as the products of the corresponding row and column proportion estimates. The modified design correction upper D 0 (based on null hypothesis proportions) is computed as

StartLayout 1st Row 1st Column upper D 0 equals 2nd Column left-parenthesis sigma-summation Underscript r Endscripts sigma-summation Underscript c Endscripts left-parenthesis 1 minus upper P Subscript r c Superscript 0 Baseline right-parenthesis normal upper D normal e normal f normal f Subscript 0 Baseline left-parenthesis ModifyingAbove upper P With caret Subscript r c Baseline right-parenthesis minus sigma-summation Underscript r Endscripts left-parenthesis 1 minus ModifyingAbove upper P With caret Subscript r dot Baseline right-parenthesis normal upper D normal e normal f normal f left-parenthesis ModifyingAbove upper P With caret Subscript r dot Baseline right-parenthesis 2nd Row 1st Column Blank 2nd Column minus sigma-summation Underscript c Endscripts left-parenthesis 1 minus ModifyingAbove upper P With caret Subscript dot c Baseline right-parenthesis normal upper D normal e normal f normal f left-parenthesis ModifyingAbove upper P With caret Subscript dot c Baseline right-parenthesis right-parenthesis slash left-parenthesis upper R minus 1 right-parenthesis left-parenthesis upper C minus 1 right-parenthesis EndLayout

where

upper P Subscript r c Superscript 0 Baseline equals ModifyingAbove upper P With caret Subscript r dot Baseline times ModifyingAbove upper P With caret Subscript dot c

and

StartLayout 1st Row 1st Column normal upper D normal e normal f normal f Subscript 0 Baseline left-parenthesis ModifyingAbove upper P With caret Subscript r c Baseline right-parenthesis 2nd Column equals 3rd Column ModifyingAbove normal upper V normal a normal r With caret left-parenthesis ModifyingAbove upper P With caret Subscript r c Baseline right-parenthesis slash normal upper V normal a normal r Subscript normal s normal r normal s Baseline left-parenthesis upper P Subscript r c Superscript 0 Baseline right-parenthesis 2nd Row 1st Column Blank 2nd Column equals 3rd Column ModifyingAbove normal upper V normal a normal r With caret left-parenthesis ModifyingAbove upper P With caret Subscript r c Baseline right-parenthesis slash left-parenthesis left-parenthesis 1 minus f right-parenthesis upper P Subscript r c Superscript 0 Baseline left-parenthesis 1 minus upper P Subscript r c Superscript 0 Baseline right-parenthesis slash left-parenthesis n minus 1 right-parenthesis right-parenthesis EndLayout
Second-Order Test

The second-order (Satterthwaite) Rao-Scott chi-square statistic for two-way tables is computed as

upper Q Subscript normal upper R normal upper S Baseline 2 Baseline equals upper Q Subscript normal upper R normal upper S Baseline 1 Baseline slash left-parenthesis 1 plus ModifyingAbove a With caret squared right-parenthesis

where upper Q Subscript normal upper R normal upper S Baseline 1 is the first-order Rao-Scott chi-square statistic described in the section First-Order Test and ModifyingAbove a With caret squared is the second-order design correction described in the section Second-Order Design Correction. For more information, see Rao and Scott (1979), Rao and Scott (1981), and Rao and Thomas (1989).

Under the null hypothesis, the second-order Rao-Scott chi-square upper Q Subscript normal upper R normal upper S Baseline 2 approximately follows a chi-square distribution with left-parenthesis upper R minus 1 right-parenthesis left-parenthesis upper C minus 1 right-parenthesis slash left-parenthesis 1 plus ModifyingAbove a With caret squared right-parenthesis degrees of freedom. The corresponding F statistic is

upper F Subscript normal upper R normal upper S Baseline 2 Baseline equals upper Q Subscript normal upper R normal upper S Baseline 2 Baseline left-parenthesis 1 plus ModifyingAbove a With caret squared right-parenthesis slash left-parenthesis upper R minus 1 right-parenthesis left-parenthesis upper C minus 1 right-parenthesis

which has an F distribution with left-parenthesis upper R minus 1 right-parenthesis left-parenthesis upper C minus 1 right-parenthesis slash left-parenthesis 1 plus ModifyingAbove a With caret squared right-parenthesis and kappa left-parenthesis upper R minus 1 right-parenthesis left-parenthesis upper C minus 1 right-parenthesis slash left-parenthesis 1 plus ModifyingAbove a With caret squared right-parenthesis degrees of freedom under the null hypothesis (Thomas and Rao 1984, 1987). The value of kappa is the degrees of freedom for the variance estimator. The degrees of freedom computation depends on the sample design and the variance estimation method. For more information, see the section Degrees of Freedom.

Second-Order Design Correction

The second-order (Satterthwaite) design correction for two-way tables is computed from the eigenvalues of the estimated design effects matrix ModifyingAbove bold upper Delta With caret, which are known as generalized design effects. The design effects matrix is defined as

ModifyingAbove bold upper Delta With caret equals left-parenthesis n minus 1 right-parenthesis slash left-parenthesis 1 minus f right-parenthesis left-parenthesis bold upper C bold o bold v Subscript bold s bold r bold s Baseline left-parenthesis ModifyingAbove bold upper P With caret right-parenthesis Superscript negative 1 Baseline bold upper H ModifyingAbove bold upper C bold o bold v With caret left-parenthesis ModifyingAbove bold upper P With caret right-parenthesis bold upper H prime right-parenthesis

where ModifyingAbove bold upper C bold o bold v With caret left-parenthesis ModifyingAbove bold upper P With caret right-parenthesis is the covariance matrix of the upper R times upper C proportion estimates and bold upper C bold o bold v Subscript bold s bold r bold s left-parenthesis ModifyingAbove bold upper P With caret right-parenthesis is the covariance under multinomial sampling (srs with replacement). For more information, see Rao and Scott (1979), Rao and Scott (1981), and Rao and Thomas (1989).

The second-order design correction is computed from the design effects matrix ModifyingAbove bold upper Delta With caret as

ModifyingAbove a With caret squared equals left-parenthesis sigma-summation Underscript i equals 1 Overscript upper K Endscripts d Subscript c Superscript 2 Baseline slash upper K d overbar squared right-parenthesis minus 1

where K = (R – 1)(C – 1), d Subscript c are the eigenvalues of ModifyingAbove bold upper Delta With caret, and d overbar is the average eigenvalue.

The srs covariance matrix is computed as

bold upper C bold o bold v Subscript bold s bold r bold s Baseline left-parenthesis ModifyingAbove bold upper P With caret right-parenthesis equals ModifyingAbove bold upper P With caret Subscript bold r Baseline circled-times ModifyingAbove bold upper P With caret Subscript bold c Baseline

where ModifyingAbove bold upper P With caret Subscript bold r is an left-parenthesis upper R minus 1 right-parenthesis times left-parenthesis upper R minus 1 right-parenthesis matrix that is constructed from the array of (R – 1) row proportion estimates ModifyingAbove bold p With caret Subscript bold r as

ModifyingAbove bold upper P With caret Subscript bold r Baseline equals normal upper D normal i normal a normal g left-parenthesis ModifyingAbove bold p With caret Subscript bold r Baseline right-parenthesis minus ModifyingAbove bold p With caret Subscript bold r Baseline ModifyingAbove bold p With caret prime Subscript bold r

Similarly, ModifyingAbove bold upper P With caret Subscript bold c is a left-parenthesis upper C minus 1 right-parenthesis times left-parenthesis upper C minus 1 right-parenthesis matrix that is constructed from the array of (C – 1) column proportion estimates ModifyingAbove bold p With caret Subscript bold c as

ModifyingAbove bold upper P With caret Subscript bold c Baseline equals normal upper D normal i normal a normal g left-parenthesis ModifyingAbove bold p With caret Subscript bold c Baseline right-parenthesis minus ModifyingAbove bold p With caret Subscript bold c Baseline ModifyingAbove bold p With caret prime Subscript bold c

The left-parenthesis upper R minus 1 right-parenthesis left-parenthesis upper C minus 1 right-parenthesis times left-parenthesis upper R minus 1 right-parenthesis left-parenthesis upper C minus 1 right-parenthesis matrix bold upper H is computed as

bold upper H equals bold upper J Subscript bold r Baseline circled-times bold upper J Subscript bold c Baseline minus left-parenthesis ModifyingAbove bold p With caret Subscript bold r Baseline bold 1 prime Subscript bold r right-parenthesis circled-times bold upper J Subscript bold c Baseline minus bold upper J Subscript bold r Baseline circled-times left-parenthesis ModifyingAbove bold p With caret Subscript bold r Baseline bold 1 prime Subscript bold r right-parenthesis

where bold upper J Subscript bold r Baseline equals left-parenthesis bold upper I Subscript left-parenthesis upper R minus 1 right-parenthesis Baseline vertical-bar bold 0 right-parenthesis, bold upper J Subscript bold c Baseline equals left-parenthesis bold upper I Subscript left-parenthesis upper C minus 1 right-parenthesis Baseline vertical-bar bold 0 right-parenthesis, bold 1 Subscript bold r is an left-parenthesis upper R times 1 right-parenthesis array of ones, and bold 1 Subscript bold c is an left-parenthesis upper C times 1 right-parenthesis array of ones. For more information, see Rao and Scott (1979, p. 61).

Last updated: December 09, 2022