The GENMOD Procedure

Example 51.9 Assessment of a Marginal Model for Dependent Data

(View the complete code for this example.)

This example illustrates the use of cumulative residuals to assess the adequacy of a marginal model for dependent data fit by generalized estimating equations (GEEs). The assessment methods are applied to CD4 count data from an AIDS clinical trial reported by Fischl, Richman, and Hansen (1990) and reanalyzed by Lin, Wei, and Ying (2002). The study randomly assigned 360 HIV patients to the drug AZT and 351 patients to placebo. CD4 counts were measured repeatedly over the course of the study. The data used here are the 4328 measurements taken in the first 40 weeks of the study.

The analysis focuses on the time trend of the response. The first model considered is

upper E left-parenthesis y Subscript i k Baseline right-parenthesis equals beta 0 plus beta 1 upper T Subscript i k Baseline plus beta 2 upper T Subscript i k Superscript 2 Baseline plus beta 3 upper R Subscript i Baseline upper T Subscript i k Baseline plus beta 4 upper R Subscript i Baseline upper T Subscript i k Superscript 2

where upper T Subscript i k is the time (in weeks) of the kth measurement on the ith patient, y Subscript i k is the CD4 count at upper T Subscript i k for the ith patient, and upper R Subscript i is the indicator of AZT for the ith patient. Normal errors and an independent working correlation are assumed.

The following statements create the SAS data set cd4:

data cd4;
   input Id Y Time Time2 TrtTime TrtTime2;
   Time3 = Time2 * Time;
   TrtTime3 = TrtTime2 * Time;
   datalines;
1     264.00024       -0.28571        0.08163       -0.28571        0.08163
1     175.00070        4.14286       17.16327        4.14286       17.16327
1     306.00150        8.14286       66.30612        8.14286       66.30612
1     331.99835       12.14286      147.44898       12.14286      147.44898
1     309.99929       16.14286      260.59184       16.14286      260.59184
1     185.00077       28.71429      824.51020       28.71429      824.51020
1     175.00070       40.14286     1611.44898       40.14286     1611.44898
2     574.99998       -0.57143        0.32653        0.00000        0.00000

   ... more lines ...   

711     363.99859        8.14286       66.30612        8.14286       66.30612
711     488.00224       12.14286      147.44898       12.14286      147.44898
711     240.00026       18.14286      329.16327       18.14286      329.16327
;

The following SAS statements fit the preceding model, create the cumulative residual plot in Output 51.9.1, and compute a p-value for the model.

To request these graphs, ODS Graphics must be enabled and you must specify the ASSESS statement. For general information about ODS Graphics, see ChapterĀ 24, Statistical Graphics Using ODS. For specific information about the graphics available in the GENMOD procedure, see the section ODS Graphics.

Here, the SAS data set variables Time, Time2, TrtTime, and TrtTime2 correspond to upper T Subscript i k, upper T Subscript i k Superscript 2, upper R Subscript i Baseline upper T Subscript i k, and upper R Subscript i Baseline upper T Subscript i k Superscript 2, respectively. The variable Id identifies individual patients.

ods graphics on;

proc genmod data=cd4;
   class Id;
   model Y = Time Time2 TrtTime TrtTime2;
   repeated sub=Id;
   assess var=(Time) / resample
                       seed=603708000;
run;

Output 51.9.1: Cumulative Residual Plot for Quadratic Time Fit

 Cumulative Residual Plot for Quadratic Time Fit


The cumulative residual plot in Output 51.9.1 displays cumulative residuals versus time for the model and 20 simulated realizations. The associated p-value, also shown in Output 51.9.1, is 0.18. These results indicate that a more satisfactory model might be possible. The observed cumulative residual pattern most resembles plot (c) in Output 51.8.6, suggesting cubic time trends.

The following SAS statements fit the model, create the plot in Output 51.9.2, and compute a p-value for a model with the additional terms upper T Subscript i k Superscript 3 and upper R Subscript i Baseline upper T Subscript i k Superscript 3:

proc genmod data=cd4;
   class Id;
   model Y = Time Time2 Time3 TrtTime TrtTime2 TrtTime3;
   repeated sub=Id;
   assess var=(Time) / resample
                       seed=603708000;
run;

Output 51.9.2: Cumulative Residual Plot for Cubic Time Fit

 Cumulative Residual Plot for Cubic Time Fit


The observed cumulative residual pattern appears more typical of the simulated realizations, and the p-value is 0.45, indicating that the model with cubic time trends is more appropriate.

Last updated: December 09, 2022