Statistical Graphics Using ODS

Scoring

Many procedures score observations that have missing values or have zero, missing, or invalid weights or frequencies. If the independent variables are all valid, then the procedure can compute predicted values. When the independent and dependent variables are all valid, then the procedure can also compute residuals. The following steps illustrate by using two simple regression models:

data class;
   set sashelp.class(rename=(height=Height1)) nobs=n;
   output;
   if _n_ = n then do;
      call missing(name, sex, height1);
      do age = 10 to 17; output; end;
   end;
run;

proc reg data=class;
   model height1 = age;
   output out=p1 p=p1y r=r1y;
run;

data class;
   f = 1;
   set sashelp.class(rename=(height=Height2)) nobs=n;
   output;
   if _n_ = n then do;
      call missing(f, name, sex);
      do age = 10 to 17; output; end;
   end;
run;

proc reg data=class;
   freq f;
   model height2 = age;
   output out=p2 p=p2y r=r2y;
run;

data all; merge p1 p2; run;

proc print data=all;
   var f name sex age height: p: r:;
   format p1y p2y r1y r2y 6.3;
run;

The results are displayed in Output 24.6.29.

Output 24.6.29: Scoring in a Simple Regression Model

Obs f Name Sex Age Height1 Height2 p1y p2y r1y r2y
1 1 Alfred M 14 69.0 69.0 64.244 64.244 4.756 4.756
2 1 Alice F 13 56.5 56.5 61.457 61.457 -4.957 -4.957
3 1 Barbara F 13 65.3 65.3 61.457 61.457 3.843 3.843
4 1 Carol F 14 62.8 62.8 64.244 64.244 -1.444 -1.444
5 1 Henry M 14 63.5 63.5 64.244 64.244 -0.744 -0.744
6 1 James M 12 57.3 57.3 58.670 58.670 -1.370 -1.370
7 1 Jane F 12 59.8 59.8 58.670 58.670 1.130 1.130
8 1 Janet F 15 62.5 62.5 67.031 67.031 -4.531 -4.531
9 1 Jeffrey M 13 62.5 62.5 61.457 61.457 1.043 1.043
10 1 John M 12 59.0 59.0 58.670 58.670 0.330 0.330
11 1 Joyce F 11 51.3 51.3 55.882 55.882 -4.582 -4.582
12 1 Judy F 14 64.3 64.3 64.244 64.244 0.056 0.056
13 1 Louise F 12 56.3 56.3 58.670 58.670 -2.370 -2.370
14 1 Mary F 15 66.5 66.5 67.031 67.031 -0.531 -0.531
15 1 Philip M 16 72.0 72.0 69.818 69.818 2.182 2.182
16 1 Robert M 12 64.8 64.8 58.670 58.670 6.130 6.130
17 1 Ronald M 15 67.0 67.0 67.031 67.031 -0.031 -0.031
18 1 Thomas M 11 57.5 57.5 55.882 55.882 1.618 1.618
19 1 William M 15 66.5 66.5 67.031 67.031 -0.531 -0.531
20 .     10 . 66.5 53.095 53.095 . 13.405
21 .     11 . 66.5 55.882 55.882 . 10.618
22 .     12 . 66.5 58.670 58.670 . 7.830
23 .     13 . 66.5 61.457 61.457 . 5.043
24 .     14 . 66.5 64.244 64.244 . 2.256
25 .     15 . 66.5 67.031 67.031 . -0.531
26 .     16 . 66.5 69.818 69.818 . -3.318
27 .     17 . 66.5 72.605 72.605 . -6.105


The first model excludes observations that have missing values. The second model excludes observations that have missing frequencies. The first model uses the variable Height1, which has missing values in the extra observations that are to be scored. The predicted values and residual for the first model are displayed in the variables p1y and r1y, respectively. The second model uses the variable Height2, which has no missing values. The predicted values and residual for the second model are displayed in the variables p1y and r1y, respectively. The predicted values match for both models. The residuals match for the observations that do not have a missing height. For both models, the predicted values for the scored observations match the predicted values for the analysis observations that have the same age. Additionally, the procedure creates predicted values for heights that are not in the data set.

The following steps create observations for each fuel type that are to be scored:

proc means min max data=sashelp.gas;
   class fuel;
   var eqratio;
   output out=m(where=(_type_ eq 1 and _stat_ in ('MIN', 'MAX')));
run;

proc transpose data=m out=m2(drop=_:);
   var eqratio;
   by fuel;
   id _stat_;
run;

data score(drop=min max);
   set m2;
   do eqratio = min to max by (max - min) / 200; output; end;
run;

The following steps concatenate those observations to the data set and use PROC GLIMMIX to score them:

data gas;
   set sashelp.gas(where=(n(nox))) score;
run;

proc glimmix data=gas;
   effect spl = spline(eqratio / naturalcubic knotmethod=equal(5));
   class fuel;
   model nox = spl | fuel;
   output out=scored(where=(nmiss(nox))) pred=py;
run;

The next step displays the scores:

proc sgplot data=scored;
   series y=py x=eqratio / group=fuel;
run;

Output 24.6.30: Scored Observations

Scored Observations


The following steps score the same observations by using the PLM procedure:

proc glimmix data=sashelp.gas;
   effect spl = spline(eqratio / naturalcubic knotmethod=equal(5));
   class fuel;
   model nox = spl | fuel;
   store SplineModel;
run;

proc plm restore=SplineModel;
   score data=score out=scored2 predicted=py;
run;

proc sgplot data=scored2;
   series y=py x=eqratio / group=fuel;
run;

The scores are displayed in Output 24.6.31. For more information about PROC PLM, see Chapter 94, The PLM Procedure.

Output 24.6.31: Observations Scored by PROC PLM

Observations Scored by PROC PLM


Last updated: December 09, 2022