(View the complete code for this example.)
This example creates a classification tree model to determine important variables (parameters) during the manufacture of a semiconductor device. Some of the variables that are involved in the manufacturing process are as follows: gTemp is the growth temperature of substrate, aTemp is the anneal temperature, Rot is rotation speed, Dopant is the atom that is used during device growth, and Usable indicates whether the device is usable.
The following statements create a data set named MBE_Data, which contains measurements for 20 devices:
data MBE_Data;
label gTemp = 'Growth Temperature of Substrate';
label aTemp = 'Anneal Temperature';
label Rot = 'Rotation Speed';
label Dopant = 'Dopant Atom';
label Usable = 'Experiment Could Be Performed';
input gTemp aTemp Rot Dopant $ 39-40 Usable $ 47-54;
datalines;
384.614 633.172 1.01933 C Unusable
363.874 512.942 0.72057 C Unusable
397.395 671.179 0.90419 C Unusable
389.962 653.940 1.01417 C Unusable
387.763 612.545 1.00417 C Unusable
394.206 617.021 1.07188 Si Usable
387.135 616.035 0.94740 Si Usable
428.783 745.345 0.99087 Si Unusable
399.365 600.932 1.23307 Si Unusable
455.502 648.821 1.01703 Si Unusable
387.362 697.589 1.01623 Ge Usable
408.872 640.406 0.94543 Ge Usable
407.734 628.196 1.05137 Ge Usable
417.343 612.328 1.03960 Ge Usable
482.539 669.392 0.84249 Ge Unusable
367.116 564.246 0.99642 Sn Unusable
398.594 733.839 1.08744 Sn Unusable
378.032 619.561 1.06137 Sn Usable
357.544 606.871 0.85205 Sn Unusable
384.578 635.858 1.12215 Sn Unusable
;
The following statements create the tree model:
proc hpsplit data=MBE_Data maxdepth=6;
class Usable Dopant;
model Usable = gTemp aTemp Rot Dopant;
prune none;
run;
Output 68.5.1 shows the "Variable Importance" table.
Output 68.5.1: Variable Importance
| Variable Importance | ||||
|---|---|---|---|---|
| Variable | Variable Label |
Training | Count | |
| Relative | Importance | |||
| gTemp | Growth Temperature of Substrate | 1.0000 | 2.1022 | 2 |
| Dopant | Dopant Atom | 0.7522 | 1.5811 | 1 |
| aTemp | Anneal Temperature | 0.6228 | 1.3093 | 1 |
| Rot | Rotation Speed | 0.3250 | 0.6831 | 1 |
This table shows that the predictor gTemp has the largest value. This means that the growth temperature of substrate is the most important consideration in determining the usability of the sample.