The MI Procedure

TRANSFORM Statement

  • TRANSFORM transform (variables</ options>)<…transform (variables</ options>)> ;

The TRANSFORM statement lists the transformations and their associated variables to be transformed. The options are transformation options that provide additional information for the transformation.

The MI procedure assumes that the data are from a multivariate normal distribution when either the regression method or the MCMC method is used. When some variables in a data set are clearly non-normal, it is useful to transform these variables to conform to the multivariate normality assumption. With a TRANSFORM statement, variables are transformed before the imputation process, and these transformed variable values are displayed in all of the results. When you specify an OUT= option, the variable values are back-transformed to create the imputed data set.

The following transformations can be used in the TRANSFORM statement:

BOXCOX

specifies the Box-Cox transformation of variables. The variable Y is transformed to StartFraction left-parenthesis sans-serif upper Y plus c right-parenthesis Superscript lamda Baseline minus 1 Over lamda EndFraction, where c is a constant such that each value of sans-serif upper Y plus c must be positive. If the specified constant lamda equals 0, the logarithmic transformation is used.

EXP

specifies the exponential transformation of variables. The variable Y is transformed to normal e Superscript left-parenthesis sans-serif upper Y plus c right-parenthesis, where c is a constant.

LOG

specifies the logarithmic transformation of variables. The variable Y is transformed to normal l normal o normal g left-parenthesis sans-serif upper Y plus c right-parenthesis, where c is a constant such that each value of sans-serif upper Y plus c must be positive.

LOGIT

specifies the logit transformation of variables. The variable Y is transformed to normal l normal o normal g left-parenthesis StartFraction sans-serif upper Y slash c Over 1 minus sans-serif upper Y slash c EndFraction right-parenthesis, where the constant c>0 and the values of sans-serif upper Y slash c must be between 0 and 1.

POWER

specifies the power transformation of variables. The variable Y is transformed to left-parenthesis sans-serif upper Y plus c right-parenthesis Superscript lamda, where c is a constant such that each value of sans-serif upper Y plus c must be positive and the constant lamda not-equals 0.

The following options provide the constant c and lamda values in the transformations.

C=number

specifies the c value in the transformation. The default is c = 1 for logit transformation and c = 0 for other transformations.

LAMBDA=number

specifies the lamda value in the power and Box-Cox transformations. You must specify the lamda value for these two transformations.

For example, the following statement requests that variables normal l normal o normal g left-parenthesis sans-serif y Baseline sans-serif 1 right-parenthesis, a logarithmic transformation for the variable y1, and StartRoot sans-serif y Baseline sans-serif 2 plus 1 EndRoot, a power transformation for the variable y2, be used in the imputation:

   transform log(y1) power(y2/c=1 lambda=.5);

If the MU0= option is used to specify a parameter value bold-italic mu 0 for a transformed variable, the same transformation for the variable is also applied to its corresponding MU0= value in the t test. Otherwise, bold-italic mu 0 equals 0 is used for the transformed variable. See Example 82.10 for a usage of the TRANSFORM statement.

Last updated: December 09, 2022