The BCHOICE Procedure

PREDDIST Statement

PREDDIST OUTPRED=SAS-data-set <COVARIATES=SAS-data-set>;

The PREDDIST statement creates a new SAS data set that contains random samples from the posterior predictive distribution of the choice probabilities. It enables you to get the expected choice probabilities of all the alternatives in a choice set.

The posterior predictive distribution is the distribution of unobserved observations (prediction) conditional on the observed data. Let be the observed data, be the covariates, be the parameter, and be the unobserved data. The posterior predictive distribution is defined as follows:

Assuming that the observed and unobserved data are conditionally independent given , the posterior predictive distribution can be further simplified as follows:

p left-parenthesis bold upper Y Subscript pred Baseline vertical-bar bold upper Y comma bold upper X right-parenthesis equals integral p left-parenthesis bold upper Y Subscript pred Baseline vertical-bar bold-italic theta right-parenthesis p left-parenthesis bold-italic theta vertical-bar bold upper Y comma bold upper X right-parenthesis d bold-italic theta

The posterior predictive distribution is an integral of the likelihood function with respect to the posterior distribution . The PREDDIST statement generates samples from a posterior predictive distribution based on draws from the posterior distribution of .

You can specify the following options:

COVARIATES=SAS-data-set: names the SAS data set to contain the sets of explanatory variable values for which the predictions are established. This data set must contain data that has the same variables used in the model. If you omit the COVARIATES= option, the DATA= data set that is specified in the PROC BCHOICE statement is used instead.
NALTER=n NALTERNATIVE=n: specifies the number of alternatives in a choice set in the COVARIATES= data set. All choice sets in the data must have the same number of alternatives. You must specify this option if a COVARIATES= data set is given.
OUTPRED=SAS-data-set: creates an output data set to contain the samples from the posterior predictive distribution of the choice probability that each alternative is chosen from a choice set. The output data set are in the order of either the COVARIATES= data set or the DATA= data set specified in the PROC statement. Multi-threading and data deletion might cause the order to change.

Last updated: December 09, 2022