The BCHOICE Procedure

MODEL Statement

  • MODEL response <(response-options)> = <fixed-effects> </ model-options>;

The MODEL statement is required; it defines the response (dependent) variable and the fixed effects. The response variable indicates the chosen alternative in a choice set by the value 1 and the unchosen alternatives by the value 0 when the response is binary. In the MaxDiff choice experiment, a respondent can pick one best item and one worst item in each choice set, where there are three valid integer values: 1 for the best alternative, –1 for the worst alternative, and 0 for the ones in the middle. In the allocation choice setting, the response takes a percent of preference for each alternative and the sum of all percents in each choice set is 100%. Any value of the response variable that is valid, including missing, will be replace by zero. The

fixed-effects determine the bold upper X matrix of the model. The specification of effects is the same as in the GLIMMIX and MIXED procedures, where you do not specify random effects in the MODEL statement.

Table 3 summarizes the options available in the MODEL statement. These are subsequently discussed in detail in alphabetical order by option category.

Table 3: MODEL Statement Options

Option Description
ALTERIDX= Specifies the order of all alternatives in a choice set
CHOICESET= Specifies the variables for defining a choice set
CHOICETYPE= Specifies what values the choice response variable takes
COEFFPRIOR= Specifies the prior of the regression coefficients
COVPRIOR= Specifies the prior of the covariance parameter for a probit model
COVTYPE= Specifies the structure of the covariance matrix of the error difference for a probit model
INIT= Controls the generation of initial values of the regression coefficients
LAMBDAPRIOR= Specifies the prior of the log-sum coefficients for a nested logit model
NEST= Defines the nonoverlapping nests for a nested logit model
SAMELAMBDA Constrains the log-sum coefficients to be the same for all the nests in a nested logit model
TYPE= Specifies the type of the model


ALTERIDX=variable

specifies the order of all alternatives in a choice set for a nested logit or probit model. Specify this option to make sure the alternatives appear in the same order in all choice sets. When you do not specify this option, PROC BCHOICE assumes that the alternatives are in the same order. The variable should take increasing positive integers starting from 1 within each choice set. You need to specify this option only for a nested logit or probit model; PROC BCHOICE ignores it for a logit model.

CHOICESET=(variables)

specifies one or more variables for defining the choice sets. You must specify how the choice sets are constructed, and you can use more than one variable. PROC BCHOICE does not sort by the values of the choice set variable; rather, it considers the data to be from a new choice set whenever the value of the choice set variable changes from the previous observation.

CHOICETYPE=choice-type

specifies what values the choice outcome variable takes. You can specify the following choice-types:

BINARY

specifies that the choice response variable has a binary integer value: 1 for a chosen alternative and 0 for an unchosen alternative. Each choice set has at least two alternatives and only one chosen alternative.

MAXDIFF

specifies that the choice response variable takes one of the possible three integer values: 1 for the best alternative, –1 for the worst alternative, and 0 for the ones in the middle. Each respondent is shown a set of the possible items to be evaluated and chooses one best item and one worst item in each choice set.

ALLOCATION <(WEIGHT=n | variable)>

specifies that the choice response variable takes a percentage of preference for each alternative and that the sum of percentages in each choice set is 100%. You can specify the following suboption:

WEIGHT=n

specifies a positive integer to indicate how many times the allocated percentages of each choice set in the data should be applied. For example, if the respondent were asked to imagine the next 20 purchase occasions and to allocate a percentage of those 20 purchases for each alternative, you would use a weight of 20. The same weight is used for all the choice sets in the data. By default, WEIGHT=10.

WEIGHT=variable

specifies a variable from the DATA= data set to indicate how many times the allocated percentages of a choice set should be counted. This suboption allows varying weights among the choice sets. The variable takes nonmissing positive integers. This variable should be the same within each choice set, because the weight is supposed to be applied at the choice set level and is not specific to any alternative in a choice set.

By default, CHOICETYPE=BINARY.

COEFFPRIOR=NORMAL < (options)>
CPRIOR=NORMAL <(options)>

specifies the prior distribution for the regression coefficients. The default is the normal prior upper N left-parenthesis bold 0 comma 10 squared bold upper I right-parenthesis, where bold upper I is the identity matrix. You can specify the following options, enclosed in parentheses:

INPUT=SAS-data-set

specifies a SAS data set that contains the mean and covariance information of the normal prior. The data set must have a _TYPE_ variable to represent the type of each observation and a variable for each regression coefficient. If the data set also contains a _NAME_ variable, the values of this variable are used to identify the covariances for the _TYPE_=’COV’ observations; otherwise, the _TYPE_=’COV’ observations are assumed to be in the same order as the explanatory variables in the MODEL statement. PROC BCHOICE reads the mean vector from the observation for which _TYPE_=’MEAN’ and reads the covariance matrix from observations for which _TYPE_=’COV’. For an independent normal prior, the variances can be specified with _TYPE_=’VAR’; alternatively, the precisions (inverse of the variances) can be specified with _TYPE_=’PRECISION’.

VAR <=c>

specifies the normal prior upper N left-parenthesis bold 0 comma c bold upper I right-parenthesis, where bold upper I is the identity matrix and c is a scalar.

COVPRIOR=IWISHART <(options)>

specifies an inverse Wishart prior distribution, IWISHART(a,b), for the covariance matrix for the vector of error differences. For models that do not have a covariance matrix for the error differences (the logit and nested logit models), this option is ignored.

You can specify the following options, enclosed in parentheses:

DF=a

specifies the degrees of freedom of the inverse Wishart distribution. The default is the number of alternatives in the choice set plus 2, which is equivalent to the dimension of the covariance matrix of the error differences plus 3.

SCALE=b

specifies b bold upper I for the scale parameter of the inverse Wishart distribution, where bold upper I is the identity matrix. The default is the number of alternatives in the choice set plus 2.

COVTYPE=UN | VC

specifies the covariance structure of the error difference vector for a probit model. Although a variety of structures are available, most applications call for either COVTYPE=VC or COVTYPE=UN for the error difference vector. The COVTYPE=VC (variance components) models a different variance component for each error term. The TYPE=UN (unstructured) specifies a full structured covariance matrix. The unstructured form accommodates any pattern of correlation in addition to fitting a different variance component for each error difference term.

INIT=keyword-list | (numeric-list)
INITIAL=keyword-list | (numeric-list)

specifies options for generating the initial values for the coefficients parameters that are specified as fixed-effects in the MODEL statement. By default, INIT=POSTMODE for logit models and INIT=PRIORMODE for probit models. You can specify the following keywords:

LIST=numeric-list

assigns the numbers to be the initial values of the fixed effects in the corresponding list order. The length of the list must be the same as the number of fixed effects. For example, the following statement assigns the values 1, 2, and 3 to the first, second, and third coefficients in the model and prints the table of initial values:

model y = x / choiceset=(ID Index) init=(list=(1 2 3) pinit);

If the length of the list is less than the number of fixed effects, the initial value of each remaining parameter will be replaced by the corresponding default initial value. For example, the corresponding mode of the posterior density is used for a logit model. If the length of the list is greater than the number of fixed effects, the extra ones are ignored.

PINIT

tabulates initial values for the fixed effects. (By default, PROC BCHOICE does not display the initial values.)

POSTMODE

uses the mode of the posterior density as the initial value of the parameter, if you do not provide one. If the mode does not exist or if it is on the boundary of the support of the density, the mean value is used. If you specify POSTMODE for a probit model, where the posterior density is difficult to obtain, PROC BCHOICE resets it to PRIORMODE.

PRIORMODE

uses the mode of the prior density as the initial value of the parameter.

LAMBDAPRIOR=SEMIFLAT <(options)>
LPRIOR=SEMIFLAT <(options)>

specifies a semi-flat prior distribution (Lahiri and Gao 2002) for the log-sum coefficient, lamda, for each nest in a nested logit model. For models that are not nested logit, this option is ignored.

You can specify the following option, enclosed in parentheses:

PHI=a

specifies the parameter phi of the semi-flat prior. By default, phi equals 0.8.

NEST=(numeric-list)

defines the nonoverlapping nests for a nested logit model. For a nested logit model, you must specify the nests for all the alternatives in the choice set. Otherwise, the standard logit model is assumed. The number of values in the list should match the number of alternatives in the choice sets, and each of the actual values represents the nest that the particular alternative goes to. For example, NEST=(1 2 1 1 2) arranges the first, third, and fourth alternatives in the first nest and the second and fifth alternatives in the second nest. Currently, this option can accommodate only two-level nested logit models.

SAMELAMBDA

constrains the log-sum coefficients to be the same for all the nests in a nested logit model.

TYPE=keyword

specifies the type of the model. You can specify the following keywords:

LOGIT

specifies a standard logit model.

NLOGIT

specifies a nested logit model. If you do not also specify the NEST= option to define the nests, this option is ignored, and a standard logit model is fit.

PROBIT

specifies a probit model.

Last updated: December 09, 2022