The MCMC Procedure

Access Lag and Lead Variables

(View the complete code for this example.)

There are two types of random variables in PROC MCMC that are indexed: the response (MODEL statement) is indexed by observations, and the random effect (RANDOM statement) is indexed by the SUBJECT= option variable. As the procedure steps through the input data set, the response or the random-effects symbols are filled with values of the current observation or the random-effects parameters in the current subject. Often you might want to access lag or lead variables across an index. For example, the likelihood function for y Subscript i can depend on y Subscript i minus 1 in an autoregressive model, or the prior distribution for mu Subscript j can depend on mu Subscript k, where k not-equals j, in a dynamic linear model. In these situations, you can use the following rules to construct symbols to access values from other observations or subjects:

  • rv.Li: the ith lag of the variable rv. This looks back to the past.

  • rv.Ni: the ith lead of the variable rv. This looks forward to the future.

The construction is allowed for random variables that are associated with an index, either a response variable or a random-effects variable. You concatenate the variable name, a dot, either the letter L (for “lag”) or the letter N (for “next”), and a lag or a lead number. PROC MCMC resolves these variables according to the indices that are associated with the random variable, with respect to the current observation.

For example, the following RANDOM statement specifies a first-order Markov dependence in the random effect mu that is indexed by the subject variable time:

random mu ~ normal(mu.l1,sd=1) subject=time;

This corresponds to the prior

mu Subscript t Baseline tilde normal left-parenthesis mu Subscript t minus 1 Baseline comma sd equals 1 right-parenthesis

At each observation, PROC MCMC fills in the symbol mu with the random-effects parameter mu Subscript t that belongs to the current cluster t. To fill in the symbol mu.l1, the procedure looks back and finds a lag-1 random-effects parameter, mu Subscript t minus 1, from the last cluster t-1. As the procedure moves forward in the input data set, these two symbols are constantly updated, as appropriate.

When the index is out of range, such as t–1 when t is 1, PROC MCMC fills in the missing state from the ICOND= option in either the MODEL or RANDOM statement. The following example illustrates how PROC MCMC fills in the values of these lag and lead variables as it steps through the data set.

Assume that the random effect mu has five levels, indexed by sans-serif s sans-serif u sans-serif b equals StartSet a comma b comma c comma d comma e EndSet. The model contains two lag variables and one lead variable (mu.l1, mu.l2, and mu.n2):

mn = (mu.l1 + mu.l2 + mu.n2) / 3
random mu ~ normal(mn, sd=1) subject=time icond=(alpha beta gamma kappa);

In this setup, instead of a list of five random-effects parameters that the variable mu can be assigned values to, there is now a list of nine variables for the variables mu, mu.l1, mu.l2, and mu.n2. The list lines up in the following order:

sans-serif a sans-serif l sans-serif p sans-serif h sans-serif a comma sans-serif b sans-serif e sans-serif t sans-serif a comma mu Subscript a Baseline comma mu Subscript b Baseline comma mu Subscript c Baseline comma mu Subscript d Baseline comma mu Subscript e Baseline comma sans-serif g sans-serif a sans-serif m sans-serif m sans-serif a comma sans-serif k sans-serif a sans-serif p sans-serif p sans-serif a

PROC MCMC finds relevant symbol values according to this list, as the procedure steps through different subject cluster. The process is illustrated in Table 46.

Table 46: Processing Lag and Lead Variables

sub mu mu.l2 mu.l1 mu.n2
a mu Subscript a alpha beta mu Subscript c
b mu Subscript b beta mu Subscript a mu Subscript d
c mu Subscript c mu Subscript a mu Subscript b mu Subscript e
d mu Subscript d mu Subscript b mu Subscript c gamma
e mu Subscript e mu Subscript c mu Subscript d kappa


For observations in cluster a, PROC MCMC sets the random-effects variable mu to mu Subscript a, looks back two lags and fills in mu.l2 with alpha, looks back one lag and fills in mu.l1 with beta, and looks forward two leads and fills in mu.n2 with mu Subscript c. As the procedure moves to observations in cluster b, mu becomes mu Subscript b, mu.l2 becomes beta, mu.l1 becomes mu Subscript a, and mu.n2 becomes mu Subscript d. For observations in the last cluster, cluster e, mu becomes mu Subscript e, mu.l2 becomes mu Subscript c, mu.l1 becomes mu Subscript d, and mu.n2 is filled with kappa.

The following example fits a simple first-order dynamic linear model, in which the data set contains a time index and the response variable y:

data dlm;
   input time y;
   datalines;
1   1.353412529
2   4.840739953
3   1.604892523
4   6.8947921
5   3.509644288
6   4.020173553
7   3.842884451
8   4.49057276
9   2.204570502
10   4.007351323
11   2.005515044
12   2.781756057
;

You can fit the following model to the data:

StartLayout 1st Row 1st Column upper Y Subscript t 2nd Column tilde 3rd Column normal left-parenthesis mu Subscript t Baseline comma var equals sigma Subscript y Superscript 2 Baseline right-parenthesis 2nd Row 1st Column mu Subscript t 2nd Column tilde 3rd Column normal left-parenthesis mu Subscript t minus 1 Baseline comma var equals sigma Subscript mu Superscript 2 Baseline right-parenthesis 3rd Row 1st Column mu 0 2nd Column equals 3rd Column alpha 4th Row 1st Column alpha 2nd Column tilde 3rd Column normal left-parenthesis 0 comma var equals 10 right-parenthesis 5th Row 1st Column sigma Subscript y Superscript 2 Baseline comma sigma Subscript mu Superscript 2 Baseline 2nd Column tilde 3rd Column igamma left-parenthesis shape equals 3 comma scale equals 2 right-parenthesis EndLayout

The following PROC MCMC statements estimate parameters from this dynamic linear model:

proc mcmc data=dlm outpost=dlmO nmc=20000 seed=23;
   ods select PostSumInt;
   parms alpha 0;
   parms var_y 1 var_mu 1;
   prior alpha ~ n(0, sd=10);
   prior var_y var_mu ~ igamma(shape=3, scale=2);
   random mu ~ n(mu.l1,var=var_mu) s=time icond=(alpha) monitor=(mu);
   model y~n(mu, var=var_y);
run;

The key component is the mu.l1 specification in the RANDOM statement, where the prior for mu depends on its lag-1 value. The ICOND= option specifies the initial condition of mu and assigns it to be alpha, which is a parameter in the model.

Figure 16 lists the estimated posterior statistics for the parameters.

Figure 16: Posterior Summary Statistics of the Dynamic Linear Model

The MCMC Procedure

Posterior Summaries and Intervals
Parameter N Mean Standard
Deviation
95% HPD Interval
alpha 20000 2.6498 1.2924 -0.0555 5.0366
var_y 20000 1.7400 0.8337 0.5651 3.3498
var_mu 20000 0.8299 0.5720 0.2109 1.9582
mu_1 20000 2.6899 0.9253 0.8380 4.5390
mu_2 20000 3.4175 0.7622 1.9336 4.9541
mu_3 20000 3.3820 0.7431 1.8543 4.7950
mu_4 20000 4.3364 0.8297 2.7406 6.0198
mu_5 20000 3.9308 0.7194 2.4967 5.3248
mu_6 20000 3.8642 0.7348 2.4117 5.3832
mu_7 20000 3.7716 0.7399 2.3019 5.1525
mu_8 20000 3.6754 0.7316 2.1683 5.0342
mu_9 20000 3.1796 0.7209 1.7485 4.6257
mu_10 20000 3.2237 0.7355 1.7802 4.6253
mu_11 20000 2.8074 0.7839 1.2985 4.3701
mu_12 20000 2.8006 0.8758 1.1286 4.5473


Last updated: December 09, 2022