The SIM2D Procedure

Theoretical Development

It is a simple matter to produce an upper N left-parenthesis 0 comma 1 right-parenthesis random number, and by stacking k upper N left-parenthesis 0 comma 1 right-parenthesis random numbers in a column vector, you can obtain a vector with independent standard normal components bold upper W tilde upper N Subscript k Baseline left-parenthesis bold 0 comma bold upper I right-parenthesis. The meaning of the terms independence and randomness in the context of a deterministic algorithm required for the generation of these numbers is subtle; see Knuth (1981, Chapter 3) for details.

Rather than bold upper W tilde upper N Subscript k Baseline left-parenthesis bold 0 comma bold upper I right-parenthesis, what is required is the generation of a vector bold upper Z tilde upper N Subscript k Baseline left-parenthesis bold 0 comma bold upper C right-parenthesis—that is,

bold upper Z equals Start 4 By 1 Matrix 1st Row  upper Z 1 2nd Row  upper Z 2 3rd Row  vertical-ellipsis 4th Row  upper Z Subscript k EndMatrix

with covariance matrix

bold upper C equals Start 4 By 4 Matrix 1st Row 1st Column upper C 11 2nd Column upper C 12 3rd Column midline-horizontal-ellipsis 4th Column upper C Subscript 1 k Baseline 2nd Row 1st Column upper C 21 2nd Column upper C 22 3rd Column midline-horizontal-ellipsis 4th Column upper C Subscript 2 k Baseline 3rd Row 1st Column Blank 2nd Column down-right-diagonal-ellipsis 3rd Column Blank 4th Column Blank 4th Row 1st Column upper C Subscript k Baseline 1 Baseline 2nd Column upper C Subscript k Baseline 2 Baseline 3rd Column midline-horizontal-ellipsis 4th Column upper C Subscript k k EndMatrix

If the covariance matrix is symmetric and positive definite, it has a Cholesky root bold upper L such that bold upper C can be factored as

bold upper C equals bold upper L bold upper L prime

where bold upper L is lower triangular. See Ralston and Rabinowitz (1978, Chapter 9, Section 3-3) for details. This vector bold upper Z can be generated by the transformation bold upper Z equals bold upper L bold upper W. Here is where the assumption of a Gaussian SRF is crucial. When bold upper W tilde upper N Subscript k Baseline left-parenthesis bold 0 comma bold upper I right-parenthesis, then bold upper Z equals bold upper L bold upper W is also Gaussian. The mean of bold upper Z is

upper E left-parenthesis bold upper Z right-parenthesis equals bold upper L left-parenthesis upper E left-parenthesis bold upper W right-parenthesis right-parenthesis equals bold 0

and the variance is

Var left-parenthesis bold upper Z right-parenthesis equals Var left-parenthesis bold upper L bold upper W right-parenthesis equals upper E left-parenthesis bold upper L bold upper W bold upper W prime bold upper L Superscript prime Baseline right-parenthesis equals bold upper L upper E left-parenthesis bold upper W bold upper W Superscript prime Baseline right-parenthesis bold upper L Superscript prime Baseline equals bold upper L bold upper L prime equals bold upper C

Consider now an SRF upper Z left-parenthesis bold-italic s right-parenthesis comma bold-italic s element-of upper D subset-of script upper R squared, with spatial covariance function upper C left-parenthesis bold-italic h right-parenthesis. Fix locations bold-italic s 1 comma bold-italic s 2 comma ellipsis comma bold-italic s Subscript k Baseline, and let bold upper Z denote the random vector

bold upper Z equals Start 4 By 1 Matrix 1st Row  upper Z left-parenthesis bold-italic s 1 right-parenthesis 2nd Row  upper Z left-parenthesis bold-italic s 2 right-parenthesis 3rd Row  vertical-ellipsis 4th Row  upper Z left-parenthesis bold-italic s Subscript k Baseline right-parenthesis EndMatrix

with corresponding covariance matrix

bold upper C Subscript z Baseline equals Start 4 By 4 Matrix 1st Row 1st Column upper C left-parenthesis bold 0 right-parenthesis 2nd Column upper C left-parenthesis bold-italic s 1 minus bold-italic s 2 right-parenthesis 3rd Column midline-horizontal-ellipsis 4th Column upper C left-parenthesis bold-italic s 1 minus bold-italic s Subscript k Baseline right-parenthesis 2nd Row 1st Column upper C left-parenthesis bold-italic s 2 minus bold-italic s 1 right-parenthesis 2nd Column upper C left-parenthesis bold 0 right-parenthesis 3rd Column midline-horizontal-ellipsis 4th Column upper C left-parenthesis bold-italic s 2 minus bold-italic s Subscript k Baseline right-parenthesis 3rd Row 1st Column Blank 2nd Column down-right-diagonal-ellipsis 3rd Column Blank 4th Column Blank 4th Row 1st Column upper C left-parenthesis bold-italic s Subscript k Baseline minus bold-italic s 1 right-parenthesis 2nd Column upper C left-parenthesis bold-italic s Subscript k Baseline minus bold-italic s 2 right-parenthesis 3rd Column midline-horizontal-ellipsis 4th Column upper C left-parenthesis bold 0 right-parenthesis EndMatrix

Since this covariance matrix is symmetric and positive definite, it has a Cholesky root, and the upper Z left-parenthesis bold-italic s Subscript i Baseline right-parenthesis comma i equals 1 comma ellipsis comma k, can be simulated as described previously. This is how the SIM2D procedure implements unconditional simulation in the zero-mean case. More generally,

upper Z left-parenthesis bold-italic s right-parenthesis equals mu left-parenthesis bold-italic s right-parenthesis plus epsilon left-parenthesis bold-italic s right-parenthesis

where mu left-parenthesis bold-italic s right-parenthesis is a quadratic form in the coordinates bold-italic s equals left-parenthesis x comma y right-parenthesis and the epsilon left-parenthesis bold-italic s right-parenthesis is an SRF that has the same covariance matrix bold upper C Subscript z as previously. In this case, the mu left-parenthesis bold-italic s Subscript i Baseline right-parenthesis comma i equals 1 comma ellipsis comma k, is computed once and added to the simulated vector epsilon left-parenthesis bold-italic s Subscript i Baseline right-parenthesis comma i equals 1 comma ellipsis comma k, for each realization.

For a conditional simulation, this distribution of

bold upper Z equals Start 4 By 1 Matrix 1st Row  upper Z left-parenthesis bold-italic s 1 right-parenthesis 2nd Row  upper Z left-parenthesis bold-italic s 2 right-parenthesis 3rd Row  vertical-ellipsis 4th Row  upper Z left-parenthesis bold-italic s Subscript k Baseline right-parenthesis EndMatrix

must be conditioned on the observed data. The relevant general result concerning conditional distributions of multivariate normal random variables is the following. Let bold upper X tilde upper N Subscript m Baseline left-parenthesis bold-italic mu comma bold upper Sigma right-parenthesis, where

bold upper X equals StartBinomialOrMatrix bold upper X 1 Choose bold upper X 2 EndBinomialOrMatrix
bold-italic mu equals StartBinomialOrMatrix bold-italic mu 1 Choose bold-italic mu 2 EndBinomialOrMatrix
bold upper Sigma equals Start 2 By 2 Matrix 1st Row 1st Column bold upper Sigma 11 2nd Column bold upper Sigma 12 2nd Row 1st Column bold upper Sigma 21 2nd Column bold upper Sigma 22 EndMatrix

The subvector bold upper X 1 is k times 1, bold upper X 2 is n times 1, bold upper Sigma 11 is k times k, bold upper Sigma 22 is n times n, and bold upper Sigma 12 equals bold upper Sigma prime 21 is k times n, with k plus n equals m. The full vector bold upper X is partitioned into two subvectors, bold upper X 1 and bold upper X 2, and bold upper Sigma is similarly partitioned into covariances and cross covariances.

With this notation, the distribution of bold upper X 1 conditioned on bold upper X 2 equals bold-italic x 2 is upper N Subscript k Baseline left-parenthesis bold-italic mu overTilde comma bold upper Sigma overTilde right-parenthesis, with

bold-italic mu overTilde equals bold-italic mu 1 plus bold upper Sigma 12 bold upper Sigma 22 Superscript negative 1 Baseline left-parenthesis bold-italic x 2 minus bold-italic mu 2 right-parenthesis

and

bold upper Sigma overTilde equals bold upper Sigma 11 minus bold upper Sigma 12 bold upper Sigma 22 Superscript negative 1 Baseline bold upper Sigma 21

See Searle (1971, pp. 46–47) for details. The correspondence with the conditional spatial simulation problem is as follows. Let the coordinates of the observed data points be denoted bold-italic s overTilde Subscript 1 Baseline comma bold-italic s overTilde Subscript 2 Baseline comma ellipsis comma bold-italic s overTilde Subscript n Baseline, with values z overTilde Subscript 1 Baseline comma z overTilde Subscript 2 Baseline comma ellipsis comma z overTilde Subscript n Baseline. Let bold upper Z overTilde denote the random vector

bold upper Z overTilde equals Start 4 By 1 Matrix 1st Row  upper Z left-parenthesis bold-italic s overTilde Subscript 1 Baseline right-parenthesis 2nd Row  upper Z left-parenthesis bold-italic s overTilde Subscript 2 Baseline right-parenthesis 3rd Row  vertical-ellipsis 4th Row  upper Z left-parenthesis bold-italic s overTilde Subscript n Baseline right-parenthesis EndMatrix

The random vector bold upper Z overTilde corresponds to bold upper X 2, while bold upper Z corresponds to bold upper X 1. Then left-parenthesis bold upper Z bar bold upper Z overTilde equals bold z overTilde right-parenthesis tilde upper N Subscript k Baseline left-parenthesis bold-italic mu overTilde comma bold upper C overTilde right-parenthesis as in the previous distribution. The matrix

bold upper C overTilde equals bold upper C 11 minus bold upper C 12 bold upper C 22 Superscript negative 1 Baseline bold upper C 21

is again positive definite, so a Cholesky factorization can be performed.

The dimension n for bold upper Z overTilde is simply the number of nonmissing observations for the VAR= variable; the values z overTilde Subscript 1 Baseline comma z overTilde Subscript 2 Baseline comma ellipsis comma z overTilde Subscript n Baseline are the values of this variable. The coordinates bold-italic s overTilde Subscript 1 Baseline comma bold-italic s overTilde Subscript 2 Baseline comma ellipsis comma bold-italic s overTilde Subscript n Baseline are also found in the DATA= data set, with the variables that correspond to the x and y coordinates identified in the COORDINATES statement. Note: All VAR= variables use the same set of conditioning coordinates; this fixes the matrix bold upper C 22 for all simulations.

The dimension k for bold upper Z is the number of grid points specified in the GRID statement. Since there is a single GRID statement, this fixes the matrix bold upper C 11 for all simulations. Similarly, bold upper C 12 is fixed.

The Cholesky factorization bold upper C overTilde equals bold upper L bold upper L prime is computed once, as is the mean correction

bold-italic mu overTilde equals bold-italic mu 1 plus bold upper C 12 bold upper C 22 Superscript negative 1 Baseline left-parenthesis bold-italic x 2 minus bold-italic mu 2 right-parenthesis

The means bold-italic mu 1 and bold-italic mu 2 are computed using the grid coordinates bold-italic s 1 comma bold-italic s 2 comma ellipsis comma bold-italic s Subscript k Baseline, the data coordinates bold-italic s overTilde Subscript 1 Baseline comma bold-italic s overTilde Subscript 2 Baseline comma ellipsis comma bold-italic s overTilde Subscript n Baseline, and the quadratic form specification from the MEAN statement. The simulation is now performed exactly as in the unconditional case. A k times 1 vector of independent standard upper N left-parenthesis 0 comma 1 right-parenthesis random variables is generated and multiplied by bold upper L, and bold-italic mu overTilde is added to the transformed vector. This is repeated N times, where N is the value specified for the NR= option.

Last updated: December 09, 2022