The SIM2D Procedure

Theoretical Development

It is a simple matter to produce an random number, and by stacking k random numbers in a column vector, you can obtain a vector with independent standard normal components . The meaning of the terms independence and randomness in the context of a deterministic algorithm required for the generation of these numbers is subtle; see Knuth (1981, Chapter 3) for details.

Rather than , what is required is the generation of a vector —that is,

bold upper Z equals Start 4 By 1 Matrix 1st Row upper Z 1 2nd Row upper Z 2 3rd Row vertical-ellipsis 4th Row upper Z Subscript k EndMatrix

with covariance matrix

If the covariance matrix is symmetric and positive definite, it has a Cholesky root such that can be factored as

bold upper C equals bold upper L bold upper L prime

where is lower triangular. See Ralston and Rabinowitz (1978, Chapter 9, Section 3-3) for details. This vector can be generated by the transformation . Here is where the assumption of a Gaussian SRF is crucial. When , then is also Gaussian. The mean of is

upper E left-parenthesis bold upper Z right-parenthesis equals bold upper L left-parenthesis upper E left-parenthesis bold upper W right-parenthesis right-parenthesis equals bold 0

and the variance is

Var left-parenthesis bold upper Z right-parenthesis equals Var left-parenthesis bold upper L bold upper W right-parenthesis equals upper E left-parenthesis bold upper L bold upper W bold upper W prime bold upper L Superscript prime Baseline right-parenthesis equals bold upper L upper E left-parenthesis bold upper W bold upper W Superscript prime Baseline right-parenthesis bold upper L Superscript prime Baseline equals bold upper L bold upper L prime equals bold upper C

Consider now an SRF , with spatial covariance function . Fix locations , and let denote the random vector

bold upper Z equals Start 4 By 1 Matrix 1st Row upper Z left-parenthesis bold-italic s 1 right-parenthesis 2nd Row upper Z left-parenthesis bold-italic s 2 right-parenthesis 3rd Row vertical-ellipsis 4th Row upper Z left-parenthesis bold-italic s Subscript k Baseline right-parenthesis EndMatrix

with corresponding covariance matrix

Since this covariance matrix is symmetric and positive definite, it has a Cholesky root, and the , can be simulated as described previously. This is how the SIM2D procedure implements unconditional simulation in the zero-mean case. More generally,

upper Z left-parenthesis bold-italic s right-parenthesis equals mu left-parenthesis bold-italic s right-parenthesis plus epsilon left-parenthesis bold-italic s right-parenthesis

where is a quadratic form in the coordinates and the is an SRF that has the same covariance matrix as previously. In this case, the , is computed once and added to the simulated vector , for each realization.

For a conditional simulation, this distribution of

must be conditioned on the observed data. The relevant general result concerning conditional distributions of multivariate normal random variables is the following. Let , where

bold upper X equals StartBinomialOrMatrix bold upper X 1 Choose bold upper X 2 EndBinomialOrMatrix

bold-italic mu equals StartBinomialOrMatrix bold-italic mu 1 Choose bold-italic mu 2 EndBinomialOrMatrix

bold upper Sigma equals Start 2 By 2 Matrix 1st Row 1st Column bold upper Sigma 11 2nd Column bold upper Sigma 12 2nd Row 1st Column bold upper Sigma 21 2nd Column bold upper Sigma 22 EndMatrix

The subvector is , is , is , is , and is , with . The full vector is partitioned into two subvectors, and , and is similarly partitioned into covariances and cross covariances.

With this notation, the distribution of conditioned on is , with

bold-italic mu overTilde equals bold-italic mu 1 plus bold upper Sigma 12 bold upper Sigma 22 Superscript negative 1 Baseline left-parenthesis bold-italic x 2 minus bold-italic mu 2 right-parenthesis

and

bold upper Sigma overTilde equals bold upper Sigma 11 minus bold upper Sigma 12 bold upper Sigma 22 Superscript negative 1 Baseline bold upper Sigma 21

See Searle (1971, pp. 46–47) for details. The correspondence with the conditional spatial simulation problem is as follows. Let the coordinates of the observed data points be denoted , with values . Let denote the random vector

bold upper Z overTilde equals Start 4 By 1 Matrix 1st Row upper Z left-parenthesis bold-italic s overTilde Subscript 1 Baseline right-parenthesis 2nd Row upper Z left-parenthesis bold-italic s overTilde Subscript 2 Baseline right-parenthesis 3rd Row vertical-ellipsis 4th Row upper Z left-parenthesis bold-italic s overTilde Subscript n Baseline right-parenthesis EndMatrix

The random vector corresponds to , while corresponds to . Then as in the previous distribution. The matrix

bold upper C overTilde equals bold upper C 11 minus bold upper C 12 bold upper C 22 Superscript negative 1 Baseline bold upper C 21

is again positive definite, so a Cholesky factorization can be performed.

The dimension n for is simply the number of nonmissing observations for the VAR= variable; the values are the values of this variable. The coordinates are also found in the DATA= data set, with the variables that correspond to the x and y coordinates identified in the COORDINATES statement. Note: All VAR= variables use the same set of conditioning coordinates; this fixes the matrix for all simulations.

The dimension k for is the number of grid points specified in the GRID statement. Since there is a single GRID statement, this fixes the matrix for all simulations. Similarly, is fixed.

The Cholesky factorization is computed once, as is the mean correction

bold-italic mu overTilde equals bold-italic mu 1 plus bold upper C 12 bold upper C 22 Superscript negative 1 Baseline left-parenthesis bold-italic x 2 minus bold-italic mu 2 right-parenthesis

The means and are computed using the grid coordinates , the data coordinates , and the quadratic form specification from the MEAN statement. The simulation is now performed exactly as in the unconditional case. A vector of independent standard random variables is generated and multiplied by , and is added to the transformed vector. This is repeated N times, where N is the value specified for the NR= option.

Last updated: December 09, 2022