The VARIOGRAM Procedure

Theoretical and Computational Details of the Semivariogram

Let StartSet upper Z left-parenthesis bold-italic s right-parenthesis comma bold-italic s element-of upper D subset-of script upper R squared EndSet be a spatial random field (SRF) with n measured values z Subscript i Baseline equals upper Z left-parenthesis bold-italic s Subscript i Baseline right-parenthesis at respective locations bold-italic s Subscript i, i equals 1 comma ellipsis comma n. You use the VARIOGRAM procedure because you want to gain insight into the spatial continuity and structure of upper Z left-parenthesis bold-italic s right-parenthesis. A good measure of the spatial continuity of upper Z left-parenthesis bold-italic s right-parenthesis is defined by means of the variance of the difference upper Z left-parenthesis bold-italic s Subscript i Baseline right-parenthesis minus upper Z left-parenthesis bold-italic s Subscript j Baseline right-parenthesis, where bold-italic s Subscript i and bold-italic s Subscript j are locations in D. Specifically, if you consider bold-italic s Subscript i and bold-italic s Subscript j to be spatial increments such that bold-italic h equals bold-italic s Subscript j Baseline minus bold-italic s Subscript i, then the variance function based on the increments bold-italic h is independent of the actual locations bold-italic s Subscript i, bold-italic s Subscript j. Most commonly, the continuity measure used in practice is one half of this variance, better known as the semivariance function,

gamma Subscript z Baseline left-parenthesis bold-italic h right-parenthesis equals one-half normal upper V normal a normal r left-bracket upper Z left-parenthesis bold-italic s plus bold-italic h right-parenthesis minus upper Z left-parenthesis bold-italic s right-parenthesis right-bracket

or, equivalently,

gamma Subscript z Baseline left-parenthesis bold-italic h right-parenthesis equals one-half left-parenthesis normal upper E left-brace left-bracket upper Z left-parenthesis bold-italic s plus bold-italic h right-parenthesis minus upper Z left-parenthesis bold-italic s right-parenthesis right-bracket squared right-brace minus StartSet normal upper E left-bracket upper Z left-parenthesis bold-italic s plus bold-italic h right-parenthesis right-bracket minus normal upper E left-bracket upper Z left-parenthesis bold-italic s right-parenthesis right-bracket EndSet squared right-parenthesis

The plot of semivariance as a function of bold-italic h is the semivariogram. You might also commonly see the term semivariogram used instead of the term semivariance.

Assume that the SRF upper Z left-parenthesis bold-italic s right-parenthesis is free of nonrandom (or systematic) surface trends. Then, the expected value normal upper E left-bracket upper Z left-parenthesis bold-italic s right-parenthesis right-bracket of upper Z left-parenthesis bold-italic s right-parenthesis is a constant for all bold-italic s element-of script upper R squared, and the semivariance expression is simplified to the following:

gamma Subscript z Baseline left-parenthesis bold-italic h right-parenthesis equals one-half normal upper E left-brace left-bracket upper Z left-parenthesis bold-italic s plus bold-italic h right-parenthesis minus upper Z left-parenthesis bold-italic s right-parenthesis right-bracket squared right-brace

Given the preceding assumption, you can compute an estimate ModifyingAbove gamma With caret Subscript z Baseline left-parenthesis bold-italic h right-parenthesis of the semivariance gamma Subscript z Baseline left-parenthesis bold-italic h right-parenthesis from a finite set of points in a practical way by using the formula

ModifyingAbove gamma With caret Subscript z Baseline left-parenthesis bold-italic h right-parenthesis equals StartFraction 1 Over 2 bar upper N left-parenthesis bold-italic h right-parenthesis bar EndFraction sigma-summation Underscript upper N left-parenthesis bold-italic h right-parenthesis Endscripts left-bracket upper Z left-parenthesis bold-italic s Subscript i Baseline right-parenthesis minus upper Z left-parenthesis bold-italic s Subscript j Baseline right-parenthesis right-bracket squared

where the sets upper N left-parenthesis bold-italic h right-parenthesis contain all the neighboring pairs at distance bold-italic h,

upper N left-parenthesis bold-italic h right-parenthesis equals StartSet i comma j colon bold-italic s Subscript i Baseline minus bold-italic s Subscript j Baseline equals bold-italic h EndSet

and bar upper N left-parenthesis bold-italic h right-parenthesis bar is the number of such pairs left-parenthesis i comma j right-parenthesis.

The expression for ModifyingAbove gamma With caret Subscript z Baseline left-parenthesis bold-italic h right-parenthesis is called the empirical semivariance (Matheron 1963). This is the quantity that PROC VARIOGRAM computes, and its corresponding plot is the empirical semivariogram.

The empirical semivariance ModifyingAbove gamma With caret Subscript z Baseline left-parenthesis bold-italic h right-parenthesis is also referred to as classical. This name is used so that it can be distinguished from the robust semivariance estimate ModifyingAbove gamma With bar Subscript z Baseline left-parenthesis bold-italic h right-parenthesis and the corresponding robust semivariogram. The robust semivariance was introduced by Cressie and Hawkins (1980) to weaken the effect that outliers in the observations might have on the semivariance. It is described by Cressie (1993, p. 75) as

ModifyingAbove gamma With bar Subscript z Baseline left-parenthesis bold-italic h right-parenthesis equals StartFraction normal upper Psi Superscript 4 Baseline left-parenthesis bold-italic h right-parenthesis Over 2 left-bracket 0.457 plus 0.494 slash upper N left-parenthesis bold-italic h right-parenthesis right-bracket EndFraction

In the preceding expression the parameter normal upper Psi left-parenthesis bold-italic h right-parenthesis is defined as

normal upper Psi left-parenthesis bold-italic h right-parenthesis equals StartFraction 1 Over upper N left-parenthesis bold-italic h right-parenthesis EndFraction sigma-summation Underscript upper P Subscript i Baseline upper P Subscript j Baseline element-of upper N left-parenthesis bold-italic h right-parenthesis Endscripts left-bracket upper Z left-parenthesis bold-italic s Subscript i Baseline right-parenthesis minus upper Z left-parenthesis bold-italic s Subscript j Baseline right-parenthesis right-bracket Superscript one-half

According to Cressie (1985), the estimate ModifyingAbove gamma With caret Subscript z Baseline left-parenthesis bold-italic h right-parenthesis has approximate variance

normal upper V normal a normal r left-bracket ModifyingAbove gamma With caret Subscript z Baseline left-parenthesis bold-italic h right-parenthesis right-bracket asymptotically-equals StartFraction 2 left-bracket gamma Subscript z Baseline left-parenthesis bold-italic h right-parenthesis right-bracket squared Over upper N left-parenthesis bold-italic h right-parenthesis EndFraction

This approximation is possible by assuming upper Z left-parenthesis bold-italic s right-parenthesis to be a Gaussian SRF, and by further assuming the squared differences in empirical semivariances to be uncorrelated for different distances bold-italic h. Typically, semivariance estimates are correlated because of the underlying spatial correlation among the observations, and also because the same observation pairs might be used for the estimation of more than one semivariogram point, as described in the following subsections. Despite these restrictive assumptions, the approximate variance provides an idea about the semivariance estimate variance and enables fitting of a theoretical model to the empirical semivariance; see the section Theoretical Semivariogram Model Fitting for more details about the fitting process.

Note: If your data include a surface trend, then the empirical semivariance ModifyingAbove gamma With caret Subscript z Baseline left-parenthesis bold-italic h right-parenthesis is not an estimate of the theoretical semivariance function gamma Subscript z Baseline left-parenthesis bold-italic h right-parenthesis. Instead, rather than the spatial increments variance, it represents a different quantity known as pseudo-semivariance, and its corresponding plot is a pseudo-semivariogram. In principle, pseudo-semivariograms do not provide measures of the spatial continuity. They can thus lead to misinterpretations of the upper Z left-parenthesis bold-italic s right-parenthesis spatial structure, and are consequently unsuitable for the purpose of spatial prediction. For further information, see the detailed discussion in the section Empirical Semivariograms and Surface Trends. Under certain conditions you might be able to gain some insight about the spatial continuity with a pseudo-semivariogram. This case is presented in Analysis without Surface Trend Removal.

Last updated: December 09, 2022