The CALIS Procedure

Assessment of Fit

In PROC CALIS, there are three main tools for assessing model fit:

  • residuals for the fitted means or covariances

  • overall model fit indices

  • squared multiple correlations and determination coefficients

This section contains a collection of formulas for these assessment tools. The following notation is used:

  • N for the total sample size

  • k for the total number of independent groups in analysis

  • p for the number of manifest variables

  • t for the number of parameters to estimate

  • bold upper Theta for the t-vector of parameters, ModifyingAbove bold upper Theta With caret for the estimated parameters

  • bold upper S equals left-parenthesis s Subscript i j Baseline right-parenthesis for the p times p input covariance or correlation matrix

  • bold x overbar equals left-parenthesis x overbar Subscript i Baseline right-parenthesis for the p-vector of sample means

  • ModifyingAbove bold upper Sigma With caret equals bold upper Sigma left-parenthesis ModifyingAbove bold upper Theta With caret right-parenthesis equals left-parenthesis ModifyingAbove sigma With caret Subscript i j Baseline right-parenthesis for the predicted covariance or correlation matrix

  • ModifyingAbove bold-italic mu With caret equals left-parenthesis ModifyingAbove mu With caret Subscript i Baseline right-parenthesis for the predicted mean vector

  • delta for indicating the modeling of the mean structures

  • bold upper W for the weight matrix

  • f Subscript m i n for the minimized function value of the fitted model

  • d Subscript m i n for the degrees of freedom of the fitted model

In multiple-group analyses, subscripts are used to distinguish independent groups or samples. For example, upper N 1 comma upper N 2 comma ellipsis comma upper N Subscript r Baseline comma ellipsis comma upper N Subscript k Baseline denote the sample sizes for k groups. Similarly, notation such as p Subscript r, bold upper S Subscript r, bold x overbar Subscript r, ModifyingAbove bold upper Sigma With caret Subscript r, ModifyingAbove bold-italic mu With caret Subscript r, delta Subscript r, and bold upper W Subscript r is used for multiple-group situations.

Residuals in the Moment Matrices

Residuals indicate how well each entry or element in the mean or covariance matrix is fitted. Large residuals indicate bad fit.

PROC CALIS computes four types of residuals and writes them to the OUTSTAT= data set when requested.

  • raw residuals

    s Subscript i j Baseline minus ModifyingAbove sigma With caret Subscript i j Baseline comma x overbar Subscript i Baseline minus ModifyingAbove mu With caret Subscript i Baseline

    for the covariance and mean residuals, respectively. The raw residuals are displayed whenever the PALL, PRINT, or RESIDUAL option is specified.

  • variance standardized residuals

    StartFraction s Subscript i j Baseline minus ModifyingAbove sigma With caret Subscript i j Baseline Over StartRoot s Subscript i i Baseline s Subscript j j Baseline EndRoot EndFraction comma StartFraction x overbar Subscript i Baseline minus ModifyingAbove mu With caret Subscript i Baseline Over StartRoot s Subscript i i Baseline EndRoot EndFraction

    for the covariance and mean residuals, respectively. The variance standardized residuals are displayed when you specify one of the following:

    The variance standardized residuals are equal to those computed by the EQS 3 program (Bentler 1995).

  • asymptotically standardized residuals

    StartFraction s Subscript i j Baseline minus ModifyingAbove sigma With caret Subscript i j Baseline Over StartRoot v Subscript i j comma i j Baseline EndRoot EndFraction comma StartFraction x overbar Subscript i Baseline minus ModifyingAbove mu With caret Subscript i Baseline Over StartRoot u Subscript i i Baseline EndRoot EndFraction

    for the covariance and mean residuals, respectively; with

    v Subscript i j comma i j Baseline equals left-parenthesis ModifyingAbove bold upper Gamma With caret Subscript 1 Baseline minus bold upper J 1 ModifyingAbove Cov With caret left-parenthesis ModifyingAbove bold upper Theta With caret right-parenthesis bold upper J prime 1 right-parenthesis Subscript i j comma i j
    u Subscript i i Baseline equals left-parenthesis ModifyingAbove bold upper Gamma With caret Subscript 2 Baseline minus bold upper J 2 ModifyingAbove Cov With caret left-parenthesis ModifyingAbove bold upper Theta With caret right-parenthesis bold upper J prime 2 right-parenthesis Subscript i i

    where ModifyingAbove bold upper Gamma With caret Subscript 1 is the p squared times p squared estimated asymptotic covariance matrix of sample covariances, ModifyingAbove bold upper Gamma With caret Subscript 2 is the p times p estimated asymptotic covariance matrix of sample means, bold upper J 1 is the p squared times t Jacobian matrix d bold upper Sigma slash d bold upper Theta, bold upper J 2 is the p times t Jacobian matrix d bold-italic mu slash d bold upper Theta, and ModifyingAbove Cov With caret left-parenthesis ModifyingAbove bold upper Theta With caret right-parenthesis is the t times t estimated covariance matrix of parameter estimates, all evaluated at the sample moments and estimated parameter values. See the next section for the definitions of ModifyingAbove bold upper Gamma With caret Subscript 1 and ModifyingAbove bold upper Gamma With caret Subscript 2. Asymptotically standardized residuals are displayed when one of the following conditions is met:

    • The PALL, the PRINT, or the RESIDUAL option is specified, and METHOD=ML, METHOD=GLS, or METHOD=WLS, and the expensive information and Jacobian matrices are computed for some other reason.

    • RESIDUAL= ASYSTAND is specified.

    The asymptotically standardized residuals are equal to those computed by the LISREL 7 program (Jöreskog and Sörbom 1988) except for the denominator in the definition of matrix ModifyingAbove bold upper Gamma With caret Subscript 1.

  • normalized residuals

    StartFraction s Subscript i j Baseline minus ModifyingAbove sigma With caret Subscript i j Baseline Over StartRoot left-parenthesis ModifyingAbove bold upper Gamma With caret Subscript 1 Baseline right-parenthesis Subscript i j comma i j Baseline EndRoot EndFraction comma StartFraction x overbar Subscript i Baseline minus ModifyingAbove mu With caret Subscript i Baseline Over StartRoot left-parenthesis ModifyingAbove bold upper Gamma With caret Subscript 2 Baseline right-parenthesis Subscript i i Baseline EndRoot EndFraction

    for the covariance and mean residuals, respectively; with ModifyingAbove bold upper Gamma With caret Subscript 1 as the p squared times p squared estimated asymptotic covariance matrix of sample covariances; and ModifyingAbove bold upper Gamma With caret Subscript 2 as the p times p estimated asymptotic covariance matrix of sample means.

    Diagonal elements of ModifyingAbove bold upper Gamma With caret Subscript 1 and ModifyingAbove bold upper Gamma With caret Subscript 2 are defined for the following methods:

    • GLS: left-parenthesis ModifyingAbove bold upper Gamma With caret Subscript 1 Baseline right-parenthesis Subscript i j comma i j Baseline equals StartFraction 1 Over left-parenthesis upper N minus 1 right-parenthesis EndFraction left-parenthesis s Subscript i i Baseline s Subscript j j Baseline plus s Subscript i j Superscript 2 Baseline right-parenthesis and left-parenthesis ModifyingAbove bold upper Gamma With caret Subscript 2 Baseline right-parenthesis Subscript i i Baseline equals StartFraction 1 Over left-parenthesis upper N minus 1 right-parenthesis s EndFraction Subscript i i

    • ML: left-parenthesis ModifyingAbove bold upper Gamma With caret Subscript 1 Baseline right-parenthesis Subscript i j comma i j Baseline equals StartFraction 1 Over left-parenthesis upper N minus 1 right-parenthesis EndFraction left-parenthesis ModifyingAbove sigma With caret Subscript i i Baseline ModifyingAbove sigma With caret Subscript j j Baseline plus ModifyingAbove sigma With caret Subscript i j Superscript 2 Baseline right-parenthesis and left-parenthesis ModifyingAbove bold upper Gamma With caret Subscript 2 Baseline right-parenthesis Subscript i i Baseline equals StartFraction 1 Over left-parenthesis upper N minus 1 right-parenthesis EndFraction ModifyingAbove sigma With caret Subscript i i

    • WLS: left-parenthesis ModifyingAbove bold upper Gamma With caret Subscript 1 Baseline right-parenthesis Subscript i j comma i j Baseline equals StartFraction 1 Over left-parenthesis upper N minus 1 right-parenthesis EndFraction upper W Subscript i j comma i j and left-parenthesis ModifyingAbove bold upper Gamma With caret Subscript 2 Baseline right-parenthesis Subscript i i Baseline equals StartFraction 1 Over left-parenthesis upper N minus 1 right-parenthesis s EndFraction Subscript i i

    where bold upper W in the WLS method is the weight matrix for the second-order moments.

    Normalized residuals are displayed when one of the following conditions is met:

    The normalized residuals are equal to those computed by the LISREL VI program (Jöreskog and Sörbom 1985) except for the definition of the denominator in computing matrix ModifyingAbove bold upper Gamma With caret Subscript 1.

For estimation methods that are not "best" generalized least squares estimators (Browne 1982, 1984), such as METHOD=NONE, METHOD=ULS, or METHOD=DWLS, the assumption of an asymptotic covariance matrix bold upper Gamma 1 of sample covariances does not seem to be appropriate. In this case, the normalized residuals should be replaced by the more relaxed variance standardized residuals. Computation of asymptotically standardized residuals requires computing the Jacobian and information matrices. This is computationally very expensive and is done only if the Jacobian matrix has to be computed for some other reasons—that is, if at least one of the following items is true:

Since normalized residuals use an overestimate of the asymptotic covariance matrix of residuals (the diagonals of bold upper Gamma 1 and bold upper Gamma 2), the normalized residuals cannot be greater than the asymptotically standardized residuals (which use the diagonal of the form bold upper Gamma minus bold upper J ModifyingAbove Cov With caret left-parenthesis ModifyingAbove bold upper Theta With caret right-parenthesis bold upper J prime).

Together with the residual matrices, the values of the average residual, the average off-diagonal residual, and the rank order of the largest values are displayed. The distributions of the normalized and standardized residuals are displayed also.

Overall Model Fit Indices

Instead of assessing the model fit by looking at a number of residuals of the fitted moments, an overall model fit index measures model fit by a single number. Although an overall model fit index is precise and easy to use, there are indeed many choices of overall fit indices. Unfortunately, researchers do not always have a consensus on the best set of indices to use in all occasions.

PROC CALIS produces a large number of overall model fit indices in the fit summary table. If you prefer to display only a subset of these fit indices, you can use the ONLIST(ONLY)= option of the FITINDEX statement to customize the fit summary table.

Fit indices are classified into three classes in the fit summary table of PROC CALIS:

  • absolute or standalone Indices

  • parsimony indices

  • incremental indices

Absolute or Stand-Alone Indices

These indices are constructed so that they measure model fit without comparing with a baseline model and without taking the model complexity into account. They measure the absolute fit of the model.

  • fit function or discrepancy function The fit function or discrepancy function F is minimized during the optimization. See the section Estimation Criteria for definitions of various discrepancy functions available in PROC CALIS. For a multiple-group analysis, the fit function can be written as a weighted average of discrepancy functions for k independent groups as:

    upper F equals sigma-summation Underscript r equals 1 Overscript k Endscripts a Subscript r Baseline upper F Subscript r

    where a Subscript r Baseline equals StartFraction left-parenthesis upper N Subscript j Baseline minus 1 right-parenthesis Over left-parenthesis upper N minus k right-parenthesis EndFraction and upper F Subscript r are the group weight and the discrepancy function for the rth group, respectively. Notice that although the groups are assumed to be independent in the model, in general upper F Subscript r’s are not independent when F is being minimized. The reason is that upper F Subscript r’s might have shared parameters in bold upper Theta during estimation.

    The minimized function value of F will be denoted as f Subscript m i n, which is always positive, with small values indicating good fit.

  • bold-italic chi squared test statistic For ML, GLS, and WLS estimation, the overall chi squared measure for testing model fit is

    chi squared equals left-parenthesis upper N minus k right-parenthesis asterisk f Subscript m i n

    where f Subscript m i n is the function value at the minimum, N is the total sample size, and k is the number of independent groups. The associated degrees of freedom is denoted by d Subscript m i n.

    For ML estimation, this gives the likelihood ratio test statistic of the specified structural model in the null hypothesis against an unconstrained saturated model in the alternative hypothesis. The chi squared test is valid only if the observations are independent and identically distributed, the analysis is based on the unstandardized sample covariance matrix bold upper S, and the sample size N is sufficiently large (Browne 1982; Bollen 1989b; Jöreskog and Sörbom 1985). For ML and GLS estimates, the variables must also have an approximately multivariate normal distribution.

    In the output fit summary table of PROC CALIS, the notation "Prob > Chi-Square" means "the probability of obtaining a greater chi squared value than the observed value under the null hypothesis." This probability is also known as the p-value of the chi-square test statistic.

  • Satorra-Bentler scaled bold-italic chi squared value (Satorra and Bentler 1994) For MLSB estimation, the baseline and target model chi squared is adjusted by the formula by

    chi Subscript upper S upper B Superscript 2 Baseline equals StartFraction chi squared Over tau slash d EndFraction

    where d is the degrees of freedom of the baseline or target model and tau is a quantity that must be estimated in practice. Raw data are necessary for computing the estimate of tau. Both d and tau are usually different for the baseline and target models. See Satorra and Bentler (1994) for detailed formulas.

    When you specify METHOD=MLSB, PROC CALIS displays the scaled chi-squares for the baseline and target models. In addition, various fit indices are computed based on the scaled chi-squares instead of the regular versions. If the formulas for the fit indices involve the fit function values of the baseline and target models, the scaled versions of these function values are used instead.

  • adjusted bold-italic chi squared value (Browne 1982) If the variables are p-variate elliptic rather than normal and have significant amounts of multivariate kurtosis (leptokurtic or platykurtic), the chi squared value can be adjusted to

    chi Subscript ell Superscript 2 Baseline equals StartFraction chi squared Over eta 2 EndFraction

    where eta 2 is the multivariate relative kurtosis coefficient.

  • Z-test (Wilson and Hilferty 1931) The Z-test of Wilson and Hilferty assumes a p-variate normal distribution,

    upper Z equals StartStartFraction RootIndex 3 StartRoot StartFraction chi squared Over d EndFraction EndRoot minus left-parenthesis 1 minus StartFraction 2 Over 9 d EndFraction right-parenthesis OverOver StartRoot StartFraction 2 Over 9 d EndFraction EndRoot EndEndFraction

    where d is the degrees of freedom of the model. See McArdle (1988) and Bishop, Fienberg, and Holland (1975, p. 527) for an application of the Z-test.

  • critical N index (Hoelter 1983) The critical N (Hoelter 1983) is defined as

    CN equals int left-parenthesis StartFraction chi Subscript c r i t Superscript 2 Baseline Over f Subscript m i n Baseline EndFraction right-parenthesis

    where chi Subscript c r i t Superscript 2 is the critical chi-square value for the given d degrees of freedom and probability alpha equals 0.05, and int() takes the integer part of the expression. See Bollen (1989b, p. 277). Conceptually, the CN value is the largest number of observations that could still make the chi-square model fit statistic insignificant if it were to apply to the actual sample fit function value f Subscript m i n. Hoelter (1983) suggests that CN should be at least 200; however, Bollen (1989b) notes that the CN value might lead to an overly pessimistic assessment of fit for small samples.

    Note that when you have a perfect model fit for your data (that is, f Subscript m i n Baseline equals 0) or a zero degree of freedom for your model (that is, d = 0), CN is not computable.

  • root mean square residual (RMR) For a single-group analysis, the RMR is the root of the mean squared residuals:

    RMR equals StartRoot StartFraction 1 Over b EndFraction left-bracket sigma-summation Underscript i Overscript p Endscripts sigma-summation Underscript j Overscript i Endscripts left-parenthesis s Subscript i j Baseline minus ModifyingAbove sigma With caret Subscript i j Baseline right-parenthesis squared plus delta sigma-summation Underscript i Overscript p Endscripts left-parenthesis x overbar Subscript i Baseline minus ModifyingAbove mu With caret Subscript i Baseline right-parenthesis squared right-bracket EndRoot

    where

    b equals StartFraction p left-parenthesis p plus 1 plus 2 delta right-parenthesis Over 2 EndFraction

    is the number of distinct elements in the covariance matrix and in the mean vector (if modeled).

    For multiple-group analysis, PROC CALIS uses the following formula for the overall RMR:

    overall RMR equals StartRoot sigma-summation Underscript r equals 1 Overscript k Endscripts StartFraction w Subscript r Baseline Over sigma-summation Underscript r equals 1 Overscript k Endscripts w Subscript r Baseline EndFraction left-bracket sigma-summation Underscript i Overscript p Endscripts sigma-summation Underscript j Overscript i Endscripts left-parenthesis s Subscript i j Baseline minus ModifyingAbove sigma With caret Subscript i j Baseline right-parenthesis squared plus delta sigma-summation Underscript i Overscript p Endscripts left-parenthesis x overbar Subscript i Baseline minus ModifyingAbove mu With caret Subscript i Baseline right-parenthesis squared right-bracket EndRoot

    where

    w Subscript r Baseline equals StartFraction upper N Subscript r Baseline minus 1 Over upper N minus k EndFraction b Subscript r

    is the weight for the squared residuals of the rth group. Hence, the weight w Subscript r is the product of group size weight StartFraction upper N Subscript r Baseline minus 1 Over upper N minus k EndFraction and the number of distinct moments b Subscript r in the rth group.

  • standardized root mean square residual (SRMR)

    For a single-group analysis, the SRMR is the root of the mean of the standardized squared residuals:

    SRMR equals StartRoot StartFraction 1 Over b EndFraction left-bracket sigma-summation Underscript i Overscript p Endscripts sigma-summation Underscript j Overscript i Endscripts StartFraction left-parenthesis s Subscript i j Baseline minus ModifyingAbove sigma With caret Subscript i j Baseline right-parenthesis squared Over s Subscript i i Baseline s Subscript j j Baseline EndFraction plus delta sigma-summation Underscript i Overscript p Endscripts StartFraction left-parenthesis x overbar Subscript i Baseline minus ModifyingAbove mu With caret Subscript i Baseline right-parenthesis squared Over s Subscript i i Baseline EndFraction right-bracket EndRoot

    where b is the number of distinct elements in the covariance matrix and in the mean vector (if modeled). The formula for b is defined exactly the same way as it appears in the formula for RMR.

    Similar to the calculation of the overall RMR, an overall measure of SRMR in a multiple-group analysis is a weighted average of the standardized squared residuals of the groups. That is,

    overall SRMR equals StartRoot sigma-summation Underscript r equals 1 Overscript k Endscripts StartFraction w Subscript r Baseline Over sigma-summation Underscript r equals 1 Overscript k Endscripts w Subscript r Baseline EndFraction left-bracket sigma-summation Underscript i Overscript p Endscripts sigma-summation Underscript j Overscript i Endscripts StartFraction left-parenthesis s Subscript i j Baseline minus ModifyingAbove sigma With caret Subscript i j Baseline right-parenthesis squared Over s Subscript i i Baseline s Subscript j j Baseline EndFraction plus delta sigma-summation Underscript i Overscript p Endscripts StartFraction left-parenthesis x overbar Subscript i Baseline minus ModifyingAbove mu With caret Subscript i Baseline right-parenthesis squared Over s Subscript i i Baseline EndFraction right-bracket EndRoot

    where w Subscript r is the weight for the squared residuals of the rth group. The formula for w Subscript r is defined exactly the same way as it appears in the formula for SRMR.

  • goodness-of-fit index (GFI) For a single-group analysis, the goodness-of-fit index for the ULS, GLS, and ML estimation methods is:

    normal upper G normal upper F normal upper I equals 1 minus StartFraction normal upper T normal r left-parenthesis left-parenthesis bold upper W Superscript negative 1 Baseline left-parenthesis bold upper S minus ModifyingAbove bold upper Sigma With caret right-parenthesis right-parenthesis squared right-parenthesis plus delta left-parenthesis bold x overbar minus ModifyingAbove bold-italic mu With caret right-parenthesis prime bold upper W Superscript negative 1 Baseline left-parenthesis bold x overbar minus ModifyingAbove bold-italic mu With caret right-parenthesis Over normal upper T normal r left-parenthesis left-parenthesis bold upper W Superscript negative 1 Baseline bold upper S right-parenthesis squared right-parenthesis plus delta bold x overbar prime bold upper W Superscript negative 1 Baseline bold x overbar EndFraction

    with bold upper W equals upper I for ULS, bold upper W equals bold upper S for GLS, and bold upper W equals ModifyingAbove bold upper Sigma With caret. For WLS and DWLS estimation,

    normal upper G normal upper F normal upper I equals 1 minus StartFraction left-parenthesis bold u minus ModifyingAbove bold-italic eta With caret right-parenthesis prime bold upper W Superscript negative 1 Baseline left-parenthesis bold u minus ModifyingAbove bold-italic eta With caret right-parenthesis Over bold u prime bold upper W Superscript negative 1 Baseline bold u EndFraction

    where bold u is the vector of observed moments and ModifyingAbove bold-italic eta With caret is the vector of fitted moments. When the mean structures are modeled, vectors bold u and ModifyingAbove bold-italic eta With caret contains all the nonredundant elements normal v normal e normal c normal s left-parenthesis bold upper S right-parenthesis in the covariance matrix and all the means. That is,

    bold u equals left-parenthesis normal v normal e normal c normal s prime left-parenthesis bold upper S right-parenthesis comma bold x overbar Superscript prime Baseline right-parenthesis Superscript prime Baseline comma ModifyingAbove bold-italic eta With caret equals left-parenthesis normal v normal e normal c normal s prime left-parenthesis ModifyingAbove bold upper Sigma With caret right-parenthesis comma ModifyingAbove bold-italic mu With caret Superscript prime Baseline right-parenthesis prime

    and the symmetric weight matrix bold upper W is of dimension p times left-parenthesis p plus 3 right-parenthesis slash 2. When the mean structures are not modeled, vectors bold u and ModifyingAbove bold-italic eta With caret contains all the nonredundant elements normal v normal e normal c normal s left-parenthesis bold upper S right-parenthesis in the covariance matrix only. That is,

    bold u equals normal v normal e normal c normal s left-parenthesis bold upper S right-parenthesis comma ModifyingAbove bold-italic eta With caret equals normal v normal e normal c normal s left-parenthesis ModifyingAbove bold upper Sigma With caret right-parenthesis

    and the symmetric weight matrix bold upper W is of dimension p times left-parenthesis p plus 1 right-parenthesis slash 2. In addition, for the DWLS estimation, bold upper W is a diagonal matrix.

    For a constant weight matrix bold upper W, the goodness-of-fit index is 1 minus the ratio of the minimum function value and the function value before any model has been fitted. The GFI should be between 0 and 1. The data probably do not fit the model if the GFI is negative or much greater than 1.

    For a multiple-group analysis, individual GFI Subscript r’s are computed for groups. The overall measure is a weighted average of individual GFI Subscript r’s, using weight a Subscript r Baseline equals StartFraction upper N Subscript r Baseline minus 1 Over upper N minus k EndFraction. That is,

    overall GFI equals sigma-summation Underscript r equals 1 Overscript k Endscripts a Subscript r Baseline GFI Subscript r Baseline
Parsimony Indices

These indices are constructed so that the model complexity is taken into account when assessing model fit. In general, models with more parameters (fewer degrees of freedom) are penalized.

  • adjusted goodness-of-fit index (AGFI) The AGFI is the GFI adjusted for the degrees of freedom d of the model,

    AGFI equals 1 minus StartFraction c Over d EndFraction left-parenthesis 1 minus GFI right-parenthesis

    where

    c equals sigma-summation Underscript r equals 1 Overscript k Endscripts StartFraction p Subscript k Baseline left-parenthesis p Subscript k Baseline plus 1 plus 2 delta Subscript k Baseline right-parenthesis Over 2 EndFraction

    computes the total number of elements in the covariance matrices and mean vectors for modeling. For single-group analyses, the AGFI corresponds to the GFI in replacing the total sum of squares by the mean sum of squares.

    Caution:

    • Large p and small d can result in a negative AGFI. For example, GFI=0.90, p=19, and d=2 result in an AGFI of –8.5.

    • AGFI is not defined for a saturated model, due to division by d = 0.

    • AGFI is not sensitive to losses in d.

    The AGFI should be between 0 and 1. The data probably do not fit the model if the AGFI is negative or much greater than 1. For more information, see Mulaik et al. (1989).

  • parsimonious goodness-of-fit index (PGFI) The PGFI (Mulaik et al. 1989) is a modification of the GFI that takes the parsimony of the model into account:

    PGFI equals StartFraction d Subscript m i n Baseline Over d 0 EndFraction GFI

    where d Subscript m i n is the model degrees of freedom and d 0 is the degrees of freedom for the independence model. See the section Incremental Indices for the definition of independence model. The PGFI uses the same parsimonious factor as the parsimonious normed Bentler-Bonett index (James, Mulaik, and Brett 1982).

  • RMSEA index (Steiger and Lind 1980; Steiger 1998) The root mean square error of approximation (RMSEA) coefficient is:

    epsilon equals StartRoot k EndRoot StartRoot max left-parenthesis StartFraction f Subscript m i n Baseline Over d Subscript m i n Baseline EndFraction minus StartFraction 1 Over left-parenthesis upper N minus k right-parenthesis EndFraction comma 0 right-parenthesis EndRoot

    The lower and upper limits of the left-parenthesis 1 minus alpha right-parenthesis percent-sign-confidence interval are computed using the cumulative distribution function of the noncentral chi-square distribution normal upper Phi left-parenthesis x vertical-bar lamda comma d right-parenthesis. With x equals left-parenthesis upper N minus k right-parenthesis f Subscript m i n, lamda Subscript upper L satisfying normal upper Phi left-parenthesis x vertical-bar lamda Subscript upper L Baseline comma d Subscript m i n Baseline right-parenthesis equals 1 minus StartFraction alpha Over 2 EndFraction, and lamda Subscript upper U satisfying normal upper Phi left-parenthesis x vertical-bar lamda Subscript upper U Baseline comma d Subscript m i n Baseline right-parenthesis equals StartFraction alpha Over 2 EndFraction:

    left-parenthesis epsilon Subscript alpha Sub Subscript upper L Subscript Baseline semicolon epsilon Subscript alpha Sub Subscript upper U Subscript Baseline right-parenthesis equals left-parenthesis StartRoot k EndRoot StartRoot StartFraction lamda Subscript upper L Baseline Over left-parenthesis upper N minus k right-parenthesis d Subscript m i n Baseline EndFraction EndRoot semicolon StartRoot k EndRoot StartRoot StartFraction lamda Subscript upper U Baseline Over left-parenthesis upper N minus k right-parenthesis d Subscript m i n Baseline EndFraction EndRoot right-parenthesis

    See Browne and Du Toit (1992) for more details. The size of the confidence interval can be set by the option ALPHARMS=alpha, 0 less-than-or-equal-to alpha less-than-or-equal-to 1. The default is alpha equals 0.1, which corresponds to the 90% confidence interval for the RMSEA.

  • probability for test of close fit (Browne and Cudeck 1993) The traditional exact chi squared test hypothesis upper H 0 colon epsilon equals 0 is replaced by the null hypothesis of close fit upper H 0 colon epsilon less-than-or-equal-to 0.05 and the exceedance probability P is computed as:

    upper P equals 1 minus normal upper Phi left-parenthesis x vertical-bar lamda Superscript asterisk Baseline comma d Subscript m i n Baseline right-parenthesis

    where x equals left-parenthesis upper N minus k right-parenthesis f Subscript m i n and lamda Superscript asterisk Baseline equals 0.05 squared left-parenthesis upper N minus k right-parenthesis d Subscript m i n Baseline slash k. The null hypothesis of close fit is rejected if P is smaller than a prespecified level (for example, P < 0.05).

  • ECVI: expected cross-validation index (Browne and Cudeck 1993) The following formulas for ECVI are limited to the case of single-sample analysis without mean structures and with either the GLS, ML, or WLS estimation method. For other cases, ECVI is not defined in PROC CALIS. For GLS and WLS, the estimator c of the ECVI is linearly related to AIC, Akaike’s Information Criterion (Akaike 1974, 1987):

    c equals f Subscript m i n Baseline plus StartFraction 2 t Over upper N minus 1 EndFraction

    For ML estimation, c Subscript upper M upper L is used:

    c Subscript upper M upper L Baseline equals f Subscript m i n Baseline plus StartFraction 2 t Over upper N minus p minus 2 EndFraction

    For GLS and WLS, the confidence interval left-parenthesis c Subscript upper L Baseline semicolon c Subscript upper U Baseline right-parenthesis for ECVI is computed using the cumulative distribution function normal upper Phi left-parenthesis x vertical-bar lamda comma d Subscript m i n Baseline right-parenthesis of the noncentral chi-square distribution,

    left-parenthesis c Subscript upper L Baseline semicolon c Subscript upper U Baseline right-parenthesis equals left-parenthesis StartFraction lamda Subscript upper L Baseline plus p left-parenthesis p plus 1 right-parenthesis slash 2 plus t Over left-parenthesis upper N minus 1 right-parenthesis EndFraction semicolon StartFraction lamda Subscript upper U Baseline plus p left-parenthesis p plus 1 right-parenthesis slash 2 plus t Over left-parenthesis upper N minus 1 right-parenthesis EndFraction right-parenthesis

    with x equals left-parenthesis upper N minus 1 right-parenthesis f Subscript m i n, normal upper Phi left-parenthesis x vertical-bar lamda Subscript upper U Baseline comma d Subscript m i n Baseline right-parenthesis equals StartFraction alpha Over 2 EndFraction, and normal upper Phi left-parenthesis x vertical-bar lamda Subscript upper L Baseline comma d Subscript m i n Baseline right-parenthesis equals 1 minus StartFraction alpha Over 2 EndFraction.

    For ML, the confidence interval left-parenthesis c Subscript upper L Superscript asterisk Baseline semicolon c Subscript upper U Superscript asterisk Baseline right-parenthesis for ECVI is:

    left-parenthesis c Subscript upper L Superscript asterisk Baseline semicolon c Subscript upper U Superscript asterisk Baseline right-parenthesis equals left-parenthesis StartFraction lamda Subscript upper L Superscript asterisk Baseline plus p left-parenthesis p plus 1 right-parenthesis slash 2 plus t Over upper N minus p minus 2 EndFraction semicolon StartFraction lamda Subscript upper U Superscript asterisk Baseline plus p left-parenthesis p plus 1 right-parenthesis slash 2 plus t Over upper N minus p minus 2 EndFraction right-parenthesis

    where x equals left-parenthesis upper N minus p minus 2 right-parenthesis f Subscript m i n, normal upper Phi left-parenthesis x vertical-bar lamda Subscript upper U Superscript asterisk Baseline comma d Subscript m i n Baseline right-parenthesis equals StartFraction alpha Over 2 EndFraction and normal upper Phi left-parenthesis x vertical-bar lamda Subscript upper L Superscript asterisk Baseline comma d Subscript m i n Baseline right-parenthesis equals 1 minus StartFraction alpha Over 2 EndFraction. See Browne and Cudeck (1993). The size of the confidence interval can be set by the option ALPHAECV=alpha, 0 less-than-or-equal-to alpha less-than-or-equal-to 1. The default is alpha equals 0.1, which corresponds to the 90% confidence interval for the ECVI.

  • Akaike’s information criterion (AIC) (Akaike 1974, 1987) This is a criterion for selecting the best model among a number of candidate models. The model that yields the smallest value of AIC is considered the best.

    AIC equals h plus 2 t

    where h is the –2 times the likelihood function value for the FIML method or the chi squared value for other estimation methods.

  • consistent Akaike’s information criterion (CAIC) (Bozdogan 1987) This is another criterion, similar to AIC, for selecting the best model among alternatives. The model that yields the smallest value of CAIC is considered the best. CAIC is preferred by some people to AIC or the chi squared test.

    CAIC equals h plus left-parenthesis ln left-parenthesis upper N right-parenthesis plus 1 right-parenthesis t

    where h is the –2 times the likelihood function value for the FIML method or the chi squared value for other estimation methods. Notice that N includes the number of incomplete observations for the FIML method while it includes only the complete observations for other estimation methods.

  • Schwarz’s Bayesian criterion (SBC) (Schwarz 1978; Sclove 1987) This is another criterion, similar to AIC, for selecting the best model. The model that yields the smallest value of SBC is considered the best. SBC is preferred by some people to AIC or the chi squared test.

    SBC equals h plus ln left-parenthesis upper N right-parenthesis t

    where h is the –2 times the likelihood function value for the FIML method or the chi squared value for other estimation methods. Notice that N includes the number of incomplete observations for the FIML method while it includes only the complete observations for other estimation methods.

  • McDonald’s measure of centrality (McDonald and Marsh 1988)

    CENT equals exp left-parenthesis minus StartFraction left-parenthesis chi squared minus d Subscript m i n Baseline right-parenthesis Over 2 upper N EndFraction right-parenthesis
Incremental Indices

These indices are constructed so that the model fit is assessed through the comparison with a baseline model. The baseline model is usually the independence model where all covariances among manifest variables are assumed to be zeros. The only parameters in the independence model are the diagonals of covariance matrix. If modeled, the mean structures are saturated in the independence model. For multiple-group analysis, the overall independence model consists of component independence models for each group.

In the following, let f 0 and d 0 denote the minimized discrepancy function value and the associated degrees of freedom, respectively, for the independence model; and f Subscript m i n and d Subscript m i n denote the minimized discrepancy function value and the associated degrees of freedom, respectively, for the model being fitted in the null hypothesis.

  • Bentler comparative fit index (Bentler 1995)

    CFI equals 1 minus StartFraction max left-parenthesis left-parenthesis upper N minus k right-parenthesis f Subscript m i n Baseline minus d Subscript m i n Baseline comma 0 right-parenthesis Over max left-parenthesis left-parenthesis upper N minus k right-parenthesis f Subscript m i n Baseline minus d Subscript m i n Baseline comma max left-parenthesis left-parenthesis upper N minus k right-parenthesis f 0 minus d 0 comma 0 right-parenthesis EndFraction
  • Bentler-Bonett normed fit index (NFI) (Bentler and Bonett 1980)

    normal upper Delta equals StartFraction f 0 minus f Subscript m i n Baseline Over f 0 EndFraction

    Mulaik et al. (1989) recommend the parsimonious weighted form called parsimonious normed fit index (PNFI) (James, Mulaik, and Brett 1982).

  • Bentler-Bonett nonnormed coefficient (Bentler and Bonett 1980)

    rho equals StartFraction f 0 slash d 0 minus f Subscript m i n Baseline slash d Subscript m i n Baseline Over f 0 slash d 0 minus 1 slash left-parenthesis upper N minus k right-parenthesis EndFraction

    See Tucker and Lewis (1973).

  • normed index bold-italic rho 1 (Bollen 1986)

    rho 1 equals StartFraction f 0 slash d 0 minus f Subscript m i n Baseline slash d Subscript m i n Baseline Over f 0 slash d 0 EndFraction

    rho 1 is always less than or equal to 1; rho 1 less-than 0 is unlikely in practice. See the discussion in Bollen (1989a).

  • nonnormed index bold upper Delta 2 (Bollen 1989a)

    normal upper Delta 2 equals StartStartFraction f 0 minus f Subscript m i n Baseline OverOver f 0 minus StartFraction d Subscript m i n Baseline Over left-parenthesis upper N minus k right-parenthesis EndFraction EndEndFraction

    is a modification of Bentler and Bonett’s normal upper Delta that uses d and "lessens the dependence" on N. See the discussion in (Bollen 1989b). normal upper Delta 2 is identical to the IFI2 index of Mulaik et al. (1989).

  • parsimonious normed fit index (James, Mulaik, and Brett 1982) The PNFI is a modification of Bentler-Bonett’s normed fit index that takes parsimony of the model into account,

    PNFI equals StartFraction d Subscript m i n Baseline Over d 0 EndFraction StartFraction left-parenthesis f 0 minus f Subscript m i n Baseline right-parenthesis Over f 0 EndFraction

    The PNFI uses the same parsimonious factor as the parsimonious GFI of Mulaik et al. (1989).

Fit Indices and Estimation Methods

Note that not all fit indices are reasonable or appropriate for all estimation methods set by the METHOD= option of the PROC CALIS statement. The availability of fit indices is summarized as follows:

  • Adjusted (elliptic) chi-square and its probability are available only for METHOD=ML or GLS and with the presence of raw data input.

  • For METHOD=ULS or DWLS, probability of the chi-square value, RMSEA and its confidence intervals, probability of close fit, ECVI and its confidence intervals, critical N index, Z-test, AIC, CAIC, SBC, and measure of centrality are not appropriate and therefore not displayed.

Individual Fit Indices for Multiple Groups

When you compare the fits of individual groups in a multiple-group analysis, you can examine the residuals of the groups to gauge which group is fitted better than the others. While examining residuals is good for knowing specific locations with inadequate fit, summary measures like fit indices for individual groups would be more convenient for overall comparisons among groups.

Although the overall fit function is a weighted sum of individual fit functions for groups, these individual functions are not statistically independent. Therefore, in general you cannot partition the degrees of freedom or chi squared value according to the groups. This eliminates the possibility of breaking down those fit indices that are functions of degrees of freedom or chi squared for group comparison purposes. Bearing this fact in mind, PROC CALIS computes only a limited number of descriptive fit indices for individual groups.

  • fit function The overall fit function is:

    upper F equals sigma-summation Underscript r equals 1 Overscript k Endscripts a Subscript r Baseline upper F Subscript r

    where a Subscript r Baseline equals StartFraction left-parenthesis upper N Subscript j Baseline minus 1 right-parenthesis Over left-parenthesis upper N minus k right-parenthesis EndFraction and upper F Subscript r are the group weight and the discrepancy function for group r, respectively. The value of unweighted fit function upper F Subscript r for the rth group is denoted by:

    f Subscript r

    This f Subscript r value provides a measure of fit in the rth group without taking the sample size into account. The large the f Subscript r, the worse the fit for the group.

  • percentage contribution to the chi-square The percentage contribution of group r to the chi-square is:

    percentage contribution equals a Subscript r Baseline f Subscript r Baseline slash f Subscript m i n Baseline times 100 percent-sign

    where f Subscript r is the value of upper F Subscript r with F minimized at the value f Subscript m i n. This percentage value provides a descriptive measure of fit of the moments in group r, weighted by its sample size. The group with the largest percentage contribution accounts for the most lack of fit in the overall model.

  • root mean square residual (RMR) For the rth group, the total number of moments being modeled is:

    g equals StartFraction p Subscript r Baseline left-parenthesis p Subscript r Baseline plus 1 plus 2 delta Subscript r Baseline right-parenthesis Over 2 EndFraction

    where p Subscript r is the number of variables and delta Subscript r is the indicator variable of the mean structures in the rth group. The root mean square residual for the rth group is:

    RMR Subscript r Baseline equals StartRoot StartFraction 1 Over g EndFraction left-bracket sigma-summation Underscript i Overscript p Subscript r Baseline Endscripts sigma-summation Underscript j Overscript i Endscripts left-parenthesis left-bracket bold upper S Subscript r Baseline right-bracket Subscript i j Baseline minus left-bracket ModifyingAbove bold upper Sigma With caret Subscript r Baseline right-bracket Subscript i j Baseline right-parenthesis squared plus delta Subscript r Baseline sigma-summation Underscript i Overscript p Subscript r Baseline Endscripts left-parenthesis left-bracket bold x overbar Subscript r Baseline right-bracket Subscript i Baseline minus left-bracket ModifyingAbove bold-italic mu With caret Subscript r Baseline right-bracket Subscript i Baseline right-parenthesis squared right-bracket EndRoot
  • standardized root mean square residual (SRMR) For the rth group, the standardized root mean square residual is:

    SRMR equals StartRoot StartFraction 1 Over g EndFraction left-bracket sigma-summation Underscript i Overscript p Subscript r Baseline Endscripts sigma-summation Underscript j Overscript i Endscripts StartFraction left-parenthesis left-bracket bold upper S Subscript r Baseline right-bracket Subscript i j Baseline minus left-bracket ModifyingAbove bold upper Sigma With caret Subscript r Baseline right-bracket Subscript i j Baseline right-parenthesis squared Over left-bracket bold upper S Subscript r Baseline right-bracket Subscript i i Baseline left-bracket bold upper S Subscript r Baseline right-bracket Subscript j j Baseline EndFraction plus delta Subscript r Baseline sigma-summation Underscript i Overscript p Subscript r Baseline Endscripts StartFraction left-parenthesis left-bracket bold x overbar Subscript r Baseline right-bracket Subscript i Baseline minus left-bracket ModifyingAbove bold-italic mu With caret Subscript r Baseline right-bracket Subscript i Baseline right-parenthesis squared Over left-bracket bold upper S Subscript r Baseline right-bracket Subscript i i Baseline EndFraction right-bracket EndRoot
  • goodness-of-fit index (GFI) For the ULS, GLS, and ML estimation, the goodness-of-fit index (GFI) for the rth group is:

    normal upper G normal upper F normal upper I equals 1 minus StartFraction normal upper T normal r left-parenthesis left-parenthesis bold upper W Subscript r Superscript negative 1 Baseline left-parenthesis bold upper S Subscript r Baseline minus ModifyingAbove bold upper Sigma Subscript r Baseline With caret right-parenthesis right-parenthesis squared right-parenthesis plus delta Subscript r Baseline left-parenthesis bold x overbar Subscript r Baseline minus ModifyingAbove bold u Subscript r Baseline With caret right-parenthesis prime bold upper W Subscript r Superscript negative 1 Baseline left-parenthesis bold x overbar Subscript r Baseline minus ModifyingAbove bold u Subscript r Baseline With caret right-parenthesis Over normal upper T normal r left-parenthesis left-parenthesis bold upper W Subscript r Superscript negative 1 Baseline bold upper S Subscript r Baseline right-parenthesis squared right-parenthesis plus delta Subscript r Baseline bold x overbar prime Subscript r Baseline bold upper W Subscript r Superscript negative 1 Baseline bold x overbar Subscript r Baseline EndFraction

    with bold upper W Subscript r Baseline equals upper I for ULS, bold upper W Subscript r Baseline equals bold upper S Subscript r for GLS, and bold upper W Subscript r Baseline equals ModifyingAbove bold upper Sigma Subscript r Baseline With caret. For the WLS and DWLS estimation,

    normal upper G normal upper F normal upper I equals 1 minus StartFraction left-parenthesis bold u Subscript r Baseline minus ModifyingAbove bold-italic eta With caret Subscript r Baseline right-parenthesis prime bold upper W Subscript r Superscript negative 1 Baseline left-parenthesis bold u Subscript r Baseline minus ModifyingAbove bold-italic eta Subscript r Baseline With caret right-parenthesis Over bold u prime Subscript r Baseline bold upper W Subscript r Superscript negative 1 Baseline bold u Subscript r Baseline EndFraction

    where bold u Subscript r is the vector of observed moments and ModifyingAbove bold-italic eta With caret Subscript r is the vector of fitted moments for the rth group (r equals 1 comma ellipsis comma k).

    When the mean structures are modeled, vectors bold u Subscript r and ModifyingAbove bold-italic eta Subscript r Baseline With caret contain all the nonredundant elements normal v normal e normal c normal s left-parenthesis bold upper S Subscript r Baseline right-parenthesis in the covariance matrix and all the means, and bold upper W Subscript r is the weight matrix for covariances and means. When the mean structures are not modeled, bold u Subscript r, ModifyingAbove bold-italic eta Subscript r Baseline With caret, and bold upper W Subscript r contain elements pertaining to the covariance elements only. Basically, formulas presented here are the same as the case for a single-group GFI. The only thing added here is the subscript r to denote individual group measures.

  • Bentler-Bonett normed fit index (NFI) For the rth group, the Bentler-Bonett NFI is:

    normal upper Delta Subscript r Baseline equals StartFraction f Subscript 0 r Baseline minus f Subscript r Baseline Over f Subscript 0 r Baseline EndFraction

    where f Subscript 0 r is the function value for fitting the independence model to the rth group. The larger the value of normal upper Delta Subscript r, the better is the fit for the group. Basically, the formula here is the same as the overall Bentler-Bonett NFI. The only difference is that the subscript r is added to denote individual group measures.

Squared Multiple Correlations and Determination Coefficients

In the section, squared multiple correlations for endogenous variables are defined. Squared multiple correlation is computed for all of these five estimation methods: ULS, GLS, ML, WLS, and DWLS. These coefficients are also computed as in the LISREL VI program of Jöreskog and Sörbom (1985). The DETAE, DETSE, and DETMV determination coefficients are intended to be multivariate generalizations of the squared multiple correlations for different subsets of variables. These coefficients are displayed only when you specify the PDETERM option.

  • bold upper R squared values corresponding to endogenous variables

    upper R squared equals 1 minus StartFraction ModifyingAbove Evar With caret left-parenthesis y right-parenthesis Over ModifyingAbove Var With caret left-parenthesis y right-parenthesis EndFraction

    where y denotes an endogenous variable, ModifyingAbove Var With caret left-parenthesis y right-parenthesis denotes its variance, and ModifyingAbove Evar With caret left-parenthesis y right-parenthesis denotes its error (or unsystematic) variance. The variance and error variance are estimated under the model.

  • total determination of all equations

    DETAE equals 1 minus StartFraction StartAbsoluteValue ModifyingAbove Ecov With caret left-parenthesis bold y comma bold-italic eta right-parenthesis EndAbsoluteValue Over StartAbsoluteValue ModifyingAbove Cov With caret left-parenthesis bold y comma bold-italic eta right-parenthesis EndAbsoluteValue EndFraction

    where the bold y vector denotes all manifest dependent variables, the bold-italic eta vector denotes all latent dependent variables, ModifyingAbove Cov With caret left-parenthesis bold y comma bold-italic eta right-parenthesis denotes the covariance matrix of bold y and bold-italic eta, and ModifyingAbove Ecov With caret left-parenthesis bold y comma bold-italic eta right-parenthesis denotes the error covariance matrix of bold y and bold-italic eta. The covariance matrices are estimated under the model.

  • total determination of latent equations

    DETSE equals 1 minus StartFraction StartAbsoluteValue ModifyingAbove Ecov With caret left-parenthesis bold-italic eta right-parenthesis EndAbsoluteValue Over StartAbsoluteValue ModifyingAbove Cov With caret left-parenthesis bold-italic eta right-parenthesis EndAbsoluteValue EndFraction

    where the bold-italic eta vector denotes all latent dependent variables, ModifyingAbove Cov With caret left-parenthesis bold-italic eta right-parenthesis denotes the covariance matrix of bold-italic eta, and ModifyingAbove Ecov With caret left-parenthesis bold-italic eta right-parenthesis denotes the error covariance matrix of bold-italic eta. The covariance matrices are estimated under the model.

  • total determination of the manifest equations

    DETMV equals 1 minus StartFraction StartAbsoluteValue ModifyingAbove Ecov With caret left-parenthesis bold y right-parenthesis EndAbsoluteValue Over StartAbsoluteValue ModifyingAbove Cov With caret left-parenthesis bold y right-parenthesis EndAbsoluteValue EndFraction

    where the bold y vector denotes all manifest dependent variables, ModifyingAbove Cov With caret left-parenthesis bold y right-parenthesis denotes the covariance matrix of bold y, ModifyingAbove Ecov With caret left-parenthesis bold y right-parenthesis denotes the error covariance matrix of bold y, and StartAbsoluteValue bold upper A EndAbsoluteValue denotes the determinant of matrix bold upper A. All the covariance matrices in the formula are estimated under the model.

You can also use the DETERM statement to request the computations of determination coefficients for any subsets of dependent variables.

Last updated: December 09, 2022