This paper aims at clarifying the questions regarding the effects of the scaling method on the discrepancy function of the metric measurement invariance model. We provide examples and a formal account showing that neither the choice of the scaling method in general nor the choice of a particular referent indicator affects the value of the discrepancy function. Thus, the test statistic is not affected by the scaling method, either. The results rely on an appropriate application of the scaling restrictions, which can be phrased as a simple rule: "Apply the scaling restriction in one group only!" We develop formulas to calculate the degrees of freedom of χ²-difference tests comparing metric models to the corresponding configural model. Our findings show that it is impossible to test the invariance of the estimated loading of exactly one indicator, because metric MI models aimed at doing so are actually equivalent to the configural model.

In this section, we introduce the notation we use throughout this article. Basically, we consider measurement models or confirmatory factor analysis models in multiple group settings of the form

Let

Under the standard assumptions for confirmatory factor analysis models (e.g.,

where

As is common in statistics, we distinguish between population models and estimated models (cf.,

holds within a dgp, then this dgp fulfills full metric invariance. In contrast, a dgp fulfills

for all groups

Following the above-mentioned distinction between population and estimated models, the estimated model is stipulated by a researcher. We assume that the estimated model is also a confirmatory factor analysis model of the same structure. The estimated model implied-covariance matrix is

The parameter vector

The invariance condition for

For

for all groups

In the following, we will always explicitly use either the term full metric MI model or partial metric MI model. If we want to make assertions that refer to both types of models, we use the term

To actually estimate the model, we have to apply a scaling method. It is important to note that the restrictions of the various scaling methods are applied to the

To refer to any of the elements of

We call a model as in

In the following, when estimating a metric MI model, we assume that configural invariance is given. Furthermore, we assume that the scaled model is estimated by minimizing a discrepancy function

As the distinction between a dgp as well as unscaled and scaled estimated models is not common in the literature, we want to explain the relations between these models with the help of ^{5}

With this, we mean the sequence of tests as outlined in the introduction.

and afterward, the researcher arrives at a conclusion about the substantial hypothesis. The figure illustrates two issues: Firstly, the model used by the researcher is a distinct entity from the dgp. Secondly, the researcher starts with invariance conditions in the unscaled model, to which the scaling methods are then added. Thus, the scaling of the model is not related to the dgp. We will now move on to our first example.In the following example, we present a simple two-group case which demonstrates that regardless of the scaling method, the same value of the discrepancy function results when a full metric MI model is estimated. The aim of this section is to provide an intuitive and heuristic understanding, both of the equivalence of the scaling methods for the full metric MI model and of the concept of change of scale. A change of scale allows to convert the estimates obtained under a certain scaling method to those obtained under any other scaling method, without re-estimating the model. This concept is vital to the more formal account in the next section.

We consider a model with only one latent variable with six indicators in two groups,

This is a

We now take the role of the researcher from

To scale the model, we apply each scaling method from the set ^{6}

We only consider the case when the scaling restrictions are set in the reference group. The case when the scaling restrictions are set in the focal group, which means a role swap of both groups, can be easily implemented in the R code.

, i.e.,Notably, the invariance condition in

For the

As stated in the introduction, this scaling restriction is equivalent to the restriction that the average of the estimated loadings equals 1, i.e.,

Again, the invariance conditions cause the propagation of the scaling restriction to the focal group.

The most important observation from these scaling examples is that the scaling restrictions are applied in one group only. As we have seen, a common feature of all scaling methods is the propagation of the scaling restrictions into the focal group via the invariance conditions. Because of its importance, we want to set up this rule in colloquial terms, which is

Finally, for the

In contrast to the previous scaling methods, this restriction does not directly propagate to the focal group. However, the restriction on the estimated variance in the reference group scales the estimated loadings in this group indirectly. Due to the invariance conditions, this restriction is then propagated to the focal group, thus overall acting as a scaling restriction.

We now estimate the full metric MI model for ten different generated samples, using all scaling methods, i.e., the six different variants of the FM scaling as well as the EC and RG scaling method. We use the ML estimator. The sample size per group is

Regarding the degrees of freedom, there are

The results are presented in

Sample | EC | RG | ||||||
---|---|---|---|---|---|---|---|---|

1 | ||||||||

2 | ||||||||

3 | ||||||||

4 | ||||||||

5 | ||||||||

6 | ||||||||

7 | ||||||||

8 | ||||||||

9 | ||||||||

10 |

Loadings | EC | RG | ||||||
---|---|---|---|---|---|---|---|---|

This result is in line with

Now, we turn to a second observation: Parameters estimated under one scaling method can be converted to those estimated under another scaling method (

and those estimated under

Now, we choose

The choice of the constant exemplifies the mechanism of the conversion: It yields an “estimated” latent variance in the reference group of

To convert from

Denoting the average of the estimated loadings under the ^{7}

The exact number is

Obviously, for our example with one latent variable, a conversion according to

The type of relation in

A general version of a change of scale for the model in

In the following section, we turn over to the formal side of this exemplary consideration and provide a general proof that each scaling method results in the same value of the discrepancy function. As we will see, the idea of a change of scale and the idea of the propagation of the scaling restriction over the groups via the invariance conditions will be essential in proving the equivalence of the scaling methods.

The considerations laid out above lead to the following proposition.

If the full metric MI model is estimated by minimizing a discrepancy function, then the resulting optimal values of the discrepancy function as well as the estimated model-implied covariance matrices do not depend on the particular method used for scaling the MI model.

The outline of the proof is as follows: First of all, we will explain why the minimum of the discrepancy function, taken over all parameters fulfilling full metric MI, coincides, for every scaling method, with the discrepancy function’s minimum taken over all parameters simultaneously fulfilling full metric MI as well as the restrictions stemming from the corresponding scaling method. After having established this fact, it will be obvious that the resulting optimal discrepancy value does not depend on the particular scaling method, as the optimal values for different scaling methods all take the same value (namely, the discrepancy function’s minimum taken over all parameters fulfilling full metric MI). As a by-product of proving the invariance of the optimal discrepancy value, the proof will also show that the estimated model-implied covariance matrices do not depend on the method used for scaling the MI model.

To follow the outline described above, we denote by

For the ^{8}

Latent variances, being the special case of covariances with

For the

For the

In all three cases, the change of scale can be described by

it follows that, in all groups, the estimated model-implied covariance matrices are identical for ^{9}

If the estimate under the scaling method is uniquely defined, then

If the full metric MI model is estimated by minimizing a discrepancy function, then its

This immediately follows from Proposition 1, as all these quantities are calculated using the full metric MI model’s likelihood value, which does not depend on the particular method used for scaling the model.

The results of the

This immediately follows from the preceding corollary and the well-known fact that the corresponding quantities for the configural model do not depend on the particular method used for scaling the configural model, either.

Concerning the

The degrees of freedom of the

Corollary 3 can easily be adapted to partial metric MI models. In this case, the overall number of invariant indicators will be denoted by

The degrees of freedom of the

The preceding corollary implies a special case which deserves particular attention: the case of

A partial metric MI model in which only one indicator per latent variable is presumed to be invariant is equivalent to the configural model.

Up to this point, we have mainly considered full metric MI models. In this section, building on the corollaries developed in the preceding section, we want to present a further example, in which we look at partial metric MI models. The example consists of three scenarios, A, B, and C. Each of thee scenarios is further divided into two settings. The dgp stays the same as in the previous example, please see the model given in

^{10}

We used the

^{11}

Again, we only consider the case when the scaling restrictions are set in the reference group.

Once again, we consider every indicator as a potential RI, i.e., we look at^{12}

This refers to the function measEq.syntax in the semTools package.

Setting 1 | ||
---|---|---|

Invariance condition | ||

Scaling | ||

Setting 2 | ||

Invariance condition | ||

Scaling | ||

Estimating the models produces the results shown in

Setting | Discrepancy function values | |||||||
---|---|---|---|---|---|---|---|---|

1 | ||||||||

2 | ||||||||

In order to get around the problem arising in Scenario A, one might be tempted to ignore the rule of applying scaling restrictions in only one group and apply them in both groups. Doing so, there would be one degree of freedom for the ^{13}

Such a situation may happen when there is a configural model, which has a scaling restriction in each group, and a researcher sets an invariance condition on an estimated loading falsely believing that this

Setting 1 | ||
---|---|---|

Invariance condition | ||

Scaling | ||

Setting 2 | ||

Invariance condition | ||

Scaling | ||

Setting | Discrepancy function values | |||||||
---|---|---|---|---|---|---|---|---|

1 | ||||||||

2 | ||||||||

To do so, let us have a look at the situation in Setting 1 and the use of

Scenario B demonstrates another aspect, too. There may be conditions under which identical values of the discrepancy function emerge, even though the model is not scaled according to the rule of setting scaling restrictions in one group only, as it was the case for the ^{14}

The reason for this are specific to the

For scaling, we use all possible versions of the

For the

Setting 1 | ||
---|---|---|

Invariance conditions | ||

Scaling | ||

Setting 2 | ||

Invariance conditions | ||

Scaling | ||

Finally, for the

The results are presented in

Setting | Discrepancy function values | |||||||||
---|---|---|---|---|---|---|---|---|---|---|

1 | ||||||||||

2 | ||||||||||

In the following section, we will provide the formal proof that for partial metric MI models, the scaling method does not affect the value of the discrepancy function, as long as scaling restrictions are applied in one group only.

The following proposition formalizes the results of the previous section.

If a partial metric MI model is estimated by minimizing a discrepancy function, then the resulting optimal values of the discrepancy function as well as the estimated model-implied covariance matrices do not depend on the particular method used for scaling the partial metric MI model, as long as the scaling method is applied in only one group,

The proof is essentially the same as the one for Proposition 1. The only detail that needs to be given precisely is how exactly to construct the changes of scale that transform a given parameter

For the

For the

For the

If a partial metric MI model is estimated by minimizing a discrepancy function, then its

This immediately follows from Proposition 2, as all these quantities are calculated using the partial metric MI model’s likelihood value, which does not depend on the particular method used for scaling the model.

The results of the

This immediately follows from the preceding corollary and the well-known fact that the corresponding quantities for the configural model do not depend on the particular method used for scaling the configural model, either.

In this paper, our goal was to clarify the impact of the various scaling methods on the estimation results for metric measurement invariance models. To this end, we addressed both full and partial metric MI models, and the results were laid out by means of worked examples as well as theoretical results with formal proofs. A first important insight of the paper is that scaling restrictions for metric MI models must be placed in one group only, which is of particular importance for partial metric MI models if the

For partial metric MI models, the issues discussed in the literature regarding scaling and in particular RI selection originate from scaling restrictions being set in all groups, instead of in one group only. For instance, when using the ^{15}

The semTools package’s help page for the function measEq.syntax correspondingly warns users that the RI’s loadings can not be freed.

Thus, choosing different RIs leads to different sets of indicators whose loadings’ invariance is under examination. Consequentially, the results differ, a phenomenon known as constraint interaction. These problems, however, can easily be avoided by setting scaling restrictions properly, i.e. by placing them in one group only. Apart from scaling in all groups instead of in only one, some of the concerns regarding the RI selection as well as concerns regarding other scaling methods, e.g., that the RG method implies an invariance assumption about the latent variances in the reference group (cf.,One of the surprising results of this paper is that it is impossible to test the invariance or non-invariance of

With regard to the number of degrees of freedom for full and partial metric MI models, we provided formulas for calculating these easily. Given the findings of

The corollaries concerning the degrees of freedom also point at another issue. For the FM scaling method, ^{16}

Concerning the fourth and last question, all scaling methods have the same mechanism: the scaling restriction is only applied in the reference group and propagated by means of the MI conditions to the other groups. Thus, there are again no differences between the various scaling method in this regards. In particular, this mechanism is the foundation for the,

To sum up, there are no potential issues concerning the choice of the RI when using the

At this point, we want to emphasize that our consideration refers in the first line to the discrepancy function and, and in turn, to the LR-test in the form of the

We want to note that some parts of the results we presented are already present in the current literature. For instance,

Finally, we want to note that we also did not turn our attention to factors that potentially have effects on metric MI tests, e.g., the size of the manifest variables’ residual variances. For instance, the example of

^{2}-test statistic of the metric invariance model

The R code to replicate the results of the current study, models and scenarios, as well as the supporting information, are freely available and can be found in the

The supplementary materials provided are the R code to replicate the results of the current study, models and scenarios, and supporting information (see

The authors have no funding to report.

The authors have declared that no competing interests exist.

The authors are grateful for valuable remarks from Sandra Baar, Martin Becker, Martin Klein, Mireille Soliman, and Holger Steinmetz (in alphabetical order).