^{a}

^{a}

Methods to assess measurement invariance in constructs have received much attention, as invariance is critical for accurate group comparisons. Less attention has been given to the identification and correction of the sources of non-invariance in predictive equations. This work developed correction factors for structural intercept and slope bias in common regression equations to address calls in the literature to revive test bias research. We demonstrated the correction factors in regression analyses within the context of a large international dataset containing 68 countries and regions (groups). A mathematics achievement score was predicted by a math self-efficacy score, which exhibited a lack of invariance across groups. The proposed correction factors significantly corrected structural intercept and slope bias across groups. The impact of the correction factors was greatest for groups with the largest amount of bias. Implications for both practice and methodological extensions are discussed.

The

Measurement invariance (MI) is an essential component for constructing a validity argument and a precursor to score use. Validity evidence related to associations with external variables is especially important to support inferences across groups. MI effects are documented for mean differences and group comparisons (

Item and factor structure differences may be present as a result of cultural differences (

Factor invariance can be examined at the factor structure level through multigroup confirmatory factor analysis (MGCFA;

Several options exist when analyzing data that lack FI. Researchers can ignore the problem or estimate non-equivalence models (

Assumptions about regression or structural slope and intercept bias in predictive equations has been questioned.

Latent variables and factor scores are commonly used in predictive equations. Unfortunately, the predictive invariance of such equations can be compromised by the use of assessments that lack invariance across groups (

Detection of invariance in a model has received much attention (

Numerous combinations of noninvariance can occur among measurement intercepts and slopes for both criterion and predictor variables. These combinations, along with differences in group population means, can mask predictive bias (

We developed two correction factors (CF), one for intercept and one for slope in a common regression equation, to adjust bias between groups that originated from a lack of FI at the item level. These CFs were designed to (a) adjust bias

The primary benefits of this method are (a) a more objective approach to adjusting predictive bias that precludes the need for subjective decisions, (b) not relying on the use of fit statistics and thresholds (i.e. Chen’s, 2007 fit criteria) for models and data that likely do not meet the narrow simulated characteristics of the models from which the fit criteria were obtained, (c) a quantification of predictive bias in terms of the specific values of the slope and intercept CFs for each group, (d) the use of a single predictive model (with CFs as additional inputs), and (e) the testing of statistically significant differences between the CFs of different groups.

We hypothesized that the CFs would allow for greater predictive accuracy compared to prediction without the corrections. We also hypothesized that predictive bias would be controlled for to a greater extent with groups, where FI was not present in the predictor variable compared to their counterparts were FI was present.

Our overall procedure was to first examine the FI of a construct at the configural, metric, and scalar levels. Second, we constructed our CFs. Third, we tested the CFs in a regression equation. We used Mplus (Version 7.4) for all analysis (

We used the 2012 Programme for International Student Assessment (PISA) student dataset (

We selected a set of variables to construct a latent variable regression model to illustrate the CF method. The PISA math achievement score (PVMATH, α = 0.91) was the dependent variable and mathematics self-efficacy, an 8-item scale (α = 0.84), was the independent variable. The PISA math achievement score was selected as the criterion to minimize any possible confounding effects that might result from the use of a psychological construct that also suffered from a lack of FI. Furthermore, regardless of whether a single factor or two factors underlie the criterion and predictor, there is no difference in the relationship between predictive slope invariance and predictor FI (

Given the negative effect of different reference and target group sample size proportions on predictive intercept bias (

We examined the mathematics self-efficacy items for metric and scalar invariance between each country and a 1-factor baseline model established by the reference group. Every country was compared with the reference group using progressively constrained models (

The structural intercept correction factor was created by first calculating an estimate of the predictive intercept bias for each group in our dataset using an equation from

where _{G}_{yG}_{yG}_{xyG}_{rG}_{rG}_{0} is the common regression line intercept.

The correction factor can then be used in a predictive equation such as a common regression line for all groups in the dataset. Hence, the new common regression line appears as in _{adj}

The structural slope correction factor was calculated as a function of the ratio of the target and reference group predictor factor variances (

where SN_{yG}

Following recommendations by

The common regression model was then modified using the correction factors (

Metric FI held for the mathematics self-efficacy items across all countries, providing no guidance for where predictive slope bias would eventually be found between our common regression and the individual country regressions. A lack of scalar FI appeared in 32 countries. We would expect, therefore, that countries exhibiting a lack of scalar invariance would also exhibit greater predictive intercept bias than the countries where scalar FI held. MGCFA model results are presented in

Table S1 in the

The average magnitude of the predictive (structural) slope bias (Table S1, Column 2 in

The average magnitude of

All calculations for the results in Table S1 are provided in

Disaggregating results by each correction factor, the intercept correction factor reduced structural intercept bias from 1% to nearly 100%. Figure S1 in ^{2} = 78%). For every percent increase in structural intercept bias, the correction factor reduced bias by 4.4%. The countries where the structural intercept correction performed poorly (e.g. Liechtenstein) displayed individual regression line slopes that deviated from the common regression line slope much more than countries where the intercept correction factor did perform well.

Focusing on the contribution of the structural slope correction factor, the structural slope correction reduced predictive bias in 70% of the countries beyond the reduction obtained with the structural intercept correction alone. Figure S2 in

As with our examination of the intercept correction factor, we examined the slope correction with cases where the

Breakdown of averages | Original unadjusted predictive bias (%) | Predictive bias after intercept adjustment (%) | Predictive bias after intercept and slope adjustment (%) |
---|---|---|---|

Average predictive bias across all 68 countries | 7.79 | 4.19 | 3.22 |

Average across countries with original bias LESS THAN 7% | 4.31 | 3.61 | 3.06 |

Average across countries with original bias GREATER THAN 7% | 12.48 | 4.97 | 3.44 |

We note that the countries and regions where a lack of FI was identified using ^{2} = 50%). For every percent increase in predictive bias, the correction factors reduced bias by 36%.

The results for Brazil, Macao, and Liechtenstein are outlined below to illustrate examples of how the correction factors work at the group level. First, the common regression equation calculated across all countries is shown in

The intercept and slope adjustment values for Brazil are -0.1426 and 0.4569, respectively. _{0} and b_{1} are the coefficients of the common regression line calculated across all groups. All calculations are performed using the unrounded values (see

Inclusion of the coefficients from the common regression line into

Similar calculations for group members of Macao and Liechtenstein produce

The panels shown in

This study established and demonstrated two correction factors for predictive bias found in common regression lines estimated using group data from a large-scale international dataset with a lack of FI in the predictor variable. The aim was to provide researchers with a third option to account for bias beyond ignoring it or using partial invariance models and that minimizes the need to make judgements about the presence and degree of measurement noninvariance. The method we propose, while applicable in its current form, is not necessarily intended to be a definitive answer to this complex problem, but a new approach, open to further development, that synthesizes a wide variety of information (e.g.

The correction factors produced large and consistent adjustments to

The performance of the correction factors also suggests that over-correction for bias is fairly minimal and limited to cases where predictive bias is less than 5% (Figure S2 in

The use of the correction factors addresses standards (e.g.,

Both correction factors are easy to create and do not require advanced statistical knowledge to implement. Calculation of factor scores from latent variables, reliability, sample proportions, and factor variances are all elements that are easily obtained from software packages capable of latent variable estimation. The correction factors should be helpful in large cross-cultural or international datasets where the number of groups is large enough that the use of group indicator variables, or partial invariance techniques, is impractical. We further recommend that researchers use MGCFA and established threshold-based invariance measures (e.g.

The implications of the statistical bias (approximately 4%) in

Given our data source, we had to assume that the criterion was free from measurement problems. While a reasonable assumption given the variable, we could not verify it. The effects of a lack of FI on both the predicator and the criterion while using correction factors deserves attention.

To address such limitations, a series of simulation studies is recommended. First, a diverse set of conditions should manipulate elements that are used in our correction factors in addition to other influences (e.g., restriction of range) to understand how the correction factors function under a range of conditions where bias is known and varied, especially intercept and slope bias. Second, such work can help understand if the adjustments over- or under-correct for bias and if reference and focal groups are being misrepresented through such corrections. Third, exploration of different reference group selection criteria, such as comparing our random selection method with the use of an established group (e.g. country or language group) as the reference is needed. Fourth and finally, comparisons of this method with other methods such as the Alignment Method, and recent work using regularized nonlinear multigroup factor analysis for invariance (e.g.,

Thus, given the legal and practical implications of adjusting data and the unknown aspects of the adjustments, we cannot recommended it for use without further evaluation. However, we do encourage continued research in this area to build a stronger analytical and empirical connection between measurement issues and predictive bias.

A lack of measurement invariance, specifically FI, in predictor variables can have a cascading effect on predictive equations resulting in differential prediction or test bias. This can have meaningful implications for individuals and groups, with some being unfairly favored over others. Our proposed correction factors have the potential to extended well-beyond educationally-related variables in cross-cultural settings. These methods may be useful for any group comparisons for a variety of inferences that need support.

The authors have no funding to report.

The authors have declared that no competing interests exist.

The authors have no support to report.