^{1}

^{2}

Two of the most important extensions of the basic regression model are moderated effects (due to interactions) and mediated effects (i.e. indirect effects). Combinations of these effects may also be present. In this work, an important, yet missing combination is presented that can determine whether a moderating effect itself is mediated by another variable. This ‘indirect moderation’ model can be assessed by a four-step decision tree which guides the user through the necessary regression analyses to infer or refute indirect moderation. A simulation experiment shows how the method works under some basic scenarios.

Linear regression is the most common version of regression analysis (and the only one this report will focus on). In its simplest form, it can be used to estimate the relationship between a dependent variable (also called target, or outcome) and a predictor variable (also called independent variable, or feature; the term covariates is often used to indicate control variables). A regression model is a supervised prediction model, which estimates the relationship between some predictor variable(s) X and a target variable Y. The resulting regression model can be used to predict unknown Y values from observed X observations.

_{10}+ β

_{1X}X + ε

_{1}

which can be conceptually represented as a path model, shown in

Before discussing some extensions of the regression model, please be aware that throughout this paper the error terms (ε) for the endogenous variables (i.e., variables which have an arrow pointing towards them) are not depicted in the figures. In case of multiple predictors, a path model should specify correlations between the exogenous variables (i.e., those which do not have an arrow pointing towards them). For simplicity, these correlations will not be depicted in the figures in this paper. Finally, the regression equations in this manuscript all include an intercept term for completeness, even when it can be assumed to be zero (for example after all variables are mean centred or standardized).

Mediation occurs when a variable

The most common method to assess mediation, by

_{10}+ β

_{1X}X + ε

_{1}

_{20}+β

_{2X}X + ε

_{2}

_{30}+ β

_{3X}X + β

_{3M}M +ε

_{3}

Mediation is determined when four conditions are met (

The effect of X on Y in

X has a significant effect on the mediator M (i.e.,

M has a significant effect on Y when controlling for X (i.e.,

The direct effect of X on Y in

In complex models, with multiple mediators,

In moderation, or interaction, the strength of the relationship between two variables depends on the value of a third variable (

The general approach in moderation analysis to what is commonly called linear-by-linear interaction (

_{40}+ β

_{4X}X + β

_{4Z}Z + β

_{4(ZX)}ZX + ε

_{4}

or

_{40}+ β

_{4Z}Z] + [β

_{4X}+ β

_{4(ZX)}Z]X + ε

_{4}

By rewriting

An important note here is that the conceptual representation of the moderation model in

When doing moderation analysis, the ‘main effects’ now have a specific meaning. If a moderating effect is present, the parameters b_{4X} and b_{4Z} should only be interpreted as restricted conditional effects (_{4X} is the difference expected in Y between two observations of which the first has a one-unit higher score on X, and both have scores zero on Z. When data are mean centred the estimate b_{4X} is the expected difference in Y between two cases with a one-unit difference in X and mean scores on Z. In many cases therefore, the specific value of the main effect may not be that informative.

Apart from the added value of interpretation, mean centring can also decrease the correlation of lower order terms with their product-terms, thus decreasing non-essential multicollinearity (

Current integrated models of mediation and moderation in the literature begin with a mediation model, where effects from and/or to the mediator are moderated by a fourth variable (

One case of the moderated mediation model has, rather confusingly, been called ‘mediated moderation’ (

The ‘mediated moderation’ model is assessed by checking whether a basic moderation model (

_{40}+ β

_{4X}X + β

_{4Z}Z + β

_{4(ZX)}ZX + ε

_{4}

_{50}+ β

_{5X}X + β

_{5Z}Z + β

_{5(ZX)}ZX + ε

_{5}

_{60}+ β

_{6X}X + β

_{6M}M + β

_{6Z}Z + β

_{6(ZX)}ZX+ ε

_{6}

which correspond to the path model depicted in

The conditions for first-stage moderated mediation (a.k.a., ‘Mediated Moderation’) are (

1) b_{5(ZX)} is significant.

2) b_{6(ZX)} is smaller in absolute value than b_{4(ZX).}

A more stringent method would be to test for the difference in the moderating effect between

The missing link in the regression framework is a model in which moderation is indirect (rather than an indirect effect being moderated). To avoid further confusion, the model will be referred to as ‘indirect moderation’.

Logically incorporating mediation analysis in a moderation model means that assessment of indirect moderation requires three regression models. The letter W is used instead of M to indicate the functional difference of the variable in other models.

_{40}+ β

_{4X}X + β

_{4Z}Z + β

_{4(ZX)}ZX + ε

_{4}

_{70}+ β

_{7X}X + β

_{7Z}Z + β

_{7(ZX)}ZX + β

_{7W}W + β

_{7(WX)}WX + ε

_{7}

or

_{70}+ [β

_{7Z}Z + β

_{7W}W] + [β

_{7X}+ β

_{7(ZX)}Z+ β

_{7(WX)}W]X + ε

_{7}

_{80}+ β

_{8Z}Z+ β

_{8X}X+ ε

_{8}

The path model of the indirect moderation model, corresponding to the

Following the same logic of testing mediation and moderation, indirect moderation is assessed if:

there is a moderating effect of Z without considering W (i.e., β_{4(ZX)} ≠ 0)

the moderating effect of Z is not present when W is included as a moderator (i.e., β_{7(ZX)} = 0),

Z has an effect on W (i.e., β_{8Z}≠0)

W moderates the effect of X on Y (i.e., β_{7(WX)}≠0)

The method described above is used to detect pure indirect moderation. Impure indirect moderation would have the looser restriction |β_{5(ZX)}|< |β_{4(ZX)}|. One idea is to adapt the method by

Since _{7W} = 0 and β_{7(WX)} = 0) one might want to use a test for model comparison (e.g. an ^{2}-change test). However, _{7(ZX)}) becomes non-significant after the inclusion of the W terms, maybe not more variance is explained, but it is explained in a better way.

There are several situations, based on the indirect moderation model, where W can be a mediator of the moderating effect of Z, all of which are based on the three characteristics that Z has an effect on W, Z is a moderator of the effect of X on Y when W is not taken into account, and the moderating effect of Z is weaker when W is taken into account. The discussion here will, however, be restricted to linear-by-linear moderation of continuous variables and the situation in which W is a pure mediator of the moderating effect of Z (i.e., Z does not have an additional moderating effect in the population).

When using this decision-tree one will obviously decide ‘no’ in some cases. However, all is not lost when some criteria for moderated mediation are not found. When in Step 1 no moderating effect of Z is found, one can stop the analysis for mediated moderation. When in the next step W is not a significant moderator, one could use another confounding moderator and repeat Step 2 with another. Note that choosing these confounders should be based on theoretical rather than statistical arguments.

When another moderator is found in Step 3 of the decision-tree and the moderating effect of Z remains significant, one may conclude a multiple moderator model and again undertake the prescribed steps for assessing mediated moderation to assess impure (or partial) indirect moderation. Analysis of such models may be investigated further but surpass the scope of the current paper.

If, however, Z does fail to be a moderator when W is included as a moderator but there is no effect of Z on W, one will conclude that there is a spurious (non-authentic) moderating effect of Z. Not all these possibilities have been explicitly recorded. However, the reader might be able to infer the quantitative data for some of these alternatives from the results described below and by running the simulation with different criteria (syntax for the study below is provided as supplemental material).

To study the behaviour of the decision-tree, data was generated and analysed using the open-source programming software ‘R’ (

_{X}X + β

_{Z}Z + β

_{W}W + β

_{WX}WX + ε

_{Y}, with ε

_{Y}~ Normal(0, 1)

The relationships between the three continuous variables were manipulated to determine the effects on the regression parameter estimates of interest in the decision-tree. For datasets with sample sizes of

For these 2 * 5 * 3 = 30 conditions 10,000 random data sets were generated and analysed with the 4-step approach described above. The results of these analyses will be reported in two ways. Firstly, the results will be given in marginal proportions of ‘yes’ answers to each step in the decision tree. This will allow for detailed evaluation of Type I error rates and power to detect certain effects. Secondly, the results are provided in conjunctional form. That is, it is evaluated how often the steps of the decision-tree were successively answered with ‘yes’. After four successive ‘yes’ answers, indirect moderation is assessed.

It is important to, a priori, determine relationships between the parameters, based on the relationships between the variables. Since the method starts with inclusion of the wrong moderator, it is helpful to investigate how the model parameters are related. Specifically, how other factors influence the parameter β_{4ZX} in

where β_{WX} is the moderating effect of W in the true model, σ represents the standard deviation of the variable noted in the subscript and ρ refers to a specific element from the correlation matrix:

For completeness: the mathematical result only holds when applied to population data and under the assumption that W is the true moderator, while Z is treated as moderator instead and all variables are normally distributed.

moderating effect of W increases,

variance of W increases,

variance of Z decreases,

correlation between Z and W increases,

correlation between Z and X decreases,

product of the correlations between (W and Z) and (Z and X) increases

It is a straightforward prediction that the Type I error rate for the regression parameter for the moderating effect of Z will be inflated (i.e., a significant moderating effect of Z will be found more than 5% of the time) in the study described above as a function of the moderating effect of W and the correlation ρ_{ZW}. This prediction is justified since all other variables from

Step 1 | Step 2 | Step 3 | Step 4 | ||||
---|---|---|---|---|---|---|---|

β_{WX} |
ρ_{ZW} |
Indirect Moderation | |||||

100 | -0,4 | 0 | 0.118 | 0.961 | 0.948 | 0.052 | 0.010 |

0,3 | 0.327 | 0.934 | 0.952 | 0.870 | 0.271 | ||

0,6 | 0.661 | 0.834 | 0.948 | 1.000 | 0.539 | ||

-0,2 | 0 | 0.071 | 0.500 | 0.949 | 0.052 | 0.002 | |

0,3 | 0.130 | 0.454 | 0.951 | 0.871 | 0.049 | ||

0,6 | 0.262 | 0.334 | 0.950 | 1.000 | 0.088 | ||

0 | 0 | 0.053 | 0.047 | 0.946 | 0.053 | 0.000 | |

0,3 | 0.048 | 0.053 | 0.953 | 0.865 | 0.001 | ||

0,6 | 0.049 | 0.050 | 0.951 | 1.000 | 0.001 | ||

0,2 | 0 | 0.065 | 0.501 | 0.953 | 0.050 | 0.003 | |

0,3 | 0.133 | 0.454 | 0.949 | 0.862 | 0.050 | ||

0,6 | 0.254 | 0.326 | 0.948 | 1.000 | 0.083 | ||

0,4 | 0 | 0.118 | 0.955 | 0.951 | 0.053 | 0.010 | |

0,3 | 0.317 | 0.937 | 0.952 | 0.862 | 0.264 | ||

0,6 | 0.668 | 0.828 | 0.948 | 1.000 | 0.542 | ||

250 | -0,4 | 0 | 0.173 | 1.000 | 0.948 | 0.051 | 0.012 |

0,3 | 0.635 | 1.000 | 0.950 | 0.999 | 0.609 | ||

0,6 | 0.964 | 0.997 | 0.947 | 1.000 | 0.919 | ||

-0,2 | 0 | 0.084 | 0.895 | 0.954 | 0.048 | 0.004 | |

0,3 | 0.252 | 0.843 | 0.950 | 0.998 | 0.197 | ||

0,6 | 0.562 | 0.697 | 0.950 | 1.000 | 0.387 | ||

0 | 0 | 0.051 | 0.050 | 0.951 | 0.050 | 0.000 | |

0,3 | 0.053 | 0.049 | 0.948 | 0.999 | 0.001 | ||

0,6 | 0.051 | 0.052 | 0.951 | 1.000 | 0.001 | ||

0,2 | 0 | 0.084 | 0.891 | 0.951 | 0.050 | 0.004 | |

0,3 | 0.251 | 0.848 | 0.951 | 0.998 | 0.202 | ||

0,6 | 0.553 | 0.699 | 0.951 | 1.000 | 0.384 | ||

0,4 | 0 | 0.169 | 1.000 | 0.949 | 0.049 | 0.011 | |

0,3 | 0.641 | 1.000 | 0.951 | 0.999 | 0.611 | ||

0,6 | 0.963 | 0.997 | 0.949 | 1.000 | 0.920 |

The results are in line with expectations that, depending on the correlation between the proposed and the true moderator, the probability of making a Type I error for the wrong moderator becomes greatly inflated. Importantly the Type I error rate is a function of strength of the moderating effect of W (even while that predictor is not included in the regression model).

The results also show a detrimental effect of including a correlated non-moderating variable as a moderator on the probability of finding a significant result for the true moderator. That is, the Type II error rate increases with the correlation between W and Z. Next to the mathematical proof, this behaviour can be seen in the second column of the primary results. The number of significant results diminishes when the correlation between W and Z increases, keeping all other things constant.

When sample size increases, the probability of finding a significant result for the moderating effect of Z also increases, thus making more Type I errors. The overall probabilities of finding a significant result for the moderating effect of W increases with sample size. As far as these results go, the effect of including both Z and W as moderators on the probability of finding a significant result for the moderating effect of Z does not depend on sample size. For the effect of Z on W, the same parameter behaviour was observed in both sample sizes, albeit that larger sample sizes overall have more significant results. Hence, not surprisingly, the probability of correctly assessing indirect moderation increases with the sample size.

The entire column belonging to Step 3 shows that the Type I error rate is nominal in all conditions. The powerful implication is that, regardless of whether a variable (in this case Z) is wrongly classified as a moderator at first, including the true moderator will control for the faulty result. Covariates should therefore also be included in regression models to control for spurious moderation, and not only for spurious main effects.

Instead of evaluating the steps separately,

Step 1 | Step 2 | Step 3 | Step 4 | ||||
---|---|---|---|---|---|---|---|

β_{WX} |
ρ_{ZW} |
Indirect Moderation | |||||

100 | -0,4 | 0 | 0.118 | 0.113 | 0.087 | 0.010 | 0.010 |

0,3 | 0.327 | 0.306 | 0.285 | 0.271 | 0.271 | ||

0,6 | 0.661 | 0.551 | 0.539 | 0.539 | 0.539 | ||

-0,2 | 0 | 0.071 | 0.037 | 0.023 | 0.002 | 0.002 | |

0,3 | 0.130 | 0.058 | 0.052 | 0.049 | 0.049 | ||

0,6 | 0.262 | 0.089 | 0.088 | 0.088 | 0.088 | ||

0 | 0 | 0.053 | 0.002 | 0.001 | 0.000 | 0.000 | |

0,3 | 0.048 | 0.001 | 0.001 | 0.001 | 0.001 | ||

0,6 | 0.049 | 0.002 | 0.001 | 0.001 | 0.001 | ||

0,2 | 0 | 0.065 | 0.034 | 0.022 | 0.003 | 0.003 | |

0,3 | 0.133 | 0.061 | 0.054 | 0.050 | 0.050 | ||

0,6 | 0.254 | 0.084 | 0.083 | 0.083 | 0.083 | ||

0,4 | 0 | 0.118 | 0.113 | 0.088 | 0.010 | 0.010 | |

0,3 | 0.317 | 0.299 | 0.281 | 0.264 | 0.264 | ||

0,6 | 0.668 | 0.554 | 0.542 | 0.542 | 0.542 | ||

250 | -0,4 | 0 | 0.173 | 0.173 | 0.144 | 0.012 | 0.012 |

0,3 | 0.635 | 0.634 | 0.609 | 0.609 | 0.609 | ||

0,6 | 0.964 | 0.962 | 0.919 | 0.919 | 0.919 | ||

-0,2 | 0 | 0.084 | 0.075 | 0.051 | 0.004 | 0.004 | |

0,3 | 0.252 | 0.213 | 0.197 | 0.197 | 0.197 | ||

0,6 | 0.562 | 0.390 | 0.387 | 0.387 | 0.387 | ||

0 | 0 | 0.051 | 0.002 | 0.001 | 0.000 | 0.000 | |

0,3 | 0.053 | 0.002 | 0.001 | 0.001 | 0.001 | ||

0,6 | 0.051 | 0.003 | 0.001 | 0.001 | 0.001 | ||

0,2 | 0 | 0.084 | 0.075 | 0.049 | 0.004 | 0.004 | |

0,3 | 0.251 | 0.217 | 0.202 | 0.202 | 0.202 | ||

0,6 | 0.553 | 0.388 | 0.384 | 0.384 | 0.384 | ||

0,4 | 0 | 0.169 | 0.169 | 0.142 | 0.011 | 0.011 | |

0,3 | 0.641 | 0.641 | 0.612 | 0.611 | 0.611 | ||

0,6 | 0.963 | 0.960 | 0.920 | 0.920 | 0.920 |

Experience has shown that researchers tend to perceive the indirect moderation model as being the same as the

_{BK}= β

_{Y0}+ β

_{YM}β

_{M0}+ (β

_{YX}+ β

_{YM}β

_{MX})X + (β

_{YM}β

_{MZ})Z + (β

_{YM}β

_{MZX})ZX + (β

_{YM})ε

_{M}+ ε

_{Y}

and the regression for predicting Y in the indirect moderation model can be written as:

_{IndMo}= β

_{Y0}+ β

_{YW}β

_{W0}+ (β

_{YX}+ β

_{YWX}β

_{W0})X + (ε

_{W}β

_{YWX})X + (β

_{YZ}+ β

_{YW}β

_{WZ})Z + (β

_{YWX}β

_{WZ})ZX + ε

_{Y}

The models for predicting Y are not identical and one crucial difference is that the indirect moderation model includes a random effect of X. That is, the regression parameter of X depends on the random component ε_{W} (

To test whether the models are statistically equivalent or not, and thus whether or not they can answer different research questions both models were fitted using AMOS 18 (_{WX} = -0.2 and ρ_{ZW} = 0.3. If the models are equivalent, the same fit statistics would be observed for both models.

The results of these analyses (see ^{2} = 0.966 with ^{2} = 11.662 with

Statistical Indices |
Descriptive |
|||||
---|---|---|---|---|---|---|

Model | χ^{2} |
AGFI | TLI | |||

Indirect moderation | 0.966 | 2 | 0.617 | 0.986 | 1.029 | |

Baron and Kenny model | 11.662 | 2 | 0.003 | 0.842 | 0.730 | |

The SEM analysis shows that the models were statistically non-equivalent and may be used to answer different research questions. As was argued in the beginning, the indirect moderation model is not just a special case of moderated mediation as is the Baron and Kenny model.

In this tutorial, the indirect moderation model was presented which can be used to analyse processes where a variable moderates an effect through another variable. A simulation study showed that the Type I error rate in moderation analysis can be high if a variable is included as a moderator if that variable is only related to the true moderator. The solution is simple: control for spurious moderators by including control variables. When one of the two proposed variables is not a moderator, this will be found in Step 3 of the decision tree. Also, the inclusion of two suspected moderator variables has been shown to be a robust way of determining which one is the true moderator. Future work may investigate the robustness of the methodology under different conditions (e.g., with multiple control moderators simultaneously).

There was a strong decrease in power to detect a true moderator W in the presence of a wrongly included moderator. Low to moderately correlated variables can have detrimental effects of the reliability of regression estimates for the moderators. This effect should be carefully considered in research applications as low to moderate correlations often exist between many variables in the social and behavioural sciences.

It is extremely important for any researcher using moderators in regression analysis to become aware of the pitfalls of including the wrong moderator in a model. Researchers can gain much in research validity when they would not only include covariates as main effects, but also include possible confounding moderators. More research is necessary to investigate the behaviour of parameters of moderators in multiple moderator models, but this presented methodology may be an important step to answering new research questions.

Derivations for comparing the regression models. W and M are assumed to essentially be the same variable but play distinct roles in each model.

_{M0}+ β

_{MZ}Z +β

_{MX}X + β

_{MZX}ZX + ε

_{M}

_{BK}= β

_{Y0}+ β

_{YX}X + β

_{YM}M + ε

_{Y}

Substituting for M gives

_{BK}= β

_{Y0}+ β

_{YX}X + β

_{YM}[β

_{M0}+ β

_{MZ}Z +β

_{MX}X + β

_{MZX}ZX + ε

_{M}] + ε

_{Y}

_{Y0}+ β

_{YX}X + β

_{YM}β

_{M0}+ (β

_{YM}β

_{MZ})Z + (β

_{YM}β

_{MX})X + (β

_{YM}β

_{MZX}) ZX + (β

_{YM}) ε

_{M}+ ε

_{Y}

Rewriting this expression gives

_{BK}= β

_{Y0}+ β

_{YM}β

_{M0}+ (β

_{YX}+ β

_{YM}β

_{MX}) X + (β

_{YM}β

_{MZ})Z + (β

_{YM}β

_{MZX}) ZX + (β

_{YM}) ε

_{M}+ ε

_{Y}

_{W0}+ β

_{WZ}Z + ε

_{W}

_{IndMo}= β

_{Y0 +}β

_{YX}X + β

_{YZ}Z + β

_{YW}W + β

_{YWX}WX + ε

_{Y}

Substituting for W gives

_{IndMo}= β

_{Y0}+ β

_{YX}X + β

_{YZ}Z + β

_{YW}[β

_{W0}+ β

_{WZ}Z + ε

_{W}] + β

_{YWX}X[β

_{W0}+ β

_{WZ}Z + ε

_{W}] + ε

_{Y}

_{Y0}+ β

_{YX}X + β

_{YZ}Z + β

_{YW}β

_{W0}+ (β

_{YW}β

_{WZ}) Z +(β

_{YW}) e

_{W}+ (β

_{YWX}β

_{W0}) X + (β

_{YWX}β

_{WZ})XZ + ε

_{W}(β

_{YWX}X) + ε

_{Y}

Rewriting the expression gives

_{IndMo}= β

_{Y0}+ β

_{YW}β

_{W0}+ (β

_{YX}+ β

_{YWX}β

_{W0}) X + (ε

_{W}β

_{YWX})X + (β

_{YZ}+ β

_{YW}β

_{WZ})Z + (β

_{YWX}β

_{WZ})ZX + +(β

_{YW}) e

_{W}+ ε

_{Y}

_{BK}= β

_{Y0}+ β

_{YM}β

_{M0}+ (β

_{YX}+ β

_{YM}β

_{MX}) X + (β

_{YM}β

_{MZ})Z + (β

_{YM}β

_{MZX}) ZX + (β

_{YM}) ε

_{M}+ ε

_{Y}

_{MeMo}= β

_{Y0}+ β

_{YW}β

_{W0}+ (β

_{YX}+ β

_{YWX}β

_{W0}+ ε

_{W}β

_{YWX}) X + (β

_{YZ}+ β

_{YW}β

_{WZ})Z + (β

_{YWX}β

_{WZ})ZX + (β

_{YW}) e

_{W}+ ε

_{Y}

The authors have declared that they have no conflicts of interest to disclose.

The authors have no funding to report.

Data is freely available at

The supplementary material provided is the R code used in the research and can be accessed in the

The authors have no additional (i.e., non-financial) support to report.