The fixed-effects (FE) model has been widely used in longitudinal social science research. The promise of the FE model is that by focusing on within-group (or person) variation over time, omitted variable bias can be substantially reduced or eliminated. Variation between groups may reflect instead the impact of omitted variables, while focusing on within-group variation eliminates those potential confounds from consideration. Yet the fixed-effects approach is based on the key identifying assumptions that the unobserved confounders are time-invariant, i.e., varying only between groups and not within groups over time (Hill et al., 2020; McNeish & Hamaker, 2020).
In longitudinal observational studies, however, it may be unrealistic to argue that unmeasured confounders are time-invariant while the independent variable itself changes over time; hence, the assumption of no unobserved time-varying confounding is often untenable (Clare et al., 2019). Importantly, factors that may influence both outcome and independent variables and that change over time are often unknown or unmeasured due to costs or time constraints (Nielsen et al., 2019). For example, time-varying confounders such as peer/teacher relationships, physical activity, and mental health conditions may simultaneously influence educational outcomes and childhood obesity, biasing estimates if omitted (Förster et al., 2023). Such confounders may be challenging to measure due to costs or time constraints, with additional unobserved ones likely omitted from datasets.
In this paper, we propose the Time-Varying Confounding Structural Equation Model (TVC-SEM), a simple and flexible approach for testing the robustness of causal effects estimated with traditional fixed-effects models by relaxing the assumption of no time-varying confounding. The method builds on the traditional latent variable approach for estimating the fixed-effects model within the structural equation modeling framework, where time-invariant unobservables are incorporated into the model as a “phantom” latent variable with no observed indicators (Allison, 2009; Bollen & Brand, 2010; Finkel, 1995; Harring et al., 2017).
Our model extends earlier work by Dormann (2001) and Dormann and Zapf (2002) and posits a latent autoregressive variable which represents the combined influence of both time-invariant and time-varying unobservables, and then links to the independent and dependent variables over time. Using a Monte Carlo simulation, we show that the latent TVC-SEM approach provides less biased direct estimates than several variants of the traditional fixed-effects model whenever there is more than minimal time-varying confounding in the data-generating process. Then, analyzing actual data from the Early Childhood Longitudinal Studies Kindergarten cohort (ECLS-K) and the Rural Substance Abuse and Violence Project (RSVP) in the US, we show that TVC-SEM yields more conservative causal direct effects than FE-SEMs, reflecting unmeasured time-varying confounders. TVC-SEM thus offers a practical and accessible robustness check for fixed-effects estimation in the presence of violations of the assumption of no time-varying confounders.
In the following sections, we first briefly introduce the fixed-effects model and discuss the assumption of no unobserved time-varying confounding. Then, we propose our TVC-SEM approach and simulation settings, and present and discuss the results. We call for extending the model to include dynamic, reciprocal, and non-linear effects in causal processes (Allison et al., 2017; Zyphur et al., 2020).
Alternative SEM Models of Unobserved Confounding
From Fixed-Effects Regression to Fixed-Effects Structural Equation Model (FE-SEM)
Fixed-effects (FE) regression isolates within-person changes over time to estimate how these changes relate to variations in a given outcome. By using individual-specific data transformations — such as demeaning (subtracting each individual’s own average), first-differencing (subtracting previous observations), or individual dummy variables — FE regression effectively removes any stable personal characteristics (e.g., family background or genetic traits) that might confound the estimation of causal effects through omitted variable bias (Angrist & Pischke, 2009). Importantly, these stable, potentially confounding factors may be observed or they may be unobserved — that is, not measured, not included in the dataset being analyzed, or indeed, possibly not known to the researcher. Due to its capacity to control for these time-invariant confounders, FE regression has become a foundational analytical tool in longitudinal research across the social sciences, public health, and other fields.
The core idea of FE regression can also be expressed within a Structural Equation Modeling (SEM) framework. Rindskopf (1984) initially showed that stable, unobserved factors (often referred to as “unobserved heterogeneity”) could be represented as a latent (or phantom) variable. Building on this idea, Allison (2009) and Bollen and Brand (2010) further extended the approach specifically for longitudinal data analysis, creating the Fixed-effects Structural Equation Model (FE-SEM), as illustrated in Figure 1. The figure is a path diagram with four time points, where represents the unobserved time-invariant factors for each unit i (i.e., “unobserved heterogeneity” or individual-specific unit effects); is an independent variable that changes over time for each unit.
Figure 1
FE-SEM Framework
The causal processes in the figure can be expressed in the following equation:
1
Specifically, Yit is a function of Xit, with the latent Ui representing the set of stable unobservables that are constant for each unit i. εit refers to idiosyncratic error term that varies across units and time. The latent (or phantom) variable Ui serves as a proxy for all time-invariant omitted confounders (Allison, 2009; Finkel, 1995) and thus is allowed to covary with Xit which distinguishes the fixed-effects model from the traditional random-effects approach (Bollen & Brand, 2010). In the traditional FE model, it is usually assumed that the effect of X on Y () remains consistent over time, yielding averaged effects across time points. Additionally, it posits that residuals are independent, there is no autoregressive parameter of Yit, no reverse effects of Yit-1 on Xit, and no measurement errors in the variables.
FE-SEM offers key advantages over traditional regression, including model fit statistics (e.g., chi-square, RMSEA) and efficient handling of missing data using maximum likelihood methods. It also allows tests for residual autocorrelation and allows to vary over time. Further, if X or Y is measured by multiple items, we can directly model measurement errors of each latent construct (Allison, 2009; Allison et al., 2017; Bollen & Brand, 2010). Finally, with multiwave data it is possible to relax the initial assumption that the effects of the time-invariant Ui in Equation (1) are equal (and set arbitrarily to 1).
Conceptually, a time-invariant characteristic such as Ui (e.g., sex or race/ethnicity) may still exert time-varying influences on outcomes. By this we mean that the slope linking Ui to Xit or Yit can change over time even though Ui itself is constant. For example, if women’s political knowledge or academic performance increases more sharply over time than men’s, the effect of sex on the level of knowledge or achievement necessarily differs between waves — even though sex itself does not change. In this sense, the slope associated with Ui can vary across time, reflecting contextual or developmental shifts in how a time-invariant factor relates to a time-varying outcome (see also Ren & Allison, 2025). The FE-SEM framework is flexible enough to accommodate such scenarios (Allison et al., 2017; Zyphur et al., 2020).
Unobserved Time-Varying Confounding in Longitudinal Observational Studies
As noted, the critical assumption in FE-SEM is that the potential unobserved confounders (Ui) are stable over time. For example, estimating the effects of voluntary group memberships on social trust may be confounded by unobserved stable personality attributes such as extroversion that are related both to an individual’s propensity to join groups and to trust other people (Finkel, 2002). But the assumption that all potential unobserved confounders are time-invariant is almost certainly unrealistic.
For example, in investigating the effects of obesity status on academic achievement among school-age children, unobserved aspects of the child’s home environment may be essentially stable. But as children move from kindergarten to elementary school, their peer and teacher relationships, and their participation in physical or cultural activities will also change, and to the extent that these factors are related to both health and education outcomes (Carbonaro & Maloney, 2019; Umberson & Karas Montez, 2010), these factors may represent unobserved time-varying confounders in the causal process. In that case, the conventional fixed-effects models will yield biased estimates.
We display a model with unobserved time-invariant and time-varying confounders in Figure 2, and in Equation (2) below. To simplify exposition, we consider only two time points and assume contemporaneous effects as in the traditional fixed-effects panel model.
Figure 2 depicts a new variable representing time-varying confounding () along with the effects of the time-invariant Ui. Since the time-varying confounds the relationship between X and Y at each time point, above and beyond the influences of , adjustment for Zit is of necessity in order to estimate the causal direct effect of X on Y (. However, in observational longitudinal studies, it is challenging to identify and control for a set of all potential confounders at each time point.1 The latent variable approach that we propose here is particularly useful for multiwave longitudinal data with few observed time-varying confounders in estimating causal direct effects of a time-varying exposure.
Figure 2
Unobserved Time-Varying Confounding
2
The Time-Varying Confounding Structural Equation Model (TVC-SEM)
A straightforward alternative way to include the effects of a set of time-varying unobservable confounders is suggested by Dormann (2001), who builds on the early common factor models for assessing “spuriousness” in cross-lagged panel models (Finkel, 1995; Kenny, 1975). In these measurement-like models, X and Y are treated as indicators of a latent construct which accounts for their observed association at given points in time. Our TVC-SEM can be viewed as a dynamic common-factor specification in which the latent factor plays the same conceptual role that Dormann’s (2001) time-varying common factor did — namely, accounting for any “third-variable” association between X and Y — but is now allowed to evolve autoregressively across waves to summarily capture both time-varying and time-invariant unobserved confounding. In this way, we are bringing back the common-factor idea but repurposing it as a robustness test that allows for unmeasured time-varying confounding in modern causal inference parlance. We show the time-varying confounding SEM (TVC-SEM) for the four-wave case in Figure 3 and Equation (3) below.
Figure 3
An Alternative Latent Variable Approach for Controlling Unobserved Time-Varying Confounding (Error Covariances Are Assumed Between )
3
The model contains a time-varying latent that has varying direct effects on and at each wave. However, as opposed to Figure 2 and its accompanying Equation (2), the effect of time-varying Zit subsumes the effects of both unobserved time-varying and time-invariant confounding; that is, the effects of Zit in Figure 3 represent the effects of both Ui and Zit in Figure 2. This is the key idea of the approach that allows us to identify the model, since models with two latent variables that represent time-invariant and time-varying latent variables respectively are not identifiable. Most importantly, the autoregressive parameters of Zit (Zit-1 → Zit) capture the stability of unobserved confounding and thus provide an estimate of the relative prevalence of stable versus time-varying confounders in the causal process. The model thus provides estimates of the effect of Xit on Yit controlling for possible unobserved time-invariant and time-varying confounding that may exist in the process.2
At first glance, the TVC-SEM bears resemblance to a traditional FE-SEM model with autocorrelated (AR (1)) disturbances, where the Zit proxies for both time-invariant confounding (Ui) and the AR (1) process of the disturbances in Yit. However, in contrast to the fixed-effects AR (1) model, we directly model the effect of the latent variable Zit on the Xit as well, thus accounting for the confounding of the causal effects from Xit to Yit that is due to their joint association with Zit (and Ui as well). In this sense, the FE-SEM model with AR (1) disturbances, as the other FE-SEM variants, falls short in controlling for unobserved time-varying confounding.3
In the model proposed by Dormann, which incorporates autoregressive effects (often referred to as “dynamic” effects) of both the independent variable (Xit) and the dependent variable (Yit), identifying the model requires the inclusion of three endogenous variables. Additionally, to ensure model identification, there must be equality constraints on the autoregressive parameters as well as stability in the variance of the latent variable (Zit) over time.
In our case, the traditional fixed-effects model does not contain dynamic components for either X or Y. This allows for a simpler version of the Dormann model, where only two endogenous variables are needed, provided there are at least three waves of data collection. Consequently, we are not required to impose homogeneity restrictions on the variances of the time-varying latent variable. This adaptation follows the practices of traditional fixed-effects SEMs (Bollen & Brand, 2010).
As can be seen from the coefficient from Zit to Yit in Figure 3, however, we restrict the paths at 1 for model identification (as in Bollen & Brand, 2010), which also serves as a reference to gauge time-varying effects of Zit at each wave. We also build correlations between Xit to account for remaining associations among predictors as shown in Figure 1 (for the model specification see SM 1).
All of the model parameters can be freely estimated without additional constraint subject to the following assumptions regarding Zt and its relationship with Xt and Yt (for notational brevity, in what follows we suppress the individual index i):
Dynamic structure. Zt follows a first-order Markov/AR (1) process across waves; we do not include higher-order lags of Zt unless explicitly stated.
Contemporaneous effects only. The effects of Zt on Xt and Yt are contemporaneous, with no additional lagged effects of Zt on later values of Xt and Yt.
Conditional exogeneity. Conditional on prior observed variables and Zt, the structural disturbances and are uncorrelated with (i.e., no residual– correlation).
Functional form and support. Effects are linear and additive in the baseline specification, and is not perfectly collinear with or .
As in the traditional FE model, we also assume that residuals are independent, there is no autoregressive parameter of , no reverse causal effects of on , and no measurement errors in the variables.
Using a Monte Carlo simulation and a real data example, we test whether the TVC-SEM model with time-varying provides less biased estimates of causal direct effects than the traditional FE-SEM models as well as a naïve model without any of the FE corrections for stable unobserved confounding variables. We show that TVC-SEM can replicate FE-SEM estimates under many conditions; more importantly, it provides substantially less biased estimates than FE-SEM whenever time-varying confounding is present, thus making the model an appropriate robustness test for the presence of time-varying confounding and the possible biases that would result in causal estimates from traditional fixed-effects models.
Simulation Tests
We generate our data by following Allison et al. (2017) and Leszczensky and Wolbring (2022). We generate a model with two random variables X and Y, where X has a causal direct effect on Y, with an additional unobserved time-invariant variable having (initially) constant effects on both X and Y. We also generate a time-varying variable which has possible time-varying effects on both X and Y. We then build our data based on the following equation:
4
where is the constant direct effect of X on Y, which is of primary interest; is the (initially) time-invariant effect of the stable unit-specific Ui variable, and is the time-varying effect of , the unobserved time-varying confounder. The time-specific disturbances, , are assumed to be each standard normal, independent of all exogenous variables and with no autoregressive component.
Extending Equation (4) to the full data-generating system, we manipulate three types of unobserved confounding:
Time-invariant effects of only.
Time-varying effects of .
Time-varying effects of both and .
Figure 4 (a, b, c) illustrates the data-generating models of the three conditions (for simplicity, only two time points are illustrated). We explore how each model responds to various simulation settings as shown in Table 1. Our aim is to test which model(s) recovers the true value of , which is set to .4 in all conditions, in the presence of the different levels of unobserved confounding in the population. We begin with four-wave data. Note that the numbers in bold in Table 1 are values for the baseline model. Each parameter is then varied while keeping all other parameters at their baseline values (as in Allison et al., 2017). For simulating the conditions of time-varying effects of unobserved confounders (as in Figures 4b and c), we vary the influences of on and at each wave with incremental/decremental changes from the lowest/highest values of , and also vary the effect of on and , as well as the stability of ; the symbol Δ indicates a per-wave linear step.
Figure 4
Data-Generating Models for Unobserved Confounding
Table 1
Components of the Simulation and Parameter Values
| Concept | Values |
|---|---|
| Sample size | 500 |
| Number of waves | 4 |
| → ( | .4 |
| Stability of ( → ) | ∆ -.2; ∆ .2; .2; .5; 1 |
| Time-invariant effects of () | 0; .2; .5; .8 |
| Time-varying effects of on ( | |
| .8; 1 | |
| .6; .8 | |
| .4; .6 | |
| .2; .4 | |
| Time-varying effects of on ( | |
| .3; .7 | |
| .5; .9 | |
| -.1; .3 | |
| .1; .5 | |
| Time-varying effects of ( | |
| ∆ .2/.2; ∆ .2/.3; | |
| ∆ -.2/-.3; ∆ .2/-.3 | |
The performance of five estimation models is compared:
A “naïve” SEM with no unobserved confounding at all, either time-invariant or time-varying (Naïve SEM).
The FE-SEM with time-invariant effects of the stable Ui term (FE-SEM1).
The FE-SEM with time-varying effects of (FE-SEM2).
The FE-SEM with an AR (1) disturbance (FE-SEM3).4
The TVC-SEM model with a time-varying Zit which subsumes both unobserved time-invariant and time-varying confounders.
We generate the data using Stata 17 (n = 500, t = 4, and 500 iterations) and run each model using Mplus 8.0 with the maximum likelihood estimator. Our research specifically aims to compare the efficacy of different latent models in accurately recovering true parameters under various conditions. In line with the approaches of Leszczensky and Wolbring (2022) and Vaisey and Miles (2017), we primarily focus on analyzing estimated coefficients. Additionally, we report relative bias and SE/SD ratios, with a ratio of one indicating accurate standard error estimation. We do not report model fit indices (e.g., CFI, RMSEA, TLI), as they do not reliably indicate true models in our simulations (Bollen & Pearl, 2013). We illustrate encountered model convergence percentages in each figure and table.
We then compare the models using two data sets from the Early Childhood Longitudinal Studies Kindergarten cohort (ECLS-K) and the Rural Substance Abuse and Violence Project (RSVP) in the US, respectively. The ECLS-K follows the kindergarten cohort of 2010–2011 through the 2015–2016 school year, providing a comprehensive picture of children’s academic development. We investigate the link between body mass index (BMI) and math achievement (n = 18,170). Previous studies report the significant negative association between BMI (or overweight status) and academic achievement especially for girls, though rigorous longitudinal evidence remains lacking, due largely to the problem of unobserved confounding (Santana et al., 2017). Here we use the standardized composite BMI calculated by composite weight and height and math item response theory scores from Grades 1 to 4, and run each model for boys and girls, respectively.
The RSVP data is a longitudinal study (2001–2004) of criminal victimization and offending among a panel of 7th grade students from the state of Kentucky (n = 3,968). It is often hypothesized that victimization contributes to subsequent offending by creating negative affect such as anger. Yet previous studies report mixed findings on the hypothesis of the positive relationship, partly because of unobserved confounding or different time specifications of key variables (Ousey et al., 2011). Here we investigate the link between victimization (i.e., experience as a victim of physical assault or robbery) and offending (i.e., self-reported delinquent behaviors, e.g., robbery, theft, or assault) across four waves.5
Results
Condition #1: Time-Invariant Confounding Only With Time-Invariant Causal Effects (Figure 4a)
Figure 5a illustrates the results from a simulation where there is no time-varying confounding from Zit at all, and where there are varied magnitudes of confounding from time-invariant Ui. The effects of Ui in this simulation are also constrained to be equal over time. This simulation corresponds exactly to the assumptions underlying the traditional fixed-effects model, and thus should be the most favorable condition for FE-SEM1 (the corresponding tables to Figure 5 are illustrated in Supplementary Material (SM) 4–5).
Figure 5
Simulation Results Under Varying Levels of Time-Invariant and Varying Effects of
Note. % denotes model convergence rates. The x-axis for Figure 5b lists discrete simulation scenarios; points are connected only for visual purposes.
The zero value of effects (in Figure 5a) reflects the condition of no unobserved heterogeneity that relates to both X and Y. Under such conditions, naïve SEM and the variants of the FE-SEM model, i.e., with time-invariant and time-varying effects of , provide unbiased estimates, while TVC-SEM model with provides a biased estimate (.13) along with a very high rate of convergence failure (85%) as well as a poor SE/SD ratio (.54) in SM 4. This occurs because TVC-SEM attempts to capture both (non-existent) and effects in the simulated data. Likewise, while FE-SEM1 and FE-SEM3 do not show any model convergence issue, FE-SEM2 introducing time-varying latent also has some degree of convergence failure when the true Ui effects are null (Ui = 0, 37%) to small (Ui = .2, 11%).
As the influence of increases, FE-SEM1-3 recover the exact causal effect, while naïve SEM produces upwardly biased estimates of up to 60%. Importantly, TVC-SEM also recovers true values without a serious convergence failure whenever the effects of Ui exceed .3 (to capture the degree of bias in detail, we illustrate findings with an increase of one decimal unit on in Figure 5a). This is an important result: TVC-SEM with Zit can recover causal effects as well as all FE-SEMs even when all confounders are time-invariant, so long as the Ui effects are at least of relatively modest magnitude (≥ .3). Only when the true population model has effects of which are null to small in magnitude and the model has no time-varying confounding at all are traditional FE-SEMs better able to recover the causal direct effect of interest.
Condition #2: Time-Invariant Confounding With Time-Varying Causal Effects (Figure 4b)
In the previous section, we assume that effects are constant over time. As noted above, however, even fixed unobserved characteristics may have differential effects on X and Y over time (as, for example, in Ren & Allison, 2025). Such a model would indicate that the stable unobservables influence both the level and the change in the outcome over time. We thus vary the influences of on X and Y at each wave with incremental/decremental changes from the lowest/highest values of unobserved heterogeneity in Table 1.
From the results, as seen in Figure 5b with full results in SM 5, in the presence of time-varying effects of , even FE-SEM1 and FE-SEM3 produce biased estimates of up to 23%. As expected, the FE-SEM2 allowing for time-varying effects of successfully recovers the true values. The TVC-SEM with time-varying effects of , however, also performs very well compared to both the Naïve and FE-SEM1 and FE-SEM3 with a minor convergence failure (6%). In other words, when stable unobservables have varying effects on outcomes over time, TVC-SEM performs better in terms of bias than both the traditional FE-SEM1 that assumes time-invariant effects of , better than the FE-SEM3 with an AR (1) disturbance, and it performs nearly as well as the model that conforms to the true data generating process in the population.
Condition #3: Time-Invariant and Time-Varying , Each With Time-Varying Causal Effects (Figure 4c)
We model this condition by varying the effect of on and at different waves, fixing all other parameters at their baseline values. The results are shown in Figure 6 (also in SM 6). In this scenario, all of the FE-SEM models perform extremely poorly, with biases up to 55%.6 Among the five models, TVC-SEM provides by far the best estimates in the presence of time-varying confounding effects of and , does so without convergence failures, and does so regardless of the amount of (non-zero) time-varying unobserved confounding which exists in the true population.7
Figure 6
Simulation Results Under Varying Levels of Time-Varying Effects of and
Note. The x-axis lists distinct design conditions; points are connected only for visual purposes.
We also test further the five models by varying the autoregressive parameters of (i.e., its stability). The results, illustrated in SM 3, show broadly similar patterns that TVC-SEM with time-varying tends to provide less biased estimates under most conditions. Yet, when the stability of is too low (.2) or the fluctuation in the stability of is present (∆ .2), SEM with time-varying yields more biased estimates. But in every case the biases are less than any of the alternative models.
It is also worth noting that TVC-SEM generally produced larger (more conservative) standard errors than FE-based alternatives, reflecting its richer parameterization — e.g., wave-specific Z→X and Z→Y paths and serial covariance terms. Accordingly, we position TVC-SEM primarily as a bias-robustness tool, while FE models may be preferable when efficiency is paramount and when their assumptions — i.e., when only time-invariant confounders are relevant to the causal processes — are credible (see the conclusion for further discussion).
Empirical Examples
We now apply the five models, including the naïve SEM and both versions of FE-SEM and TVC-SEM, in the ECLS-K and RSVP data across four waves.8 Using the ECLS-K, we first investigate the link between child BMI and math achievement between boys and girls, respectively (from Grades 1 to 4). The model fit indices are reported only for reference.
In Table 2, naïve SEM (M1) shows a significant negative association between BMI and math scores for boys (-.06) and girls (-.05) at p < .001. Notably, in the traditional FE-SEM1 (M2) the observed direct effects reduce to about half of the coefficients in the naïve model. For girls, the effects are reduced further and are no longer statistically significant in FE-SEM2 (M3) that allows for time-varying effects of the stable unobservables. FE-SEM3 with an AR (1) disturbance (M4) also yields the same estimates as that of FE-SEM2, though model fit is slightly improved.
Table 2
Results from the ECLS-K
| M1 | M2 | M3 | M4 | M5 | |
|---|---|---|---|---|---|
| Model | Naïve SEM | FE-SEM1 | FE-SEM2 | FE-SEM3 | TVC-SEM |
| Model fit indices: | • : 108.46 (15) | • : 930.48 (16) | • : 832.30 (13) | • :157.01 (15) | • :55.01 (10) |
| Boys (8,080) | • CFI: .99 | • CFI: .98 | • CFI: .98 | • CFI: .99 | • CFI: .99 |
| • TLI: .99 | • TLI: .97 | • TLI: .97 | • TLI: .99 | • TLI: .99 | |
| • RMSEA: .03 | • RMSEA: .08 | • RMSEA: .09 | • RMSEA: .03 | • RMSEA: .02 | |
| BMI → Math: | -.06*** (.01): | -.03*** (.01): | -.03*** (.01): | -.03*** (.01): | .01 (.02): |
| 95% BC CI | [-.07 to -.04] | [-.05 to -.02] | [-.05 to -.02] | [-.05 to -.02] | [-.03 to .06] |
| Model fit indices: | • : 125.29 (15) | • : 694.59 (16) | • : 581.49 (13) | • :144.17 (15) | • : 37.91 (10) |
| Girls (7,740) | • CFI: .99 | • CFI: .98 | • CFI: .98 | • CFI: .99 | • CFI: .99 |
| • TLI: .99 | • TLI: .97 | • TLI: .97 | • TLI: .99 | • TLI: .99 | |
| • RMSEA: .03 | • RMSEA: .07 | • RMSEA: .08 | • RMSEA: .03 | • RMSEA: .02 | |
| BMI → Math: | -.05*** (.01): | -.02* (.01): | -.01 (.01): | -.01 (.01): | .02 (.03): |
| 95% BC CI | [-.06 to -.04] | [-.03 to .00] | [-.03 to .01] | [-.03 to .00] | [-.03 to .08] |
Note. Variables are standardized. Sample sizes are rounded per NCES guidelines. 95% BC CI are presented below each estimate. TVC-SEMs were estimated without autoregressive equality constraints on .
*p < .05. **p < .01. ***p < .001.
The drop in coefficient magnitudes is predictable because, while omitted stable confounders such as family/school SES (within a relatively short interval) are likely to be positively (+) correlated with math achievement, they tend to be negatively (-) correlated with BMI or overweight status. Since there is an open negative (-) backdoor path between BMI and math scores, M1 is likely to produce a downward bias relative to FE-SEMs.
Under the assumption that some time-varying unobserved confounding exists, M5 shows BMI-math relationships in M1 and M2 drop and lose significance for boys and girls. Moreover, a cross-model Wald test (Clogg et al., 1995) comparing FE-SEM1 (M2) and TVC-SEM (M5) for boys shows that the .04 difference in effects is statistically significant; given the larger standard errors in the TVC-SEM model, this kind of comparison is perhaps a more important indication of the possible impact of time-varying unobserved confounding than the examination of the TVC-SEM model on its own. Yet the corresponding cross-model Wald test for girls shows a statistically insignificant change (Δ = .04) in the causal impact of BMI on math achievement.
In Table 3, we present the year-to-year relationship between crime victimization and offending from the RSVP data. The positive association between victimization and offending decreases from .36 in naïve M1 to .26 in FE-SEMs and further to .18 in TVC-SEM — a 50% reduction in magnitude. In contrast to the ECLS-K results, the causal effect of interest is statistically significant in both the FE-SEM and TVC-SEM model estimations, but the cross-model test comparing the TVC-SEM with FE-SEM estimates nevertheless shows a significant reduction in the effect, once the possibility of time-varying unobserved confounding is allowed.
Table 3
Results from the RSVP
| M1 | M2 | M3 | M4 | M5 | |
|---|---|---|---|---|---|
| Model | Naïve SEM | FE-SEM1 | FE-SEM2 | FE-SEM3 | TVC-SEM |
| Model fit indices | • : 260.93 (15) | • : 101.29 (16) | • : 80.52 (13) | • : 19.18 (15) | • : 12.86 (10) |
| • CFI: .95 | • CFI: .98 | • CFI: .99 | • CFI: .99 | • CFI: .99 | |
| • TLI: .92 | • TLI: .97 | • TLI: .98 | • TLI: .99 | • TLI: .99 | |
| • RMSEA: .06 | • RMSEA: .04 | • RMSEA: .04 | • RMSEA: .01 | • RMSEA: .01 | |
| Victimization → Offending: | .36*** (.01): | .26*** (.01): | .26*** (.01): | .26*** (.01): | .18*** (.02): |
| 95% BC CI | [.33 to .38] | [.24 to .29] | [.24 to .29] | [.23 to .29] | [.13 to .22] |
Note. 95% BC CI are presented below each estimate. N = 3,965. TVC-SEMs were estimated without autoregressive equality constraints on .
*p < .05. **p < .01. ***p < .001.
Discussion
The widely used fixed-effects model is based on the assumption that unobserved heterogeneity is time-invariant, which is unlikely to obtain in many observational longitudinal studies. In the present study, we incorporate the early idea of Dormann (2001) into the fixed-effects model framework. The resultant model, which we call TVC-SEM, includes a latent time-varying which serves as a combination of the time-invariant and time-varying factors which may confound estimates of causal effects in longitudinal models. The model is straightforward to specify, with one additional latent variable affecting Xit and Yit at each wave in time with autoregressive effects from wave to wave. It is identifiable with at least 3 waves of observation and in our simulations, consistently yielded less biased estimates than any of the variants of FE-SEM in the presence of time-varying confounding. Further, it provides more conservative estimates of causal direct effects in two analyses of observational longitudinal studies.
Our approach has several notable strengths. First, the analytic model is relatively simple and straightforward compared to other sophisticated methods for addressing time-varying confounding (e.g., G-estimation or IPTW; Platt et al., 2009). Rather than proposing TVC-SEM as a general-purpose estimator for unmeasured time-varying confounding, though, we present it as a diagnostic tool to probe the sensitivity of FE-based estimates under more flexible assumptions. Unlike Dormann’s original study, it does not require a third variable for model identification, nor does it necessitate repeatedly measured potential confounders as required in IPTW approaches. This makes TVC-SEM particularly valuable in observational longitudinal studies where only a few time-varying confounders are observed.
Further, the flexibility of SEM allows researchers to test total effects, reverse causality, and conduct robustness or sensitivity checks against various types of potential bias. Observed time-varying covariates can also be incorporated with equality constraints, as in FE-SEM. Accounting for autocorrelated residuals, missing data, and measurement errors are additional benefits that the TVC-SEM, like other models in the structural equation tradition, can offer. Our study thus contributes to previous research that highlights the general flexibility and capability of SEM-based panel models using a latent variable approach (Allison, 2009; Allison et al., 2017; Bollen & Brand, 2010; Finkel, 1995; Zyphur et al., 2020).
One limitation of TVC-SEM is that in the absence of any unobserved confounding — that is, neither time-invariant nor time-varying — it showed a relatively poor performance with a higher rate of convergence failure. Moreover, TVC-SEM underperformed most FE models when time-invariant confounding was relatively modest and no time-varying confounding was present. These results clarify the boundary conditions of our approach: while the TVC-SEM model produces unbiased effects under many conditions of unobserved confounding, it is unequivocally superior to FE models only when some degree of time-varying confounding exists. These findings suggest that TVC-SEM is not intended as a replacement for FE-based models, but rather as a complementary approach to evaluate potential biases in FE estimates when researchers may anticipate both time-invariant and time-varying unobserved confounding.
We thus recommend that researchers initially assess the potential structure of unobserved confounding within their specific research context and hypotheses, drawing on subject-matter expertise. For example, understanding the nuances of how childhood obesity impacts educational outcomes can inform the anticipation and modeling of potential unobserved confounders over time. This subject-matter insight is crucial for hypothesizing about the nature and evolution of such confounders.
In scenarios commonly characterized by both time-invariant and time-varying confounding, traditional FE-SEMs are likely to yield significantly larger biases relative to TVC-SEM. Yet, TVC-SEM is typically a less efficient estimator, similar to instrumental variables (Martens et al., 2006). Therefore, we advise researchers to focus on both the direction and magnitude of direct effects when comparing TVC-SEM with variants of FE-SEM; if our model yields significantly different effects, this would be strong evidence that time-varying unobserved factors are potentially confounding the causal effects. Conversely, if our model yields different effects from FE-SEM but with the differences being statistically insignificant, the conclusion becomes less definitive. In all cases, the direction and magnitude of coefficients and their differences across models should be evaluated and considered as part of the total evidence for the existence and magnitude of causal effects. In this way, TVC-SEM is best viewed as a complementary robustness test for FE-SEM rather than a universal alternative.
In terms of limitations, the TVC-SEM we developed here is built on the assumption of contemporaneous effects of X on Y, and on the lack of dynamic autoregressive effects between X and Y from wave to wave. Also the TVC-SEM shares the same functional form assumptions, such as linearity or additivity, as the FE-SEM. And while TVC-SEM outperforms traditional fixed-effects models in most cases, it does not always precisely recover true effects. Further the examined patterns of unobserved time-varying confounding are not exhaustive in this study; in observational studies, emerging patterns of time-varying confounding are complex and somewhat unpredictable due to uncontrollable factors (Platt et al., 2009). Yet researchers will be able to account for such biases by adjusting the covariance structures between and in our simulations.
The proposed TVC-SEM is built on the tradition of FE-models. Future studies should extend the approach to incorporate cross-lagged reciprocal dynamics and dynamic autoregressive effects of both Xit and Yit over time, and assess the performance of TVC-SEM relative to alternative panel models (e.g., Allison et al., 2017; Bollen & Brand, 2010; Zyphur et al., 2020). For instance, Kenny and McCoach (2025) introduce a Latent Time-Varying Covariate (LTVC) approach that tests whether a latent time-varying covariate can explain the observed X–Y covariation in two-variable cross-lagged multi-wave designs, with identification under stationarity constraints. The two frameworks can be seen as complementary: TVC-SEM is FE-oriented and offers an estimation-centered sensitivity analysis (recovering within-person direct effects while explicitly modeling a latent time-varying confounder), whereas LTVC-CLPM is a cross-lagged panel model-based diagnostic test. Future work should also compare TVC-SEM with causal frameworks designed for time-varying confounding under different assumptions (e.g., instrumental variables; marginal structural models using IPTW or g-computation).
This is an open access article distributed under the terms of the