Original Article

Controlling for Time-Varying Confounding in the Longitudinal Fixed-Effects Model: A Latent Variable Approach

Baeksan Yu1 , Steven Finkel2

Methodology, 2026, Vol. 22(1), 27–51, https://doi.org/10.5964/meth.16875

Received: 2025-02-02. Accepted: 2025-12-14. Published (VoR): 2026-03-27. Corrected (CVoR): 2026-04-02.

Handling Editor: Shahab Jolani, Maastricht University, Maastricht, the Netherlands

Corresponding Author: Baeksan Yu, Department of Education, Gwangju National University of Education, 55 Pilmun-daero, Buk-gu, Gwangju 61204, Republic of Korea. E-mail: yu.baeksan@gmail.com

Open Code BadgeOpen Data BadgeOpen Materials Badge
Supplementary Materials: Code, Data, Materials [see Index of Supplementary Materials]

This is an open access article distributed under the terms of the Creative Commons Attribution 4.0 International License, CC BY 4.0, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Abstract

Fixed-effects regression models are commonly used in longitudinal studies as a means to estimate causal effects while controlling for unobserved time-invariant confounders. However, unobserved time-varying confounding remains potentially problematic, and identifying and measuring such confounders can be resource-intensive and costly. We propose the Time-Varying Confounding Structural Equation Model (TVC-SEM), a simple longitudinal model that builds on previous “common factor” models and which can serve as a robustness check for the assumption of no unobserved time-varying confounding in the fixed-effects approach. We posit a model with a latent autoregressive variable Zit, which represents the combined influence of both time-invariant and time-varying unobservables, and which is linked to the independent and dependent variables over time. Through Monte Carlo simulations and analyses of data from the Early Childhood Longitudinal Studies Kindergarten cohort (ECLS-K) and the Rural Substance Abuse and Violence Project (RSVP), we show that, under most conditions, TVC-SEM provides less biased estimates than several variants of the traditional fixed-effects model. Our proposed approach offers applied researchers a practical check for gauging the extent to which the fixed-effects assumption of no time-varying confounding may produce bias in the estimation of causal effects.

Keywords: time-varying confounding, structural equation model, latent variable, fixed effects, panel data

The fixed-effects (FE) model has been widely used in longitudinal social science research. The promise of the FE model is that by focusing on within-group (or person) variation over time, omitted variable bias can be substantially reduced or eliminated. Variation between groups may reflect instead the impact of omitted variables, while focusing on within-group variation eliminates those potential confounds from consideration. Yet the fixed-effects approach is based on the key identifying assumptions that the unobserved confounders are time-invariant, i.e., varying only between groups and not within groups over time (Hill et al., 2020; McNeish & Hamaker, 2020).

In longitudinal observational studies, however, it may be unrealistic to argue that unmeasured confounders are time-invariant while the independent variable itself changes over time; hence, the assumption of no unobserved time-varying confounding is often untenable (Clare et al., 2019). Importantly, factors that may influence both outcome and independent variables and that change over time are often unknown or unmeasured due to costs or time constraints (Nielsen et al., 2019). For example, time-varying confounders such as peer/teacher relationships, physical activity, and mental health conditions may simultaneously influence educational outcomes and childhood obesity, biasing estimates if omitted (Förster et al., 2023). Such confounders may be challenging to measure due to costs or time constraints, with additional unobserved ones likely omitted from datasets.

In this paper, we propose the Time-Varying Confounding Structural Equation Model (TVC-SEM), a simple and flexible approach for testing the robustness of causal effects estimated with traditional fixed-effects models by relaxing the assumption of no time-varying confounding. The method builds on the traditional latent variable approach for estimating the fixed-effects model within the structural equation modeling framework, where time-invariant unobservables are incorporated into the model as a “phantom” latent variable with no observed indicators (Allison, 2009; Bollen & Brand, 2010; Finkel, 1995; Harring et al., 2017).

Our model extends earlier work by Dormann (2001) and Dormann and Zapf (2002) and posits a latent autoregressive variable Zit which represents the combined influence of both time-invariant and time-varying unobservables, and then links Zit to the independent and dependent variables over time. Using a Monte Carlo simulation, we show that the latent TVC-SEM approach provides less biased direct estimates than several variants of the traditional fixed-effects model whenever there is more than minimal time-varying confounding in the data-generating process. Then, analyzing actual data from the Early Childhood Longitudinal Studies Kindergarten cohort (ECLS-K) and the Rural Substance Abuse and Violence Project (RSVP) in the US, we show that TVC-SEM yields more conservative causal direct effects than FE-SEMs, reflecting unmeasured time-varying confounders. TVC-SEM thus offers a practical and accessible robustness check for fixed-effects estimation in the presence of violations of the assumption of no time-varying confounders.

In the following sections, we first briefly introduce the fixed-effects model and discuss the assumption of no unobserved time-varying confounding. Then, we propose our TVC-SEM approach and simulation settings, and present and discuss the results. We call for extending the model to include dynamic, reciprocal, and non-linear effects in causal processes (Allison et al., 2017; Zyphur et al., 2020).

Alternative SEM Models of Unobserved Confounding

From Fixed-Effects Regression to Fixed-Effects Structural Equation Model (FE-SEM)

Fixed-effects (FE) regression isolates within-person changes over time to estimate how these changes relate to variations in a given outcome. By using individual-specific data transformations — such as demeaning (subtracting each individual’s own average), first-differencing (subtracting previous observations), or individual dummy variables — FE regression effectively removes any stable personal characteristics (e.g., family background or genetic traits) that might confound the estimation of causal effects through omitted variable bias (Angrist & Pischke, 2009). Importantly, these stable, potentially confounding factors may be observed or they may be unobserved — that is, not measured, not included in the dataset being analyzed, or indeed, possibly not known to the researcher. Due to its capacity to control for these time-invariant confounders, FE regression has become a foundational analytical tool in longitudinal research across the social sciences, public health, and other fields.

The core idea of FE regression can also be expressed within a Structural Equation Modeling (SEM) framework. Rindskopf (1984) initially showed that stable, unobserved factors (often referred to as “unobserved heterogeneity”) could be represented as a latent (or phantom) variable. Building on this idea, Allison (2009) and Bollen and Brand (2010) further extended the approach specifically for longitudinal data analysis, creating the Fixed-effects Structural Equation Model (FE-SEM), as illustrated in Figure 1. The figure is a path diagram with four time points, where Ui represents the unobserved time-invariant factors for each unit i (i.e., “unobserved heterogeneity” or individual-specific unit effects); Xit is an independent variable that changes over time for each unit.

Click to enlarge
meth.16875-f1
Figure 1

FE-SEM Framework

The causal processes in the figure can be expressed in the following equation:

1
Yit=α+β1Xit+Ui+εit.

Specifically, Yit is a function of Xit, with the latent Ui representing the set of stable unobservables that are constant for each unit i. εit refers to idiosyncratic error term that varies across units and time. The latent (or phantom) variable Ui serves as a proxy for all time-invariant omitted confounders (Allison, 2009; Finkel, 1995) and thus is allowed to covary with Xit which distinguishes the fixed-effects model from the traditional random-effects approach (Bollen & Brand, 2010). In the traditional FE model, it is usually assumed that the effect of X on Y (β1) remains consistent over time, yielding averaged effects across time points. Additionally, it posits that residuals are independent, there is no autoregressive parameter of Yit, no reverse effects of Yit-1 on Xit, and no measurement errors in the variables.

FE-SEM offers key advantages over traditional regression, including model fit statistics (e.g., chi-square, RMSEA) and efficient handling of missing data using maximum likelihood methods. It also allows tests for residual autocorrelation and allows β1 to vary over time. Further, if X or Y is measured by multiple items, we can directly model measurement errors of each latent construct (Allison, 2009; Allison et al., 2017; Bollen & Brand, 2010). Finally, with multiwave data it is possible to relax the initial assumption that the effects of the time-invariant Ui in Equation (1) are equal (and set arbitrarily to 1).

Conceptually, a time-invariant characteristic such as Ui (e.g., sex or race/ethnicity) may still exert time-varying influences on outcomes. By this we mean that the slope linking Ui to Xit or Yit can change over time even though Ui itself is constant. For example, if women’s political knowledge or academic performance increases more sharply over time than men’s, the effect of sex on the level of knowledge or achievement necessarily differs between waves — even though sex itself does not change. In this sense, the slope associated with Ui can vary across time, reflecting contextual or developmental shifts in how a time-invariant factor relates to a time-varying outcome (see also Ren & Allison, 2025). The FE-SEM framework is flexible enough to accommodate such scenarios (Allison et al., 2017; Zyphur et al., 2020).

Unobserved Time-Varying Confounding in Longitudinal Observational Studies

As noted, the critical assumption in FE-SEM is that the potential unobserved confounders (Ui) are stable over time. For example, estimating the effects of voluntary group memberships on social trust may be confounded by unobserved stable personality attributes such as extroversion that are related both to an individual’s propensity to join groups and to trust other people (Finkel, 2002). But the assumption that all potential unobserved confounders are time-invariant is almost certainly unrealistic.

For example, in investigating the effects of obesity status on academic achievement among school-age children, unobserved aspects of the child’s home environment may be essentially stable. But as children move from kindergarten to elementary school, their peer and teacher relationships, and their participation in physical or cultural activities will also change, and to the extent that these factors are related to both health and education outcomes (Carbonaro & Maloney, 2019; Umberson & Karas Montez, 2010), these factors may represent unobserved time-varying confounders in the causal process. In that case, the conventional fixed-effects models will yield biased estimates.

We display a model with unobserved time-invariant and time-varying confounders in Figure 2, and in Equation (2) below. To simplify exposition, we consider only two time points and assume contemporaneous effects as in the traditional fixed-effects panel model.

Figure 2 depicts a new variable representing time-varying confounding (Zit) along with the effects of the time-invariant Ui. Since the time-varying Zit confounds the relationship between X and Y at each time point, above and beyond the influences of Ui, adjustment for Zit is of necessity in order to estimate the causal direct effect of X on Y (β1). However, in observational longitudinal studies, it is challenging to identify and control for a set of all potential confounders at each time point.1 The latent variable approach that we propose here is particularly useful for multiwave longitudinal data with few observed time-varying confounders in estimating causal direct effects of a time-varying exposure.

Click to enlarge
meth.16875-f2
Figure 2

Unobserved Time-Varying Confounding

2
Yit=α+β1Xit+δtZit+Ui+εit.

The Time-Varying Confounding Structural Equation Model (TVC-SEM)

A straightforward alternative way to include the effects of a set of time-varying unobservable confounders is suggested by Dormann (2001), who builds on the early common factor models for assessing “spuriousness” in cross-lagged panel models (Finkel, 1995; Kenny, 1975). In these measurement-like models, X and Y are treated as indicators of a latent construct which accounts for their observed association at given points in time. Our TVC-SEM can be viewed as a dynamic common-factor specification in which the latent factor plays the same conceptual role that Dormann’s (2001) time-varying common factor did — namely, accounting for any “third-variable” association between X and Y — but is now allowed to evolve autoregressively across waves to summarily capture both time-varying and time-invariant unobserved confounding. In this way, we are bringing back the common-factor idea but repurposing it as a robustness test that allows for unmeasured time-varying confounding in modern causal inference parlance. We show the time-varying confounding SEM (TVC-SEM) for the four-wave case in Figure 3 and Equation (3) below.

Click to enlarge
meth.16875-f3
Figure 3

An Alternative Latent Variable Approach for Controlling Unobserved Time-Varying Confounding (Error Covariances Are Assumed Between Xit)

3
Yit=α+β1Xit+γtZit+εit.

The model contains a time-varying latent Zit that has varying direct effects on Xit and Yit at each wave. However, as opposed to Figure 2 and its accompanying Equation (2), the γt effect of time-varying Zit subsumes the effects of both unobserved time-varying and time-invariant confounding; that is, the effects of Zit in Figure 3 represent the effects of both Ui and Zit in Figure 2. This is the key idea of the approach that allows us to identify the model, since models with two latent variables that represent time-invariant and time-varying latent variables respectively are not identifiable. Most importantly, the autoregressive parameters of Zit (Zit-1 → Zit) capture the stability of unobserved confounding and thus provide an estimate of the relative prevalence of stable versus time-varying confounders in the causal process. The model thus provides estimates of the effect of Xit on Yit controlling for possible unobserved time-invariant and time-varying confounding that may exist in the process.2

At first glance, the TVC-SEM bears resemblance to a traditional FE-SEM model with autocorrelated (AR (1)) disturbances, where the Zit proxies for both time-invariant confounding (Ui) and the AR (1) process of the disturbances in Yit. However, in contrast to the fixed-effects AR (1) model, we directly model the effect of the latent variable Zit on the Xit as well, thus accounting for the confounding of the causal effects from Xit to Yit that is due to their joint association with Zit (and Ui as well). In this sense, the FE-SEM model with AR (1) disturbances, as the other FE-SEM variants, falls short in controlling for unobserved time-varying confounding.3

In the model proposed by Dormann, which incorporates autoregressive effects (often referred to as “dynamic” effects) of both the independent variable (Xit) and the dependent variable (Yit), identifying the model requires the inclusion of three endogenous variables. Additionally, to ensure model identification, there must be equality constraints on the autoregressive parameters as well as stability in the variance of the latent variable (Zit) over time.

In our case, the traditional fixed-effects model does not contain dynamic components for either X or Y. This allows for a simpler version of the Dormann model, where only two endogenous variables are needed, provided there are at least three waves of data collection. Consequently, we are not required to impose homogeneity restrictions on the variances of the time-varying latent variable Zit. This adaptation follows the practices of traditional fixed-effects SEMs (Bollen & Brand, 2010).

As can be seen from the coefficient from Zit to Yit in Figure 3, however, we restrict the paths at 1 for model identification (as in Bollen & Brand, 2010), which also serves as a reference to gauge time-varying effects of Zit at each wave. We also build correlations between Xit to account for remaining associations among predictors as shown in Figure 1 (for the model specification see SM 1).

All of the model parameters can be freely estimated without additional constraint subject to the following assumptions regarding Zt and its relationship with Xt and Yt (for notational brevity, in what follows we suppress the individual index i):

  1. Dynamic structure. Zt follows a first-order Markov/AR (1) process across waves; we do not include higher-order lags of Zt unless explicitly stated.

  2. Contemporaneous effects only. The effects of Zt on Xt and Yt are contemporaneous, with no additional lagged effects of Zt on later values of Xt and Yt.

  3. Conditional exogeneity. Conditional on prior observed variables and Zt, the structural disturbances εtX and εtY are uncorrelated with Zt (i.e., no residual–Zt correlation).

  4. Functional form and support. Effects are linear and additive in the baseline specification, and Zt is not perfectly collinear with Xt or Yt.

As in the traditional FE model, we also assume that residuals are independent, there is no autoregressive parameter of Yt, no reverse causal effects of Yt1 on Xt, and no measurement errors in the variables.

Using a Monte Carlo simulation and a real data example, we test whether the TVC-SEM model with time-varying Zit provides less biased estimates of causal direct effects than the traditional FE-SEM models as well as a naïve model without any of the FE corrections for stable unobserved confounding variables. We show that TVC-SEM can replicate FE-SEM estimates under many conditions; more importantly, it provides substantially less biased estimates than FE-SEM whenever time-varying confounding is present, thus making the model an appropriate robustness test for the presence of time-varying confounding and the possible biases that would result in causal estimates from traditional fixed-effects models.

Simulation Tests

We generate our data by following Allison et al. (2017) and Leszczensky and Wolbring (2022). We generate a model with two random variables X and Y, where X has a causal direct effect on Y, with an additional unobserved time-invariant variable Ui having (initially) constant effects on both X and Y. We also generate a time-varying variable Zit which has possible time-varying effects on both X and Y. We then build our data based on the following equation:

4
Yit=β1Xit+β2Ui+β3Zit+εyit,

where β1 is the constant direct effect of X on Y, which is of primary interest; β2 is the (initially) time-invariant effect of the stable unit-specific Ui variable, and β3 is the time-varying effect of Zit, the unobserved time-varying confounder. The time-specific disturbances, εyit, are assumed to be each standard normal, independent of all exogenous variables and with no autoregressive component.

Extending Equation (4) to the full data-generating system, we manipulate three types of unobserved confounding:

  1. Time-invariant effects of Ui only.

  2. Time-varying effects of Ui.

  3. Time-varying effects of both Ui and Zit.

Figure 4 (a, b, c) illustrates the data-generating models of the three conditions (for simplicity, only two time points are illustrated). We explore how each model responds to various simulation settings as shown in Table 1. Our aim is to test which model(s) recovers the true value of β1, which is set to .4 in all conditions, in the presence of the different levels of unobserved confounding in the population. We begin with four-wave data. Note that the numbers in bold in Table 1 are values for the baseline model. Each parameter is then varied while keeping all other parameters at their baseline values (as in Allison et al., 2017). For simulating the conditions of time-varying effects of unobserved confounders (as in Figures 4b and c), we vary the influences of Ui on Xit and Yit at each wave with incremental/decremental changes from the lowest/highest values of β2, and also vary the effect of Zit on Xit and Yit, as well as the stability of Zit; the symbol Δ indicates a per-wave linear step.

Click to enlarge
meth.16875-f4
Figure 4

Data-Generating Models for Unobserved Confounding

Table 1

Components of the Simulation and Parameter Values

ConceptValues
Sample size500
Number of waves4
XitYit (β1).4
Stability of Zit (Zit1Zit) ∆ -.2; ∆ .2; .2; .5; 1
Time-invariant effects of Ui (β2)0; .2; .5; .8
Time-varying effects of Zit on Yit (β3)
Zi1Yi1.8; 1
Zi2Yi2.6; .8
Zi3Yi3.4; .6
Zi4Yi4.2; .4
Time-varying effects of Zit on Xit (β4)
Zi1Xi1.3; .7
Zi2Xi2.5; .9
Zi3Xi3-.1; .3
Zi4Xi4.1; .5
Time-varying effects of Ui (β5)
∆ .2/.2; ∆ .2/.3;
∆ -.2/-.3; ∆ .2/-.3

The performance of five estimation models is compared:

  1. A “naïve” SEM with no unobserved confounding at all, either time-invariant or time-varying (Naïve SEM).

  2. The FE-SEM with time-invariant effects of the stable Ui term (FE-SEM1).

  3. The FE-SEM with time-varying effects of Ui (FE-SEM2).

  4. The FE-SEM with an AR (1) disturbance (FE-SEM3).4

  5. The TVC-SEM model with a time-varying Zit which subsumes both unobserved time-invariant and time-varying confounders.

We generate the data using Stata 17 (n = 500, t = 4, and 500 iterations) and run each model using Mplus 8.0 with the maximum likelihood estimator. Our research specifically aims to compare the efficacy of different latent models in accurately recovering true parameters under various conditions. In line with the approaches of Leszczensky and Wolbring (2022) and Vaisey and Miles (2017), we primarily focus on analyzing estimated coefficients. Additionally, we report relative bias and SE/SD ratios, with a ratio of one indicating accurate standard error estimation. We do not report model fit indices (e.g., CFI, RMSEA, TLI), as they do not reliably indicate true models in our simulations (Bollen & Pearl, 2013). We illustrate encountered model convergence percentages in each figure and table.

We then compare the models using two data sets from the Early Childhood Longitudinal Studies Kindergarten cohort (ECLS-K) and the Rural Substance Abuse and Violence Project (RSVP) in the US, respectively. The ECLS-K follows the kindergarten cohort of 2010–2011 through the 2015–2016 school year, providing a comprehensive picture of children’s academic development. We investigate the link between body mass index (BMI) and math achievement (n = 18,170). Previous studies report the significant negative association between BMI (or overweight status) and academic achievement especially for girls, though rigorous longitudinal evidence remains lacking, due largely to the problem of unobserved confounding (Santana et al., 2017). Here we use the standardized composite BMI calculated by composite weight and height and math item response theory scores from Grades 1 to 4, and run each model for boys and girls, respectively.

The RSVP data is a longitudinal study (2001–2004) of criminal victimization and offending among a panel of 7th grade students from the state of Kentucky (n = 3,968). It is often hypothesized that victimization contributes to subsequent offending by creating negative affect such as anger. Yet previous studies report mixed findings on the hypothesis of the positive relationship, partly because of unobserved confounding or different time specifications of key variables (Ousey et al., 2011). Here we investigate the link between victimization (i.e., experience as a victim of physical assault or robbery) and offending (i.e., self-reported delinquent behaviors, e.g., robbery, theft, or assault) across four waves.5

Results

Condition #1: Time-Invariant Confounding Only With Time-Invariant Causal Effects (Figure 4a)

Figure 5a illustrates the results from a simulation where there is no time-varying confounding from Zit at all, and where there are varied magnitudes of confounding from time-invariant Ui. The effects of Ui in this simulation are also constrained to be equal over time. This simulation corresponds exactly to the assumptions underlying the traditional fixed-effects model, and thus should be the most favorable condition for FE-SEM1 (the corresponding tables to Figure 5 are illustrated in Supplementary Material (SM) 4–5).

Click to enlarge
meth.16875-f5
Figure 5

Simulation Results Under Varying Levels of Time-Invariant and Varying Effects of Ui

Note. % denotes model convergence rates. The x-axis for Figure 5b lists discrete simulation scenarios; points are connected only for visual purposes.

The zero value of Ui effects (in Figure 5a) reflects the condition of no unobserved heterogeneity that relates to both X and Y. Under such conditions, naïve SEM and the variants of the FE-SEM model, i.e., with time-invariant and time-varying effects of Ui, provide unbiased estimates, while TVC-SEM model with Zit provides a biased estimate (.13) along with a very high rate of convergence failure (85%) as well as a poor SE/SD ratio (.54) in SM 4. This occurs because TVC-SEM attempts to capture both (non-existent) Ui and Zit effects in the simulated data. Likewise, while FE-SEM1 and FE-SEM3 do not show any model convergence issue, FE-SEM2 introducing time-varying latent Ui also has some degree of convergence failure when the true Ui effects are null (Ui = 0, 37%) to small (Ui = .2, 11%).

As the influence of Ui increases, FE-SEM1-3 recover the exact causal effect, while naïve SEM produces upwardly biased estimates of up to 60%. Importantly, TVC-SEM also recovers true values without a serious convergence failure whenever the effects of Ui exceed .3 (to capture the degree of bias in detail, we illustrate findings with an increase of one decimal unit on Ui in Figure 5a). This is an important result: TVC-SEM with Zit can recover causal effects as well as all FE-SEMs even when all confounders are time-invariant, so long as the Ui effects are at least of relatively modest magnitude (≥ .3). Only when the true population model has effects of Ui which are null to small in magnitude and the model has no time-varying confounding at all are traditional FE-SEMs better able to recover the causal direct effect of interest.

Condition #2: Time-Invariant Confounding With Time-Varying Causal Effects (Figure 4b)

In the previous section, we assume that Ui effects are constant over time. As noted above, however, even fixed unobserved characteristics may have differential effects on X and Y over time (as, for example, in Ren & Allison, 2025). Such a model would indicate that the stable unobservables influence both the level and the change in the outcome over time. We thus vary the influences of Ui on X and Y at each wave with incremental/decremental changes from the lowest/highest values of unobserved heterogeneity in Table 1.

From the results, as seen in Figure 5b with full results in SM 5, in the presence of time-varying effects of Ui, even FE-SEM1 and FE-SEM3 produce biased estimates of up to 23%. As expected, the FE-SEM2 allowing for time-varying effects of Ui successfully recovers the true values. The TVC-SEM with time-varying effects of Zit, however, also performs very well compared to both the Naïve and FE-SEM1 and FE-SEM3 with a minor convergence failure (6%). In other words, when stable unobservables have varying effects on outcomes over time, TVC-SEM performs better in terms of bias than both the traditional FE-SEM1 that assumes time-invariant effects of Ui, better than the FE-SEM3 with an AR (1) disturbance, and it performs nearly as well as the model that conforms to the true data generating process in the population.

Condition #3: Time-Invariant Ui and Time-Varying Zit, Each With Time-Varying Causal Effects (Figure 4c)

We model this condition by varying the effect of Zit on Xit and Yit at different waves, fixing all other parameters at their baseline values. The results are shown in Figure 6 (also in SM 6). In this scenario, all of the FE-SEM models perform extremely poorly, with biases up to 55%.6 Among the five models, TVC-SEM provides by far the best estimates in the presence of time-varying confounding effects of Zit and Ui, does so without convergence failures, and does so regardless of the amount of (non-zero) time-varying unobserved confounding which exists in the true population.7

Click to enlarge
meth.16875-f6
Figure 6

Simulation Results Under Varying Levels of Time-Varying Effects of Ui and Zit

Note. The x-axis lists distinct design conditions; points are connected only for visual purposes.

We also test further the five models by varying the autoregressive parameters of Zit (i.e., its stability). The results, illustrated in SM 3, show broadly similar patterns that TVC-SEM with time-varying Zit tends to provide less biased estimates under most conditions. Yet, when the stability of Zit is too low (.2) or the fluctuation in the stability of Zit is present (∆ .2), SEM with time-varying Zit yields more biased estimates. But in every case the biases are less than any of the alternative models.

It is also worth noting that TVC-SEM generally produced larger (more conservative) standard errors than FE-based alternatives, reflecting its richer parameterization — e.g., wave-specific Z→X and Z→Y paths and serial covariance terms. Accordingly, we position TVC-SEM primarily as a bias-robustness tool, while FE models may be preferable when efficiency is paramount and when their assumptions — i.e., when only time-invariant confounders are relevant to the causal processes — are credible (see the conclusion for further discussion).

Empirical Examples

We now apply the five models, including the naïve SEM and both versions of FE-SEM and TVC-SEM, in the ECLS-K and RSVP data across four waves.8 Using the ECLS-K, we first investigate the link between child BMI and math achievement between boys and girls, respectively (from Grades 1 to 4). The model fit indices are reported only for reference.

In Table 2, naïve SEM (M1) shows a significant negative association between BMI and math scores for boys (-.06) and girls (-.05) at p < .001. Notably, in the traditional FE-SEM1 (M2) the observed direct effects reduce to about half of the coefficients in the naïve model. For girls, the effects are reduced further and are no longer statistically significant in FE-SEM2 (M3) that allows for time-varying effects of the stable unobservables. FE-SEM3 with an AR (1) disturbance (M4) also yields the same estimates as that of FE-SEM2, though model fit is slightly improved.

Table 2

Results from the ECLS-K

M1M2M3M4M5
ModelNaïve SEMFE-SEM1FE-SEM2FE-SEM3TVC-SEM
Model fit indices:χ2: 108.46 (15)χ2: 930.48 (16)χ2: 832.30 (13)χ2:157.01 (15)χ2 :55.01 (10)
Boys (8,080)• CFI: .99• CFI: .98• CFI: .98• CFI: .99• CFI: .99
• TLI: .99• TLI: .97• TLI: .97• TLI: .99• TLI: .99
• RMSEA: .03• RMSEA: .08• RMSEA: .09• RMSEA: .03• RMSEA: .02
BMI → Math:-.06*** (.01):-.03*** (.01):-.03*** (.01):-.03*** (.01):.01 (.02):
95% BC CI[-.07 to -.04][-.05 to -.02][-.05 to -.02][-.05 to -.02][-.03 to .06]
Model fit indices:χ2: 125.29 (15)χ2: 694.59 (16)
χ2 : 581.49 (13)χ2:144.17 (15)χ2 : 37.91 (10)
Girls (7,740)• CFI: .99• CFI: .98• CFI: .98• CFI: .99• CFI: .99
• TLI: .99• TLI: .97• TLI: .97• TLI: .99• TLI: .99
• RMSEA: .03• RMSEA: .07• RMSEA: .08• RMSEA: .03• RMSEA: .02
BMI → Math:
-.05*** (.01):
-.02* (.01):
-.01 (.01):
-.01 (.01):
.02 (.03):
95% BC CI[-.06 to -.04][-.03 to .00][-.03 to .01][-.03 to .00][-.03 to .08]

Note. Variables are standardized. Sample sizes are rounded per NCES guidelines. 95% BC CI are presented below each estimate. TVC-SEMs were estimated without autoregressive equality constraints on Zit.

*p < .05. **p < .01. ***p < .001.

The drop in coefficient magnitudes is predictable because, while omitted stable confounders such as family/school SES (within a relatively short interval) are likely to be positively (+) correlated with math achievement, they tend to be negatively (-) correlated with BMI or overweight status. Since there is an open negative (-) backdoor path between BMI and math scores, M1 is likely to produce a downward bias relative to FE-SEMs.

Under the assumption that some time-varying unobserved confounding exists, M5 shows BMI-math relationships in M1 and M2 drop and lose significance for boys and girls. Moreover, a cross-model Wald test (Clogg et al., 1995) comparing FE-SEM1 (M2) and TVC-SEM (M5) for boys shows that the .04 difference in effects is statistically significant; given the larger standard errors in the TVC-SEM model, this kind of comparison is perhaps a more important indication of the possible impact of time-varying unobserved confounding than the examination of the TVC-SEM model on its own. Yet the corresponding cross-model Wald test for girls shows a statistically insignificant change (Δ = .04) in the causal impact of BMI on math achievement.

In Table 3, we present the year-to-year relationship between crime victimization and offending from the RSVP data. The positive association between victimization and offending decreases from .36 in naïve M1 to .26 in FE-SEMs and further to .18 in TVC-SEM — a 50% reduction in magnitude. In contrast to the ECLS-K results, the causal effect of interest is statistically significant in both the FE-SEM and TVC-SEM model estimations, but the cross-model test comparing the TVC-SEM with FE-SEM estimates nevertheless shows a significant reduction in the effect, once the possibility of time-varying unobserved confounding is allowed.

Table 3

Results from the RSVP

M1M2M3M4M5
ModelNaïve SEMFE-SEM1FE-SEM2FE-SEM3TVC-SEM
Model fit indicesχ2: 260.93 (15)χ2: 101.29 (16)χ2: 80.52 (13)χ2: 19.18 (15)χ2: 12.86 (10)
• CFI: .95• CFI: .98• CFI: .99• CFI: .99• CFI: .99
• TLI: .92• TLI: .97• TLI: .98• TLI: .99• TLI: .99
• RMSEA: .06• RMSEA: .04• RMSEA: .04• RMSEA: .01• RMSEA: .01
Victimization → Offending:.36*** (.01):.26*** (.01):.26*** (.01):.26*** (.01):.18*** (.02):
95% BC CI[.33 to .38][.24 to .29][.24 to .29][.23 to .29][.13 to .22]

Note. 95% BC CI are presented below each estimate. N = 3,965. TVC-SEMs were estimated without autoregressive equality constraints on Zit.

*p < .05. **p < .01. ***p < .001.

Discussion

The widely used fixed-effects model is based on the assumption that unobserved heterogeneity is time-invariant, which is unlikely to obtain in many observational longitudinal studies. In the present study, we incorporate the early idea of Dormann (2001) into the fixed-effects model framework. The resultant model, which we call TVC-SEM, includes a latent time-varying Zit which serves as a combination of the time-invariant and time-varying factors which may confound estimates of causal effects in longitudinal models. The model is straightforward to specify, with one additional latent variable affecting Xit and Yit at each wave in time with autoregressive effects from wave to wave. It is identifiable with at least 3 waves of observation and in our simulations, consistently yielded less biased estimates than any of the variants of FE-SEM in the presence of time-varying confounding. Further, it provides more conservative estimates of causal direct effects in two analyses of observational longitudinal studies.

Our approach has several notable strengths. First, the analytic model is relatively simple and straightforward compared to other sophisticated methods for addressing time-varying confounding (e.g., G-estimation or IPTW; Platt et al., 2009). Rather than proposing TVC-SEM as a general-purpose estimator for unmeasured time-varying confounding, though, we present it as a diagnostic tool to probe the sensitivity of FE-based estimates under more flexible assumptions. Unlike Dormann’s original study, it does not require a third variable for model identification, nor does it necessitate repeatedly measured potential confounders as required in IPTW approaches. This makes TVC-SEM particularly valuable in observational longitudinal studies where only a few time-varying confounders are observed.

Further, the flexibility of SEM allows researchers to test total effects, reverse causality, and conduct robustness or sensitivity checks against various types of potential bias. Observed time-varying covariates can also be incorporated with equality constraints, as in FE-SEM. Accounting for autocorrelated residuals, missing data, and measurement errors are additional benefits that the TVC-SEM, like other models in the structural equation tradition, can offer. Our study thus contributes to previous research that highlights the general flexibility and capability of SEM-based panel models using a latent variable approach (Allison, 2009; Allison et al., 2017; Bollen & Brand, 2010; Finkel, 1995; Zyphur et al., 2020).

One limitation of TVC-SEM is that in the absence of any unobserved confounding — that is, neither time-invariant nor time-varying — it showed a relatively poor performance with a higher rate of convergence failure. Moreover, TVC-SEM underperformed most FE models when time-invariant confounding was relatively modest and no time-varying confounding was present. These results clarify the boundary conditions of our approach: while the TVC-SEM model produces unbiased effects under many conditions of unobserved confounding, it is unequivocally superior to FE models only when some degree of time-varying confounding exists. These findings suggest that TVC-SEM is not intended as a replacement for FE-based models, but rather as a complementary approach to evaluate potential biases in FE estimates when researchers may anticipate both time-invariant and time-varying unobserved confounding.

We thus recommend that researchers initially assess the potential structure of unobserved confounding within their specific research context and hypotheses, drawing on subject-matter expertise. For example, understanding the nuances of how childhood obesity impacts educational outcomes can inform the anticipation and modeling of potential unobserved confounders over time. This subject-matter insight is crucial for hypothesizing about the nature and evolution of such confounders.

In scenarios commonly characterized by both time-invariant and time-varying confounding, traditional FE-SEMs are likely to yield significantly larger biases relative to TVC-SEM. Yet, TVC-SEM is typically a less efficient estimator, similar to instrumental variables (Martens et al., 2006). Therefore, we advise researchers to focus on both the direction and magnitude of direct effects when comparing TVC-SEM with variants of FE-SEM; if our model yields significantly different effects, this would be strong evidence that time-varying unobserved factors are potentially confounding the causal effects. Conversely, if our model yields different effects from FE-SEM but with the differences being statistically insignificant, the conclusion becomes less definitive. In all cases, the direction and magnitude of coefficients and their differences across models should be evaluated and considered as part of the total evidence for the existence and magnitude of causal effects. In this way, TVC-SEM is best viewed as a complementary robustness test for FE-SEM rather than a universal alternative.

In terms of limitations, the TVC-SEM we developed here is built on the assumption of contemporaneous effects of X on Y, and on the lack of dynamic autoregressive effects between X and Y from wave to wave. Also the TVC-SEM shares the same functional form assumptions, such as linearity or additivity, as the FE-SEM. And while TVC-SEM outperforms traditional fixed-effects models in most cases, it does not always precisely recover true effects. Further the examined patterns of unobserved time-varying confounding are not exhaustive in this study; in observational studies, emerging patterns of time-varying confounding are complex and somewhat unpredictable due to uncontrollable factors (Platt et al., 2009). Yet researchers will be able to account for such biases by adjusting the covariance structures between Ui and Zit in our simulations.

The proposed TVC-SEM is built on the tradition of FE-models. Future studies should extend the approach to incorporate cross-lagged reciprocal dynamics and dynamic autoregressive effects of both Xit and Yit over time, and assess the performance of TVC-SEM relative to alternative panel models (e.g., Allison et al., 2017; Bollen & Brand, 2010; Zyphur et al., 2020). For instance, Kenny and McCoach (2025) introduce a Latent Time-Varying Covariate (LTVC) approach that tests whether a latent time-varying covariate can explain the observed X–Y covariation in two-variable cross-lagged multi-wave designs, with identification under stationarity constraints. The two frameworks can be seen as complementary: TVC-SEM is FE-oriented and offers an estimation-centered sensitivity analysis (recovering within-person direct effects while explicitly modeling a latent time-varying confounder), whereas LTVC-CLPM is a cross-lagged panel model-based diagnostic test. Future work should also compare TVC-SEM with causal frameworks designed for time-varying confounding under different assumptions (e.g., instrumental variables; marginal structural models using IPTW or g-computation).

Notes

1) An alternative method for dealing with time-varying confounding is an instrumental variable analysis where an outside “instrument” serves as a proxy for an explanatory variable whose effects may be confounded by omitted variable bias. In observational studies, however, it can be challenging to identify suitable IVs that meet the assumptions of the model (Sajons, 2020). Methods like inverse-probability-of-treatment weighted (IPTW) and G-computation require consistent measurement of confounders and are thus limited in their applicability in control for unobserved time-varying confounding (Daniel et al., 2013).

2) Our model could be conceptualized as having both “between” effects from the unobservables (what Ui would be from Figure 2) and “within” effects from Zit, with these effects being permitted to differ. To the extent that the autoregressive effect of Zit is 1, this would imply that only the “between” effect of the stable unobservables is relevant, whereas an autoregressive effect of Zit of 0 would imply a model with only “within” effects. To the extent that the Zit are relevant but not perfectly stable, the autoregressive effect of Zit would fall between 0 and 1, and the estimated effect of Zit would be a mixture of the unobserved confounders’ within and between effects.

3) In the following parts of the paper, we compare the performance of the TVC-SEM with the AR (1) model, as well as the other variants of the FE-SEM model we have presented thus far.

4) FE-SEM3 extends the basic fixed-effects framework by adding a first-order autocorrelation term — that is, it lets any unexplained change in one wave carry over to the next. In practice, this means it picks up short-run momentum in the outcome that remains after we control for observed variables and stable personal traits. By modelling this spill-over, FE-SEM3 may yield more accurate coefficients and standard errors while still purging all time-invariant confounding.

5) The public ECLS-K data were obtained from the National Center for Education Statistics (2019), and the RSVP data were accessed from Statistical Horizons (n.d.).

6) We also observe that FE-SEM2 with time-varying effects of Ui sometimes produces more upwardly biased estimates than FE-SEM1, especially in more extreme conditions (see also bolded values in SM 2).

7) We further examined the case in which Ui has time-invariant effects and Zit has time-varying effects; SM 7 shows similar patterns.

8) We provide the Mplus syntax used for each model in SM 8. In dealing with a potential violation of normal distribution assumption in the proposed models, bootstrap methods are preferred.

Funding

The authors have no funding to report.

Acknowledgments

We thank the Editor, the Associate Editor, and the anonymous reviewers for their helpful comments on an earlier version of this manuscript. An earlier version of this work was recognized at the 2025 Annual Meeting of the American Educational Research Association (AERA), Division H.

Competing Interests

The authors have declared that no competing interests exist.

Publisher Note

This Corrected Version of Record (CVoR) differs from the original Version of Record (VoR) published on March 27, 2026. Due to a publisher error, the following changes needed to be made: The two publicly available datasets ECLS-K and RSVP were added to the reference section and the links were updated to point directly to these sources. Additionally, footnote 5 on page 38 was modified to incorporate these changes.

Data Availability

This study used publicly available secondary datasets. The public ECLS-K data were obtained from the National Center for Education Statistics (2019), and the RSVP data were accessed from Statistical Horizons (n.d.). Supplementary materials, including the model specification of TVC-SEM, study simulation results, and syntax for simulated data and models, are available as online supplementary materials accompanying this article (Yu & Finkel, 2026).

Supplementary Materials

Type of supplementary materialsAvailability/Access
Data
ECLS-K dataset (publicly available)National Center for Education Statistics (2019)
RSVP dataset (publicly available)Statistical Horizons (n.d.)
Code
Syntax for simulated data and models.Yu and Finkel (2026)
Material
Model specification of TVC-SEM.Yu and Finkel (2026)
Study simulation results.Yu and Finkel (2026)
Syntax for simulated data and models.Yu and Finkel (2026)
Study/Analysis preregistration
The study was not preregistered.
Other
No other materials available.

References

  • Allison, P. D. (2009). Fixed effects regression models. SAGE Publications.

  • Allison, P. D., Williams, R., & Moral-Benito, E. (2017). Maximum likelihood for cross-lagged panel models with fixed effects. Socius: Sociological Research for a Dynamic World, 3, . https://doi.org/10.1177/2378023117710578

  • Angrist, J. D., & Pischke, J.-S. (2009). Mostly harmless econometrics: An empiricist's companion. Princeton University Press.

  • Bollen, K. A., & Brand, J. E. (2010). A general panel model with random and fixed effects: A structural equations approach. Social Forces, 89(1), 1-34. https://doi.org/10.1353/sof.2010.0072

  • Bollen, K. A., & Pearl, J. (2013). Eight myths about causality and structural equation models. In S. L. Morgan (Ed.), Handbook of causal analysis for social research (pp. 301–328). Springer.

  • Carbonaro, W., & Maloney, E. (2019). Extracurricular activities and student outcomes in elementary and middle school: Causal effects or self-selection? Socius: Sociological Research for a Dynamic World, 5, . https://doi.org/10.1177/2378023119845496

  • Clare, P. J., Dobbins, T. A., & Mattick, R. P. (2019). Causal models adjusting for time-varying confounding — A systematic review of the literature. International Journal of Epidemiology, 48(1), 254-265. https://doi.org/10.1093/ije/dyy218

  • Clogg, C. C., Petkova, E., & Haritou, A. (1995). Statistical methods for comparing regression coefficients between models. American Journal of Sociology, 100(5), 1261-1293. https://doi.org/10.1086/230638

  • Daniel, R. M., Cousens, S. N., De Stavola, B. L., Kenward, M. G., & Sterne, J. A. C. (2013). Methods for dealing with time-dependent confounding. Statistics in Medicine, 32(9), 1584-1618. https://doi.org/10.1002/sim.5686

  • Dormann, C. (2001). Modeling unmeasured third variables in longitudinal studies. Structural Equation Modeling, 8(4), 575-598. https://doi.org/10.1207/S15328007SEM0804_04

  • Dormann, C., & Zapf, D. (2002). Social stressors at work, irritation, and depressive symptoms: Accounting for unmeasured third variables in a multi‐wave study. Journal of Occupational and Organizational Psychology, 75(1), 33-58. https://doi.org/10.1348/096317902167630

  • Finkel, S. E. (1995). Causal analysis with panel data. SAGE Publications.

  • Finkel, S. E. (2002). Civic education and the mobilization of political participation in developing democracies. Journal of Politics, 64(4), 994-1020. https://doi.org/10.1111/1468-2508.00160

  • Förster, L.-J., Vogel, M., Stein, R., Hilbert, A., Breinker, J. L., Böttcher, M., Kiess, W., & Poulain, T. (2023). Mental health in children and adolescents with overweight or obesity. BMC Public Health, 23(1), Article 135. https://doi.org/10.1186/s12889-023-15032-z

  • Harring, J. R., McNeish, D. M., & Hancock, G. R. (2017). Using phantom variables in structural equation modeling to assess model sensitivity to external misspecification. Psychological Methods, 22(4), 616-631. https://doi.org/10.1037/met0000103

  • Hill, T. D., Davis, A. P., Roos, J. M., & French, M. T. (2020). Limitations of fixed-effects models for panel data. Sociological Perspectives, 63(3), 357-369. https://doi.org/10.1177/0731121419863785

  • Kenny, D. A. (1975). Cross-lagged panel correlation: A test for spuriousness. Psychological Bulletin, 82(6), 887-903. https://doi.org/10.1037/0033-2909.82.6.887

  • Kenny, D. A., & McCoach, D. B. (2025). Ruling out latent time-varying confounders in two-variable multi-wave studies. Multivariate Behavioral Research, 60(5), 973-989. https://doi.org/10.1080/00273171.2025.2503829

  • Leszczensky, L., & Wolbring, T. (2022). How to deal with reverse causality using panel data? Recommendations for researchers based on a simulation study. Sociological Methods & Research, 51(2), 837-865. https://doi.org/10.1177/0049124119882473

  • Martens, E. P., Pestman, W. R., de Boer, A., Belitser, S. V., & Klungel, O. H. (2006). Instrumental variables: Application and limitations. Epidemiology, 17(3), 260-267. https://doi.org/10.1097/01.ede.0000215160.88317.cb

  • McNeish, D., & Hamaker, E. L. (2020). A primer on two-level dynamic structural equation models for intensive longitudinal data in Mplus. Psychological Methods, 25(5), 610-635. https://doi.org/10.1037/met0000250

  • Nielsen, R. O., Bertelsen, M. L., Ramskov, D., Møller, M., Hulme, A., Theisen, D., Finch, C. F., Fortington, L. V., Mansournia, M. A., & Parner, E. T. (2019). Time-to-event analysis for sports injury research part 1: Time-varying exposures. British Journal of Sports Medicine, 53(1), 61-68. https://doi.org/10.1136/bjsports-2018-099408

  • National Center for Education Statistics. (2019). Early Childhood Longitudinal Study, Kindergarten Class of 2010–11 (ECLS-K:2011), kindergarten–fifth grade public-use file (NCES 2019-050) [Data set]. U.S. Department of Education. https://nces.ed.gov/ecls/dataproducts.asp

  • Ousey, G. C., Wilcox, P., & Fisher, B. S. (2011). Something old, something new: Revisiting competing hypotheses of the victimization-offending relationship among adolescents. Journal of Quantitative Criminology, 27(1), 53-84. https://doi.org/10.1007/s10940-010-9099-1

  • Platt, R. W., Schisterman, E. F., & Cole, S. R. (2009). Time-modified confounding. American Journal of Epidemiology, 170(6), 687-694. https://doi.org/10.1093/aje/kwp175

  • Ren, C., & Allison, P. (2025). Time-invariant variables’ time-varying effects: Misinterpretations of the fixed-effects model in ascriptive inequality research. Sociology Compass, 19(9), Article e70113. https://doi.org/10.1111/soc4.70113

  • Rindskopf, D. (1984). Using phantom and imaginary latent variables to parameterize constraints in linear structural models. Psychometrika, 49(1), 37-47. https://doi.org/10.1007/BF02294204

  • Sajons, G. B. (2020). Estimating the causal effect of measured endogenous variables: A tutorial on experimentally randomized instrumental variables. Leadership Quarterly, 31(5), Article 101348. https://doi.org/10.1016/j.leaqua.2019.101348

  • Santana, C. C. A., Hill, J. O., Azevedo, L. B., Gunnarsdottir, T., & Prado, W. L. (2017). The association between obesity and academic performance in youth: A systematic review. Obesity Reviews, 18(10), 1191-1199. https://doi.org/10.1111/obr.12582

  • Statistical Horizons (n.d.). Ousey [Data set]. https://statisticalhorizons.com/resources/data-sets

  • Umberson, D., & Karas Montez, J. (2010). Social relationships and health: A flashpoint for health policy. Journal of Health and Social Behavior, 51(Suppl), S54-S66. https://doi.org/10.1177/0022146510383501

  • Vaisey, S., & Miles, A. (2017). What you can — and can’t — do with three-wave panel data. Sociological Methods & Research, 46(1), 44-67. https://doi.org/10.1177/0049124114547769

  • Yu, B. & Finkel, S. (2026). Supplementary Materials to "Controlling for time-varying confounding in the longitudinal fixed-effects model: A latent variable approach" [Supplemental procedures and analyses]. PsychOpen GOLD. https://doi.org/10.23668/psycharchives.21788

  • Zyphur, M. J., Voelkle, M. C., Tay, L., Allison, P. D., Preacher, K. J., Zhang, Z., Hamaker, E. L., Shamsollahi, A., Pierides, D. C., Koval, P., & Diener, E. (2020). From data to causes II: Comparing approaches to panel data analysis. Organizational Research Methods, 23(4), 688-716. https://doi.org/10.1177/1094428119847280