Tutorial

Bridging the Gap: Introducing Joint Models for Longitudinal and Time-to-Event Data in the Social Sciences

Sophie Potts^1,² , Anja Rappl³ , Karin Kurz^2,⁴, Elisabeth Bergherr^1,²

[1] Chair of Spatial Data Science and Statistical Learning, University of Goettingen, Göttingen, Germany. [2] Campus Institute Data Science (CIDAS), University of Göttingen, Göttingen, Germany. [3] Chair of Biometry and Epidemiology, Friedrich-Alexander Universität Erlangen-Nürnberg, Erlangen, Germany. [4] Chair of Sociology / Social Stratification, University of Göttingen, Göttingen, Germany.

Methodology, 2026, Vol. 22(1), 77–108, https://doi.org/10.5964/meth.18465

Received: 2025-06-23. Accepted: 2026-01-13. Published (VoR): 2026-03-27.

Handling Editor: Levente Littvay, HUN-REN Centre for Social Sciences, Budapest, Hungary

Corresponding Author: Sophie Potts, E-mail: sophie.potts@uni-goettingen.de

Supplementary Materials: Code, Data, Preregistration [see Index of Supplementary Materials]

This is an open access article distributed under the terms of the Creative Commons Attribution 4.0 International License, CC BY 4.0, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Abstract

In time-to-event analyses in social sciences, there often exist endogenous time-varying variables, where the event status is correlated with the trajectory of the covariate itself. Ignoring this endogeneity will result in biased estimates. In the field of biostatistics this issue is tackled by estimating a joint model for longitudinal and time-to-event data as it handles endogenous covariates properly. This method is underused in the social sciences even though it is very useful to model longitudinal and time-to-event processes appropriately. Therefore, this paper provides a gentle introduction to the method of joint models and highlights its advantages for social science research questions. We demonstrate its usage on an example on marital satisfaction and marriage dissolution and compare the results with classical approaches such as a time-to-event model with a time-varying covariate. In addition to demonstrating the method, our results contribute to the understanding of the relationship between marriage satisfaction, marriage dissolution and other covariates.

Keywords: joint models, longitudinal data, time-to-event data, marriage dissolution, relationship satisfaction

Research questions pointing to the risk of experiencing an event as well as respective data sets are frequently found in social science research, e.g., in family formation (Kingsley, 2018; Kurz et al., 2006), educational attainment (Ameri et al., 2016), recidivism (Skardhamar & Telle, 2012) or reemployment (Hägglund & Bächmann, 2017). They are commonly estimated with hazard models from the field of time-to-event analysis. These research questions, including the examples from above, often involve time-varying covariates (TVC) which allow to model the impact of covariates that change over time. The classical approach to include the TVCs in time-to-event models is based on assuming that the value does not change between observations and hence carrying the last observation forward until the next observation. We call this last value carried forward (LVCF) assumption. This allows to make individual-specific predictions for each time point during their individual observation period, yet relies on the questionable assumption, that the value does not change between observation times. We refer to this modelling strategy as TVC approach. Furthermore, this strategy only holds appropriate estimation results, when the TVC is exogenous, i.e., is independent of the time-to-event outcome (event happened vs. censored). The exogeneity assumption does not hold in cases of anticipatory effects of the event or when the trajectory of the TVC is highly correlated with the outcome (more on the definition of exogeneity and endogeneity in the section Types of Time-Varying Covariates). Both assumptions, LVCF and exogeneity, are particularly questionable for many frequently changing and self-reported covariates as they are common in social sciences when individual trajectories of TVC and person-related events are of interest.

In these cases a joint model for longitudinal and time-to-event data (Wulfsohn & Tsiatis, 1997) is an appropriate estimation routine. It combines a longitudinal estimation procedure for the TVC and a classical time-to-event model. Including the estimates of the longitudinal model in the time-to-event model links them to a joint model. The estimation of the regressions coefficients of the two linked submodels is carried out simultaneously. It therefore allows to model the relationship between an endogenous covariate and the risk of an event appropriately and represents a useful tool to investigate complex social research questions. Shifting the focus to the longitudinal part, a joint model can be used to model missing not at random dropout from a longitudinal study (see also the subsection A Different Perspective on Joint Models). Further the link between the two models can be adapted to more complicated relationships (more on this in the Summary & Conclusion section).

Joint models are standard tools in biostatistics, e.g., to investigate the relationship between the trajectory of a biomarker in blood cells on all-cause mortality (Nüñez et al., 2014) or on recurrence of cancer (Ferrer et al., 2016) but do not yet belong to the standard toolkit of social science researchers (Cremers et al., 2021). In order to increase the usage in social science applications a low threshold introduction to the method is needed. Therefore, this tutorial paper aims to guide the reader through a social science application of a joint model. As an illustrative application, we will model the marital satisfaction and the risk of marriage dissolution using a German panel data base (Huinink et al., 2011).

Approaches to model two processes simultaneously as well as decomposition of effects already exist in social science literature, e.g., simultaneous equation models (Lillard, 1993) sometimes referred to as multi-process models (Mikolai & Kulu, 2018) or structural equation models (Rosen-Grandon et al., 2004) which target related yet different problems (latent variables and model accuracy for structural equation models, common unobserved heterogeneity and reciprocal causality for simultaneous equation models and in multi-process models). Modelling the trajectory of a longitudinal variable together with its influence on a time-to-event outcome is best modelled by a joint model for longitudinal and time-to-event data.

The following section highlights the definition of endogeneity and exogeneity in order to lay the foundation to identify endogenous covariates properly. The Joint Models for Longitudinal and Time-to-Event Data section introduces the method of joint models for longitudinal and time-to-event data and its properties using the example of marital satisfaction and the risk of marriage dissolution. After a short review of the literature of marital satisfaction and dissolution, the Application: Marriage Satisfaction and Time to Marriage Dissolution section describes the data base, the model specification for the application and discusses the estimation results. In the Comparison to Other Modelling Approaches subsection, the results are compared to other modelling strategies (classical TVC approach, two-stage model). The tutorial concludes with a summary and a discussion of possible extensions as well as problems of the application. The example is executed using the Software R and can be reproduced using the web supplementary material of this tutorial (see Potts et al., 2026).¹

Types of Time-Varying Covariates

Regarding the question whether a classical TVC approach models the data appropriately, the type of the time-varying covariate is crucial. The usage of the TVC approach in time-to-event models does not pose problems, when the modelled covariate is exogenous. In contrast, endogenous variables are problematic in the case of classical hazard models as they do result in biased estimates and one needs to exercise caution when making causal statements (Box-Steffensmeier & Jones, 2004). Thus, researchers should consider using a joint model when exogeneity is questionable. Therefore, we shortly review the definition of exogeneity and endogeneity in time-to-event models before describing the method.

Kalbfleisch and Prentice (2002) divide exogenous time-varying variables into two subcategories: defined and ancillary TVCs. Defined ones have a predetermined path in advance for all subjects of the study, e.g., historical period, cohort or age of the individual. In contrast, an ancillary TVC is the result of a stochastic process, which is external to the observation unit such as population level characteristics, e.g., unemployment rate in an economy (Yamaguchi, 1991). However, such exogenous covariates may be rather rare in micro-level analyses in social science research, since many time-varying covariates describe individual-specific changes.

Endogeneous time-varying variables are also categorized into two different sub-types (Kalbfleisch & Prentice, 2002): state dependent and rate dependent TVCs. The former comprises variables whose path is not independent of the state of the outcome variable. Consequently, they result in different paths of the TVC depending on the respective time-to-event outcome (e.g., marital satisfaction trajectories of still married vs. separated). ”In other words, the value of the time-dependent covariate carries information about the state of the dependent process.” (Blossfeld & Rohwer, 2001, p. 132)

Rate dependent covariates are directly correlated with the hazard rate of an event such that not only the trajectory of the TVC correlates with the outcome but the estimated risk of having an event influences the trajectory as well. This can be for example due to anticipation of the event, e.g., the effect of the anticipation of divorce on the working behaviour of women (Poortman, 2005). Other examples of possible endogenous covariates in social sciences include the number of failed/passed credits regarding the event of student drop-out or the amount of working hours and the timing of births.

Joint Models for Longitudinal and Time-to-Event Data

In this section, we start with an illustrative example and point out the shortcomings of classical time-to-event models when dealing with endogenous time-varying covariates. This is followed by an explanation of the method of joint models for longitudinal and time-to-event data and their advantages.

An Illustrative Example: Time-to-Event Model and Joint Model in Comparison

In the example we are interested in — the risk of marriage dissolution (event) and how covariates influence this risk — we may include covariates that are changing over time such as subjective marital satisfaction (TVC). Research interest lies in the influence of the trajectory of marital satisfaction and the risk of marriage dissolution. Assume marital satisfaction has been captured multiple times over the years and information on the start and possible end of the marriage are available. Figure 1 serves as an illustration for the fictional example using one single individual: the upper trajectory depicts his/her marital satisfaction measure, where the points correspond to the measurements in time. The measurement points are connected via the smooth trajectory function (solid line). The lower part represents the estimated hazard rate for marriage dissolution over time. In this example higher values of satisfaction (upper panel) go hand in hand with smaller estimated hazard rates, i.e., lower risk of ending the relationship (lower panel).

Click to enlarge

Figure 1

Scheme of Two Related Processes and Possible Modelling Strategies

Note. Upper panel: Measurements of the time-varying covariate (longitudinal process). Lower panel: Risk of having an event (time-to-event process). In a joint model the influence of the trajectory in the upper panel on the risk is estimated by an association parameter α and both processes can be modelled as functions of (shared) covariates.

In order to include the time-varying covariate marital satisfaction into a time-to-event model, one could use the classical TVC approach. This approach yields several problems since it assumes that the respective variable, (1) does not change between the observation times and, (2) is exogenous. These two assumptions are particularly questionable for volatile and self-reported variables. The first assumption would result in a step-function (dashed line in upper panel of Figure 1) between the measurements in the upper panel instead of the smooth path. Especially for infrequently measured variables with long periods between two measures this approach may model the trajectory inappropriately.

Additionally, as marital satisfaction does neither evolve from a stochastic process, which is external to the individual under study (ancillary TVC), nor can be calculated as a defined covariate, it can be called an endogenous TVC and therefore violates the second assumption. Figure 2 may be an indicator for state dependence of marital satisfaction showing the smoothed average trajectory of marital satisfaction clustered by marital status: (a) persons still in the relationship and (b) persons that ended their relationship to their married partner. The fact that the trajectories (both, for men and women) differ significantly between the marital status groups, is a strong indicator for a state-dependent TVC and thus a modelling technique able to appropriately adress this endogeneity has to be applied. Arguably rate dependence may also apply in this example, as Clark et al. (2008) found a strong anticipatory effect of the life event divorce regarding life satisfaction for men and women. It seems plausible that a similar effect exists for marital satisfaction. Besides the proper estimation in the presence of endogeneous time-varying covariates, an additional difference between the joint model and the classical time-to-event model is the inclusion of predictors for modelling the trajectory of the TVC. The basic approach of a TVC does not allow to investigate its predictors (Figure 1: Covariate 1, 2, 3 would not be taken into account for the upper panel).

Click to enlarge

Figure 2

Estimated Average Trajectory of Relationship Satisfaction of Persons by Event Status and Sex

Note. Non-linear smoother by sex (dark: men, light: women). Based on the illustration by Crowther et al. (2013).

Even though a two-stage approach can be applied to first model the TVC as a function of covariates in order to include their effects and further overcome the problem of LVCF, it has some unfavourable statistical properties mainly arising from the independent estimation of the two submodels. The subsequent time-to-event model treats the prediction of the firstly fitted longitudinal outcome, as if it was estimated without any uncertainty. This results in an underestimation of standard errors for the TVC and thus cannot be interpreted appropriately in terms of statistical inference. For an in-depth investigation on the consequences of using the aforementioned approaches instead of a joint model see Sweeting and Thompson (2011).

Compared to these two approaches (TVC approach, two-stage approach) the joint model for longitudinal and time-to-event data is advantageous as it:

Can handle endogenous TVC in time-to-event models.
Allows for proper inference in the presence of endogenous covariates due to simultaneous estimation routine.
Allows for changes in the TVC between observation points.
Deals with informative drop-out in longitudinal studies (see the A Different Perspective on Joint Models subsection).

A main drawback of joint models is the computational effort during the fitting procedure. In contrast, two-stage models are less computationally demanding. Therefore, some research focuses on combining the advantages of the two methods — the unbiased estimates from the joint model and the fast estimation of the two-stage approach (Leiva-Yamaguchi & Alvares, 2020).

Method

Joint models (Faucett & Thomas, 1996; Rizopoulos, 2012; Wulfsohn & Tsiatis, 1997) overcome the above mentioned shortcomings as they allow for joint modelling of a repeatedly measured outcome alongside the risk of having an event of interest. Rather than using the observations of the TVC, the joint model considers the repeatedly measured TVC as the result of a longitudinal process subject to its own model (marital satisfaction, Figure 1: upper part). This longitudinal process is combined with the related time-to-event process (risk of marriage dissolution, Figure 1: lower part).

The two submodels are described for the example in Figure 1 before merging them to the joint model. For a more general notation we refer the reader to Hickey et al. (2016).

First, the time-varying covariate is modeled via an appropriate model, in general a linear mixed model (LMM)², allowing for intra-individual variance along the time axis captured by random intercepts (b_0i) and possibly random slopes b_1i (commonly used on time as a covariate). The model corresponding to Figure 1 can be written as

1

y_{i} (t) = \underset{m_{i} (t)}{\underset{⏟}{(β_{0} + b_{0 i}) + (β_{1} + b_{1 i}) t + β_{2} x_{i 1} + β_{3} x_{i 2} + β_{4} x_{i 3} (t)}} + ε_{i} (t)

with ε_i(t) ∼ N (0, σ²) where i indicates the individual and t the time point of the measurement. In vectorized form the model can be rewritten as

2

y_{i} (t) = x_{i_{l o n g}} (t)' β + z_{i} (t)' b_{i} + ε_{i} (t)

where $x_{i_{l o n g}} (t)'$ is a row vector with all covariate values and a leading 1 for the intercept for person i at time t and z_i(t)′ = (1 t)′ holds the covariate values of the random effects, in this case a random intercept and a random slope on time t. Using this model allows to incorporate covariates as predictors of the estimated values of the TVC y_i(t). The covariates may be time-constant or time-varying with β being the regression coefficient vector. As in a classical LMM, the random intercepts and random slopes are assumed to stem from a multivariate normal distribution b_i ∼ $𝒩$ (0, Q).

The second related process is a time-to-event model,³ which is used to model the risk of having an event over time. The general form is a proportional hazards model, which consists of a baseline hazard h₀(t) scaled by a covariate part. The corresponding equation to Figure 1 is given by

3

h_{i} (t) = h_{0} (t) e x p [γ_{0} + γ_{1} x_{i 3} (t) + γ_{2} x_{i 4} (t) + γ_{3} x_{i 5}]

It can be rewritten in vectorized form as

4

h_{i} (t) = h_{0} (t) e x p [x_{i_{s u r v}} (t)' γ]

The modeled hazard function h_i(t) states the instantaneous risk of person i of having an event at time t (i.e., have a marriage dissolution). This model also contains covariates (Figure 1: Covariates 3–5), which may be exogenous time-varying or time-constant and a vector of coefficients γ.

In order to take marital satisfaction as an endogeneous covariate into the model, a joint model links the estimated value of the TVC process m_i(t) to the time-to-event model by incorporating it as a predictor and thus estimates their coefficients jointly:

5

h (t | M_{i} (t), x_{i}) = h_{0} (t) e x p [x_{i_{s u r v}} (t)' γ + α m_{i} (t)]

The coefficient α is called the association parameter. In contrast to the two-stage approach all coefficients (β, b, γ, α) are estimated simultaneously such that all uncertainty is included in the estimation procedure. Estimating the value of marital satisfaction ${\hat{m}}_{i} (t)$ involves all available observations of the person, such that the hazard in a joint model at this time does implicitly also depend on the covariate history M_i(t).

By including a covariate in both submodels, e.g., x_i3(t) in Equation (5), its direct and indirect effect on the risk of having an event can be separated. Thinking of a variable which has a strong influence on the TVC and further a smaller but significant impact on the survival: By estimating a unique $\hat{β}$ coefficient as well as a $\hat{γ}$ coefficient and the association parameter $\hat{α}$ we decompose the total effect via: $\hat{α} \hat{β} + \hat{γ}$ . Hereby $\hat{β}$ indicates the mediated effect of the covariate via the trajectory and $\hat{γ}$ represents the direct effect on the risk of having an event. This decomposition is especially helpful to understand the effect pathway of the respective covariate.

Estimation of the coefficients can be done using a Maximum Likelihood approach (Expectation-Maximization Algorithm) or Bayesian methods (Markov-Chain Monte-Carlo sampling). The different estimation strategies for joint models are presented and compared by Rappl et al. (2021). Joint models for longitudinal and time-to-event data are implemented in several R packages (Philipson et al., 2018; Rizopoulos, 2010) and are also available in STATA (Crowther, 2020; Crowther et al., 2013). Furthermore, joint models allow for individual-specific predictions as they control for individual characteristics via the random effects in the longitudinal model.

A Different Perspective on Joint Models

In the previous section, the joint model was presented with particular emphasis on the time-to-event submodel. However, there is also a body of literature focusing on the longitudinal submodel where the time-to-event submodel is used to model informative censoring from the longitudinal study (Asar et al., 2015; Hogan & Laird, 1997; Vonesh et al., 2005; Wu & Carroll, 1988). The utilisation of a joint model may be advantageous for example in situations involving group comparisons of trajectories in the presence of censoring or missing data mechanisms related to the trajectories. One approach to account for the informative censoring is to utilise an association structure based on the random effects of the longitudinal submodel to be included in the survival submodel. This strand of literature refers to joint models as shared parameter models and contributed to the strand of literature on misspecified regression models, e.g., ignored measurement error in covariates or ignored random effects (see e.g., Bound et al., 2001; Molenberghs & Verbeke, 2001).

Application: Marriage Satisfaction and Time to Marriage Dissolution

In order to demonstrate the use of joint models in social sciences, the relationship between satisfaction with the marriage and the time to marriage dissolution is investigated. There is a huge body of literature in the field of marital satisfaction, predictors of marriage dissolution/divorce and interrelations of the two. Some studies focus on questions of general development of marital satisfaction throughout the marriage (e.g., Lorber et al., 2014; Williamson & Lavner, 2019), others investigate predictors for marital satisfaction (e.g., Elmslie & Tebaldi, 2014; Huss & Pollmann-Schult, 2019). Some include marital satisfaction as a mediator between the risk of marriage dissolution and other effects in regression models, e.g., personality traits (Solomon & Jackson, 2014) or household work (Frisco & Williams, 2003). There are cross-sectional and longitudinal studies, with different degrees of exploitation of the longitudinal structure (two time-points vs. whole trajectory) of the data. Different methods were applied to investigate the effect of marital satisfaction on marriage dissolution. The latent growth curve approach of Lorber et al. (2014) for example indicates that the trajectory of marital satisfaction throughout the period of marriage should be modelled individual-specific. Most empirical studies fit separate models to male and female respondents, since the determinants of marriage dissolution and marital satisfaction differ between sexes. For a review of theoretical models regarding marital satisfaction evolution see Caughlin and Huston (2006). Following the large body of literature, we chose the most common covariates for marital satisfaction; our selection largely matches the findings of the meta analysis of Karney and Bradbury (1995).

We would like to highlight the paper of Frisco and Williams (2003) as it analyses the relationship between the two outcomes of interest in a regression. Their focus is to determine the influence on household work (in)equity on the odds of divorce and the possible mediating effect of marital satisfaction. Without considering individual trajectories, they find a small mediating effect of marital satisfaction but still state a significant direct positive effect of unfair high workload of household work on the odds of divorce eight years later for women. The study is based on measurements at two points in time.

To the best of our knowledge, so far no one has used a joint model for longitudinal and time-to-event analysis to investigate the relationship between marital satisfaction and time to marriage dissolution yet. As this model type allows to exploit the whole richness of data, i.e., the longitudinal character of the data as well as the information of timing of an event, we believe that it is highly suitable to generate more well-founded answers to the questions: “What determines marital satisfaction?”, “What determines marriage dissolution?” as well as “How does marital satisfaction mediate influences on the hazard of marriage dissolution?” with respect to the joint evolution of both processes over time.

Data Set

The analysis is based on the German pairfam data (Brüderl et al., 2023). Pairfam (“Panel Analysis of Intimate Relationships and Family Dynamics”) is a longitudinal study with 14 annual waves contributing to shed light to changes in family and relationship structures. It started in 2008 with over 12,000 respondents. Another sample of 1,489 East-German anchor persons (“DemoDiff”) is merged as a supplementary to the data base. A detailed description of the pairfam study can be found in Huinink et al. (2011). The relationship biographies of the respondents as well as the annual questionnaire about the satisfaction with the relationship can be used to build a joint model.

The final sample consists of all persons in the pairfam data, who were married in their first marriage over the course of at least three interviews. This restriction has been made due to two reasons: First, some measurements of the longitudinal variable on marriage satisfaction are needed for proper analysis. Second, using only the first marriage of a person is based on previous empirical findings that relationship stability differs between the first and following marriages and that a selection bias may be present (Jensen et al., 2016), which might also skew the results of the model. This leaves us with a final sample size of N = 3,616 persons/marriages of which 247 (≈ 7%) stated an end of this relationship during the observation period (number of events). We did not take the actual month of divorce as event time but the stated end of relationship (marriage dissolution).

Marital satisfaction is measured as the answer on an 11-point scale to the question “All in all, how satisfied are you with your relationship?” Some exemplary trajectories of marital satisfaction are depicted in Figure 3. Note that marriages which started before the first interview such as the person in Panel C in Figure 3, are not left censored for the time-to-event model since we know the start of their marriage. They just start at a different point in time with time-dependent information. In order to allow users to reproduce the analysis, a synthesized data set can be found in the web supplementary material at Potts et al. (2026).⁴

Click to enlarge

Figure 3

Example Trajectories of Relationship Satisfaction From Pairfam Participants

Note. Vertical dashed lines indicate the time of marriage dissolution.

Table 1 summarises the time-constant variables used from the pairfam data that are included in the models. Relationship duration at marriage, age at marriage and premarital cohabitation were included as time-constant covariates and years of education, personal net income, amount of household work, labor force status, children (presence of preschool child and number of children under 18 in the respondents household) and gender role attitudes are included as time-dependent exogenous variables. The variable on the division of household work is a weighted sum index resulting from five survey items. They are measured on a 5-point scale with endpoints indicating that the respondent does all the work (high values) or the partner does all household work (low values) and thus is a relative measure between the spouses. Further, gender role attitudes is also a sum index over three items. Note that all metric variables were z-score standardized for model estimation purposes and the scale of the time is changed from months to vary in the interval [0, 1] with 1 being the overall latest time-point measured in a marriage in the sample. Further note that for some of the covariates labelled as exogenous, the exogeneity assumption can also be called into question, as they may also respond to couple-separation (e.g., labor market status, number of children). However, for the sake of simplicity, we focus on one single endogenous covariate in this tutorial (for possible extensions see the Summary & Conclusion section).

Table 1

Description of the Final Sample

	Descriptive Statistics
Variable	Women	Men
Number of persons	2,141	1,475
Number of events	161 (7.5%)	86 (5.8%)
Number of observations per person	7.6 (6)	7.5 (7)
Age at marriage	27.0 (26)	29.2 (29)
Relationship duration at marriage in months	64.2 (54)	64.9 (54)
Premarital cohabitation	Yes: 1800 (84%)	Yes: 1235 (84%)
	No: 341 (16%)	No: 240 (16%)

Note. For the covariates means (medians) of non-standardized time-constant variables are presented.

Model Specification

Based on the reviewed literature on marital satisfaction and marital dissolution, two joint models will be fitted separately for men and women. They are identical in terms of included covariates, method and association structure.

The longitudinal model on satisfaction with the marriage will be modelled by an LMM including a random intercept and a random slope term for the duration of marriage (t). We included a random slope since intercepts and slopes differ between individuals (see Figure 3 for some example trajectories). The above mentioned variables are taken into account as covariates to model the longitudinal variable properly. Hereby, i is an indicator for the person and j denotes the measurement at time-point j:

6

\begin{array}{l} {\hat{m}}_{i j} = ({\hat{β}}_{0} + {\hat{b}}_{0 i}) + ({\hat{β}}_{1} + {\hat{b}}_{1 i}) t_{i j} + {\hat{β}}_{2} t_{i j}^{2} + \\ {\hat{β}}_{3} {age at marriage}_{i} + {\hat{β}}_{4} {relationship duration at marriage}_{i} + \\ {\hat{β}}_{5} {premarital cohabitation}_{i} + {\hat{β}}_{6} {years of education}_{i} + \\ {\hat{β}}_{7} {net income}_{i j} + {\hat{β}}_{8} {shared work}_{i j} + \\ {\hat{β}}_{9} {gender attitudes}_{i j} + {\hat{β}}_{10} {preschoolchild}_{i j} + \\ {\hat{β}}_{11} nchild = 1_{i j} + {\hat{β}}_{12} nchild = 2_{i j} + {\hat{β}}_{13} nchild = {more}_{i j} + \\ {\hat{β}}_{14} lfs = not {working}_{i j} + {\hat{β}}_{15} lfs = part - {time}_{i j} + {\hat{β}}_{16} lfs = {other}_{i j} \end{array}

The time-to-event model on time to marriage dissolution will be modelled jointly with the longitudinal model from Equation (6), where the estimated values of satisfaction explain the risk of marriage dissolution. Specifically, the time-to-event model includes a B-Spline approximation of the baseline hazard h₀(t).

Equation (7) shows the final representation of our main joint model on marital satisfaction and the risk of marriage dissolution.

7

\begin{array}{l} \hat{h} (t | M_{i} (t), x_{i_{s u r v}}) = h_{0} (t) e x p [{\hat{γ}}_{1} {premarital cohabitation}_{i} + \\ {\hat{γ}}_{2} {age at marriage}_{i} + {\hat{γ}}_{3} {relationship duration at marriage}_{i} + \\ {\hat{γ}}_{4} {net income}_{i} (t) + {\hat{γ}}_{5} {years of education}_{i} + {\hat{γ}}_{6} {shared work}_{i} (t) + \\ {\hat{γ}}_{7} {gender attitudes}_{i} (t) + {\hat{γ}}_{8} {preschoolchild}_{i} (t) + \\ {\hat{γ}}_{9} nchild = 1_{i} (t) + {\hat{γ}}_{10} nchild = 2_{i t} + {\hat{γ}}_{11} nchild = {more}_{i} (t) + \\ {\hat{γ}}_{12} lfs = {not working}_{i} (t) + {\hat{γ}}_{13} lfs = {part-time}_{i} (t) + {\hat{γ}}_{14} lfs = {other}_{i} (t) + \\ \hat{α} ({\hat{m}}_{i} (t))] \end{array}

Implementation

For implementation of the above model we use the package $JM$ (Rizopoulos, 2010) for $R$ (Version 4.4.0) (R Core Team, 2024). $JM$ combines two models that are built with their specific $R$ packages. The longitudinal model is constructed using $lme()$ from the $nlme$ package (Pinheiro et al., 2023) and therefore requires the usual data structure in long format, where each individual spans several rows corresponding to the observation time points, each holding the covariate value of the time point, respectively.

The time-to-event model is fitted using $coxph()$ from $survival$ (Therneau, 2024). The structure of the underlying data set is equivalent to the long format start-stop-event logic when further exogenous TVCs are used, i.e., several rows per individual, indicating the current measured values. In case of lack of other exogenous TVCs in the model, the data set for the time-to-event model reduces to one row per individual and supplements the long format data set for the longitudinal submodel (see Rizopoulos (2012)). Since our model includes other exogenous TVCs (e.g., labor force status) one single data set in the classical start-stop-event logic is used.

Both models serve as inputs for the final $jointModel()$ command. For further information on the (optional) arguments in the $jointModel()$ function, we refer to Rizopoulos (2010). $R$ -code and a synthesized data set to replicate the example can be found in the web appendix of this paper (see Potts et al., 2026).

# time-to-event model
	modsurv_female <- coxph(Surv(time = t1, time2 = t2, event = status)~
	  yeduc + ageatm + preschoolchild + nchild + premarcohab +
	  sw_weight + incnet + relduratmar + lfs_rec + genderatt_s,
	  data = df_female, x = TRUE, model=T, cluster = id)

# longitudinal model
	modlong_female <- lme(sat31 ~ t + I(t^2) + sw_weight + ageatm +
	  preschoolchild + nchild + premarcohab + yeduc + incnet +
	  relduratmar + lfs_rec + genderatt_s,
	  data = df_female, random = ~ t | id)

# joint model for longitudinal and time-to-event data
	modjoint_female <- jointModel(modlong_female, modsurv_female,
	  timeVar = "t", method = "spline-PH-GH",
	  control = list(verbose=T, iter.EM=100))

Estimation Results

Separate models were fitted for men and women. This section starts with the estimation results of the joint model for women.

The longitudinal submodel estimates the relationship between the covariates and marital satisfaction (outcome) (Table 2, left side). The model results in a U-shaped effect of time. The number of years of education shows a positive, significant effect on marital satisfaction. Some other covariates show negative, linear effects on satisfaction with the relationship: higher age at marriage and higher personal net income are associated with a significantly lower marital satisfaction. Women with children (compared to childless women) also reveal lower values of satisfaction. There is even an additional negative effect if there are preschool children present in the household. The index for the division of household work reveals a negative and statistically significant effect, i.e., women who stated to do more household work are less satisfied with their relationship during marriage. The relationship duration at time of marriage and the gender role attitudes show insignificant coefficients in this model. Next, we examine the time-to-event submodel for women (Table 2, right side) including the properly modelled endogenous variable marital satisfaction. This model indicates which variables still have a direct effect on the risk of marital dissolution when controlling for marital satisfaction. Looking at the association parameter (last row in Table 2), we observe the expected strong negative effect of marital satisfaction on the risk of marriage dissolution: the higher the current value of satisfaction with the relationship, the lower the risk of marriage dissolution. Besides this relationship, there are only few significant effects in the time-to-event submodel. Higher educated women and women that were in a long relationship with their married partner before marriage have a lower risk of marriage dissolution. As an example for the decomposition of effects, we focus on the variable of relative household work in the following. In contrast to other research results (e.g., Frisco & Williams, 2003)⁵, the relative load of household work done by a person has no direct effect for women on their risk of marriage dissolution. There still remains the indirect effect via the mediator of marital satisfaction: a higher share of household work done by the female respondent results in a significantly lower marital satisfaction which results in significantly higher risk of marriage dissolution. The estimated total effect of the index on the divison of household work on the risk of marriage dissolution adds up to −0.5552 × −0.1369 + 0.1089 ≈ 0.1849.

Table 2

Model for Women: Regression Coefficients of the Joint Model for Longitudinal and Time-to-Event Data

	Longitudinal submodel			Time-to-event submodel
	(Marital satisfaction)			(Risk of marriage dissolution)
Variable	Estimate	Std. err.	p-value	Estimate	Std. err.	p-value
(Intercept)	8.7403	0.1054	0.0000
Time	-3.0387	0.4023	0.0000
Time²	2.0987	0.5022	0.0000
Relative load of household work	-0.1369	0.0191	0.0000	0.1089	0.0793	0.1694
Premarital cohabitation^a : yes	-0.1036	0.0788	0.1885	0.1297	0.2225	0.5599
Age at marriage	-0.1820	0.0361	0.0000	-0.1546	0.1070	0.1483
Preschool child(ren) in hh^a : yes	-0.0693	0.0412	0.0924	-0.3286	0.2121	0.1213
Number of children in hh^b : 1	-0.2520	0.0648	0.0001	0.0893	0.2969	0.7637
Number of children in hh^b : 2	-0.2594	0.0729	0.0004	0.0508	0.3082	0.8691
Number of children in hh^b : more	-0.2182	0.0921	0.0178	0.0799	0.3649	0.8266
Years of education	0.0672	0.0310	0.0299	-0.2228	0.0969	0.0214
Personal net income	-0.0457	0.0248	0.0651	-0.0230	0.1462	0.8750
Relationship duration at marriage	0.0183	0.0340	0.5910	-0.2570	0.1047	0.0140
Labor force status^c : not working	0.0474	0.0625	0.4479	-0.3514	0.3051	0.2494
Labor force status^c : other	-0.0577	0.0677	0.3941	-0.0238	0.2830	0.9329
Labor force status^c : part-time employed	-0.0170	0.0555	0.7594	-0.1381	0.2422	0.5685
Gender role attitudes	-0.0082	0.0219	0.7069	0.0819	0.0896	0.3606
Satisfaction ( $\hat{α}$ )				-0.5552	0.0551	0.0000

Note. Reference categories: ^a no, ^b zero, ^c full-time employed.

For men, the results reveal interesting differences in both submodels (Table 3): Regarding the marital satisfaction (longitudinal model ), socio-economic variables such as education as well as personal net income do not show significant effects. Similarly to the model for women, children decrease the relationship satisfaction in the marriage compared to childless men and the effect size is larger than for women. In contrast to the model for women, there is no additional significant, negative effect of preschool children. Even though the labor force status shows an influence on the marital satisfaction for men, these results have to be interpreted with caution, as over 80% of the person periods for men indicate a full-time employment. The relative load of household work has a smaller effect on marital satisfaction and is also negative and statistically significant.

Table 3

Model for Men: Regression Coefficients of the Joint Model for Longitudinal and Time-to-Event Data

	Longitudinal Submodel			Time-to-Event Submodel
	(Marital Satisfaction)			(Risk of Marriage Dissolution)
Variable	Estimate	Std. err.	p-value	Estimate	Std. err.	p-value
(Intercept)	9.0393	0.1233	0.0000
Time	-2.8529	0.4976	0.0000
Time²	1.5018	0.7095	0.0343
Relative load of household work	-0.0603	0.0267	0.0237	0.2413	0.1312	0.0660
Premarital cohabitation^a : yes	-0.2385	0.1052	0.0234	-0.3357	0.3020	0.2663
Age at marriage	-0.1686	0.0417	0.0001	0.1291	0.1383	0.3508
Preschool child(ren) in hh^a : yes	0.0249	0.0507	0.6238	0.0196	0.2806	0.9442
Number of children in hh^b : 1	-0.3302	0.0790	0.0000	-0.1158	0.4041	0.7744
Number of children in hh^b : 2	-0.3910	0.0901	0.0000	0.3430	0.3825	0.3699
Number of children in hh^b : more	-0.3562	0.1152	0.0020	0.3927	0.4633	0.3967
Years of education	0.0341	0.0393	0.3864	-0.1266	0.1232	0.3043
Personal net income	0.0017	0.0192	0.9307	-0.0481	0.1647	0.7703
Relationship duration at marriage	-0.0015	0.0420	0.9708	-0.0709	0.1208	0.5573
Labor force status^c : not working	-0.2574	0.0977	0.0085	0.0613	0.4931	0.9012
Labor force status^c : other	-0.3296	0.0927	0.0004	-0.0242	0.3599	0.9465
Labor force status^c : part-time employed	-0.2288	0.1191	0.0548	-1.1292	1.0137	0.2653
Gender role attitudes	0.0281	0.0263	0.2846	0.1294	0.1162	0.2654
Satisfaction ( $\hat{α}$ )				-0.4534	0.0702	0.0000

Note. Reference categories: ^a no, ^b zero, ^c full-time employed.

The estimated association parameter $\hat{α}$ in the time-to-event submodel is also significant but the effect size is smaller compared to the model for women. In other words, the estimated marital satisfaction appears less predictive for the risk of marriage dissolution for men than for women. Another difference between the sexes in the submodel regarding the risk of marriage dissolution is the effect of the division of household work: This covariate has a positive and statistically significant effect on the risk of marriage dissolution in the model on men (p = 0.0660). In contrast to the model for women, the indirect mediated effect is supplemented by a direct effect of household work on the risk of marriage dissolution. The estimated total effect of this predictor variable on the risk of marriage dissolution can be derived as −0.4534 × −0.0603 + 0.2413 ≈ 0.2686.

Besides the decomposition of interpretable covariate effects, another strength of a joint model is the option to predict individual probabilities to still be in the relationship after the last measurement of a person. This may be useful for intervention planning in different research questions on the micro-level. In contrast to a classical TVC approach, these predictions are based on the whole estimated longitudinal trajectory of marriage satisfaction and do not only rely on the last measured value. The orange line in Figure 4 shows the predictions of a fictional part-time working woman, who lived together with her partner before marriage. She lives with two children, at least one being a preschool child and shows median values (for women) for the other covariates. The only part that varies between the four plots is her trajectory of marital satisfaction indicated by the crosses left to the dashed line showing the time of the last measurement. Looking at a certain point of time in the future (black solid line), the different trajectories result in different predicted probabilities to still be in the marriage, even though each trajectory ends with the same last measurement of a 4 on the

Click to enlarge

Figure 4

Predicted Probability of Still Being in the Marriage for a Person Varying Only the Marital Satisfaction Trajectory

Note. Covariate values: part-time working, 2 children, at least one preschool child, premarital cohabitation, median values (for women) for the other covariates.

11-point scale. This feature also allows to update the prediction with every newly obtained measurement on a person such that a researcher can trace the development dynamically (see Appendix A.4).

Comparison to Other Modelling Approaches

In this section, we compare the estimation results, i.e., coefficients, standard errors, predictive performance (Mean Squared Error), of the joint model with two other modelling approaches (TVC approach in a Cox-proportional hazards model and a two-stage model). Comparison tables of the estimation results can be found in Appendix A.1.

A classical TVC approach in a time-to-event model ⁶ would heavily underestimate the effect of marital satisfaction on the risk of marriage dissolution (e.g., women: TVC approach: −0.3083, joint model: −0.5552). This may result from the fact that the risk of marriage dissolution is not only dependent on the observed current value but due to the estimation routine on the whole trajectory of the marriage satisfaction until the point in time. Modelling marital satisfaction as an endogenous covariate results in some differences regarding the other covariates in the time-to-event submodel as well. For example, the effect of the division of household work for women on the risk of marriage dissolution is overestimated in a classical TVC model and the standard error is underestimated which leads to a smaller p-value. These differences highlight the importance of the correct model choice when dealing with endogenous covariates in time-to-event models.

In our example the differences between a two-stage model and a joint model are only small and the tendency to underestimate the uncertainty of the satisfaction variable can only be found in the model for women. The regression coefficients for marital satisfaction differ only slightly with −0.5347 in the two-stage model and −0.5552 in the joint model, and the standard error of 0.0530 in the two-stage model is smaller than the standard error of 0.0551 of the presented joint model (men: two-stage: −0.4240, s.e. = 0.0708; joint model: −0.4534, s.e. = 0.0702).

Focussing on the longitudinal submodel, we compare the standalone longitudinal model (mixed model fitted with $lme())$ and the longitudinal submodel of a joint model which controls for the non-random drop-out due to marriage dissolution. There are only small differences in coefficients and standard errors between the two models. However, in the joint model for women the effect of time of marriage has a more pronounced U-shape i.e., larger absolute coefficients for the linear and quadratic term. In the model for men, modelling the non-random drop-out by marriage dissolution leads to a change in the size and significance of the coefficient of premarital cohabitation. Note that the differences between the longitudinal model approaches may be larger in other applications when a larger share of events (higher number of drop-outs, i.e., more marriage dissolutions) is present.

We further evaluated the predictive performance of the three modelling approaches via predicting the event probability for several time points after the last individual measurement of marital satisfaction. Figure A.1 shows the mean squared error (MSE) of the models at specific points in time and highlights the advantages of a joint model regarding predictions in the longrun. While the MSE is smallest for the TVC approach when predicting the event outcome up until six months after the last measurement, the MSE for the joint model is smaller when looking at later times and outperforms the two competing models after ten months.

Summary & Conclusion

This tutorial paper aimed to introduce the method of a joint model for longitudinal and time-to-event data in the field of social science research. We demonstrate the suitability and added value of answering research questions with endogenous covariates in an application on marital satisfaction and marriage dissolution. Based on the pairfam data, our results indicate that the effect of marital satisfaction on the risk of marriage dissolution is larger than a Cox model with TVC suggests. The strength of the decomposition of effects has been demonstrated and shows for our sample, e.g., that the relative load of household work in a marriage has no direct effect on the risk of marital dissolution for women but for men and a strong indirect effect via marital satisfaction for both sexes. We believe that this model class is a useful tool in social science research and hope to contribute to its increasing usage.

For the sake of illustration, this tutorial paper kept the modelling structure as simple as possible such that several extensions of the same data example may arise. A first option is the consideration of a different linkage between the longitudinal and the time-to-event model, as there exist several options of association structures. Using the current value of the longitudinal model mi(t), as presented in our main joint model in Equation (5), associates the predicted value at each time-point with the hazard function at the same time-point. One can also think of the slope of the estimated trajectory of the TVC (marital satisfaction) to be important for the hazard function. An increase or decrease in marital satisfaction independent of the actual level may influence the risk of marriage dissolution. The options can be combined. Note that the effect size of the slope association structure depends on the units of time, whereas the current value does not. Thus, the estimates of the association structures cannot be compared in their effect size. Both options can be used in lagged versions as well. We therefore tested the joint model with different association structures (current slope, current value and current slope, cumulative effect, lagged effect, see Appendix A.3). Model choice criteria suggest to favor the current value association with a lag over the other association structures for men whereas the current value and current slope association is the favored model for women. Another common approach for the association structure is the usage of estimated random effects of the longitudinal submodel of each person and link them to their survival. The interpretation of the respective association parameter does not depend on time, since random intercept and random slope do not depend on time by default. For an overview of association structures see Cremers et al. (2021).

In terms of model set-up, the variable of marital satisfaction is not perfectly normally distributed and another outcome distribution modelled via a Generalized Linear Mixed Model may be a more appropriate choice for the longitudinal submodel. This can be included using the $JMbayes2$ package (Rizopoulos et al., 2024) in R. Furthermore, the individual-specific effect of time in the LMM could be modelled non-linearly with semi-parametric methods via splines. Combining statistical modelling with machine learning methods for variable selection may be useful. A model-based boosting algorithm for joint models has been developed by Griesbach et al. (2023). Another extension is the use of a competing risks model in the survival part, which might be especially useful to account for different types of events or different reasons of dropout and is therefore appealing when focus is on the longitudinal submodel as well. For an extensive overview on recent developments in the joint modelling literature see Papageorgiou et al. (2019).

Regarding the content level, several extensions are conceivable: In order to exploit the data richness of pairfam even further, couple-level data analysis might give additional insights (Hickey et al., 2016; Ruppanner et al., 2017). This is a strong limitation of the performed analysis, since the occurrence of marriage dissolution and its timing are assumed to be influenced only by the satisfaction level of one partner in the marriage. The predictive performance might be improved using variables of the partner in the model as well. In addition, one could rethink the exogeneity assumption of the other time-varying covariates in the time-to-event model and also model them as endogeneous in a joint model, e.g., regarding the potential anticipation effect of divorce and its connection to working behaviour for women (Poortman, 2005).

Notes

1) The preregistration of this article can also be found in the supplementary material at Potts et al. (2025).

2) An introduction to linear mixed models can be found in Galecki and Burzykowski (2012).

3) Several classes of time-to-event models are explained by Blossfeld and Rohwer (2001).

4) The data set has been generated using the simPop package (Templ et al., 2017)

5) Note that the measures of household tasks differ, as Frisco and Williams (2003) use a measure of feeling of fairness, whereas our variable measures to which extend a person does more or less of the household work. Furthermore, they concentrate on dual-earner households with data from the United States whereas our data base contains first marriages in Germany without any restrictions on the labor force status.

6) A classical TVC approach with a lagged satisfaction value (value of previous interview) did not reveal large differences in terms of the size of the coefficient (tendency to smaller effect size) and inference compared to the non-lagged TVC approach.

Funding

The work on this article was supported by the DFG (Number 426493614) and the Volkswagen Foundation (Freigeist Fellowship).

Acknowledgments

This paper uses data from the German Family Panel pairfam, coordinated by Josef Brüderl, Sonja Drobnič, Karsten Hank, Johannes Huinink, Bernhard Nauck, Franz J. Neyer, and Sabine Walper. The study was funded from 2004 to 2022 as a priority program and long-term project by the German Research Foundation (DFG).

Competing Interests

The authors have declared that no competing interests exist.

Data Availability

The study is preregistered at Potts et al. (2025). The tutorial code, R and R-Markdown, used during the current study, as well as the synthesized data set generated from the original pairfam data set, are available at Potts et al. (2026).

Supplementary Materials

Type of supplementary materials	Availability/Access
Data
Synthesized data set.	Potts et al. (2026)
Code
Tutorial code - R code.	Potts et al. (2026)
Tutorial code - R Markdown code.	Potts et al. (2026)
Material
No supplemental material available.	—
Study/Analysis preregistration
Preregistration.	Potts et al. (2025)
Other
No other materials available.	—

References

Ameri, S., Fard, M. J., Chinnam, R. B., & Reddy, C. K. (2016). Survival analysis based framework for early prediction of student dropouts. In Proceedings of the 25^th ACM International on Conference on Information and Knowledge Management. ACM.
Asar, O., Ritchie, J., Kalra, P. A., & Diggle, P. J. (2015). Joint modelling of repeated measurement and time-to-event data: An introductory tutorial. International Journal of Epidemiology, 44(1), 334-344.
Blossfeld, H.-P., & Rohwer, G. (2001). Techniques of event history modeling (2^nd ed.). Psychology Press.
Bound, J., Brown, C., & Mathiowetz, N. (2001). Measurement error in survey data. In Handbook of econometrics, (pp. 3705–3843). Elsevier.
Box-Steffensmeier, J. M., & Jones, B. S. (2004). Analytical methods for social research: Event history modeling: A guide for social scientists. Cambridge University Press.
Brüderl, J., Drobnič, S., Hank, K., Neyer, F. J., Walper, S., Wolf, C., Alt, P., Bauer, I., Böhm, S., Borschel, E., Bozoyan, C., Christmann, P., Edinger, R., Eigenbrodt, F., Garrett, M., Geissler, S., Gonzalez Avilés, T., Gröpler, N., Gummer, T. . . . & Wetzel, M. (2023). The German Family Panel (pairfam). GESIS. ZA5678 Datenfile Version 14.0.0, https://doi.org/10.4232/pairfam.5678.14.0.0.
Caughlin, J. P., & Huston, T. L. (2006). The affective structure of marriage. In Cambridge handbook of personal relationships, (pp. 131–156). Cambridge University Press.
Clark, A. E., Diener, E., Georgellis, Y., & Lucas, R. E. (2008). Lags and leads in life satisfaction: A test of the baseline hypothesis. Economic Journal, 118(529), F222-F243.
Cremers, J., Mortensen, L. H., & Ekstrøm, C. T. (2021). A joint model for longitudinal and time-to-event data in social and life course research: Employment status and time to retirement. Sociological Methods & Research, 1-36.
Crowther, M. J. (2020). merlin — a unified modeling framework for data analysis and methods development in Stata. Stata Journal, 20(4), 763-784.
Crowther, M. J., Abrams, K. R., & Lambert, P. C. (2013). Joint modeling of longitudinal and survival data. Stata Journal, 13(1), 165-184.
Elmslie, B. T., & Tebaldi, E. (2014). The determinants of marital happiness. Applied Economics, 46(28), 3452-3462.
Faucett, C. L., & Thomas, D. C. (1996). Simultaneously modelling censored survival data and repeatedly measured covariates: A Gibbs sampling approach. Statistics in Medicine, 15(15), 1663-1685.
Ferrer, L., Rondeau, V., Dignam, J., Pickles, T., Jacqmin-Gadda, H., & Proust-Lima, C. (2016). Joint modelling of longitudinal and multi-state processes: Application to clinical progressions in prostate cancer. Statistics in Medicine, 35(22), 3933-3948.
Frisco, M. L., & Williams, K. (2003). Perceived housework equity, marital happiness, and divorce in dual-earner households. Journal of Family Issues, 24(1), 51-73.
Galecki, A., & Burzykowski, T. (2012). Linear mixed-effects model. In Linear mixed-effects models using R, (pp. 245–273). Springer New York.
Griesbach, C., Mayr, A., & Bergherr, E. (2023). Variable selection and allocation in joint models via gradient boosting techniques. Mathematics, 11(2), Article 411.
Hägglund, A. E., & Bächmann, A.-C. (2017). Fast lane or down the drain? Does the occupation held prior to unemployment shape the transition back to work? Research in Social Stratification and Mobility, 49, 32-46.
Hickey, G. L., Philipson, P., Jorgensen, A., & Kolamunnage-Dona, R. (2016). Joint modelling of time-to-event and multivariate longitudinal outcomes: Recent developments and issues. BMC Medical Research Methodology, 16, Article 117.
Hogan, J. W., & Laird, N. M. (1997). Mixture model for the joint distribution of repeated measures and event times. Statistics in Medicine, 16(3), 239-257.
Huinink, J., Brüderl, J., Nauck, B., Walper, S., Castiglioni, L., & Feldhaus, M. (2011). Panel analysis of intimate relationships and family dynamics (pairfam): Conceptual framework and design. Zeitschrift für Familienforschung, 23(1), 77-101.
Huss, B., & Pollmann-Schult, M. (2019). Relationship satisfaction across the transition to parenthood: The impact of conflict behavior. Journal of Family Issues, 41(3), 383-411.
Jensen, T. M., Shafer, K., Guo, S., & Larson, J. H. (2016). Differences in relationship stability between individuals in first and second marriages. Journal of Family Issues, 38(3), 406-432.
Kalbfleisch, J. D., & Prentice, R. L. (2002). The statistical analysis of failure time data (Wiley Series in Probability and Statistics, 2^nd ed.). John Wiley & Sons.
Karney, B. R., & Bradbury, T. N. (1995). The longitudinal course of marital quality and stability: A review of theory, methods, and research. Psychological Bulletin, 118(1), 3-34.
Kingsley, M. (2018). The influence of income and work hours on first birth for Australian women. Journal of Population Research, 35(2), 107-129.
Kurz, K., Steinhage, N., & Golsch, K. (2006). Case study Germany: Global competition, uncertainty and the transition to adulthood. In Globalization, uncertainty and youth in society, (pp. 47–78). Routledge.
Leiva-Yamaguchi, V., & Alvares, D. (2020). A two-stage approach for Bayesian joint models of longitudinal and survival data: Correcting bias with informative prior. Entropy, 23(1), Article 50.
Lillard, L. A. (1993). Simultaneous equations for hazards. Journal of Economics., 56(1–2), 189-217.
Lorber, M. F., Erlanger, A. C. E., Heyman, R. E., & O’Leary, K. D. (2014). The honeymoon effect: Does it exist and can it be predicted? Prevention Science, 16(4), 550-559.
Mikolai, J., & Kulu, H. (2018). Divorce, separation, and housing changes: A multiprocess analysis of longitudinal data from England and Wales. Demography, 55(1), 83-106.
Molenberghs, G., & Verbeke, G. (2001). A review on linear mixed models for longitudinal data, possibly subject to dropout. Statistical Modelling, 1(4), 235-269.
Núñez, J., Núñez, E., Rizopoulos, D., Miñana, G., Bodí, V., Bondanza, L., Husser, O., Merlos, P., Santas, E., Pascual-Figal, D., Chorro, F. J., & Sanchis, J. (2014). Red blood cell distribution width is longitudinally associated with mortality and anemia in heart failure patients. Circulation Journal, 78(2), 410-418.
Papageorgiou, G., Mauff, K., Tomer, A., & Rizopoulos, D. (2019). An overview of joint modeling of time-to-event and longitudinal outcomes. Annual Review of Statistics and Its Application, 6(1), 223-240.
Philipson, P., Sousa, I., Diggle, P. J., Williamson, P., Kolamunnage-Dona, R., Henderson, R., & Hickey, G. L. (2018). joineR: Joint Modelling of Repeated Measurements and Time-to-Event Data. R package Version 1.2.8.
Pinheiro, J., Bates, D., and R Core Team. (2023). nlme: Linear and Nonlinear Mixed Effects Models. R package Version 3.1-164.
Poortman, A.-R. (2005). Women’s work and divorce: A matter of anticipation? A research note. European Sociological Review, 21(3), 301-309.
Potts, S., Rappl, A., Kurz, K., & Bergherr, E. (2025). Bridging the gap: Introducing joint models for longitudinal and time-to-event data in the social sciences [Preregistration]. ArXiv. https://arxiv.org/html/2504.18288v1
Potts, S., Rappl, A., Kurz, K., & Bergherr, E. (2026). Supplementary Materials to “Bridging the gap: Introducing joint models for longitudinal and time-to-event data in the social sciences” [Supplemental procedures and analyses]. PsychOpen GOLD. https://doi.org/10.23668/psycharchives.21789
R Core Team (2024). R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing.
Rappl, A., Mayr, A., & Waldmann, E. (2021). More than one way: Exploring the capabilities of different estimation approaches to joint models for longitudinal and time-to-event outcomes. International Journal of Biostatistics, 18(1), 127-149.
Rizopoulos, D. (2010). JM: An R package for the joint modelling of longitudinal and time-to-event data. Journal of Statistical Software, 35(9), 1-33.
Rizopoulos, D. (2012). Joint models for longitudinal and time-to-event data: With applications in R. CRC Press.
Rizopoulos, D., Papageorgiou, G., & Miranda Afonso, P. (2024). JMbayes2: Extended Joint Models for Longitudinal and Time-to-Event Data. R package Version 0.5-0. https://github.com/drizopoulos/JMbayes2.
Rosen-Grandon, J. R., Myers, J. E., & Hattie, J. A. (2004). The relationship between marital characteristics, marital interaction processes, and marital satisfaction. Journal of Counseling amp; Development, 82(1), 58-68.
Ruppanner, L., Brandén, M., & Turunen, J. (2017). Does unequal housework lead to divorce? Evidence from Sweden. Sociology, 52(1), 75-94.
Skardhamar, T., & Telle, K. (2012). Post-release employment and recidivism in Norway. Journal of Quantitative Criminology, 28(4), 629-649.
Solomon, B. C., & Jackson, J. J. (2014). Why do personality traits predict divorce? Multiple pathways through satisfaction. Journal of Personality and Social Psychology, 106(6), 978-996.
Sweeting, M. J., & Thompson, S. G. (2011). Joint modelling of longitudinal and time-to-event data with application to predicting abdominal aortic aneurysm growth and rupture. Biometrical Journal, 53(5), 750-763.
Templ, M., Meindl, B., Kowarik, A., & Dupriez, O. (2017). Simulation of synthetic complex data: The R package simPop. Journal of Statistical Software, 79(10), 1-38.
Therneau, T. M. (2024). A Package for Survival Analysis in R. R package Version 3.6-4.
Vonesh, E. F., Greene, T., & Schluchter, M. D. (2005). Shared parameter models for the joint analysis of longitudinal data and event times. Statistics in Medicine, 25(1), 143-163.
Williamson, H. C., & Lavner, J. A. (2019). Trajectories of marital satisfaction in diverse newlywed couples. Social Psychological and Personality Science, 11(5), 597-604.
Wu, M. C., & Carroll, R. J. (1988). Estimation and comparison of changes in the presence of informative right censoring by modeling the censoring process. Biometrics, 45(3), 939-955.
Wulfsohn, M. S., & Tsiatis, A. A. (1997). A joint model for survival and longitudinal data measured with error. Biometrics, 53(1), 330-339.
Yamaguchi, K. (1991). Event history analysis (Applied Social Research Methods). SAGE Publications.

Appendices

A.1. Model Comparison

A1.1. Longitudinal Model

Table A.1

Model Comparison Table for Women: Longitudinal (Sub)model for Modelling Marital Satisfaction

	Linear mixed model			Longitudinal submodel
	$lme()$			$JM()$
Variable	Estimate	Std. err.	p-value	Estimate	Std. err.	p-value
(Intercept)	8.7180	0.1098	0.0000	8.7403	0.1054	0.0000
Time	-2.8256	0.4079	0.0000	-3.0387	0.4023	0.0000
Time²	1.8383	0.5152	0.0004	2.0987	0.5022	0.0000
Relative load of household work	-0.1402	0.0192	0.0000	-0.1369	0.0191	0.0000
Premarital cohabitation^a: yes	-0.0988	0.0861	0.2516	-0.1036	0.0788	0.1885
Age at marriage	-0.1915	0.0387	0.0000	-0.1820	0.0361	0.0000
Preschool child(ren) in hh^a: yes	-0.0599	0.0418	0.1517	-0.0693	0.0412	0.0924
Number of children in hh^b: 1	-0.2696	0.0669	0.0001	-0.2520	0.0648	0.0001
Number of children in hh^b: 2	-0.2752	0.0763	0.0003	-0.2594	0.0729	0.0004
Number of children in hh^b: more	-0.2579	0.0970	0.0079	-0.2182	0.0921	0.0178
Years of education	0.0868	0.0337	0.0100	0.0672	0.0310	0.0299
Personal net income	-0.0469	0.0249	0.0595	-0.0457	0.0248	0.0651
Relationship duration at marriage	0.0328	0.0359	0.3612	0.0183	0.0340	0.5910
Labor force status^c: not working	0.0314	0.0642	0.6245	0.0474	0.0625	0.4479
Labor force status^c: other	-0.0683	0.0696	0.3264	-0.0577	0.0677	0.3941
Labor force status^c: part-time employed	-0.0288	0.0570	0.6128	-0.0170	0.0555	0.7594
Gender role attitudes	-0.0049	0.0222	0.8265	-0.0082	0.0219	0.7069

Note. Reference categories: ^a no, ^b zero, ^c full-time employed.

Table A.2

Model Comparison Table for Men: Longitudinal (Sub)model for Modelling Marital Satisfaction

	Linear mixed model			Longitudinal submodel
	$lme()$			$JM()$
Variable	Estimate	Std. err.	p-value	Estimate	Std. err.	p-value
(Intercept)	8.9591	0.1221	0.0000	9.0393	0.1233	0.0000
Time	-2.7153	0.4870	0.0000	-2.8529	0.4976	0.0000
Time²	1.5466	0.6785	0.0227	1.5018	0.7095	0.0343
Relative load of household work	-0.0682	0.0267	0.0106	-0.0603	0.0267	0.0237
Premarital cohabitation^a: yes	-0.1696	0.1037	0.1019	-0.2385	0.1052	0.0234
Age at marriage	-0.1569	0.0433	0.0003	-0.1686	0.0417	0.0001
Preschool child(ren) in hh^a: yes	0.0273	0.0501	0.5858	0.0249	0.0507	0.6238
Number of children in hh^b: 1	-0.3394	0.0785	0.0000	-0.3302	0.0790	0.0000
Number of children in hh^b: 2	-0.4088	0.0889	0.0000	-0.3910	0.0901	0.0000
Number of children in hh^b: more	-0.3674	0.1138	0.0012	-0.3562	0.1152	0.0020
Years of education	0.0351	0.0381	0.3570	0.0341	0.0393	0.3864
Personal net income	-0.0021	0.0192	0.9137	0.0017	0.0192	0.9307
Relationship duration at marriage	-0.0118	0.0383	0.7592	-0.0015	0.0420	0.9708
Labor force status^c: not working	-0.2637	0.0970	0.0066	-0.2574	0.0977	0.0085
Labor force status^c: other	-0.3112	0.0894	0.0005	-0.3296	0.0927	0.0004
Labor force status^c: part-time employed	-0.2695	0.1233	0.0289	-0.2288	0.1191	0.0548
Gender role attitudes	0.0253	0.0260	0.3315	0.0281	0.0263	0.2846

Note. Reference categories: ^a no, ^b zero, ^c full-time employed.

A.1.2. Time-to-Event Model

Table A.3.

Model Comparison Table for Women: Time-to-Event (Sub)model for Modelling the Risk of Marriage Dissolution

	Time-varying covariate			Two-stage model			Time-to-event submodel
	$coxph()$			$lme() and coxph()$			$JM()$
Variable	Estimate	Std. err.	p-value	Estimate	Std. err.	p-value	Estimate	Std. err.	p-value
Years of education	-0.2420	0.0968	0.0090	-0.2346	0.0957	0.0097	-0.2228	0.0969	0.0214
Age at marriage	-0.0788	0.1035	0.4418	-0.1413	0.1060	0.1909	-0.1546	0.1070	0.1483
Preschool child(ren) in hh^a: yes	-0.3780	0.2096	0.0773	-0.3620	0.2115	0.0975	-0.3286	0.2121	0.1213
Number of children in hh^b: 1	0.0463	0.2931	0.8713	0.0451	0.2931	0.8734	0.0893	0.2969	0.7637
Number of children in hh^b: 2	0.1254	0.3017	0.6803	0.0411	0.3046	0.8939	0.0508	0.3082	0.8691
Number of children in hh^b: more	0.1319	0.3576	0.7140	0.0804	0.3619	0.8258	0.0799	0.3649	0.8266
Relative load of household work	0.1451	0.0771	0.0713	0.0715	0.0796	0.3968	0.1089	0.0793	0.1694
Premarital cohabitationa: yes	0.0526	0.2197	0.8182	0.1019	0.2189	0.6586	0.1297	0.2225	0.5599
Personal net income	-0.0283	0.1470	0.7439	-0.0328	0.1600	0.7509	-0.0230	0.1462	0.8750
Relationship duration at marriage	-0.2673	0.1025	0.0080	-0.2512	0.1035	0.0149	-0.2570	0.1047	0.0140
Labor force status^c: not working	-0.3843	0.3039	0.2014	-0.3509	0.3085	0.2475	-0.3514	0.3051	0.2494
Labor force status^c: other	-0.0363	0.2820	0.8983	-0.0545	0.2833	0.8491	-0.0238	0.2830	0.9329
Labor force status^c: part-time employed	-0.1260	0.2403	0.6037	-0.1647	0.2409	0.4961	-0.1381	0.2422	0.5685
Gender role attitudes	0.0848	0.0887	0.3245	0.0876	0.0892	0.3214	0.0819	0.0896	0.3606
Satisfaction	-0.3083	0.0253	0.0000	-0.5347	0.0530	0.0000	-0.5552	0.0551	0.0000

Note. Reference categories: ^a no, ^b zero, ^c full-time employed.

Table A.4

Model Comparison Table for Men: Time-to-Event (Sub)model for Modelling the Risk of Marriage Dissolution

	Time-varying covariate			Two-stage model			Time-to-event submodel
	$coxph()$			$lme() and coxph()$			$JM()$
Variable	Estimate	Std. err.	p-value	Estimate	Std. err.	p-value	Estimate	Std. err.	p-value
Years of education	-0.1012	0.1228	0.4061	-0.1292	0.1234	0.2961	-0.1266	0.1232	0.3043
Age at marriage	0.1366	0.1384	0.2970	0.1255	0.1378	0.3418	0.1291	0.1383	0.3508
Preschool child(ren) in hh^a: yes	0.0164	0.2809	0.9573	0.0064	0.2794	0.9832	0.0196	0.2806	0.9442
Number of children in hh^b: 1	-0.1469	0.4026	0.7291	-0.1767	0.4011	0.6667	-0.1158	0.4041	0.7744
Number of children in hh^b: 2	0.3322	0.3800	0.3978	0.2857	0.3781	0.4570	0.3430	0.3825	0.3699
Number of children in hh^b: more	0.3797	0.4603	0.4021	0.3476	0.4585	0.4335	0.3927	0.4633	0.3967
Relative load of household work	0.2502	0.1294	0.0206	0.2305	0.1311	0.0343	0.2413	0.1312	0.0660
Premarital cohabitation^a: yes	-0.3325	0.3001	0.2974	-0.3453	0.3003	0.2784	-0.3357	0.3020	0.2663
Personal net income	-0.0552	0.1679	0.7828	-0.0710	0.1762	0.7462	-0.0481	0.1647	0.7703
Relationship duration at marriage	-0.0722	0.1194	0.6172	-0.0704	0.1201	0.6286	-0.0709	0.1208	0.5573
Labor force status^c: not working	-0.1215	0.4968	0.8130	-0.0195	0.4982	0.9697	0.0613	0.4931	0.9012
Labor force status^c: other	-0.0453	0.3583	0.8939	-0.0528	0.3587	0.8764	-0.0242	0.3599	0.9465
Labor force status^c: part-time employedt	-1.3864	1.0184	0.1806	-1.2501	1.0177	0.2260	-1.1292	1.0137	0.2653
Gender role attitudes	0.1173	0.1167	0.3469	0.1288	0.1162	0.2985	0.1294	0.1162	0.2654
Satisfaction	-0.2798	0.0351	0.0000	-0.4240	0.0708	0.0000	-0.4534	0.0702	0.0000

Note. Reference categories: ^a no, ^b zero, ^c full-time employed.

A.2. Predictive Performance

Click to enlarge

Figure A.1

Predictive Performance.

Note. Upper panel: Women, Lower panel: Men. Left: Comparison of Mean Squared Error (MSE) of three different model approaches by time since last measurement. Right: Difference in Joint Model vs. Cox Model and Difference in Joint Model vs. Two-Stage Model in terms of MSE by time since last measurement.

A.3. Different Association Structures

Exemplary different association structures can be characterised as follows:

Current value:
h(t|M_i(t), x_i) = h₀(t) exp[γ^T x_i_surv + αm_i(t)]
Current value and current slope:
h(t|M_i(t), x_i) = h₀(t) exp[γ^T x_i_surv + α₁m_i(t) + α₂m'_i(t)]
Cumulative effect (area under longitudinal trajectory):
h(t|M_i(t), x_i) = h₀(t) exp[γ^Tx_i_surv + α ∫₀^t m_i(s)ds]
Lagged effect, where c defines the desired time lag:
h(t|M_i(t), x_i) = h₀(t) exp[γ^Tx_i_surv + αm_i{max(t − c), 0}]

and result in the following AIC values:

Table A.5

AIC for Joint Models With Different Association Structures

	Women	Men
Current value	66934.06	45255.43
Current slope	67004.06	45277.44
Current value+current slope	66933.88	45258.63
Cumulative	66956.5	45267.12
Current value with lag (one month)	66934.61	45253.71

A.4. Dynamic Prediction

Click to enlarge

Figure A.2:

Predicted Survival Probability for a Person Updating her Marital Satisfaction Trajectory

Note. Covariate values: woman, part-time working, 2 children, at least one preschool child, premarital cohabitation, median values (for women) for the other covariates.

Bridging the Gap: Introducing Joint Models for Longitudinal and Time-to-Event Data in the Social Sciences

Abstract

Types of Time-Varying Covariates

Joint Models for Longitudinal and Time-to-Event Data

An Illustrative Example: Time-to-Event Model and Joint Model in Comparison

Figure 1

Scheme of Two Related Processes and Possible Modelling Strategies

Figure 2

Estimated Average Trajectory of Relationship Satisfaction of Persons by Event Status and Sex

Method

1

2

3

4

5

A Different Perspective on Joint Models

Application: Marriage Satisfaction and Time to Marriage Dissolution

Data Set

Figure 3

Example Trajectories of Relationship Satisfaction From Pairfam Participants

Table 1

Model Specification

6

7

Implementation

Estimation Results

Table 2

Table 3

Figure 4

Predicted Probability of Still Being in the Marriage for a Person Varying Only the Marital Satisfaction Trajectory

Comparison to Other Modelling Approaches

Summary & Conclusion

Notes

Funding

Acknowledgments

Competing Interests

Data Availability

Supplementary Materials

References

Appendices

A.1. Model Comparison

A1.1. Longitudinal Model

Table A.1

Table A.2

A.1.2. Time-to-Event Model

Table A.3.

Table A.4

A.2. Predictive Performance

Figure A.1

Predictive Performance.

A.3. Different Association Structures

Table A.5

A.4. Dynamic Prediction

Figure A.2:

Predicted Survival Probability for a Person Updating her Marital Satisfaction Trajectory

Outline