A persistent theme on the research agenda of social scientists is the consequences of social status mobility, especially the one across generations. As the primary embodiment of status inconsistency, variation or continuity of social status from one generation to another relates to a wide range of social outcomes, including but not limited to fertility (Kasarda & Billy, 1985), subjective wellbeing (Zang & de Graaf, 2016), and political behaviors (Weakliem, 1992).
In empirical settings, scholars always set the objective of estimating the social mobility effect net of the those from social origin and destination (Hendrickx, De Graaf, Lammers, & Ultee, 1993). This task, however, can be challenging. Since the operationalization of social mobility is often a mathematical function of the origin and destination, the estimation of the net mobility effect could fail. This is essentially a problem of model identification, by which we mean that not all of the effects of origin, destination, and mobility can be uniquely ascertained (Blalock, 1966). This problem in the mobility research has been noted as early as the 1960s, where Blalock (1966) shows that not all coefficients of origin (O thereafter), destination (D thereafter), and their difference (O-D thereafter) can be uniquely identified in a linear or generalized linear model (depending on the properties of the outcome variable) that is used to explain some specific outcome of mobility.
In this case, the many models conventionally used to reveal the net mobility effect—the square additive model (SAM thereafter) proposed by Duncan (1966), the diamond model (DM thereafter) proposed by Hope (1971, 1975), and the diagonal reference model (DRM thereafter) proposed by Sobel (1981, 1985)—can be seen to be various efforts to resolve this identification problem. A commonplace of these models, as is going to be shown below, is that they mathematically transform at least one operationalization of O, D, and mobility insofar as to ensure a full-rank design matrix for a linear or generalized linear model. This identification strategy is statistically workable, but one shortcoming is that they rely greatly on “mechanical” transformation of the measures of O, D, or mobility. As a result, the analytical results sometimes fall short of substantive and theoretical interpretability. This paper draws on the rising interests in the mechanism-based model identification and presents an identification framework based on the front-door criterion in the causal inference literature (Winship & Harding, 2008). Instead of relying on omnibus mathematical operations, this approach is theory-directed by specifying and controlling for the meaningful mediators through which O, D, or social mobility functions on the outcome variable.
The mechanism-based identification strategy is not new to empirical researchers. The underlying rationale has much to do with the well-known proxy variable methods and has been proposed to handle other “multiple clock” problems (e.g., the age-period-cohort [APC] modeling; Bijlsma, Daniel, Janssen, & De Stavola, 2017; Winship & Harding, 2008). But to the best of our knowledge, this research is the first one that systematically discusses the mechanism-based identification framework in social mobility research.
The Identification Problem and Existing Strategies
The identification problem concerns the question of “whether or not there are too many unknowns for solution” for a given model configuration (Blalock, 1966, p. 52). A frequently met unidentifiable case is the multicollinearity problem, where some predictors are functions of the others so that their independent effects are not estimable. As well noted, this problem comes into being because the design matrix of these covariates is not full-rank. In the research of net mobility effect, the rank deficiency is exactly the cause of the identification challenge.
Specifically, social mobility concerns two basic facets: mobility direction (upward, downward, or static) and mobility length (steps of move on the status ladder). A handy approach to construct this measure is to compute the difference between O and D. For example, suppose both O and D refer to occupational categories with three levels (1 = high, 2 = middle, 3 = low), then their difference O-D would be valued from −2 to 2. Clearly, a positive, negative, and zero value would correspond respectively to upward, downward, and static (no) mobility. The numerical value stands for the steps of move, that is, from a low status to a high status would span over two steps, which returns a value of positive two. Such steps can be understood as the difficulty of mobility (i.e., the number of barriers one has to pass through in order to mobilize upwardly or downwardly).
Despite the theoretical relevance and simplicity, taking into account O-D along with O and D in a linear or generalized linear model would result in the failure of effect identification, because the design matrix is always one less than full column rank. To solve this problem, several models have been proposed. The first one is the SAM. Instead of using O-D, the SAM parameterizes the mobility effect to be the product of O and D, that is, the interaction term O*D (Duncan, 1966). However, this approach has been criticized for not being able to capture the net mobility effect because the mobility effect O*D cannot be entirely separated from the main effects of O and D (Hope, 1971, 1975).
To preserve the logical measure of O-D, Hope (1971, 1975) proposes the DM, where a common dimension of social status without distinction between origin and destination is used. This common status measure can be parameterized to be the summation of O and D (House, 1978). This algebraic operation reduces the number of predictors to be two (one is for O+D and the other is for O-D), so the design matrix now is full. However, an overall status dimension has been called into question because it is in conflict with the multidimensional conceptualization of social status (e.g., Weber, 1978). Also, without controlling for each of O and D, as long as O and D have different effects on the outcome, the O-D term would always appear to have some effect even though such effect does not indeed exist (House, 1978).
Perhaps the most widely used model in the current literature on social mobility is the DRM (Sobel, 1981, 1985). A conventional interpretation of this model is that the net mobility effect is estimated with reference to the diagonal cells of a mobility table. Specifically, using Sobel’s notations, the outcome Y of individual k with origin i and destination j is parameterized to be Yijk = pμii+(1-p) μjj+γ(O-D)ijk+εijk, where μii and μjj are the population means in the iith and jjth cells of the mobility table, εijk is the random error, and the mobility effect is captured by γ. The DRM makes sense because those who do not mobilize provide the status cues for the mobilizers to get acculturated from. To see how the DRM solves the identification problem, we turn to another DRM parameterization proposed by Sobel (Model 3.4 in Sobel, 1981, p. 898), which is
Simply switching the order of p and μii as well as the order of 1-p and μjj, Model (1) is equal to Model (2):
In this model, note that neither μii nor μjj is treated as data inputs. Instead, they are statistics that should be estimated from the data (but the sample estimators are not the sample means of the outcome in the diagonal cells of the mobility table, as noted by Sobel, 1981). Seeing μii and μjj to be some unknown coefficients, it is immediately clear that, relative to the unidentifiable generic model with the predictors of O, D, and O-D, the DRM introduces a new unknown coefficient p. Let and = , the design matrix is then transformed into [ ], where is the column vector of s, is the column vector of s, and is the column vector of (O-D)s. Since , the identification problem is circumvented.
In summary, from the perspective of model identification, all of the three existing models estimating the net mobility effect provide different ways to ensure a full-rank design matrix. One commonplace of the SAM, DM, and DRM is that certain mathematical operation is deployed and imposed on either the main effects of O and D (the case of the DM and DRM) or the measure of mobility (the case of the SAM). This is a straightforward way of identifying statistical models, but mechanical and often falls short of theoretical reasoning and justification. In the following two sections, we will present a mechanism-based identification framework that stands for an entirely different approach from the three existing models. Before showing the details, it is necessary to familiarize readers with the directional acyclic graph (DAG).
The Directional Acyclic Graph: An Overview
The mechanism-based identification of the net mobility effect is mostly based on the so-called front-door criterion in the literature of the DAG (Pearl, 1995, 2009). The DAG is a graphical representational system that can be used to show the interrelationship between variables. Two fundamental criterions of effect identification have been proposed by Pearl (1995), which are respectively named the back-door criterion and the front-door criterion.
The back-door criterion, intuitively, requires controlling for all of the confounders C that determine the value of both predictor X and outcome Y. This is tantamount to cut off all of the potential confounding paths between X and Y. For example, in Figure 1A, the causal effect of X on Y cannot be identified unless the confounding path X←C→Y is blocked by fixing C. To follow the conventional notation, we use the square symbol to denote statistical controlling, so this is tantamount to X← →Y. The front-door criterion shifts attention to the mediators that bridge X and Y. If all of the connections between X and Y go through the mediator M, then the causal effect of X on Y would be the estimable as the product of the effect of X on M and the effect of M on Y. Of course, the X-M and M-Y effects should be estimated without confoundedness, which might call for the deployment of the back-door criterion. This identification strategy is also called the path-tracing rule. However, readers should note that this rule may not apply in the case of nonlinear modeling, where the Monte Carlo simulation can be deployed (Bijlsma et al., 2017). In Figure 1B, suppose the link between X and Y is fully mediated by M, the causal effect of X on Y would be a*b. When estimating a, we do not need to control for any variable because the confounding path X←C→Y←M is automatically blocked by the collider Y. However, when estimating b, we may have to control for X or C to make sure the confounding path M←X←C→Y is no longer working.
The DAG provides a very handy analytical tool to interrogate and fix the confounding paths, as in Figure 2. In Figure 2A, both O and D are related to D-O, so the mobility effect confronts with three confounding paths: D-O←D→Y, D-O←O→Y, and D-O←D←O→Y. However, due to the multicollinearity problem, we cannot control for O and D simultaneously, so at least one confounding path would still be effective. If we change the measure of mobility (e.g., the SAM), we can control for both O and D, as in Figure 2B. In this case, all confounding paths are blocked: D*O← →Y, D*O← →Y, and D*O← ← →Y. In Figure 2C, the DM uses a common measure of status O+D, so after fixing this term, we can cut off the confounding path D-O← →Y. This is also the case for the DRM in Figure 2D, where the three confounding paths are disabled through statistical controlling: D-O← →Y, D-O← →Y, and D-O← ← →Y.
Using the DAG, especially the front-door criterion, we in the following section will present the mechanism-based identification framework for the net mobility effect estimation.
Scenarios of the Mechanism-Based Identification
The mechanism-based identification sets its basis on the introduction of the mediators that fully bridge the link between a specific predictor and the outcome. This approach is desirable because it encourages a more nuanced reflection on “how” social mobility comes into being. In a sense, the mechanism-oriented research is not new to social scientists who have long been interested in the causal chain from one variable to another (e.g., the structural equation modelling as in Blau & Duncan, 1967, also see Kelley, 1973), and this is also true for the literature on the consequences of social mobility. For instance, an early review article on the association between social mobility and fertility by Kasarda and Billy (1985) has already called for scholars’ more attention to the “intermediate variables.” As is going to be shown below, introducing the “intermediate variables” into the analysis not only enriches theoretical arguments, but also provides one workable way to identify the net mobility effect. Specifically, there are four research scenarios, as shown in Figure 3.
By independent mechanism, we mean that at least one of the effects of O and D on Y is fully mediated. Figure 3A illustrates the situations of full mediation of only O, only D, and both. When only O is fully mediated by Mo, we can control for D and Mo to ensure an unbiased estimate of the net mobility effect, as the confounding paths of D-O← →Y, D-O←O→ →Y, and D-O←D←O→ →Y are all blocked. Similarly, when only D is fully mediated by Md, the identification strategy requires fixing O and Md, as in D-O←D→ →Y, D-O← →Y, and D-O←D← →Y. Finally, when O and D have their own full mediators, the estimation of the net mobility effect can be accomplished by controlling for Mo and Md, rendering the blocked confounding paths of D-O←O→ →Y, D-O←D→ →Y, and D-O←D←O→ →Y.
Joint mechanism means that O and D share the full mediator of Mod. That is to say, the origin and the destination of status work on the outcome variable through the same set of intermediate variables. This is shown in Figure 3B. In this scenario, the identification of the net mobility effect calls for blocking three potential confounding paths by virtue of controlling for Mod, as in D-O←O→ →Y, D-O←D→ →Y, and D-O←D←O→ →Y.
Partial mechanism refers to the situation where there are some missing or unobserved mediators. This is a practical situation since scholars might not be able to get access to all of the mediators for a particular predictor. To illustrate this case, we in Figure 3C add a new unobserved mediator Uo for O. Clearly, simply controlling for D and Mo can only block the paths of D-O← →Y, D-O←O→ →Y, D-O←D←O→ →Y, and D-O← ←O→Uo→Y, but leaving D-O←O→Uo→Y unblocked.
This partial mechanism is problematic because it violates the front-door criterion that all of the mediators are taken into account. One possible way out is to introduce the mediators for the mobility measure. Indeed, the mediators for the variable of mobility should also be full, but it is still meaningful to check this approach, for at least two reasons. First, relative to the main effects of O and D, the process of status mobility is more specific and better defined, so it is relatively easier for researchers to identify its intermediate variables (e.g., Kasarda & Billy, 1985). Second, if the research objective is to estimate the net mobility effect, the discussions here suggest that as long as at least one predictor among O, D, and mobility can identify sufficient mediators, we could obtain an unbiased estimate for the net mobility effect. This would allow more leeway in the identification for empirical researchers.
With this new mediator Md-o, the net mobility effect can be estimated to be the product of a and b. Hence, the question is now how to unbiasedly estimate a and b. For a, we can direct regress Md-o on D-O and the regression coefficient of Md-o would be an unbiased estimate for a. That is because all of the confounding paths from D-O to Md-o have been blocked by the collider Y, as in D-O←D→Y←Md-o, D-O←O→Mo→Y←Md-o, D-O←O→Uo→Y←Md-o, D-O←D←O→Mo→Y←Md-o, and D-O←D←O→Uo→Y←Md-o. The estimation of b is more complicated because no collider can be used. However, as long as we control for D-O, all confounding paths would be disabled, as in Md-o← ←D→Y, Md-o← ←O→Mo→Y, Md-o← ←O→Uo→Y, Md-o← ←D←O→Mo→Y, and Md-o← ←D←O→Uo→Y. The product of the estimates of a and b provides the point estimate for the net mobility effect.
Intermediate Confounded Mechanism
The intermediate confounded mechanism captures the situation where there exist confounders that link the mediator and the outcome. This would make the estimation of the net mobility effect tricky, as illustrated in the left subfigure of Figure 3D. On the one hand, Mo is the mediator of O, so we should control for it in order to block the paths of D-O←O→ →Y, D-O←D←O→ →Y. On the other hand, Mo is a collider, so controlling for it would result in the so-called endogenous bias by opening up the confounded paths of D-O←O→ ←Uo→Y and D-O←D←O→ ←Uo→Y (Elwert & Winship, 2014). Of course, we may only control for D so as to block D-O← →Y, D-O← ←O→Mo→Y, and D-O← ←O→Mo←Uo→Y, but this would not be helpful for fixing the other two confounded paths D-O←O→Mo→Y and D-O←O→Mo←Uo→Y.
To circumvent this dilemmatic situation, we need to introduce the mediator of D-O. With Md-o, we are enabled to estimate the net mobility effect by estimating and multiplying a and b. Again, due to the collider Y, we do not need to control for extra variables when estimating a, where all of the confounding paths D-O←D→Y←Md-o, D-O←O→Mo→Y←Md-o, D-O←O→Mo←Uo→Y←Md-o, D-O←D←O→Mo→Y←Md-o, and D-O ←D←O→Mo←Uo→Y←Md-o have already been blocked. As for b, we may only need to fix D-O, resulting in the blocked confounding paths of Md-o← ←D→Y, Md-o← ←O→Mo→Y, Md-o← ←O→Mo←Uo→Y, Md-o← ←D←O→Mo→Y, and Md-o← ←D←O→Mo←Uo→Y.
A Simulation-Based Example
We use Monte Carlo simulations to illustrate the mechanism-based identification framework. Without loss of generality, both O and D are configured to have three categories (1 = low, 2 = middle, 3 = high). The multinomial distribution is used to generate the distribution of 10,000 cases among the three categories of O, with the probabilities to be 0.3, 0.6, and 0.1. The probability of getting into the low, middle, and high statuses of D is respectively 0.6, 0.3, and 0.1 for those from the low status of O, respectively 0.1, 0.8, and 0.1 for those from the middle status of O, and respectively 0.1, 0.3, and 0.6 for those from the high status of O. The measure of mobility is the difference between O and D. One caveat is necessary. We configure O and D to be ordinal variables to be consistent with the current literatures on mobility, where the characteristics of the two generations at issue often take the form of an ordinal gradient, such as the prestige of occupations or the quantiles of income. In this regard, the difference of them make practical sense. Note that the measures of O and D should have comparable scales. Otherwise, some standardization has to be deployed. Of course, we may focus on the continuous measures such as income, and in this case, the difference between O and D would be continuous.
For the case of independent mechanism, the data-generating process configures Mo and Y to be both continuous and simulate their values using the following formulas:
When O and D have the same mediator Mod, the data-generating process would follow the formulas of
The partial mechanism scenario requires introducing an unobserved mediator Uo and the mediator for the mobility variable Mmobility into the modelling process. To do so, we use the following data-generating rules:
Lastly, the scenario of intermediate confounded mechanism is simulated as follows:
To simplify our expository simulations, we configure the continuous outcome variable Y, when justifies the OLS model. This can be straightforwardly extended to discrete Ys, where different link functions are adopted in the generalized linear model framework (Faraway, 2016).
Standard errors are computed using the bootstrap method (iteration = 500).
The results of the simulations can be found in Figure 4. It is shown that across the four research scenarios, the mechanism-based identification approach can help to estimate the net mobility effect, with the sample mean and the pre-set effect of one lying within the 95% confidence intervals (CI). Therefore, the mechanism-based identification strategy adds a new tool for social scientists who are interested in the consequences of social mobility net of the influences of social origin and destination.
In a sense, the mechanism-based approach might perform better than the conventional modelling approaches. To see this, we use the simulated data of O, D, and O-D to fit the SAM, the DM, and the DRM. Unsurprisingly, the interactive term in the SAM (O*D) contrasts greatly with the effect of the difference measure of O-D, with its coefficient to be 0.147 (the 95% CI is [−0.277, 0.571]). Relatively, the DM model preserves the difference measure O-D, with the 95% CI of the coefficient of O-D to include the pre-set value one (the point estimate is 0.865, with the 95% CI to be [0.556, 1.175]). Lastly, the result of the DRM is not statistically significant, and the point estimate is extraordinarily huge (−112.477). In light of these findings, it seems that except for the DM, neither the SAM nor the DRM could estimate the net mobility effect in an unbiased fashion.
Social scientists are familiar with the mechanism-oriented research (Hedström & Ylikoski, 2010), but thus far, few apply this line of thinking to handle the model identification issue in the analysis of the net mobility effect. This article revisits the previously used methods—the SAM, the DM, and the DRM—from the perspective of model identification. Moreover, we drawing on the DAG, especially the front-door criterion, to present a mechanism-based identification framework. Four analytical scenarios are elaborated, and the Monte Carlo simulation analyses suggest that the mechanism-based approach works well to reveal the consequences of social mobility net of the influences from either social origin or social destination.
However, this mechanism-based identification framework is no panacea. The key for its success lies in whether or not one is able to collect the variables that fully mediate the relationship between one predictor and the outcome. Since the intergenerational transition of social status could bring about a wide range of changes in one’s life, it is no easy task to do so. For practical researchers, the situation of the partial mechanism as illustrated earlier could be fairly common. In this case, one tool that could be of helpful is the mediation analysis (i.e., Imai, Keele, Tingley, & Yamamoto, 2011). If the mediators available to researchers can fully mediate a predictor and the outcome, the estimated direct effect in the mediation analysis—the “residual” effect between the predictor and the outcome net of the mediation—should be not statistically significant. This mediation analysis could also enable scholars to pin down the key mediators. For instance, among the multiple mediators, there could be some crucial ones that essentially play the mediation role insofar that the other ones are only proxies of the key ones. By conducting the mediation analysis, these key mediators can be identified and taken into account when performing the mobility analysis.
Another issue that deserves more discussion is the measurement of social status. Although sociologists traditionally gravitate toward an ordinal measure, more recent research starts to shift attention to continuous measures such as income. When examining intergenerational income mobility, the identification problem discussed in this article still exists, that is, the coefficients for parental income, children’s income, and their difference cannot be uniquely estimated. Therefore, the mechanism-based approach proposed in this paper should be enlightening.
To conclude, we would like to emphasize that the mechanism-based approach matters not only for model identification, but also for the elaboration of theories. Without good understanding of the underlying mechanisms, scholars cannot be sure whether and how social mobility causes a specific outcome variable, which might further call into question the analytical values of the whole status classification scheme (Weeden & Grusky, 2005).