^{a}

^{b}

^{b}

^{c}

^{a}

^{a}

^{d}

^{d}

Latent Profile Analysis (LPA) is a method to extract homogeneous clusters characterized by a common response profile. Previous works employing LPA to human value segmentation tend to select a small number of moderately homogeneous clusters based on model selection criteria such as Akaike information criterion, Bayesian information criterion and Entropy. The question is whether a small number of clusters is all that can be gleaned from the data. While some studies have carefully compared different statistical model selection criteria, there is currently no established criteria to assess if an increased number of clusters generates meaningful theoretical insights. This article examines the content and meaningfulness of the clusters extracted using two algorithms: Variational Bayesian LPA and Maximum Likelihood LPA. For both methods, our results point towards eight as the optimal number of clusters for characterizing distinctive Schwartz value typologies that generate meaningful insights and predict several external variables.

Latent Class Analysis (LCA;

The purpose of LPA is to reduce the data to a reasonable number of classes, meaningful both in terms of statistical saliency and theoretical interpretability, and in our work we define the term “optimal number of clusters” as a number that is both statistically justifiable and substantively meaningful.

LCA and LPA are often used by psychologists and sociologists to classify individuals into specific personality types (

The majority of these previous works use ML-LPA that estimates “conditional means and variances of the continuous indicators” (

The Schwartz value theory has been widely applied to studies in various disciplines, not only in psychology and sociology (

In this study we analyzed the secondary data compiled from the 8th round of the ESS (

Assuming that the dataset is organized in a way that the observations (i.e., respondents of the ESS8) are the rows and the indicators (i.e., Schwartz’ value indicators) are the columns of a matrix containing the CFA scores for each respondent, VB-LPA aims to group the rows to form homogeneous blocks. VB-LPA is specified according to the following generative model:

The model posits that there exists

Benefits of VB include the ability to explicitly monitor convergence as well as ease of interpretation accounting for uncertainty according to the inferred factorized distribution, although the assumed factorized distribution is only an (optimistic) approximation of the uncertainty and prone to local minima issues in the optimization.

Using the Mplus software implementation (

Using our own software implementation (software implementation of the VB-LPA is available in Python in the

To evaluate the extracted clusters, we consider several generic metrics.

When the number of clusters is low, we expect the extracted profiles to be clearly distinct, but as the number of clusters increases, we expect that the extracted profiles will gradually become more similar. To assess this, we examined the maximum Pearson correlation coefficient between cluster profiles: The correlation coefficient will be equal to 1 when two cluster profiles are identical up to scaling and translation.

In general, a clustering solution is good when the observations in each cluster are very similar while the clusters are very dissimilar. To assess this we examined the average within-cluster variance

The Davies-Bouldin (DB) score is a measure of the trade-off between the within and between cluster variance, defined as the average similarity of each cluster to its closest neighbor

Entropy, as defined in Mplus, is a statistic used to measure the uncertainty of the assignment of observations to cluster. More precisely

External variables, that are expected to correlate with the value-based cluster assignment, can be used to validate the clustering and aid in determining the most appropriate number of clusters. To measure the degree of statistical relation between the cluster assignment and external categorical variables we used adjusted normalized mutual information (AMI) defined as

Mutual information measures the degree of statistical association between random variables and quantifies the amount of information (measured in nats) that the variables have in common. The AMI adjusts the mutual information for association due to chance, and scales the measure so that AMI = 1 corresponds to perfect agreement and AMI = 0 is the expected agreement due to chance when there is no association. Since mutual information takes the uncertainty related to the estimated cluster assignments into account, we find it well suited to measure the statistical relation between cluster assignments and external variables.

Components | AIC | BIC | Entropy | VLMR | BLRT |
---|---|---|---|---|---|

3 | 671730 | 672092 | 0.865 | — | |

4 | 616549 | 617006 | 0.885 | 0.000 | 0.000 |

5 | 591249 | 591801 | 0.885 | 0.000 | 0.000 |

6 | 563286 | 563933 | 0.895 | 0.000 | 0.000 |

7 | 539388 | 540130 | 0.897 | 0.000 | 0.000 |

8 | 522803 | 523640 | 0.898 | 0.000 | 0.000 |

9 | 508036 | 508968 | 0.900 | 0.628 | 0.000 |

10 | 481011 | 482133 | 0.904 | 0.094 | 0.000 |

11 | 478958 | 480080 | 0.904 | 0.000 | 0.000 |

12 | 467310 | 468527 | 0.906 | 0.032 | 0.000 |

13 | 457601 | 458913 | 0.907 | 0.151 | 0.000 |

14 | 448117 | 449523 | 0.908 | 0.044 | 0.000 |

15 | 438810 | 440311 | 0.911 | 0.047 | 0.000 |

16 | 429312 | 430908 | 0.913 | 0.047 | 0.000 |

17 | 421874 | 423565 | 0.914 | 0.567 | 0.000 |

18 | 414963 | 416749 | 0.913 | 0.287 | 0.000 |

19 | 407574 | 409455 | 0.914 | 0.194 | 0.000 |

20 | 401787 | 403762 | 0.913 | 0.141 | 0.000 |

The model fit in terms of the highest attained ELBO is shown in

The maximum Pearson correlation between cluster profiles is shown in

The variance within and between clusters is shown in

The minima of the DB score shown in

The certainty with which the observations are assigned to their respective clusters, as measured by the Entropy, is shown in

The results of cluster evaluation indicate that

Finally,

Plots of AMI between cluster assignments and nine external variables are shown in

This article presented the VB-LPA algorithm and several statistical criteria inspired by

The qualitative evaluation of the value profiles demonstrated that, for instance, the all positive or the all negative clusters identified in

As for the comparison between the ML-LPA and the VB-LPA (summary of the comparisons is found in the

A limitation of our study is that the comparisons of the ML-LPA and VB-LPA were not based on a simulation study but only on empirical data instead. The comparisons using the PVQ items based on the cluster evaluation metrics, however, enabled us to identify an optimal number of clusters that are statistically justifiable and substantively meaningful for interpreting the Schwartz theory. Another noteworthy point is that the ML-LPA based on the method factor presented in

This work has been conducted as part of the project “UMAMI: Understanding Mindsets Across Markets, Internationally” No. 61579-00001A funded by Innovation Fund Denmark.

The authors have declared that no competing interests exist.

Eldad Davidov would like to thank the University of Zurich Research Priority Program Social Networks for their support during work on this paper. The authors would like to thank Lisa Trierweiler for the English proof of the manuscript. Finally, the authors would like to thank the editor of the journal and anonymous reviewers for valuable and constructive suggestions.