Original Article

Evaluating Individual Scientific Output Normalized to Publication Age and Academic Field Through the Scientometrics.org Project

Balázs Győrffy1,2,*, Boglárka Weltz1,2, Gyöngyi Munkácsy1,2, Péter Herman1,2, István Szabó3

Methodology, 2022, Vol. 18(4), 278–297, https://doi.org/10.5964/meth.9463

Received: 2022-05-12. Accepted: 2022-11-28. Published (VoR): 2022-12-22.

Handling Editor: Katrijn Van Deun, Tilburg University, Tilburg, The Netherlands

*Corresponding author at: Department of Bioinformatics, Semmelweis University, Tűzoltó utca 7-9., 1094, Budapest, Hungary, Tel: +3630-514-2822. E-mail: gyorffy.balazs@med.semmelweis-univ.hu

This is an open access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Abstract

When evaluating the publication performance of a scientist one has to consider not only the difference in publication norms in different scientific fields, but also the length of the academic career of the investigated researcher. Here, our goal was to establish a database suitable as a reference for the ranking of scientific performance by normalizing the researchers output to those with the same academic career length and active in same scientific field. By using the complete publication and citation data of 17,072 Hungarian researchers, we established a framework enabling the quick assessment of a researcher’s scientific output by comparing four parameters (h-index, yearly independent citations received, number of publications, and number of high impact publications), to the age-matched values of all other researchers active in the same scientific discipline. The established online tool available at www.scientometrics.org could be an invaluable help for faster and more evidence-based grant review processes.

Keywords: publications, h-index, citation, article rank, basic research, scientific output, Q1, internationalization, scientometrics

Regular publication of high quality papers that have an impact is a pre-requisite to succeed in academic career also emphasized by the commonly used aphorism “publish or perish” (Campbell, 2010) which was first used as early as 1927 (Southern California Sociological Society & University of Southern California, 1927). Major grant agencies including the European Research Council offer grants only to researchers who have a track record of significant research achievements as documented by noteworthy publications as leading authors. However, no one can understand all the research project details or discern the scientific track record of a researcher and it is a futile challenge to compare and to weight the research topics of scientists or of scientific projects. On the other hand, the impact of a research can be measured by analyzing the prestige of the journal it is published in or by counting the citations it received.

So, the question arises: how exactly can one quantify the impact of research achievements? Beyond the simple count of citations (Yan et al., 2011) multiple indicators have been developed to measure and compare different researchers competing for grant and tenure positions by utilizing different metrics, of which the two most widely used are the impact factor (McKiernan et al., 2019) and the h-index (Chapman et al., 2019). The impact factor is a measure of prestige for a journal and is based on the yearly average number of citations to recent articles published in that journal (Garfield, 1972). Theoretically, by adding up all the impact factors for one’s papers, the yearly average citation count for all previously published articles could be estimated. However, this might not be a reliable estimation since the distribution of the citations received by the papers published in a given journal is skewed and in most cases only a small number of articles tend to account for a high percentage of the citations (Seglen, 1992). The h-index is an author level metric defined as the highest h number of publications of a scientist that received h or more citations each while the other publications have not more than h citations each. The index gives an estimate of the importance, significance, and broad impact of a scientist's cumulative research contributions (Hirsch, 2005). Hirsch suggested to consider the h-index when objectively comparing the scientific achievement of different individuals in a given field (Hirsch, 2005).

Previously, we compared different metrics available to determine impact in a large-scale performance evaluation of review-based grant allocation (Győrffy et al., 2020b). In this study, by using 42,905 scored review reports for 13,303 proposals of the Hungarian National Scientific Research Fund, we evaluated multiple independent indicators of scientists including citation, h-index, scores provided by reviewers, epidemiologic data of the researchers like age and gender, etc., and found that the past scientific achievements—including the h-index, yearly independent citations, and the number of Q1 publications—were the strongest predictors of future output by far outperforming reviewer scores (Győrffy et al., 2020b). The low efficiency of grant reviewers was also documented in another independent study (Fang et al., 2016). In an earlier study with a different cohort of researchers we investigated the publication performance of Hungarian Momentum grant holders—Momentum is one of the largest individual research grants in Hungary with funding comparable to European Research Council Grant programmes. In this study we observed the strongest correlations between scientific output and total number of citations, h-index, and impact factor in the last two years (Győrffy et al., 2018). Other parameters of the researchers including gender, degree, and international grants had no correlation with publication output. In summary, grant applicant’s h-index, citation count, and number of high-rank publications proved to be the most robust parameters associated with future publication output.

We have to mention another fundamental problem when measuring the scientific performance of researchers and this is the different age. Generally, more experienced senior scientists can publish more than youngers fellows. An increasing proportion of scientist remain active for a longer period during their life (Ghaffarzadegan & Xu, 2018), and the mean age of scientists is steeply increasing by about 2.3 years per 10 years (Blau & Weinberg, 2017). The productivity increases with age and reaches the top late in career, but declines after the age of 60 (Aksnes et al., 2013). The conclusion of these observations is that we cannot expect a 30- or 40-years old researcher to have the same h-index, citation count or other parameters as someone at the age of 50 or 60.

In the present study, our goal was to determine the age-related distribution of the three most important parameters, h-index, citation count, and number of publications of Hungarian researchers. As scientific disciplines can have different characteristics, we performed the entire analysis separately in eleven scientific sections. We aimed to validate the age-related distribution and to use this data to establish a reference database suitable for the ranking of scientific performance by normalizing the researchers output to those with the same academic career length and active in same scientific field. Our results can provide age-matched reference data for future studies evaluating other countries as well.

Method

Reference Database

We acquired publication and citation data from the Hungarian Scientific Work Archive (HSWA) for researchers who must regularly update their HSWA record since this database serves as the reference to determine scientific output of scientists in Hungary in many aspects (application for PhD degree, professorship, awards, etc.). These include doctors of the Hungarian Academy of Sciences, members of the Academy, Momentum Grant holders, National Research Development and Innovation Fund (OTKA) grant applicants who submitted their application since 2006 and Hungarian researchers with a university affiliation. Then, to include only researchers active at the time of analysis, we filtered the database to include only those who have updated their HSWA record within the last four years. When looking for grant applicants we included both grant recipients and those whose grant application was not successful. To separate researchers with identical names, all scientists were identified using their HSWA identification number. Publication age equals to the “actual year minus the year of the very first publication of the researcher”, where the type of the first publication was set according to the scientific discipline (see in detail below at section "Computed Scientific Parameters"). Researchers without a PhD degree were not included in the analysis.

When determining yearly citation count, only independent citations were included. A citation is designated as independent in case there is no overlap in the author list between the cited and the citing document. Both dependent and independent citations were included when determining the h-index because the original definition by Hirsch also included all citations for each research work.

We made a distinction according to the various publication patterns of diverse scientific disciplines by considering different publications types. For researchers active in life sciences (agricultural sciences, medicine, and biology) and material sciences (mathematics, engineering, chemistry, earth sciences, and physics) only papers published in peer-reviewed journals were considered. These include research articles, review articles, and short research articles. Additional publications were also evaluated for scientific disciplines belonging to humanities and social sciences including linguistics and literary, philosophy and history, and economics and law. These include books, book chapters, conference proceedings, conference abstracts, and monographs.

Scientific Disciplines

We assigned each researcher to one of eleven scientific disciplines and these were grouped into three major scientific fields (humanities and social sciences, life sciences, and material sciences). The eleven disciplines are based on the eleven classes defined by the Hungarian Academy of Sciences: linguistics and literacy (I), philosophy and historical sciences (II), mathematics (III), agricultural sciences (IV), medicine (V), engineering (VI), chemistry (VII), biological sciences (VIII), economics and law (IX), earth sciences (X), and physical sciences (XI). Researchers were assigned to the discipline they selected in the catalogue in the homepage of the Hungarian Academy of Sciences. For researchers without a selected discipline the publication record was screened and the topic of five most recent publications was used to determine the scientific focus.

Computed Scientific Parameters

For each researcher, the following three parameters (h-index, yearly independent citations, and number of publications) were computed for each age in yearly bins. Year was defined as a complete calendar year and the calculations were performed up to the last complete calendar year. Age was defined as the number of years passed since the first publication. The computation steps are also summarized in Figure 1A and Figure 1B

Click to enlarge
/meth.9463-f1A
Figure 1A

Computation Steps to Determine Scientific Output of Researchers for Life Sciences and Material Sciences

Click to enlarge
/meth.9463-f1B
Figure 1B

Computation Steps to Determine Scientific Output of Researchers for Humanities and Social Sciences

Note. To determine scientific output of researchers, www.scientometrics.org evaluates four parameters including the h-index, the yearly independent citations, the number of publications in the last five years, and the number of high impact publications in the last ten years. Due to different publication characteristics there are methodological differences in the parameter calculation in life sciences and material sciences (F1A) and in humanities and social sciences (F1B). In mathematical sciences there was no filtering for authorship position.

The h-index is based on all publications using all citations (both dependent and independent). An h-index of h means that the researcher has at least h publications receiving at least h citations. For example, a person with a h-index of three has three publications with at least three citations each, but no four publications with four citations each. The h-index is representative for the entire scientific career of a researcher.

The yearly citation of all previous publications was computed by including independent citations only. To increase the weight of on one's own publications, the citation count received for first/last/corresponding authored publications was doubled for researchers in life sciences and in material sciences. The yearly citation count is representative for the impact of the scientist’s past scientific achievements in the present. In the selection of first/last/corresponding authorships divided first/last/corresponding authorships were also included.

The third computed parameter is the number of scientific papers published in the last five years in Q1 ranked journals. We allotted a journal rank to each journal using the “SCImago journal rank” parameter of the SCImago database. Publications in journals within the top 25% of their respective field were considered as Q1 papers. This parameter is designed to show the current scientific activity of the researcher. We used a five-year cutoff as the most widely available basic research OTKA grants use the most important publications from the last five years for each researcher. Therefore, only publications are included where the scientist is first, last, or corresponding author. Publications appearing in Q2/Q3/Q4 ranked journals are not considered in life sciences and material sciences. Narrowing the publications to Q1 ranked research and review papers ensures that only high-quality publications are included. In Table 1 we list the different papers included in the publication count in each of the eleven scientific disciplines.

Table 1

The Different Types of Papers Included in the Publication Count in Each of the Eleven Scientific Disciplines

Scientific discipline Q1 ranked papers Q ranked papers All publications Only first/last/corresponding authored publications
Life sciences
Agricultural sciences Yes No No Yes
Biology Yes No No Yes
Medicine Yes No No Yes
Material sciences
Chemistry Yes No No Yes
Earth sciences Yes No No Yes
Engineering Yes No No Yes
Mathematics Yes No No No
Physics Yes No No Yes
Humanities and social sciences
Economics and law Yes Yes No No
Literature and language sciences Yes Yes Yes No
Philosophy and history Yes Yes Yes No

Note. Different publications were counted when determining the number of papers published in the last five calendar years to reflect scientific discipline-specific differences. Yes = included in the publication count, paper = research or review paper, publication = any citable publication.

Of note, in mathematical sciences there was no filtering for authorship position (neither for citation nor for published papers), and there was no weight for first/last/corresponding authored papers (see Table 1). The reason for this is the widespread use of alphabetical author order in mathematical journals.

Due to different publication habits, in humanities and social sciences we included more publications in the paper count. When including first/last/corresponding authored Q1 articles only, the majority of scientists from humanities and social sciences did not even have a single publication in the last five years. The very low number of publications would prohibit the setup of a reliable reference database. In economics, scientific publications include Q ranked articles only (see Table 1). In linguistics and literacy, philosophy and historical sciences the scientific publications include all Q ranked articles, short publications, multi-authored publications, books, book chapters, conference proceedings, conference abstracts and monographs. Across all humanities and social sciences, journal publications were not weighted for Q ranking—in other words, journals with Q2, Q3, and Q4 ranks were considered in the same way as Q1 ranked articles. In addition, there was no filtering for authorship position and first/last/corresponding authored papers had the same weight as all other articles (see Table 1).

Correlation between the mean of the calculated parameter across all researchers within each scientific field and publication age was computed using Pearson correlation and linear regression.

Integrating the Computed Parameters Into a Single Overall Score

Each of the three parameters (h-index, yearly independent citations, and number of publications) were computed for each publication age. This makes it possible to compare a selected researcher to those who have the same academic career length. When determining the relative scientific output of a selected researcher at any age, any scientist alive at the same age as they are included in the comparison/ranking (which translates to including any scientist older than they). In other words, someone with five years of experience will be compared to everyone who has five years or more experience. However, for each included researcher, only the scientific parameters at the same publication age is considered and only for those who are active in the same scientific discipline.

The ranking, based on the h-index and on the yearly independent citations in the last year, and on the number of publications in the last five years is performed by ranking all researchers and determining a percentile value for the investigated researcher. In case there are multiple scientists with the same value (e.g. for h-index or for the number of publications), then the median of the percentiles is computed. Finally, an “overall score” is computed using the mean of the three percentiles. In order to have higher weight of the contemporary scientific activity, this computation is performed by using a double weight for the number of scientific publications in the last five years.

High Impact Publications

Scientific impact can be assessed using the high number of independent citations received for a selected publication (Wilsdon et al., 2015). In order to pick those with the highest impact, we utilized the Forefront grant scheme (this grant scheme has the highest amount of funding for individual researchers in Hungary) of the National Research, Development and Innovation Office (NRDIO). In the Forefront scheme, publications are selected which were cited on average each year since their publication in the top 10% of publications within their respective scientific discipline. The values for determining the independent citation threshold were based on the Scopus citation records for the world output previously determined by NRDIO. In particular, the cutoff values were: seven citations/year for linguistics and literary sciences, seven for philosophy and historical sciences, five for mathematics, 10 for agricultural sciences, 13 for medicine, 10 for engineering, 17 for chemistry, 17 for biology, seven for economics and law, 12 for earth sciences, and 13 for physical sciences. The final cutoff for the high impact values were based on averaging the total number of independent citations per year in the last ten years. The ten-year cutoff was used because this time length was also used in the Forefront grant scheme of the NRDIO. The number of high impact publications is multiplied by a fixed but variable factor (by default, four), and this number is added to the overall score.

In brief: overall score = (percentile [h-index] + percentile [yearly citation of all previous publications] + 2*percentile [scientific papers published in the last five years]) / 4 + (number of high impact publications * 4). As an example, let us assume a researcher has a h-index percentile of 75%, a citation percentile of 66%, and a publication percentile of 85%, and one high impact publication. The overall score will equal to (75 + 66 + 85 + 85)/4 + 4 = 81.75, which translates to a D2 rank.

Online Interface

For the evaluation of any researcher and also new researchers an online accessible application was developed by using the Shiny R package (Chang & Cheng, 2019), with the utilization of the ShinyCssLoaders (Sali, 2020) and the ShinyThemes (Chang & Cheng, 2019) R packages. Data reshaping and aggregation is performed by using the reshape2 R package (Wickham, 2007). Graphics are generated by the ggplot2 (Wickham, 2016) and gridExtra (Auguie & Antonov, 2017) packages. The assessment can be either run by using the name or the HSWA ID of a researcher or by entering all variables (scientific section, h-index, number of independent citations received in last year, number of publications in the last five years, etc.). When using the HSWA ID or name of a researcher, the most recent data is immediately downloaded from the HSWA site and the computation and ranking are executed in real time. The homepage also provides a visual ranking showing the actual percentile as well as an easily interpretable classification of a selected researcher into deciles between D1 and D10, where D1 is the best. The registration-free homepage of the platform can be accessed at Supplementary Materials.

Results

Reference Database

The reference database was established using a total of 17,072 researchers. Of these, 1,206 belong to agricultural sciences, 1,682 to biology, 2,266 to philosophy and history, 698 to physics, 693 to earth sciences, 2,731 to economics and law, 1,103 to chemistry, 599 to mathematics, 2,191 to engineering, 1,427 to language sciences and literacy, and 2,476 to medicine. The average age of the researchers was 49.2 years and the median age was 46 years. When looking on five-year bins, the majority, 52% of researchers, were born between 1970 and 1989, and 82% of all researchers were born between 1955 and 1994. The publications count totaled at 252,889 publications for all researchers combined.

In the present analysis we considered all ages up to a publication age of 45 years—this equals in most cases to a biological age of 70 years. This is the usual retirement age in Hungary and after this cutoff the reliability of the HSWA publication and citation records gets uncertain.

H-Index and Age

To increase clarity, we have combined all disciplines into three major categories designated as humanities and social sciences (n = 6,424), life sciences (n = 5,364), and material sciences (n = 5,284). Of note, the online established system shows the results for each field separately, and only the aggregated results are presented below.

The median h-index displays a surprisingly strong correlation to publication age, with an R2 over 0.95 in each cohort. The Pearson correlation shows a correlation coefficient of 0.97 and a p value below 1E-30 in all three groups. In the linear regression, the equation describing the correlation between publication age and median h-index is:

h-index = correlation * publication age

The correlation was 0.13 in humanities and social sciences, 0.35 in life sciences and 0.26 in material sciences (see Figure 2). Thus, despite of a broader inclusion of publication types (counting among others book chapters and conference proceedings as well) the overall progress in h-index remains the lowest in humanities and social sciences. By using this simple equation one can calculate the age-related expected h-index within the respective discipline.

Click to enlarge
/meth.9463-f2
Figure 2

Overview of Age-Related Growth of the Median H-Index

Note. Figure A (n = 6,424) for Humanities and Social Sciences, Figure B (n = 5,364) for Life Sciences, and Figure C (n = 5,284) for Material Sciences show the lower quartile, median, and upper quartile values as well as the linear regression equation for the median.

All the h-index values for each age for each researcher—grouped according to scientific field—are listed in the anonymized Supplemental Table 1 of the Supplementary Materials.

Yearly Independent Citation

When summing up the number of independent citations received per year in each of the three major groups the results were similar to those obtained when analyzing the h-index. By using the same formula (“citation/year = correlation * publication age”), the correlation was 0.1 in humanities and social sciences, 1.41 in life sciences, and 0.69 in material sciences. As an example: the median yearly new independent citations received for all previously published work at the publication age of 20 years (roughly a biological age of 45 years) of a life science scientist is 1.41 * 20 = 28.2. The Pearson correlation between publication age and yearly independent citation was 0.98 in life sciences and material sciences and 0.77 in humanities and social sciences (all three cohorts had a statistical significance below 1E-10). The linear regression between age and citation is displayed in Figure 3 for the three major fields.

Click to enlarge
/meth.9463-f3
Figure 3

Overview of Age-Related Growth of the Yearly Independent Citations

Note. Figure A (n = 6,424) for Humanities and Social Sciences, Figure B (n = 5,364) for Life Sciences, and Figure C (n = 5,284) for Material Sciences show the lower quartile, median, and upper quartile values as well as the linear regression equation for the median.

All the yearly independent citation values for each age for each researcher—grouped according to scientific field—are listed in the anonymized Supplemental Table 2 of the Supplementary Materials.

Publications and High Impact Publications

Finally, we have also determined the number of publications in the last five years for each publication age for each researcher. Of note, for humanities and social sciences this included additional publications (not only journal articles). In addition, authorship position was not assessed for humanities, social sciences, and mathematics—in all other fields of science only first/last/corresponding authored publications in Q1 ranked journals were considered. The linear regression between average number of publications and publication age delivered similar trends in life sciences and in material sciences (number of first/last authored Q1 publications = 0.05 * publication age). In humanities and social sciences, the correlation between publication age and number of publications was best described by the equation (number of publications = 0.29 * publication age). Figure 4 shows the correlation between publication age and publication number in each of the three major fields.

Click to enlarge
/meth.9463-f4
Figure 4

Correlation in the Last Five Years Between Number of Publications or the Number of First/Last/Corresponding Authored Q1 Publications (According to Scientific Field) for Each Publication Age for Each Researcher

Note. Figure A shows the correlation between age and the number of publications in the last five years in the Humanities and Social Sciences. Figure B shows the correlation between age and first/last authored Q1 ranked publications in the last five years in Life Sciences. Figure C shows the correlation between age and first/last authored Q1 ranked publications in the last five years in the Material Sciences, with all figures showing the lower quartile, the median, the upper quartile, and the average values as well as the linear regression equation for the average. (All papers included in Humanities and Social Sciences as well as in Mathematics regardless of authorship position. Q ranked publications included in economics).

The number of publications in the last five years or the number of first/last/corresponding authored Q1 publications in the last five years (according to scientific field) for each publication age for each researcher—grouped according to scientific field—are listed in the anonymized Supplemental Table 3 of the Supplementary Materials.

When counting the total number of high impact publications, there were 278 researchers in humanities and social sciences, 340 researchers in life sciences and 378 researchers in material sciences with at least one such publication. Thus, overall 5.8% of all researchers had such a paper. Of the 996 scientists with high impact publications 714 had one, 155 had two, and 127 researchers had more than two such papers.

Real-Time Ranking of a Researcher’s Scientific Output

Finally, we have integrated all the results into an online platform capable to evaluate and rank any existing researcher (by entering the name or the HSWA ID of the scientist) or to rank a new researcher when providing pre-computed values. The visualization of the output of all researchers is made by showing the 100th, the 90th, the 75th, the 50th, the 25th and the 0th percentiles in each group for both h-index, number of yearly citations, and number of publications in the last five complete years, as displayed in Figure 5. Markedly, the online platform delivers classification for each of the eleven scientific disciplines separately. Once a new researcher is added, ranking is computed according to his/her respective discipline and the values are graphically displayed by showing a ranked distribution of all researchers with the same age in the investigated cohort (Figure 6). Notably, the raw values are also displayed to increase transparency and reproducibility of the computed parameters.

Click to enlarge
/meth.9463-f5
Figure 5

H-Index, Number of Independent Citations per Year, and Number of First/Last/Corresponding Authored Q1 Publications in the Last Five Years for Scientists Active in Chemistry (n = 829) According to Publication Age

Note. The bold dark green line marks the median value. The red line shows the performance of the very best scientist in each year to date. In contrast to Figure 3 and 4 the Y-axis in logarithmic (except for h-index).

Click to enlarge
/meth.9463-f6
Figure 6

Ranking of a Random Researcher Active in Medicine Using Three Computed Parameters

Note. The left panel shows the h-index. The central panel shows the number of independent citations per year. The right panel shows the number of first/last/corresponding authored Q1 articles in the last five years. For all panels, n = 1987 from 2070. The dotted red lines show quartile cutoffs, and the red arrows indicate the actual rank. The shaded area on the right side represents those researchers who did not yet reach the current publication age of the evaluated researcher. The bottom part shows the percentile evaluation of the included parameters as well as the values used in the calculation of a total score used for ranking.

Finally, we computed the overall score for all researchers using a 400% weight for journal rank in the equation. When comparing the overall score values to the default weight of 200%, there was a strong correlation (Pearson correlation coefficient = 0.98, p < 1E-16). Similarly, when comparing the overall scores between 200% and 100%, the correlation was also similarly high (Pearson correlation coefficient = 0.98, p < 1E-16). These results are no surprise as instead of actual values we use the ranking percentile across all researchers when determining the overall score, thus only researchers with a discrepant recent and earlier publication activity will change. The strong correlations support the robustness of our approach.

Discussion

Whenever we have to review a grant application, we have to weight not only the submitted proposal but also the previous achievements of the applicant. Numerous tools are at hand to provide an overview of the publication performance of a scientist like Google Scholar or Scopus—these are all capable to visualize the scientific progress in addition to providing basic metrics like h-index or the total number of publications. However, all these tools have two fundamental weaknesses: they do not provide a relative assessment within the scientific field of the researcher and they do not account for the publication age of different researchers. Previously, we have shown that age-related output shows different patterns in different scientific disciplines and most academic careers require decades before the maximal scientific output is reached (Győrffy et al., 2020a). We have also investigated additional parameters including gender, scientific degree, and international grants and found no correlation with publication output (Győrffy et al., 2018)—for this reason, we have not considered these factors in our current analysis.

Here, we established a method enabling the quick assessment of a researcher’s scientific output by comparing four parameters (h-index, number of yearly independent citations received, number of publications in the last five years, and number of high impact publications in the last ten years) to age-matched values of all other researchers active in the same scientific field in Hungary. As a result, we established an online platform capable to integrate all derived ranking values into a final “overall score”. The usage of a simple final score makes utilization of the system easy to understand and also enhances transparency (Hicks et al., 2015).

Decision-making in science should be based on high-quality procedures using the highest quality data (Hicks et al., 2015). In this work, we selected parameters based on our previous work where we have observed high correlation between the investigated parameters and scientific output in NRDIO applicants (Győrffy et al., 2020b) and in Momentum grant holders (Győrffy et al., 2018). In addition, we also demonstrate here that the median and the mean of the evaluated parameters show a strong correlation to publication age. For example, the solid correlation with a highly significant p value between age and h-index supports is robustness. Previous studies also support the predictive power, in other words the strong correlation between future productivity and h-index (Hirsch, 2007), citations, or publication count (Carpenter et al., 2014). Other approaches also validated a persistent and stable performance of academic researchers. For instance, the Q parameter, which represents a scientist’s sustained ability to publish high-impact (or low-impact) papers, is generally stable throughout a career, offering a quantitative prediction on the evolution of a scientific career (Sinatra et al., 2016).

When establishing scientometrics.org, we based a significant proportion of the work on the Leiden Manifesto for research metrics which suggested a set of principles to be used when performing research evaluation (Hicks et al., 2015). In particular, by using different algorithms for different scientific fields, we account for the variation by field in publication and citation practices. Second, the analysis steps are summarized by a single flowchart (illustrated in Figure 1 including all the executed steps making the data collection and analysis open, transparent, and simple. Third, the online system allows the users to verify the data and the analysis by providing a direct link to the original data on the HSWA homepage which also lists some of the computed parameters including the h-index (albeit only for the current year). Fourth, the complete reference datasets can be downloaded directly from the homepage—in principle each data point for every single researcher can be re-computed and validated. In line with the concept of the Leiden Manifesto our tool is designed not as a substitute for judgement, but as a quantitative evaluation support tool to assist qualitative and expert assessment in researcher evaluation.

We have not employed the impact factor in our system despite several publications documenting both the predictive power of impact factors in researcher evaluations (Győrffy et al., 2018) and its widespread use (McKiernan et al., 2019). However, the impact factor has serious limitations (Bordons et al., 2002) and would need substantial improvements to improve its quality (Vanclay, 2012). Today, initiatives like DORA suggest to greatly reduce emphasis on journal impact factors and to assess research on its own merits rather than on the basis of the journal in which the research was published. However, critics of DORA defend the impact factor and reason that the confusion over how to judge scientific productivity will burrow scientific productivity (Tregoning, 2018). Here, we deliberately utilized only data freely available on the internet and the journal impact factors do not fulfill this criterion.

Notably, the investigated scientific parameters have some disadvantages. Most of the publication quality metrics parameters are based on citation—this constraint is also true for other platforms, like Google Scholar as well. The citation-based h-index can be easily manipulated by some targeted self-citations. For this reason, we provide two further options in the analysis platform: to compute the h-index using independent citations only and to compute the S-index, which is the h-index using self-citations only. Another limitation of the h-index is that it drives researchers into hot topics where they can earlier increase their score (Conroy, 2020). A new study has observed a melting correlation between the h-index and scientific awards in recent years and proposed the use of a fractional h-index (Koltun & Hafner, 2021). For these reasons, and to improve the robustness of the analysis, we diversified the parameters by adding journal rank, and by giving a 200% weight to this score when combining the different parameters. The increased weight of the publication count increases the effect of the last five years on the final result—as opposed to using h-index and citation only which can accumulate throughout one’s career. Nevertheless, the utilization of the h-index and the inclusion of journal articles based on journal rank could result in “optimized” publication characteristics of researchers in the future. Different optimization techniques have been extensively analysed and discussed in a recent large-scale analysis (Ioannidis et al., 2019).

We have to note an important limitation of the system. As it only includes Hungarian researchers, the actual ranking and the numerical percentages are validated only for Hungarian scientists. Nevertheless, researchers from other countries can still be ranked by comparing their relative overall scores derived using the current database. We plan to expand the database in the future to other countries as well—the only pre-requisite for this is a comprehensive list of active researchers with scientific field classification as the publication data itself is accessible from sources other than HSWA like Google Scholar or Scopus. A further limitation is the inclusion of parameters based on output (publications) and impact (citations) only—other features like education, science communication, or leadership could also foster a fruitful academic career. However, there is no reliable database of these other parameters for all researchers which prevents their objective use when comparing all scientists in a given scientific field.

In summary, here we established a method to objectively compare and rank researchers based on their publication output. The analysis is performed by comparing each researcher to a common reference database comprising Hungarian researchers of the same age and active in the same scientific discipline. A major advantage of our platform is the elimination of age-related disadvantages (e.g. when comparing h-index or citation count) for early career scientists. Our future goal is to extend the online tool to suit other countries as well.

Funding

The authors have no funding to report.

Acknowledgments

The authors have no additional (i.e., non-financial) support to report.

Competing Interests

The authors have declared that no competing interests exist.

Data Availability

Data are freely available, see Győrffy et al. (2022a) and Győrffy et al. (2022b).

Supplementary Materials

For this article, a Shiny app was developed for the assessment of the age-normalized publication output of researchers in the same scientific discipline. The Shiny App can be accessed online - see Győrffy et al. (2022a) below. Three supplemental tables are availablethat contain the H-index, yearly independent citation, and article count of the publications in the past five years for each included research of each age - see Győrffy et al. (2022b).

Index of Supplementary Materials

  • Győrffy, B., Weltz, B., Munkácsy, G., Herman, P., & Szabó, I. (2022a). Supplementary materials to "Evaluating individual scientific output normalized to publication age and academic field through the Scientometrics.org project" [Shiny App]. Scientometrics. https://www.scientometrics.org/

  • Győrffy, B., Weltz, B., Munkácsy, G., Herman, P., & Szabó, I. (2022b). Supplementary materials to "Evaluating individual scientific output normalized to publication age and academic field through the Scientometrics.org project" [Supplemental tables: H-index, yearly citation, article count]. PsychOpen GOLD. https://doi.org/10.23668/psycharchives.12207

References