Research on Hate Speech. A Proposal of bibliometric analysis in Spain and LATAM during 2021 and 2022

Tamara Antona Jimeno, Ana Mayagoitia-Soria, Jasmina Đorđević

ICONO 14, Revista de comunicación y tecnologías emergentes, vol. 22, no. 1, 2024

Asociación científica ICONO 14

Investigación sobre el Discurso de Odio. Una propuesta de análisis bibliométrico en España y LATAM entre 2021 y 2022

Investigação sobre Discurso de Ódio. Uma proposta para uma análise bibliométrica em Espanha e LATAM entre 2021 e 2022

Tamara Antona Jimeno *

Universidad Complutense de Madrid (UCM), España

Ana Mayagoitia-Soria **,***

Universidad Complutense de Madrid (UCM), España

Universidad de Valladolid (Uva), España

Jasmina Đorđević ****

Universidad de Niš, República de Serbia

Received: 28 november 2023

Published: 16 january 2024

Abstract: This research is based on bibliometric analysis to find out if Hate Speech in Spain and Latin America has consolidated as a mature research field in recent years (2021 and 2022). Through the consultation of the Web of Science (WoS) database and by searching specific terms, we obtained a universe of 153 articles that allowed us to address our objective. The results revealed that the analysis of hate speech in Spain and LATAM is mature, characterized by consistent publication trends and influential authors while productive researchers tend to be highly cited. Conceptual trends include Hate Speech, social media, Twitter, and Freedom of Expression. There is also evidence that new theoretical foundations and reference groups in this evolving field are emerging.

Keywords: Hate Speech; Social Media; Bibliometric Analysis; Web of Science; Co-citation; Academic communication.

Resumen: Esta investigación utiliza el análisis bibliométrico para conocer si el Discurso de Odio (DDO) en España y Latinoamérica (LATAM) se ha consolidado como un campo de investigación maduro en años recientes (2021 y 2022). A través de la consulta de la base de datos Web of Science (WoS) y mediante la búsqueda de términos específicos se obtuvo un universo de 153 artículos para abordar la cuestión. Los resultados revelan que el análisis del DDO en España y LATAM es maduro, caracterizado por tendencias de publicación consistentes y autores influyentes, mientras que los investigadores productivos tienden a ser altamente citados. Las tendencias conceptuales incluyen el DDO, las redes sociales, Twitter y la libertad de expresión. También hay evidencia de que están surgiendo nuevos fundamentos teóricos y grupos de referencia en este campo en evolución.

Palabras clave: Discurso de odio; Redes sociales; Análisis bibliométrico; Web of Science; Co-citación; Comunicación académica.

Resumo: Esta pesquisa utiliza a análise bibliométrica para descobrir se o Discurso de Ódio (DH) em Espanha e na América Latina (LATAM) se consolidou como um campo de pesquisa maduro nos últimos anos (2021 e 2022). Através da consulta da base de dados Web of Science (WoS) e da pesquisa de termos específicos, obteve-se um universo de 153 artigos para responder à questão. Os resultados revelam que a análise de DDO em Espanha e na América Latina é madura, caracterizada por tendências de publicação consistentes e autores influentes, enquanto os investigadores produtivos tendem a ser altamente citados. As tendências conceptuais incluem OGD, redes sociais, Twitter e liberdade de expressão. Há também indícios de que estão a surgir novos fundamentos teóricos e grupos de referência neste domínio em evolução.

Palavras-chave: Discurso de ódio; Redes sociais; Análise bibliométrica; Web of Science; Co-citação; Comunicação académica.

1. Introduction

Hate Speech analysis is undoubtedly a growing area of interest within the scientific community. However, analyses from a bibliometric perspective are uncommon in this area compared to studies on the interaction between television and social networks (Segado-Boj et al., 2015). The aim of this research is to find out if the study of Hate Speech in Spain and Latin America (LATAM) has consolidated as a mature research field in recent years (2021 and 2022) using bibliometric tools.

Bibliometric analysis has become a fundamental tool for advancing research in various fields in Spain since López Piñero introduced, in the early 1970s, the methodology for evaluating scientific production and occurrences related to scientific communication. Spain has stood out as one of the most productive countries in the creation of bibliometric theses and articles (Delgado López-Cózar et al., 2006; Díaz-Campo, 2016), covering areas like environmental journalism (Barranquero & Marín, 2014), transmedia storytelling (Vicente-Torrico, 2017), organizational communication (Míguez-González & Costa-Sánchez, 2019), the growth of e-sports (Carrillo et al., 2018) and even political advertising (Arango Espinal et al., 2020).

Research on media consumption from this perspective is scarcer than in other areas (Repiso et al., 2011) and, more specifically, on news dissemination in social networks (García-Perdomo et al., 2018; Kalsnes & Larsson, 2018; Segado-Boj et al., 2019). The boundaries of this line of research have become blurred in the last period due to the exponential increase in academic attention, and new studies are needed to delve deeper into the origin of this trend, understand its characteristics, and explore possible avenues of growth. Furthermore, it is important to determine to what extent the increase in publications related to social networks reflects a solidly established and mature field in the Spanish and Latin American context.

Bibliometric analyses have various applications that include exploring areas of knowledge, detecting trends, delimitating the spectrum of existing approaches to specific topics, and identifying co-authorship networks. These networks represent sets of researchers who regularly collaborate with each other, reflecting the existence of research communities or possible academic genealogies with their particular characteristics (Segado-Boj et al., 2021a, p.81). Regarding the structure of research collaboration between Spain and LATAM, the nodes in its co-authorship network related to nations, institutions, and individuals have been expanding, making this rising interconnection a sign of maturity (Segado-Boj et al., 2021b).

Bibliometric analysis is a quantitative research method used to evaluate and analyze academic literature. This approach involves implementing statistical and computational techniques to bibliographic data to gain insight into various aspects of academic publications, such as citation analysis, collaboration networks, research productivity and impact, or emerging trends (Todeschini & Baccini, 2016). Data collection and analysis rely on citation indexes (Web of Science, Scopus, Google Scholar), databases, and software tools. This method plays a crucial role in scientific research by providing objective and quantifiable information to identify research trends and limitations (Donthu et al., 2021).

Bibliometric indicators, while commonly used to analyze research fields and academic disciplines, can be applied to a wide range of objects and subjects. Recent bibliometric analyses cover scientific production in Social Science on investigative journalism (Segado-Boj et al., 2022), universities and their academic articles related to the United Nations Sustainable Development Goals (Repiso et al., 2011), as well as research on television series (Segado-Boj et al., 2021a).

Alicia Moreno-Delgado (2021) argues that academic attention to topics within Communication, Film, Radio, and Television extends beyond traditional domains engaging scholars in Education, Economics, Sociology, and Humanities. Interdisciplinary diversity within this field poses both advantages and risks. For example, when addressing news dissemination on networks the intricacies beyond "incidental consumption should be considered, as discussed by Mitchelstein and Boczkowski (2018).

Bibliometric analysis is a versatile and informative tool applicable across diverse academic disciplines. It provides information on productivity, impact, and relationships, facilitating informed decision-making in both academia and research policy.

1.1. An approach to the concept of Hate Speech

Hate speech refers to the use of language to attack individuals or groups directly. Harassment, offensive language, threats, or violent and discriminatory expressions that incite hostility are used to target people because of their gender, sexual orientation, religious beliefs, race, and physical or intellectual disability (Carlson, 2021). It is also used to dehumanize individuals, distort reality, silence political affiliation, or show the superiority of one group over another (Guillén-Nieto, 2023).

Anonymity on the Internet stands out as fertile ground for the creation and dissemination of ODD (Mondal et al., 2017). Studies examine the emotional and psychological impacts of online hate, addressing the LGBT community (Ștefăniță & Buf, 2021), cruelty in comments on TikTok towards marginalized groups (Paz-Rebollo et al., 2023), and the self-censorship on Instagram due to the spiral of silence (Martínez Valerio & Mayagoitia Soria, 2021). Research has delved into the hatred towards immigrants on Twitter (Román-San-Miguel et al., 2022), and politics play a key role in the proliferation of Hate Speech on social media, leading to polarization and social fragmentation (González-Aguilar et al., 2023; Paz et al., 2021).

Bibliometric analysis shows a growing interest in Hate Speech research, particularly in Computer Science and Social Science, with an increase in scientific production on social media and online communities (Ramírez-García et al., 2022; Jahan & Oussalah, 2023). Although limited, research in Spain and Latin America reveals a significant increase in interest in networks in English, Portuguese, and Spanish (Izquierdo Montero et al., 2022). Despite fewer studies in Spanish, Spain and Latin American countries demonstrate consistent growth, interdisciplinary approach and diverse methodologies (Paz et al., 2020). Spain is recognized as a leading contributor in forensic linguistics for detecting verbal violence and discrimination, along with the United Kingdom and the United States. (Alduais et al., 2023).

1.2 Research questions

Despite Spain's late incorporation into the international publication circuit of scientific research results in Communication, there has been an exponential growth in the number of publications (Saperas, 2016). This growth is mainly reflected in positivist works that explores the audience and its interaction with ICT and social networks (Segado-Boj et al., 2022). Social networks have nurtured their own field of study that goes beyond content production. The research proposes a bibliometric analysis focused on the scientific literature of Social Sciences in Web of Science (WoS). Based on the bibliometric proposal of Segado-Boj et al. (2021) on television series, the article seeks to address three fundamental questions to deepen the issues raised in the context described.

  1. Q1. Is the analysis of hate speech in Spain and LATAM a truly mature field of research? Is there consistency in specialized journals and authors in the last two years? Are the most productive researchers also the most cited? Is there a constant annual evolution in the number of articles published?

  2. Q2. What have been the conceptual trends and lines of research in Hate Speech in Spain and LATAM in the last years?

  3. Q3. Is it possible to speak of theoretical foundations and reference groups that tend to be cited together in recent literature?

2. Methodology

Bibliometrics is revealed through mathematical methods as a valuable tool to transform complex information into manageable data, allowing to define the research profile of an academic institution in a specific topic (Moya-Anegón et al., 2007). This approach involves applying mathematical operations and statistical methods to analyze books, platforms, and media that disseminate knowledge (Pritchard, 1969).

The analysis of this research, based on the approach of delving into the state of science through the global production of highly specialized scientific literature (Okubo, 1997), focuses on DDO diffusion studies using WoS as a source (n = 153 articles). We explored aspects such as authorship, publication journals, national-level production, international collaboration, and keyword co-citation, revealing the main intellectual trends in this field. In addition, we examined the co-occurrence of references to identify a theoretical that serves as a basis for research in this area.

For bibliographic information, we consulted the Core Collection of the WoS database by searching Hate Speech AND ("Spain" OR "LatinAmerica" OR "Latin America" OR "LATAM" OR "Iberoamerica" OR "Ibero America") in all the fields of the documents. No complementary searches of the term were performed due to the very restriction of the concept since, at the academic level, "Hate Speech" is the most common way of referring to the phenomenon of dissemination of hate speech (Ramírez-García et al., 2022).

The choice to use bibliometric analysis is based on its wide use to assess and quantify the academic landscape, being an essential tool for understanding the constant evolution of the academic landscape (Segado-Boj et al., 2021b; Segado-Boj et al., 2022; Segado-Boj et al., 2023). Network theory is used to construct and analyze the network, assessing the centrality of nodes to understand the relevance of key themes and co-citations between authors to identify a solid foundation in the scholarly literature and its connections (Barabási, 2009; Granovetter, 1973; Watts, 2003). This approach provides a theoretical and analytical framework for understanding the complexity of interconnected topics and discovering scientific communities. Degree centrality measures the importance of a node according to the number of connections, while betweenness assesses its share of shortest paths between other nodes in the network (Wasserman & Faust, 1994).

The delimitation of this article, which presents a bibliometric analysis of scientific collaboration in Spain and LATAM on DDOs is due to the frequent patterns of collaboration and co-authorships between these two regions (Belli and Baltà, 2019; Segado-Boj et al., 2021b) in Social Sciences (Aguado-López and Becerril-García, 2016; Aguado-López et al., 2017).

The search was restricted to articles published in journals in the years 2021, 2022, and up to May 2023, resulting in a universe of 153 units related to the topic. The bibliometric information was downloaded on May 30, 2023, and the VosViewer program (van Eck & Waltman, 2010) was used to obtain the descriptive data on productivity and performance. The same software was used to generate the co-citation and co-word networks. These were analyzed and represented with Pajek (Batagelj & Mrvar, 1998), without the content of the articles becoming part of the object of study of this research.

For the visual representation of the graphs, we applied a reduction criterion. Only the co-appearances of keywords that reached a minimum of three occasions and the references that have been cited together at least four times are shown. In addition, the Louvain algorithm with the following parameters was used: multilevel coarsening, single refinement, resolution parameter = 1, number of random restarts = 1, maximum number of levels in each iteration = 20, and maximum number of repetitions in each level = 50 to identify the different communities. The color of the nodes in the graphs corresponds to these identified communities.

Through the application of the method mentioned above, we obtained results that allowed us to answer our research questions.

3. Results

3.1. Data collection

The research on hate speech in Spain and LATAM has a total universe of 153 articles published from January 2021 to May 2023. There was an increase in the number of published articles from 2021 to 2022, with a growth rate of 29% during that year. Up to May 2023, only six articles had been published (Graph 1). This number is significantly lower than the previous years, indicating a potential slowdown or change in content production. However, it is important to note that this data only represents the first months of 2023. This might indicate some seasonal variation in production, publication frequency changes, content strategy shifts, or just a lag in WoS indexing.

Number of articles published per year on Hate Speech
Graph 1
Number of articles published per year on Hate Speech

Source: Own elaboration

Regarding productivity and impact, the collected data includes authors from various countries, such as Spain, Germany, the United States, and Italy, suggesting international diversity in the academic landscape (Table 1). The data indicates that publications within Spain and LATAM represent 69% of scientific production. Rosso from Valencia Polytechnic University in Spain has published the most articles (8). Both Gámez-Guadix (Spain) and Wachs (Germany) are the authors with the highest number of citations (69 each). Four authors (Arcila-Calderón, Blanco-Herrero, Sánchez-Holgado, & Amores) come from the University of Salamanca in Spain, which suggests potential collaboration within the institution. Concerning the impact of published work, authors with more citations tend to have a more significant impact in their respective fields, indicating that their research is influential and recognized by other researchers. While Rosso has the highest number of published articles, it is important to consider the balance between quantity and quality: Gámez-Guadix and Wachs, with fewer articles, have a comparable number of citations, indicating high-impact research.

Table 1
Authors with more than two manuscripts in 2021, 2022 and first months of 2023 published on Spain and LATAM
Authors with more than two manuscripts in 2021, 2022 and first months of 2023 published on Spain and LATAM

Source: Own elaboration

In our dataset, the range of citations varies from 10 to 32, highlighting discrepancies in the impact and influence of articles (Table 2). This analysis suggests that the most highly cited manuscripts delve into various aspects of Hate Speech, offering insights into its detection, analysis, and prevention across different contexts. High citation counts reveal these articles have made significant contributions to the field. Keywords associated with each article unveil the primary research foci, including hate speech detection, cyberhate, sentiment analysis, and machine learning, indicating academic interest in these areas. These keywords underscore the central themes of understanding and combating Hate Speech, particularly online. The presence of diverse keywords like affective computing, social bias, immigration, and multimodal analysis underscores the interdisciplinary nature of research in this domain. Additionally, keywords like newsgames and gamification suggest emerging, specialized research areas within the broader field of study, addressing unique aspects of the topic.

Table 2
Most cited manuscripts
Most cited manuscripts

Source: Own elaboration

The collected data shows a diverse range of knowledge areas covered by the top 10 journals that published the most articles on Hate Speech in Spain and LATAM during the analyzed period. Communication appears to be a dominant knowledge area with journals like Comunicar, ProfesionalDe La Información, RevistaMediterránea Comunicación, and Revista Latina de Comunicacion Social, all focusing on aspects of Communication and Education (Table 3). Some journals are quite specialized, such as Natural Language Processing and Information Systems, dedicated to Computer Science, and Complex Networks & Their Applications, specialized in Computational Intelligence. A few journals (Humanities & Social Sciences Communications and International Journal of Environmental Research and Public Health) bridge the areas of Social Science and Health Science, demonstrating interdisciplinary research. While the number of articles published in each journal does not necessarily reflect their impact, it does suggest their significance within their respective fields. This data highlights the broad spectrum of academic and research fields that these journals cater to.

Table 3
Top 10 journals that published the most articles on Hate Speech in Spain and LATAM
Top 10 journals that published the most articles on Hate Speech in Spain and LATAM

Source: Own elaboration

3.2. Conceptual and intellectual trends in the field

The relationship between most used keywords and number of articles provides a glimpse into the level of interest and attention each topic receives in the analyzed data (Table 4). The large number of articles that use hate speech as keyword (66) indicate that this is a highly discussed and concerning issue, likely driven by its prevalence on the internet. While social media is a broad category, the number of articles (28) suggests that it remains a topic of significant interest, possibly because of its impact on various aspects of society, including communication, politics, and culture. As a prominent social media platform, Twitter has garnered attention (19 articles) due to its role in shaping public discourse and controversies related to content moderation. The focus on freedom of expression (18 articles) indicates ongoing debates and discussions about the boundaries of this fundamental right, particularly in the context of the digital age. Although not as widely discussed as other related topics, natural language processing (7 articles), cyberhate, deep learning, immigration, racism (6 articles each), machine learning and xenophobia (5 articles each) may suggest there is a niche but growing field with discussions revolving around its applications, challenges, and ethical considerations.

Table 4
Most used keywords
Most used keywords

Source: Own elaboration

The analysis of the co-keyword network allows for a graphical identification of communities (nodes), which are visually represented by various colors, as well as how they interrelate. This enables the identification of the main research themes and how they appear together in scientific literature.

In general (Figure 1) research trends encompass the detection and analysis of hate speech in text and social media (yellow), multilingual hate speech detection in social media (green), and the analysis and detection of specific types of hate speech, such as misogyny, with a particular emphasis on Twitter (blue). Three sets of nodes (red, white, and pink) represent niche research domains, including anti-immigration sentiment, racist discourse, manifestations of cyberhate, and the dynamics between bystanders and perpetrators in the context of online hate. The analysis of publication years by groups shows how research has evolved in these areas of study.

The highest degree centrality is associated with the keyword social media which belongs to the blue community but is connected to the other two larger communities (yellow) through inmigration and Twitter, to the green community (online hate speech, Instagram, political communication, and digital campaigning), and to the red community (freedom of expression). Finally, there is a separate community (white) that connects with social media through racism. In conclusion, social media is the shared theme in the majority of hate studies in Spain and Latin America. Additionally, there is a community that does not interconnect with the others, with keywords related to natural language processing, text classification, and machine learning which implies a focus on studies stemming from Engineering.

Keyword co-occurrence
Figure 1
Keyword co-occurrence

Source: Own elaboration

The cluster with the most nodes (green) focus on the political sphere, indicating the significance of Instagram and online hate speech, which can serve as a deterrent or a mobilizing factor in political communication and digital campaigning. The terms immigration, Twitter, social network analysis, and deep learning form a cluster (yellow), demonstrating a connection in the context of research and analysis to comprehend public opinion, sentiments, and narratives related to immigration using big data as a tool. Another cluster centers on the LGBTQ+ community, homophobia, and hate messages (in blue); the intersectionality of these terms and their association with the keyword social media signifies the impact of Hate Speech analysis in the digital space. Freedom of expression is directly linked to the terms religious freedom and dignity (red), which are fundamental human rights. In contrast to these concepts, racism and xenophobia (in white) form a group directly connected to social media. There is a separate cluster (in pink) in which the terms machine learning and text classification are linked to the keyword natural language processing; these three fields are interconnected within the domains of Artificial Intelligence and Computer Science and may indicate a particular focus on applying these technologies to address issues related to Hate Speech.

3.3. Co-citation analysis

The co-citation node analysis provided insights into the relationships between different publications and how they contribute to the theoretical framework of other researchers. In other words, it helped to determine whether various academic communities share a reference group or if there were multiple groups and how they interrelated.

As observed in Figure 2, it is evident that there are two interconnected communities (green and yellow) through several nodes. The highest degree centrality is held by Basile (2019) (see Table 5 for the complete reference), who connects the yellow and green groups to the blue one.

Most of these references pertain to studies conducted from a Computer Science perspective rather than from the field of social sciences. On the other hand, there are three groups that are not connected to the main co-citation network and are interdependent on each other (white, pink, and red), precisely the studies based on the Communication field. What is particularly noteworthy is that these unconnected nodes with the primary co-citation networks include the three most productive and highly cited authors: Gámez Guadix and Wachs.

Co-citation network
Figure 2
Co-citation network

Source: Own elaboration

Table 5
Publications that tend to be co-cited
Publications that tend to be co-cited

Source: Own elaboration

4. Conclusions and discussion

The analysis of hate speech has evolved into a mature field of research in Spain and Latin America (LATAM), as evidenced by consistent publication trends, influential authors, and conceptual themes (Mondal et al., 2017; Paz-Rebollo et al., 2023; Ștefăniță & Buf, 2021). The analysis of 153 articles reveals that the number of citations and associated keywords provide valuable insights into the impact, relevance, and areas of focus in hate speech and related research. The combination of citations and keywords creates a comprehensive picture of the academic landscape indicating that the most cited manuscripts cover various aspects of Hate Speech while referring to a broad range of areas and themes. Most of these themes are related to understanding and combating Hate Speech, particularly in participatory media (Instagram and Twitter) but other topics (homophobia, hate messages and freedom of speech) are also gaining on significance.

Co-citation analysis sheds light on multidisciplinarity. The co-citation network suggests robust theoretical foundations and solid reference works within each group. These co-cited publications originate from various conferences and journals, reflecting a multidisciplinary approach to the topic under analysis. Most references refer to research from a Computer Science perspective while the field of Social Sciences is less prominent.

The analysis shows that authors with more citations tend to have a more significant impact in their respective fields, indicating that their research is influential and widely recognized. Notably, Gámez-Guadix and Wachs, despite publishing fewer articles, have garnered a higher number of citations, highlighting the impact of their research. In addition, it could be noticed that publications within Spain and LATAM represent 69% of scientific production, underscoring the regional importance of this research. The top 10 journals in this field cover a wide range of knowledge areas, with Communication emerging as a dominant domain. Journals such as Comunicar, Profesional De La Información, Revista Mediterránea Comunicación, and RevistaLatina de Comunicación Social feature prominently in the landscape of hate speech analysis.

Regarding the research questions, this study provides valuable insights. As for the first research, the results confirm that the analysis of hate speech in Spain and LATAM is mature, characterized by consistent publication trends and influential authors while productive researchers tend to be highly cited. The second research question was meant to reveal what the conceptual trends and lines of research in Hate Speech in Spain and LATAM have been in recent years. The analysis shows that conceptual trends have encompassed Hate Speech, social media, Twitter, and freedom of expression. Finally, the third research question was meant to reveal whether it is possible to speak of theoretical foundations and reference groups that tend to be cited together in recent literature. Though not conclusive, there is evidence suggesting the emergence of theoretical foundations and reference groups in this evolving field.

In conclusion, it seems that the landscape of hate speech analysis in Spain and LATAM is characterized by its maturity, consistency in publication trends, influential researchers, and evolving conceptual themes. This research field continues to grow, revealing its relevance in an increasingly interconnected world.

Authors’ contribution

Tamara Antona Jimeno: Conceptualization, Data curation, Formal analysis, Investigation, Methodology, Supervision, Visualization, Writing – review and editing. Ana Mayagoitia-Soria: Conceptualization, Formal analysis, Investigation, Methodology, Visualization, Writing - review and editing. Jasmina Đorđević: Investigation, Methodology, Writing - review and editing. All authors have read and agree to the published version of the manuscript. Conflicts of interest: The authors declare that they have no conflict of interest.


