Media bias and electoral discourse: a Natural Language Processing approach

Sesgo mediático y discurso electoral: una aproximación con Procesamiento del Lenguaje Natural

Preconceitos dos media e discurso eleitoral: uma abordagem de processamento da linguagem natural

Mar Castillo-Campos1*
David Becerra-Alonso2**
David Varona-Aramburu3***

1 Loyola Andalucía University, Spain
2 Loyola Andalucía University, Spain
3 Complutense University of Madrid, Spain

* Research assistant in the Department of Communication and Education. Loyola Loyola Andalucía University, Spain. Email: mcastillo@uloyola.es
** Associate Professor in the Department of Quantitative Methods. Loyola Andalucía University, Spain. Email: dbecerra@uloyola.es
*** Assistant Professor at the Department of Journalism and New Media. Complutense University of Madrid, Spain. Email: davarona@ucm.es

Received: 16/02/2024; Revised: 21/02/2024; Accepted: 18/01/2025; Published: 08/07/2025

To cite this article: Castillo-Campos, Mar; Becerra-Alonso, David; & Varona-Aramburu, David. (2025). Media bias and electoral discourse: a Natural Language Processing approach. ICONO 14. Scientific Journal of Communication and Emerging Technologies, 23(1): e2154. https://doi.org/10.7195/ri14.v23i1.2154

Abstract

The media must serve as a reliable channel of information, especially given the overabundance of information and the personal channels available to public figures. In a society that is increasingly polarized and distrustful of the media, this study seeks to analyze the media coverage of an electoral campaign in Spain, the first following the global COVID-19 pandemic. Using TF, TF*IDF, and word2vec for text quantification and vectorization, along with UMAP and t-SNE for cluster analysis, we examine how certain terms are used in media outlets and their semantic associations. Our findings reveal a tendency for the media to link certain candidates or parties with political extremes, violence, and negativity, often overshadowing substantive political discourse. In particular, coverage predominantly focusses on major parties and polarizing factions. Campaign events receive more attention than policy proposals, which are often neglected. These insights align with previous qualitative studies, demonstrating the efficacy of our quantitative approach in expanding the sample size, reducing analysis time, and revealing nuanced patterns not readily apparent through traditional methodologies. This study contributes to a deeper understanding of the dynamics of the media during election cycles and underscores the value of quantitative methods in media analysis.

Keywords
Elections coverage; Discourse analysis; Natural language processing; NLP; Computational communication; Politics.

Resumen

Los medios de comunicación deben ser un canal fiable de información, especialmente en el contexto de sobrecarga informativa y de la presencia de los perfiles y canales privados de las figuras públicas. En una sociedad cada vez más polarizada y desconfiada hacia los medios, este estudio pretende analizar la cobertura mediática de una campaña electoral en España, la primera tras la pandemia de COVID-19. Utilizando TF, TF*IDF y word2vec para la cuantificación y vectorización del texto, junto con UMAP y t-SNE para el análisis de conglomerados, examinamos cómo se utilizan determinados términos en los medios de comunicación y sus asociaciones semánticas. Nuestros resultados revelan una tendencia de los medios a vincular determinados candidatos o partidos con los extremos políticos, la violencia y la negatividad, a menudo eclipsando el contenido del discurso político sustantivo. La cobertura se centra mayoritariamente en los grandes partidos y las facciones polarizadoras. Los actos de campaña reciben más atención que las propuestas políticas, que a menudo pasan desapercibidas. Estos datos coinciden con los de estudios cualitativos anteriores, lo que respalda la eficacia de este enfoque cuantitativo al ampliar el tamaño de la muestra, reducir el tiempo de análisis y revelar patrones sutiles que no se aprecian fácilmente con las metodologías tradicionales. Este estudio contribuye a una comprensión más profunda de la dinámica de los medios de comunicación durante los períodos electorales y subraya el valor de los métodos cuantitativos en el análisis de los medios de comunicación.

Palabras clave
Cobertura electoral; Análisis del discurso; Procesamiento del lenguaje natural; Comunicación computacional; Política.

Resumo

Este estudo emprega métodos quantitativos e de inteligência artificial para examinar a cobertura da mídia durante uma campanha eleitoral. Utilizando TF, TF*IDF e word2vec para quantificação e vetorização de textos, juntamente com UAMP e t-SNE para análise de clusters, analisamos como certos termos são utilizados em diferentes meios de comunicação e suas associações semânticas. Nossos resultados revelam uma tendência dos meios de comunicação a associar certos candidatos ou partidos a extremos políticos, violência e negatividade, frequentemente ofuscando o discurso político substantivo. Notavelmente, a cobertura se concentra predominantemente nos principais partidos e nas facções polarizadoras. Os eventos da campanha recebem mais atenção do que as propostas políticas, que muitas vezes são negligenciadas. Esses achados estão alinhados com estudos qualitativos anteriores, demonstrando a eficácia de nossa abordagem quantitativa em expandir o tamanho da amostra, reduzir o tempo de análise e revelar padrões sutis que não são facilmente perceptíveis por métodos tradicionais. Este estudo contribui para uma compreensão mais profunda das dinâmicas midiáticas durante os ciclos eleitorais e ressalta o valor dos métodos quantitativos na análise da mídia.

Palavras-chave
Cobertura eleitoral; Análise do discurso; Processamento de linguagem natural; Comunicação computacional; Política.

1. Introduction

Journalists have the job of seeking and reporting what citizens do not directly experience. Thus, journalism has the possibility and responsibility to get the facts, interpret them, provide the context, and tell them or not. In the political sphere, this is especially important because journalism has been key to bring politics closer to citizens and for “greater visualization and transparency of the political decision-making process” (Holgado, 2017). For citizens, it is the main way to get to know the political candidates and their proposals without the politicians presenting them directly. For politicians, it serves as a platform for gaining visibility, relevance, and asking for votes. In fact, simply naming certain parties gives them an advantage by positioning them within the public imagination (Haselmayer et al., 2017). The media’s role is to mediate and connect both actors—citizens and politicians—facilitating mutual understanding. Ideally, this enables political parties to propose measures that interest citizens and for citizens to make informed voting decisions.

However, the media’s approach to narrating political events often focusses not on party proposals, but rather on the actions politicians take to gain visibility and recognition. This contributes to social unrest, as “during electoral campaigns, politicians find it easier to disparage their opponents than to explain their policy proposals or actions” (Fernández-Fernández, 2009). Decades ago, Arredondo-Ramírez (2000) described elections as “an intense struggle to occupy more public space and to be identified, remembered, pointed out and, ultimately, chosen.” The presence and campaigns of candidates on social media are also of interest, as they are often replicated by the media (Bright et al., 2020; Green and Gerber, 2019). As argued by Thurber and Nelson (2018), “campaigns matter” in this regard.

The presence in the media (...) has constituted one of the most important ways that parties find to publicize their message and request votes during electoral campaigns, compared to other more classic methods such as poster hanging or rallies. In fact, these are still carried out, but with a media purpose; that is, they are designed to be televised or documented in press headlines (Holgado, 2017).

Electoral campaigns are highly mediated, fostering intense competition between media outlets (Denton et al., 2019; Thurber and Nelson, 2018). According to Castells (2009), media narratives often adopt infotainment-style language, emphasizing sensational elements such as intrigue, violence, or sex. Although not explicitly regulated within Spanish journalism, the use of violent language and the promotion of tension contradict ethical practices (Fernández-Fernández, 2009). This study does not dive into the impact of language on individuals, but other authors have already proven its influence (i.e., Van Duyn and Collier, 2019; Veres, 2006).

Furthermore, the media often amplifies the debate among political actors, focusing primarily on competing for headlines (Barandiarán et al., 2020). This often-hostile debate also benefits politicians, since “candidates using a more negative tone and, especially, those who make greater use of emotional appeals receive greater media coverage” (Maier and Nai, 2020). Aware of this effect, parties participate in various public forums and influence media narratives (López-García, 2017). Another effect of mediatization, a reciprocal process between politics and the media (Hjarvard, 2004; Deacon and Stanyer, 2014), is that parties leverage the popularity of their leaders to gain media exposure (Casero-Ripollés et al., 2016).

Just as politicians adopt a more negative or confrontational tone, the press similarly focuses on elections in terms of victories or defeats, emphasizes conflicts during the campaign or emotional appeals (Nelson and Thurber, 2018; Maier and Nai, 2020). All of this has an effect on individuals’ perceptions of politicians leading up to elections (Antoniades, 2020; Casero-Ripollés et al., 2016), and shapes societal attitudes toward electoral processes (Goidel et al., 2021).

Not only in that regard, but also in the other direction: politicians also influence the population’s perception of the media. Criticisms of the media by political figures exacerbate perceived bias among the public (Fawzi, 2019; Smith, 2010), which in turn contributes to disenchantment with government (Hutchens et al., 2016). This is not a marginal effect: According to a Reuters survey (2023), politicians are the most frequent critics of the media, accounting for 42% of criticisms directed at media and journalists.

It is observed that these are interconnected spheres. Negative information and language influence polarization. Political polarization further undermines public trust in the media and fosters hate speech (Reporteros sin Fronteras, 2020). A study on North American public trust in the media (Jones, 2004) found that the lack of information about political differences and electoral programs negatively impacts society’s perception of the press. But not only that, since empty rhetoric also makes citizens view political discourse as frustrating, divisive, and something to be avoided (Duggan and Smith, 2016). In Spain, the country where these elections take place, public trust in media is lower compared to Europe, with a continuing decline (Eurobarómetro, 2021; Asociación de la Prensa de Madrid, 2020).

1.1 Political parties in Spain

In Spain, despite over 4,900 political parties registered by 2022, only a dozen receives national media coverage, highlighting the persistent oligarchy within its so-called multi-party system. Spain’s Organic Law 5/1985 determines public media ad space for parties by their past election performance, which can reduce visibility for newer or smaller parties. Although private media are subject to certain regulations, they retain significant autonomy in deciding which parties to cover. This study underscores the importance of media in providing comprehensive and balanced political information, reflecting their autonomy in directing their decisions within the framework of Western democratic principles.

2. State of the art

There are studies on media coverage in electoral periods; however, most of them do not use quantitative methodologies nor incorporate artificial intelligence for the analysis, and they do not employ a comparative approach to investigate media outlets and candidates (Conroy et al., 2015; Córdoba-Cabús, 2018; De Vreese et al., 2006; Ergün and Karsten, 2021; Gallardo-Paúls and Enguix-Oliver, 2015; Holtz-Bacha et al., 2014; Mancera-Rueda and Villar-Hernández, 2020; Mazaira-Castro et al., 2019; Paniagua-Rojano et al., 2020; Sánchez-Gutiérrez and Nogales-Bocio, 2018; Smith, 1997; Terkildsen and Damore, 1999, Vizoso and López-García, 2020).

Applying neural networks and machine learning to journalism data, as proposed here, is a recent shift from predominantly qualitative approaches (Berrocal-Gonzalo et al., 2017; Fenoll and Rodríguez-Ballesteros, 2017; Mancera-Rueda and Villar-Hernández, 2020; Miguel-Sáez-de-Urabain et al., 2017; Sánchez-Gutiérrez and Nogales-Bocio, 2018; Van der Pas and Aaldering, 2020). Previous studies mostly employed text vectorisation to group or classify news rather than analyze term relationships (Orden-Cruz et al., 2019; Riedel et al., 2017; Zhao and Chang, 2020). Some authors also used similar techniques for supervised learning (Berven et al., 2020; Edell, 2018; Nicholls and Culpepper, 2021; Zhou et al., 2019).

Numerous studies examine electoral coverage in Spain. Berrocal-Gonzalo et al. (2017) used support vector machines to analyze conflict in electoral debates. Fenoll and Rodríguez-Ballesteros (2017) correlate media intensity with candidate rank in the 2015 general elections. Mancera-Rueda and Villar-Hernández (2020) focus on Vox party media comparison. Sánchez-Gutiérrez and Nogales Bocio (2018) analyze Podemos from a critical discourse perspective. Miguel-Sáez-de-Urabain et al. (2017) scrutinize El País’ coverage of the 2016 US elections.

On the other hand, the study Anatomy of the electoral hoax: political disinformation in Spain’s 2019 general election campaign (Paniagua-Rojano et al., 2020) analyses the content and dissemination of electoral hoaxes identified by news fact-checkers, as do Mazaira-Castro et al. (2019) and Vizoso and López-García (2020). Some researchers (López-Meri, 2017; Campos-Domínguez and Calvo, 2017; Chaves-Montero et al., 2017; Ballesteros-Herencia, 2020) focus on election coverage on social media, which, although they are undoubtedly growing platforms, are still not the main source of information in election campaign periods (67% compared to 93% of citizens who prefer to be informed through the press, television or radio, whether in their digital versions or not.) The latest studies we have found on comparative trust indicate that traditional media, including their digital version, are still the most trusted by the population (Kalogeropoulos et al., 2019), and they are particularly consulted in the context of the pandemic (Li and Sun, 2021), a context in which these elections are taking place. We found articles on the posts made by politicians on their own social media profiles during the campaign (e.g., Diez-Gracia et al., 2023) that, while potentially complementary, fall outside the scope of this study, which is focused on media analysis.

3. Proposal

This study examines the media coverage of Spain’s first electoral campaign post-COVID-19, focusing on the equitable treatment of parties and candidates rather than classifying news. The following objectives are derived:

-Analyze visibility disparities among political parties, candidates, and topics in media coverage.

-Examine how media outlets vary in their portrayal of presidential candidates, particularly in relation to their electoral programs or involvement in controversies.

4. Sample

For this study, we used the Really Simple Syndication (RSS) of the most widely read open media in Spain based on the report by the Reuters Institute and the University of Oxford (Newman et al., 2020) and the Association for Media Research (AIMC, 2020). The media that allow the extraction of news in real time through RSS are ABC, eldiario.es and El País. Furthermore, these three outlets are associated with a minimal risk of disinformation (Magallón et al., 2021). The media outlets 20 minutos and El Mundo would belong to the desired sample, although they do not have an open RSS and, therefore, do not allow the automatic extraction of news. The sample for this study is all news released by these three media outlets that refer to the Community of Madrid elections during the two-week period of the election campaign, from April 18 to May 2, 2021.

5. Methodology

NLP supported by machine learning is chosen as a quantitative methodology. Three different methods are used: TF, TF*IDF and vectorization and calculation of Euclidean distances and cosine similarity between terms through neural networks.

TF or “term frequency” is a count of words in the text, both absolute and relative to the total number of words in the same document. The potential number of times a reader could have been exposed to different terms, ideas or candidates is calculated from it, known the possible correlation between the intensity of coverage of each candidate and their position in pre-election polls (Fenoll and Rodríguez-Ballesteros, 2017). It is common as a starting metric for some studies (Aljarah et al., 2020; Høyer and Nossen, 2015; Terachi et al., 2006; Yamamoto and Church, 2001) of media analysis.

TF*IDF is a metric that allows for locating media outlets that give more importance to a specific term. It is formulated as a factor between the relative frequency of a word or set of words in a text and the comparison of a whole selection of texts that contain that same term. TF*IDF is commonly used to locate keywords for subsequent analyses (Köffer et al., 2018; Koloski et al., 2021; Xu et al., 2018) or to find differences related to terms used by different sources (Ghosh et al., 2020). In this research, TF*IDF is used to detect which media outlets give more importance to certain terms, either to candidates, to certain political measures, or to specific events that happened during the campaign.

In the final phase of the investigation, the word2vec model (Mikolov et al., 2013) is utilized to transform each word into a multidimensional vector, allowing the measurement of the probability that a given term appears adjacent to another. This probability is determined by the distance or similarity between terms within a sample and assesses, for each media outlet, the degree of association between two concepts. Similarly to other vector space models, word2vec generates vector representations such that similar or related words are positioned close to each other. For example, political parties such as PSOE, PP, and Unidas Podemos (UP) are located in the same region of the multidimensional space, whereas terms such as coronavirus, pandemic, and COVID are situated farther away. Smirnova et al. (2021) used word2vec to identify thematic areas of information and extract keywords. This methodology has also been applied in other media studies, including user comments on digital portals and in training-supervised learning experiments (Budiman et al., 2019; Köffer et al., 2018). Although its academic application is more prevalent in the analysis of social media—characterized by shorter and less elaborate texts—or in scientific articles (Mustafa et al., 2021; Terachi et al., 2006), it is not commonly used in media analysis and often fails to produce conclusive results (Al-Omari et al., 2019).

In this research, the selection of terms is constructed from (a) the most frequent words for each media outlet (obtained with TF), (b) words with the highest TF*IDF value for each media outlet, (c) words that refer to policy proposals based on the electoral programs and (d) keywords from the slogans of each candidate for their own electoral campaign.

To effectively visualize high-dimensional data generated from text analysis, we rely on dimensionality reduction techniques such as t-SNE and UMAP. These methods help us transform the complex, multi-dimensional vectors into a 2D space where patterns and relationships are more easily discernible.

t-SNE (Van der Maaten and Hinton, 2008) is a non-linear dimensionality reduction technique that is particularly well-suited for embedding high-dimensional data for visualization in a low-dimensional space. t-SNE minimizes the Kullback-Leibler divergence between the joint probabilities of the high-dimensional data and the low-dimensional embedding, effectively preserving the local structure and creating visually interpretable clusters.

UMAP (McInnes et al., 2018) constructs a high-dimensional graph representation of the data and optimizes a low-dimensional graph to be as structurally similar as possible. This method is advantageous in maintaining both local and global data structure.

The visualization made with t-SNE and UMAP is crucial for identifying clusters, patterns and anomalies within the data.

All these experiments were conducted in Visual Studio Code using Python. The analysis of the terms was performed in Spanish, the original language of the data. Subsequently, the terms were translated into English for their presentation and inclusion in the article.

6. Results

Out of the 23 parties participating in the Community of Madrid elections, 18 are mentioned at least once. However, the majority are not mentioned more than once, with nearly all of the information concentrated on the six main parties. ABC mentions the most parties (18), followed by eldiario.es (8) and El País (7). None of these media outlets mentions the concurrent Green Municipalists, the United-Green Coalition, the Retired Social Democratic Party (PDSJE), the European Union of Pensioners, or For a More Just World (PUM+J) groups during the electoral campaign.

The party with the most mentions is Vox, with 1,129 mentions across three media outlets over two weeks, followed by PP with 960 mentions and UP with 730 mentions. These absolute figures indicate the number of times that the reader could have encountered mentions of a political group relative to another party. Fewer mentions correspond to a decreased likelihood of receiving votes. Fenoll and Rodríguez-Ballesteros (2017) report a correlation between the intensity of each candidate’s media coverage and their predicted position in the electoral ranking according to pre-electoral barometers.

6.1 Headlines analysis

Initially, we analyze the headline terms, which is key for two main reasons (Kalogeropoulos et al., 2018):

-Rapid information consumption leads many readers to skim headlines, with 57.8% in Spain doing so (Castillo, 2017). Competitive media tend to craft attention-grabbing, sometimes exaggerated, headlines (Rieis et al., 2015). More importantly, for Ecker et al. (2014), headlines are not only the first impression of news articles, but can influence how users perceive the full article.

-Headlines often serve as the only accompanying text on social media posts, where content exposure is algorithmically controlled (Kalogeropoulos et al., 2018). During elections, media outlets are particularly active on platforms such as Twitter (Paniagua-Rojano et al., 2020), amplifying the significance of headline content.

Therefore, it is worth highlighting some data that may be of interest.

For the headlines of the three media outlets analysed, the most repeated words are Ayuso (PP candidate) and Vox (political party), followed by Iglesias (UP candidate) and Gabilondo or PSOE (candidate and party). The representation of Ciudadanos and Más Madrid, both pollical parties, is residual. Among the most used terms are those that referred to events during the campaign (debate, bullets, rally...) and very few related to government proposals (pandemic, education, health...). The use in headlines of terms related to conflict is notable: hatred, falsehoods, threats, etc.

In a comparative density analysis [Figure 1], that is, an evaluation of the relevance of a word in a media outlet relative to other media outlets, the TF*IDF index for eldiario.es is 0.4269 for the word Ayuso, implying that the term is more relevant in the headlines published by eldiario.es than in those published by the other outlets. For El País the maximum TF*IDF index is observed for the term Vox (0.2954).

Figure 1. The words with the highest TF*IDF index are mentioned the most by one media outlet and less often mentioned by the other two media outlets

Source: elaborated by the authors.

6.2 Analysis of news stories

In this study, Isabel Díaz Ayuso and Pablo Iglesias are found to personify their respective parties, PP and UP, more than their parties are mentioned in the media. In contrast, Vox is mentioned more frequently than its candidate, Rocío Monasterio. Although being fourth in the Assembly of the Community (fifth in the configuration of the Assembly in 2015, prior to these elections), Vox ranks first in mentions [Figure 2].

Figure 2. Comparison between the absolute number of mentions of each party and each candidate

Source: elaborated by the authors.

The application of TF*IDF enables the detection of names that appear exclusively in certain media. It is common to cite Santiago Abascal (President of Vox) to reinforce or contrast ideas or provide context for Monasterio’s statements. eldiario.es and El País, in addition to referencing Ayuso, also mention Pablo Casado (President of the party) and José María Aznar (Former President of the party and Former President of the Government). ABC does not mention him at all, resulting in a TF*IDF index of zero [Figure 3]. Additionally, when eldiario.es or El País refers to Edmundo Bal (candidate for Ciudadanos), they often mention Inés Arrimadas, his counterpart in Congress. Conversely, ABC mentions Albert Rivera (Former President of Ciudadanos, now retired from politics) more frequently than Arrimadas.

Figure 3. The TF*IDF index for ABC for party partners is zero or close to zero. There are higher densities for El País and eldiario.es. Random sample, n=20% of news stories

Source: elaborated by the authors.

By employing word2vec, a certain polarization is observed: terms with negative and violent connotations are associated with the most ideologically distant parties, namely Vox and UP. Additionally, these two parties receive the most media representation.

Vox and its lead candidate are strongly linked to terms such as uglified, denying, doubting, and death. UP and Pablo Iglesias are strongly linked to threat, doubt, abandonment, and death. Isabel Díaz Ayuso, the most mentioned candidate, is strongly linked to president, re-election and victory. For each party or candidate, none of the words most strongly linked is directly related to electoral proposals.

Based on calculating the distance between the candidates and the proposals, there is a greater relationship between all politicians and the terms taxation, tourism, employment and transparency, and a weaker relationship with education. The exception is Iglesias (UP), for whom there is a close relationship with the terms of the social proposal, but a weak relationship with the environment [Figure 4].

Figure 4. Distance between the terms of political proposals and each candidate

Source: elaborated by the authors.

As can be measured with TF*IDF, for the first week of the election campaign (from 18 to 25 April), Ayuso is already the most relevant word among the three media analysed, with an advantage in eldiario.es (0.265) over the others (0.235 in El País and 0.148 in ABC). It is followed by Vox, more relevant in El País, and Iglesias, also in El País. ABC has a higher score with Gabilondo.

In the second week, from 26 April to 2 May, Vox is more relevant (with a higher index in El País) than Ayuso, followed by PP and Iglesias. ABC has an advantage over its competitors in the use of the terms elections, electoral, vote, voting... The media outlet eldiario.es maintains that the right-wing group is the most relevant: Ayuso, PP, and Vox, followed by UP, Iglesias and Díaz. El País follows a practically identical pattern to that of eldiario.es, although with Vox in the lead.

These results, which do not indicate excessive disparity, suggest that the terms and concepts covered by the media are relatively balanced. Although certain terms appear exclusively in some media outlets and not in others (e.g., “ayusada” or “ayusismo,” which are negative terms referring to Ayuso and appear in eldiario.es and El País but not in ABC), they are not used frequently enough to be significant.

Isabel Díaz Ayuso, Pablo Iglesias, and Vox (and to a lesser extent, Rocío Monasterio) are the central figures throughout the campaign. The mentions of Mónica García and her party, Más Madrid, are insignificant compared to those of the other candidates. Furthermore, as the frequency of words with violent or negative connotations (e.g., threat, violence, fascism, hatred, disaster) increases, mentions of terms related to political proposals (e.g., education, taxes, universities, immigrants) decrease significantly.

The alignment between the candidates and the terms chosen by their respective political parties for electoral propaganda has also been analyzed. During the campaign, the PP associated Ayuso’s candidacy with freedom; the Partido Socialista (PSOE) linked Gabilondo to serious government; Más Madrid characterized García as a doctor; Ciudadanos connected Bal with the political centre; and Vox associated Monasterio with security [Figure 5]. UP did not employ key terms in its electoral campaign, except for majority, which is also interpreted as democracy.

Figure 5. Election campaign posters. From left to right and from top to bottom: PP, PSOE, Más Madrid, Vox, Ciudadanos, and UP

Source: Retrieved from the electoral programs and websites of the parties.

ABC is the outlet that most closely aligns candidates with their chosen campaign terms. This could be due, among other reasons, to a higher fidelity reproduction of the messages disseminated by the political parties themselves, rather than ABC interpreting those messages or providing its own context. Edmundo Bal, the candidate for Ciudadanos, is frequently associated with the centre. For eldiario.es, the strongest relationship is observed between Mónica García and the doctor, while there is a significant distance for Monasterio and security. El País links Bal with centre and Gabilondo with serious more strongly, with a weaker link between Monasterio and security [Figure 6].

Figure 6. Campaign term diffusion. Comparison of the terms used by the parties and their reproduction in the media

Source: elaborated by the authors.

By employing word2vec, the words with the smallest Euclidean distance are obtained for each candidate and are subjected to t-distributed Stochastic Neighbour Embedding (t-SNE), a nonlinear reduction technique that projects data in a two-dimensional plane to obtain patterns based on data similarities. In ABC news stories, among the 10 terms most related to Ayuso (that is, those most likely to appear next to her), president (0.9836), re-election (0.9447), Ángel [Gabilondo] (0.9423) and PSOE (0.9404) are notable. In eldiario.es, re-election (0.8831) and victory (0.8731). In El País news stories, there is a close distance between Ayuso and the Vice President of the Government Yolanda [Díaz] (0.9335) and the verb justified (0.9108) [Figure 7].

Figure 7. Terms related to Isabel Díaz Ayuso measured with word2vec. There is a high probability that “Ayuso” appears alongside “president” and “re-election” in ABC, and alongside “re-election” and “victory” in eldiario.es

Source: elaborated by the authors.

For Vox, the diversity in coverage is notable. In ABC news stories, the closest terms are uglified (0.998977), advertiser (0.998976) and female (0.998958). In eldiario.es, the closest terms are digitization (0.995412), retarded (0.995026) and radical (0.994319). Attitude (0.903149) and condemnation (0.89991) are related to Rocío Monasterio, the candidate for Vox. In El País news stories, deny (0.8746), doubt (0.8547), death (0.8433) and threats (0.8358) are among the 10 closest words for this candidate [Figure 8].

Figure 8. Terms related to Vox measured with word2vec. Among the words closest to Vox are terms with negative connotations such as “uglified,” “female,” “radical,” “condemnation,” “deny,” “doubt,” “death,” and “threats”

Source: elaborated by the authors.

Regarding UP and its candidate Pablo Iglesias, the closest terms in news stories published by eldiario.es and El País have a negative connotation. In eldiario.es, there is a strong link with bullets (0.8734) and abandoned (0.855). For El País, there is a strong link with riot control (0.878), death (0.86095) and exit (0.84783). In ABC news stories, although they are not part of the ten closest terms, violence and hatred have very high rates (0.938 and 0.9265) [Figure 9].

Figure 9. Terms related to UP and its candidate Pablo Iglesias measured with word2vec. They are strongly linked to “violence,” “hatred,” and “death,” among others

Source: elaborated by the authors.

For other terms, in ABC news stories, it is less likely that Isabel Díaz Ayuso is linked to corruption (0.7659) or pandemic (0.8118), in contrast to Edmundo Bal (violence: 0.9801; corruption: 0.9662; and pandemic: 0.9596). For eldiario.es, there is a significant link between violence and dictatorship (0.9692) and violence and the far right (0.9661). For El País, there is a low probability that Mónica García and corruption (0.6497) or Mónica García and violence (0.5757) appeared together, with a higher probability with other candidates, specifically with Bal: corruption: 0.8303; and violence: 0.7707.

The dimensional reducer UMAP (Uniform Manifold Approximation and Projection) to allow the generation of vectors in a two-dimensional plane, assuming criteria different from those of t-SNE. Certain groupings between terms are observed and differentiated for each media outlet. The terms are distributed on the basis of three themes:

-Words related to the parties and their candidates, forming relatively stable groups among themselves. They correspond to the “who” of the inverted pyramid in journalism;

-Words related to ideology (right, left, fascism, hatred, lies, etc.), belonging to “how” in the inverted pyramid; and

-Words related to political proposals, which were broadly distributed (taxation, companies, health, etc.) correspond to “what” of the pyramid.

The informative focus prioritizes the subject (who) and the ideology (how), instead of prioritizing “what” of the inverted pyramid. The distribution obtained with UMAP groups candidates and parties close to words related to ideology and aggressive connotations and far from electoral proposals. For El País news stories, Iglesias is more distant from the other candidates and particularly close to extreme and threat [Figure 10].

Figure 10. Distribution of terms for El País obtained with UMAP. The candidates and parties are grouped. “Iglesias” and “Monasterio” are close to “extreme” and “threats”

Source: elaborated by the authors.

Some terms referring to proposals, such as pandemic, mobility, education, or social are placed closer to words that have ideological connotations (false, far-right, fascism, etc.) [Figures 10, 11].

Figure 11. Distribution of terms for eldiario.es. Parties and candidates appear grouped and close to ideological and aggressive terms and farther from government proposals. “Iglesias” is close to “threats” and “extreme”, and “Monasterio” to “victory”

Source: elaborated by the authors.

7. Conclusions and discussion

Based on the results, the hypothesis that digital media present unequal information related to electoral campaigns cannot be affirmed. Although there are differences between how the candidates were represented by each media outlet, there are important similarities:

-Information regarding the electoral campaign focused exclusively on the six main parties.

-There was substantial coverage of the right-wing bloc. PP and Vox (and/or their candidates), the groups considered conservative, received by far the most media coverage. They are followed by the party furthest to the left, UP, and groups considered moderate received little coverage, polarizing the visibility of the extremes of the ideological spectrum.

-The treatment of the most named parties can be considered negative and even violent: in this electoral campaign, only in the three newspapers studied, there were 225 mentions of the word hate, 191 mentions of violence, 214 mentions of death, and only 153 mentions of proposal.

-Communication regarding electoral programs was neglected − even in the context of the pandemic, which framed these elections − in favour of the events, controversies and milestones produced during the campaign and of exclusively ideological discourses.

These conclusions reinforce and update previous studies that emphasize the unequal treatment of minority parties in the Spanish press: according to Gallardo-Paúls and Enguix-Oliver (2015), minority groups receive fewer mentions, often with ad hoc comments, and their voices are usually used to talk about the main parties, not to communicate their own ideas. As Haselmayer et al. (2017) report from the studies of Levendusky (2013) and Prior (2005), unbalanced coverage reinforces disparities in the distribution of political knowledge.

The use of warlike or confrontational language is so prevalent in the media and political discourse that readers are constantly exposed to these negative stimuli, which increase or provoke social tension and entail contempt and dangerously levels of aggression (Fernández-Fernández, 2009). As mentioned above, other studies have shown that the influence of language on the individual and the media can have relevant effects on the perception of politics by citizens (Casero-Ripollés et al., 2016; Gavin, 2018). In addition, journalists often use typical entertainment words rather than informative words in the news and headlines.

This study shows that information about female candidates frequently includes quotes or endorsements from their male party colleagues—Ayuso may be an exception—while male candidates are not always juxtaposed with their female counterparts. Quotes from male colleagues are often used, even when their relevance is less direct. Van der Pas and Aaldering (2020) had previously detected gender bias in political media stories, where women politicians received 17 percentage points less media attention. This is not new (i.e., Conroy et al., 2015; Smith, 1997) but can also be seen in this study.

As was already discussed in the state-of-the-art, the excessive coverage of campaign events and the lack of information about the political differences and the programs decrease the confidence of the population in the press (Jones, 2004). As observed in this research, during the 4M electoral campaign, even amidst the backdrop of a pandemic (which increased the demand for political proposals), the media did not thoroughly examine electoral measures and political differences.

In dialogue with other similar research, the paper by Mancera-Rueda and Villar-Hernández (2020), which focused on the media analysis of the Vox party for the April 2019 general elections, collected up to 413 mentions of the party or its then candidate Santiago Abascal in the headlines of 8 media outlets. In this case, up to 93 mentions of Vox or Monasterio are collected in headlines, in 3 media and for an autonomic election. If we widen the scope for text bodies, the count results in 1,589 mentions, something not measured by previous studies. In the 2019 elections, Mancera-Rueda and Villar-Hernández already reported that Vox coverage in the elections was mainly based on campaign issues and its candidate, above government proposals. In addition, such research identified “numerous samples of negative expressivity related to Vox” (Mancera-Rueda and Villar-Hernández, 2020), also detected in this study. Although it is not known whether this is the intention of the political party, as was the case with Podemos in the study by Casero-Ripollés et al. (2016), the controversy of its interventions has allowed it greater media coverage, which translates into greater exposure to potential voters.

Sánchez-Gutiérrez (2016) pointed out, after a discourse analysis, that there was a “negative media treatment in a systematic way towards Podemos and the personalization of negative actions in Iglesias.” Although no conclusions of this strength can be drawn from the present study, it can be perceived that Iglesias personifies UP in a large number of news items, with many more mentions of the candidate than of the political group. In a later study (Sánchez-Gutiérrez and Nogales-Bocio, 2018) it was indicated, after a critical analysis of the discourse, that the idea of “failure” of the media discourse on Podemos and Pablo Iglesias is underlying. As observed in our experiment, there is a high or medium-high probability that together with the terms referring to this party, doubt, abandonment, violence, hatred, and death also appear. In any case, these elections were mediatic and aggressive, a tendency that other mentioned authors had already detected, and which has negative effects on the population (frustration, fatigue, distrust of, and disappointment in government). Moreover, the weight of party information often rests on a candidate and their popularity, as noted by Casero-Ripollés et al. (2016).

The main strength of this study lies in the methodology used in the journalism field and in the discourse study. The exploratory—and not classificatory—approach adopted avoids oversimplification and reveals new lines and research objectives to explore in depth. Unlike other studies, here a comparison is made for each medium and for each candidate or party, making it possible to compare the resources and discourses used by some of the most important media in the country.

However, limitations exist. The sample size of media outlets that offer open news is restricted, hindering broader comparisons. Manual disambiguation of terms presents another limitation, suggesting the potential for comparison with automated tools like BERT or ELMo. In future experiments, we will consider using these tools to optimize our work.

In any event, this experiment reveals some very interesting lines of research, not only in terms of Natural Language Processing (disambiguation of terms, stability metrics, etc.) but also in the field of political media coverage. While polarization assessment was not the primary objective, the findings suggest its relevance, aligning with concerns over its impact on democracy. The bias and polarization present in media outlets must also be addressed through automated methodologies, facilitating the comparison of texts and sources to achieve more diverse information. For future research directions, the use of new NLP technologies, especially generative tools, is proposed to detect bias in the media.

Authors’ Contribution

Mar Castillo-Campos: Methodology; Investigation; Data curation; Writing-original draft. David Becerra-Alonso: Conceptualization; Methodology; Software; Validation; Supervision. David Varona-Aramburu: Conceptualization; Validation; Writing- review & editing; Supervision. All authors have read and agreed to the published version of the manuscript.

Conflicts of interest

The authors declare that they have no conflict of interest

References

AIMC (2020). Marco general de los medios en España. Asociación para la Investigación de Medios de Comunicación. https://bit.ly/4jeNJ6h

Aljarah, Ibrahim; Habib, Maria; Hijazi, Neveen; Faris, Hossam; Qaddoura, Raneem; Hammo, Bassam; Abushariah, Mohammad; & Alfawareh, Mohammad. (2020). Intelligent detection of hate speech in Arabic social network: A Machine Learning approach. Journal of Information Science 47(4). https://doi.org/10.1177/0165551520917651

Al-Omari, Hani; Abdullah, Malak; Altiti, Ola; & Shaikh, Samira. (2019). JUSTDeep at NLP4IF 2019 Task 1: Propaganda detection using ensemble deep learning models. Proceedings of the Second Workshop on Natural Language Processing for Internet Freedom: Censorship, Disinformation, and Propaganda, 113-118. https://doi.org/10.18653/v1/D19-5016

Antoniades, Nicos. (2020). Political marketing communications in today’s era: Putting people at the center. Society 57(6), 646-656. https://doi.org/10.1007/s12115-020-00556-6

Arredondo-Ramírez, Pablo. (2000). Los medios de comunicación y la contienda electoral. Renglones 46(12), 68-72. http://hdl.handle.net/11117/512

Asociación de la Prensa de Madrid (2020). Informe anual de la profesión periodística. https://acortar.link/rsxJDc

Ballesteros-Herencia, Carlos A. (2020). Los marcos del compromiso: framing y engagement digital en la campaña electoral de España de 2015. Observatorio (OBS*) 14(3). https://doi.org/10.15847/obsOBS14320201507

Barandiaran, Xabier; Unceta, Alfonso; & Peña, Simon. (2020). Comunicación política en tiempos de nueva cultura política. Revista ICONO 14. Revista científica de Comunicación y Tecnologías emergentes 18(1), 256-282. https://doi.org/10.7195/ri14.v18i1.1382

Berrocal-Gonzalo, Salomé; Martín-Jiménez, Virginia; & Gil-Torres, Alicia. (2017). Líderes políticos en YouTube: información y politainment en las elecciones generales de 2016 en España. El Profesional de la Información 26(5), 937-946. https://doi.org/10.3145/epi.2017.sep.15

Berven, Arne; Christensen, Ole A; Moldeklev, Sindre; Opdahl, Andreas L.; & Villanger, Kjetil J. (2020). A knowledge-graph platform for newsrooms. Computers in Industry, 123. https://doi.org/10.1016/j.compind.2020.103321

Bright, Jonathan; Hale, Scott; Ganesh, Bharath; Bulovsky, Andrew; Margetts, Helen; & Howard, Phil (2020). Does campaigning on social media make a difference? Evidence from candidate use of Twitter during the 2015 and 2017 UK elections. Communication Research 47(7), 988-1009. https://doi.org/10.1177/0093650219872394

Budiman, Irwan; Nugrahadi, Dodon T; Faisal, Mohammad R.; & Rusli, Muhammad. (2019). A study on effect of generated features from Word2Vec vectors for text classification. International Conference on Wetland and Multidisciplinary Research 2019, Utsonomiya, Japan

Campos-Domínguez, Eva; & Calvo, Dafne. (2017). Electoral campaign on the Internet: Planning, impact and viralization on Twitter during the Spanish general election, 2015. Comunicación y Sociedad (29), 93-116. https://doi.org/10.32870/cys.v0i29.6423

Casero-Ripollés, Andreu; Feenstra, Ramón A.; & Tormey, Simon. (2016). Old and new media logics in an electoral campaign: The case of Podemos and the two-way street mediatization of politics. The International Journal of Press/Politics 21(3), 378-397. https://doi.org/10.1177/1940161216645340

Castells, Manuel. (2009). Comunicación y poder. Alianza Editorial.

Castillo, Toni. (2017, December 14). El 56,8 % de los lectores españoles de prensa se informa a través de redes sociales, aunque sólo lee titulares y alguna noticia. Genbeta. https://bit.ly/3H4CWi0

Chaves-Montero, Alfonso; Gadea-Aiello, Walter F.; & Aguaded-Gómez, José I. (2017). La comunicación política en las redes sociales durante la campaña electoral de 2015 en España: uso, efectividad y alcance. Perspectivas de la Comunicación 10(1), 55-83.

Conroy, Meredith; Oliver, Sarah; Breckenridge-Jackson, Ian; & Heldman, Caroline. (2015). From Ferraro to Palin: Sexism in coverage of vicepresidential candidates in old and new media. Politics, Groups, and Identities 3(4), 573-591. https://doi.org/10.1080/21565503.2015.1050412

Córdoba-Cabús, Alba. (2018). Análisis del periodismo de datos en la campaña electoral del 20D a través de las ediciones digitales de diarios generalistas. Estudios sobre el Mensaje Periodístico 24(1), 137. https://doi.org/10.5209/ESMP.59942

Deacon, David; & Stanyer, James. (2014). Mediatization: key concept or conceptual bandwagon? Media, Culture & Society 36(7), 1032-1044. https://doi.org/10.1177/0163443714542218

Denton, Robert E; Trent, Judith S.; & Friedenberg, Robert V. (2019). Political Campaign Communication: Principles and Practices. Rowman & Littlefield.

De Vreese, Claes H; Banducci, Susan A; Semetko, Holli A.; & Boomgaarden, Hajo G. (2006). The news coverage of the 2004 European Parliamentary election campaign in 25 countries. European Union Politics 7(4), 477-504. https://doi.org/10.1177/146511650606944

Diez-Gracia, Alba; Sánchez-García, Pilar; & Martín-Román, Javier. (2023). Polarización y discurso emocional de la agenda política en redes sociales: desintermediación y engagement en campaña electoral. Revista ICONO 14. Revista científica de Comunicación y Tecnologías Emergentes 21(1). https://doi.org/10.7195/ri14.v21i1.1922

Duggan, Maeve; & Smith, Aaron. (2016, October 25). The political environment on social media. Pew Research Center. https://pewrsr.ch/4muEuSG

Ecker, Ullrich K. H; Lewandowsky, Stephan; Chang, Ee P.; & Pillai, Rekha. (2014). The effects of subtle misinformation in news headlines. Journal of experimental psychology: Applied 20(4), 323–335. https://doi.org/10.1037/xap0000028

Edell, Aaron. (2018, January 11). I trained fake news detection AI with >95% accuracy, and almost went crazy. Towards Data Science. https://bit.ly/43p8eaC

Ergün, Erkan; & Karsten, Niels. (2021). Media logic in the coverage of election promises: Comparative evidence from the Netherlands and the US. Acta Política 56(1), 1-25. https://doi.org/10.1057/s41269-019-00141-8

Eurobarómetro (2021). Opinión pública en la Unión Europea: informe nacional. España. Parlamento Europeo. https://bit.ly/4mkah8J

Fawzi, Nayla. (2019). Untrustworthy news and the media as “enemy of the people?” How a populist worldview shapes recipients’ attitudes toward the media. The International Journal of Press/Politics 24(2), 146-164. https://doi.org/10.1177/1940161218811981

Fenoll, Vicente; & Rodríguez-Ballesteros, Paula. (2017). Análisis automatizado de encuadres mediáticos. Cobertura en prensa del debate 7D: el debate decisivo. El Profesional de la Información 26(4), 630-640. https://doi.org/10.3145/epi.2017.jul.07

Fernández-Fernández, Maximiliano. (2009). Lenguaje violento en los medios de comunicación españoles. In Marie-Claude Chaput & Manuelle Peloille (Coord.), Sucesos, guerras, atentados: La escritura de la violencia y sus representaciones (pp. 193-208). PILAR (Presse, Imprimés, Lecture dans l’Aire Romane).

Gallardo-Paúls, Beatriz; & Enguix-Oliver, Salvador. (2015). Opciones discursivas en la cobertura electoral: los temas de la campaña europea de 2014. In Eulalia Hernández Sánchez & María Isabel López Martínez (Coord.), Sodalicia Dona (pp. 231-252). Universidad de Murcia.

Gavin, Neil T. (2018). Media definitely do matter: Brexit, immigration, climate change and beyond. The British Journal of Politics & International Relations 20(4), 827-845. https://doi.org/10.1177/1369148118799260

Ghosh, Shreenita; Su, Min-Hin; Abhishek, Aman; Suk, Jiyoun; Tong, Chau; Kamath, Kruthika; Hills, Ornella; Correa, Teresa; Garlough, Christine; Borah, Porismita; & Shah, Dhavan. (2020). Covering #MeToo across the news spectrum: Political accusation and public events as drivers of press attention. International Journal of Press/Politics 27(1). https://doi.org/10.1177/1940161220968081

Goidel, Kirby; Davis, Nicholas T.; & Goidel, Spencer. (2021). Changes in perceptions of media bias. Research & Politics 8(1). https://doi.org/10.1177/2053168020987441

Green, Donald P.; & Gerber, Alan S. (2019). Get out the vote: How to increase voter turnout. Brookings Institution Press.

Haselmayer, Martin; Wagner, Markus; & Meyer, Thomas M. (2017). Partisan bias in message selection: Media gatekeeping of party press releases. Political Communication 34(3), 367-384. https://doi.org/10.1080/10584609.2016.1265619

Hjarvard, Stig. (2004). From bricks to bytes: The mediatization of a global toy industry. In Ib Bondebjerg & Peter Golding (Eds.), European Culture and the Media (pp. 43-63). Intellect. https://doi.org/10.2307/j.ctv36xw5qn

Holgado, María. (2017). Publicidad e información sobre elecciones en los medios de comunicación durante la campaña electoral. Teoría y Realidad Constitucional 40, 457-485. https://doi.org/10.5944/trc.40.2017.20914

Holtz-Bacha, Christina; Langer, Ana I.; & Merkle, Susanne. (2014). The personalization of politics in comparative perspective: Campaign coverage in Germany and the United Kingdom. European Journal of Communication 29(2), 153-170. https://doi.org/10.1177/0267323113516727

Høyer, Svennik; & Nossen, Hedda A. (2015). Revisions of the news paradigm: Changes in stylistic features between 1950 and 2008 in the journalism of Norway’s largest newspaper. Journalism 16(4), 536-552. https://doi.org/10.1177/1464884914524518

Hutchens, Myiah J; Hmielowski, Jay D; Pinkleton, Bruce E.; & Beam, Michael A. (2016). A spiral of skepticism? The relationship between citizens’ involvement with campaign information to their skepticism and political knowledge. Journalism & Mass Communication Quarterly 93(4), 1073-1090. https://doi.org/10.1177/1077699016654439

Jones, David A. (2004). Why Americans don’t trust the media: A preliminary analysis. The Harvard International Journal of Press/Politics 9(2), 60-75. https://doi.org/10.1177/1081180X04263461

Kalogeropoulos, Antonis; Fletcher, Richard; & Nielsen, Rasmus K. (2018). News brand attribution in distributed environments: Do people know where they get their news? New Media & Society 21(3), 583–601. https://doi.org/10.1177/1461444818801313

Kalogeropoulos, Antonis; Suiter, Jane; Udris, Linards; & Eisenegger, Mark. (2019). News media trust and news consumption: Factors related to trust in news in 35 countries. International Journal of Communication 13(22), 3672-3693. https://rb.gy/vxm36o

Köffer, Sebastian; Riehle, Dennis M; Höhenberger, Steffen; & Becker, Jörg. (2018). Discussing the value of automatic hate speech detection in online debates. Multikonferenz Wirtschaftsinformatik: Data Driven X-Turning Data in Value 2018, Leuphana, Germany.

Koloski, Boshko; Pollak, Senja; Škrlj, Blaž; & Martinc, Matej. (2021). Extending neural keyword extraction with TF-IDF tagset matching. arXiv preprint arXiv:2102.00472. https://doi.org/10.48550/arXiv.2102.00472

Levendusky, Matthew S. (2013). How partisan media polarize America. University of Chicago Press.

Li, Zeming; & Sun, Xinying. (2021). Analysis of the impact of media trust on the public’s motivation to receive future vaccinations for COVID-19 based on protection motivation theory. Vaccines 9(12), 1401. https://doi.org/10.3390/vaccines9121401

López-García, Guillermo. (2017). Comunicación política y discursos sobre el poder. El Profesional de la Información 26(4), 573-578. https://doi.org/10.3145/epi.2017.jul.01

López-Meri, Amparo. (2017). Citizen contribution to the electoral debate and media coverage on Twitter. Prisma Social (18) 1-33.

Magallón, Raúl; Nachawati, Leila; & Seoane, Francisco. (2021). Evaluación del riesgo de desinformación: el mercado de noticias online en España. Madrid: GDI Global Disinformation Index. https://hdl.handle.net/10016/33587

Maier, Jürgen; & Nai, Alessandro. (2020). Roaring candidates in the spotlight: campaign negativity, emotions, and media coverage in 107 national elections. The International Journal of Press/Politics 25(4), 576-606. https://doi.org/10.1177/1940161220919093

Mancera-Rueda, Ana; & Villar-Hernández, Paz. (2020). Análisis de las estrategias de encuadre discursivo en la cobertura electoral sobre Vox en los titulares de la prensa española. Doxa Comunicación 31, 315-340. https://doi.org/10.31921/doxacom.n31a16

Mazaira-Castro, Andrés; Rúas-Araújo, José; & Puentes-Rivera, Iván. (2019). Fact-checking en los debates electorales televisados de las elecciones generales de 2015 y 2016. Revista Latina de Comunicación Social 74, 748-766. https://doi.org/10.4185/RLCS-2019-1355

McInnes, Leland; Healy, John; & Melville, James. (2018). UMAP: Uniform Manifold Approximation and Projection for dimension reduction. arXiv preprint arXiv:1802.03426. https://doi.org/10.48550/arXiv.1802.03426

Miguel-Sáez-de-Urabain, Ainara; Fernández de Arroyabe-Olaortua, Ainhoa; & Lazkano-Arrillaga, Iñaki. (2017). La espectacularización de la información política. El caso de El País en las elecciones estadounidenses de 2016. Revista Latina de Comunicación Social 72, 1131-1147. https://doi.org/10.4185/RLCS-2017-1211

Mikolov, Tomas; Chen, Kai; Corrado, Greg; & Dean, Jeffrey. (2013). Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781. https://doi.org/10.48550/arXiv.1301.3781

Mustafa, Ghulam; Usman, Muhammad; Afzal, Muhammad T; Shahid, Abdul; & Koubaa, Anis (2021). A comprehensive evaluation of metadata-based features to classify research paper’s topics. IEEE Access 9, 133500-133509. https://doi.org/10.1109/ACCESS.2021.3115148

Nelson, Candice J.; & Thurber, James A. (Eds.) (2018). Campaigns and elections American style: The changing landscape of political campaigns. Routledge. https://doi.org/10.4324/9780429468278

Newman, Nic; Fletcher, Richard; Schulz, Anne; Andı, Simge; & Nielsen, Rasmus K. (2020). Reuters Institute Digital News Report 2020. Reuters Institute for the Study of Journalism. https://acortar.link/TvW7gs

Nicholls, Tom; & Culpepper, Pepper D. (2021). Computational identification of media frames: Strengths, weaknesses, and opportunities. Political Communication 38(1-2), 159-181. https://doi.org/10.1080/10584609.2020.1812777

Orden-Cruz, Carmen; Gómez-Martínez, Raúl; & Paule-Vianez, Jessica. (2019). Sentimiento de los medios de comunicación españoles en formato digital sobre el Ibex35. Revista Internacional de Investigación en Comunicación aDResearch ESIC 19(19), 56-67.

Paniagua-Rojano, Francisco; Seoane-Pérez, Francisco; & Magallón-Rosa, Raúl. (2020). Anatomy of the electoral hoax: Political disinformation in Spain’s 2019 general election campaign. Revista CIDOB d’Afers Internacionals 124, 123-146. https://doi.org/10.24241/rcai.2020.124.1.123

Prior, Markus. (2005). News vs. entertainment: How increasing media choice widens gaps in political knowledge and turnout. American Journal of Political Science 49(3), 577-592. https://doi.org/10.1111/j.1540-5907.2005.00143.x

Reporteros Sin Fronteras (2020). Spain: Growing polarisation, lack of transparency. https://rsf.org/en/spain

Reuters Institute for the Study of Journalism (2023). Digital News Report 2023. https://acortar.link/JBHtWS

Riedel, Benjamin; Augenstein, Isabelle; Spithourakis, Georgios P.; & Riedel, Sebastian. (2017). A simple but tough-to-beat baseline for the Fake News Challenge stance detection task. arXiv preprint arXiv:1707.03264 https://doi.org/10.48550/arXiv.1707.03264

Rieis, Julio; De-Souza, Fabrício; Vaz De-Melo, Pedro; Prates, Raquel; Kwak, Haewoon; & An, Jisun. (2015). Breaking the news: First impressions matter on online news. Proceedings of the International AAAI Conference on Web and Social Media, 9(1), 357-366. https://doi.org/10.1609/icwsm.v9i1.14619

Sánchez-Gutiérrez, Bianca. (2016). La representación mediática de los partidos políticos emergentes: el caso de Podemos y Ciudadanos en Atresmedia [Doctoral dissertation, Universidad de Sevilla].

Sánchez-Gutiérrez, Bianca; & Nogales-Bocio, Antonia I. (2018). La cobertura mediática de Podemos en la prensa nativa digital neoliberal española: una aproximación al caso de OkDiario, El Español y El Independiente. Estándares e indicadores para la calidad informativa en los medios digitales, 125-146.

Smith, Glen R. (2010). Politicians and the news media: How elite attacks influence perceptions of media bias. International Journal of Press/Politics 15(3), 319-343. https://doi.org/10.1177/1940161210367430

Smith, Kevin B. (1997). When all’s fair: Signs of parity in media coverage of female candidates. Political Communication 14(1), 71-82. https://doi.org/10.1080/105846097199542

Smirnova, O. V; Denisova, G. V; Alevizaki, O. R; Ilyichenko, D. S.; & Antipova, A.S. (2021). Space-related scientific and technical information in current media discourse: Research results. Scientific and Technical Information Processing 48(2), 133-138. https://doi.org/10.3103/S0147688221020118

Terachi, Masahiro; Saga, Ryosuke; & Tsuji, Hiroshi. (2006). Trends recognition in journal papers by text mining. 2006 IEEE International Conference on Systems, Man and Cybernetics 2006 (Vol 6, pp. 4784-4789). https://doi.org/10.1109/ICSMC.2006.385062

Terkildsen, Nayda; & Damore, David F. (1999). The dynamics of racialized media coverage in congressional elections. The Journal of Politics 61(3), 680-699. https://doi.org/10.2307/2647823

Thurber, James A.; & Nelson, Candice J. (2018). Elections in a polarized America: Understanding the dynamics and the transformation of American political campaigns. In Candice J. Nelson & James A. Thurber (Eds.) Campaigns and Elections American Style (pp. 1-10). Routledge.

Van der Maaten, Laurens; & Hinton, Geoffrey. (2008). Visualizing data using t-SNE. Journal of Machine Learning Research 9(11).

Van der Pas, Daphne J.; & Aaldering, Loes. (2020). Gender differences in political media coverage: A meta-analysis. Journal of Communication 70(1), 114-143. https://doi.org/10.1093/joc/jqz046

Van Duyn, Emily; & Collier, Jessica. (2019). Priming and fake news: The effects of elite discourse on evaluations of news media. Mass Communication and Society 22(1), 29-48. https://doi.org/10.1080/15205436.2018.1511807

Veres, Luis. (2006). La retórica del terror: sobre lenguaje, terrorismo y medios de comunicación. Ediciones de la Torre.

Vizoso, Ángel; & López-García, Xosé. (2020). Newtral y Comprobado: experiencias de fact-checking durante la campaña electoral de las Elecciones Generales en España. In Iván Puentes-Rivera, Ana Belén Fernández-Souto & Montse Vázquez-Gestal (Eds.), Debate sobre los Debates Electorales y Nuevas Formas de Comunicación Política (pp. 77-98). Cuadernos Artesanos de Comunicación.

Xu, Kuai; Wang, Feng; Wang, Haiyan; & Yang, Bo. (2018). A first step towards combating fake news over online social media. In International Conference on Wireless Algorithms, Systems, and Applications, 521-531. Springer, Cham. https://doi.org/10.1007/978-3-319-94268-1_43

Yamamoto, Mikio; & Church, Kenneth W. (2001). Using suffix arrays to compute term frequency and document frequency for all substrings in a corpus. Computational Linguistics 27(1), 1-30. https://doi.org/10.1162/089120101300346787

Zhao, Jieyu; & Chang, Kai-Wei. (2020). LOGAN: Local Group Bias Detection by Clustering. Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing, 1968-1977. https://doi.org/10.18653/v1/2020.emnlp-main.155

Zhou, Pei; Shi, Weijia; Zhao, Jieyu; Huang, Kuan-Hao; Chen, Muhao; & Chang, Kai-Wei. (2019). Analyzing and mitigating gender bias in languages with grammatical gender and bilingual word embeddings. Annual Meeting of the Association for Computational Linguistics. ACL 2019 Montréal, QC, Canada.