Automated detection of political ideology from text: a case study of newspapers in Uruguay


ABSTRACT_

Although objective reporting is the landmark of professional journalism, several academics have argued that the media is ideologically biased. The association of the printed press (newspapers) with political parties have long been acknowledged in Uruguay. However, the lack of studies that empirically demonstrate and measure the extent of ideological bias towards political affiliations has prevented scholars from addressing the ideological diversity of newspapers. Here, the author describes the use of natural language processing and unsupervised machine learning algorithms in conjunction with network graph analysis to investigate the political leaning of five newspapers that published a total of 530 news articles on two political candidates from opposing parties during the election cycle of 2019.

THE INFLUENCE OF MEDIA IN SHAPING PUBLIC OPINION_

The media -radio, television, press, and online- has a vital role in society and its main function is the communication to the public of local, national and international events. Media could certainly influence what people think about a range of national and international issues. Several factors can contribute to the degree in which people are influenced by media; and these factors vary according to geographical and cultural contexts. For instance, a recent study co-authored by Hong Tien Vu and Peter Bobkowski from Kansas University showed that the strength of the effect of journalism on public opinion (from 16 countries on five continents) was dependent on the public's age, educational level, living area, political ideology; and also on the nation's macro variables such as economic development and media freedom [1, 2]. The study found that agendas set by media showed a moderately high correlation with issues the public considered most relevant, with countries such as South Korea, Taiwan, South Africa, Philippines, Mexico and Chile displaying statistically significant relationships between media and public agendas [2].

Media can also limit the scope of arguments and perspectives that inform public debate, and the subsequent construction of not only public belief but also their attitudes towards social change; as it was demonstrated by Happer and Philo in 2013 [3]. These authors could show that for the issue of disability in the United Kingdom, an increase in reporting from printed media that discussed the topic in unsympathetic terms led to negative public opinion on disability benefits and the persons who have them. Interestingly as well, the authors showed that repeated exposure to media messages related to climate change pre-disposed the attitude and behavior of people in adjusting their views and opinions to new information [3]. This means that public opinion could be selectively changed by exposure to media; a phenomenon that in United States was recently shown to occur when university researchers intentionally intervened 48 media outlets to activate public expression, causing citizens to discuss major issues of policy and politics as part of the ongoing collective 'national conversation' [4]. Interestingly, the authors could detect that their media intervention altered the composition of Twitter opinions expressed in the national conversation by 2.3% towards the ideological direction implicit by their published articles; and an increase in public engagement and discussion on Twitter that was 62.7% higher relative to the media's day's volume [4].

Overall, these works described that in different countries media had a considerable and tangible effect in shaping public opinion on public policy in general and on political issues in particular. This raises the question if media effects on public opinion is intentional or not; and more importantly, if press and media may or not reflect reality but filter and shape it instead according to their biases (inherent or intentional) towards certain policies and political views.

IS MEDIA BIASED?_

Although the premise of professional journalism rest on objective reporting, several scholars have described that the media consistently displays ideological bias [5-8]. Ideology is described here as a system of ideas, beliefs and ideals which constitute the basis for political and economical theories that guide policy making [9]. Media bias is considered intentional if it results from both a conscious act or choice and is sustained over time [5, 7]. In this sense, media bias is considered to be a systemic tendency instead of independently isolated incidents in which either journalists and/or media owners purposely implement in order to obtain a concrete political, social and/or economic goal.

Three types of media bias have been described [5], and these are: COVERAGE, GATEKEEPING, and STATEMENT. COVERAGE bias refers to the visibility of topics and entities, such as a person/politician or country, in media coverage. GATEKEEPING bias, also termed selection bias or agenda bias, refers to which stories media outlets select or discard for reporting. STATEMENT bias, also denominated presentation bias, relates to how articles and stories choose to report on concepts.

Media bias in news content could significantly impact the political attitude of voters and thus influence the outcome of elections [3]. What then are the potential effects of biased news consumption? A likely outcome would be the reduction of political diversity and views across the population; which in turns would diminish freedom of expression and democratic values. Indeed, it has been suggested that presence and cultivation of ideologically diverse news content would lead to healthier democracies [5]. Because of this, several countries have established laws that regulate media ownership as means to limit concentration of media outlets owned by few individuals or groups and their associated political ideologies [10, 11]. Concentration of media ownership has been the status quo in Uruguay [12, 13], a country located in South America that is the subject of study in this work.

IS THE MEDIA IN URUGUAY IDEOLOGICALLY BIASED?_

According to Adolfo Garcé, a political scientist, the media in Uruguay has been historically biased in political matters and closely associated to political parties until the 1950s, and gradually acquired more independence over political views ever since [14]. Following a period of dictatorship in which many media outlets were censured and closed, Uruguay re-established its democratic government in 1985, to which the media and in particular the printed press adjusted its political preferences onward to match the new ideological climate of re-stablished political parties when democracy was re-gained in the country [15]. Since 1985 then, the primary type of journalism practiced in Uruguay is 'declarative' in nature; and thus mainly characterized by reporter's citations of statements pronounced by politicians [16]. Because of this, the ideological perspective -specially in the printed press- of a media outlet could be evident by the analysis of the nature and frequency of citations derived from statements given by political figures who are aligned with the ideology of the media outlet in question. The author himself has been noticing ideological bias over the years towards a particular politician when reading the political section from one of the major newspapers in Uruguay. This prompted the author to engage on a research-based art project for the automatic detection of political ideology from text using machine learning and natural language processing algorithms.

OBJECTIVE_

The objective of the current work is to implement automated techniques to identify political ideology from text, mainly machine learning and natural language processing algorithms such as topic modeling and clustering. For the purpose of this study, text constituted 440 newspaper articles written about two major political candidates running for president of Uruguay that were published by journalists from three different media outlets during the period between June 1st to July 25th of 2019. Text also included an additional 90 news articles written on two political debates held on October 1 and November 13 of 2019. The author also included documents containing the programatic outlines being championed by each candidate and their respective political parties.

BRIEF INTRODUCTION TO THE POLITICAL CONTEXT OF URUGUAY_

Uruguay implements a presidential form of government with division of power among the executive, legislative and judiciary branches. The president, who is also the head of the state, is directly elected by the people for a five-year term. It is the president who then appoints a council of ministers for each administrative department [17]. The vice-president oversees the national legislature, which is composed of a bicameral parliament also elected by the people for a five-year term, and is composed of a 31-member senate and 99-member chamber of deputies. The last national elections were held on October (1st round) and on November (2nd round) of 2014 in which Dr. Tabarè Vàzquez -representing the leftist political party known as Frente Amplio- was elected president. The current presidential and parliamentary elections took place on October (1st round) and November (2nd round) of 2019. For this election cycle, there were eleven candidates running for president representing the following political parties:

Daniel Martinez > Frente Amplio (incumbent party)

Luis Lacalle Pou > Partido Nacional

Ernesto Talvi > Partido Colorado

Guido Manini Rios > Cabildo Abierto

Gonzalo Abella > Unidad Popular

Pablo Mieres > Partido Independiente

César Vega > Partido Ecologista Radical e Intransigente

Edgardo Novick > Partido de la Gente

Gustavo Salle > Partido Verde Animalista

Daniel Goldman > Partido Digital

Rafael Fernández > Partido de los Trabajadores

The focus of the current study took into account newspaper articles written about two major candidates running for president of Uruguay in the recently held national elections: Daniel Martínez and Luis Lacalle Pou, respectively. These politicians were leading the polls of public opinion [18] and represented political parties of contrasting ideologies (left and conservative, respectively). Newspaper articles written about Daniel Martinez and Luis Lacalle Pou dated to the time when primary elections were held in Uruguay last June 30th of 2019, and these politicians were elected to represent their political parties at the national elections on October and November of 2019.

The newspapers considered for this study were the following:

Primary Elections Period (June 1 to July 25 of 2019) - 440 news articles total:

El Observador (print & digital) https://www.elobservador.com.uy

La Repùblica (print & digital) https://www.republica.com.uy

Montevideo Portal (digital only) https://www.montevideo.com.uy/index.html

Political Debates between Daniel Martínez and Luis Lacalle Pou (October, 1 and November, 13 of 2019) - 90 news articles total:

El Observador

La Repùblica

Montevideo Portal

La Diaria (print & digital) https://ladiaria.com.uy

La Red21 (digital only) http://www.lr21.com.uy

The main corpus used as a dataset needed to train machine learning algorithms was derived from a total of 440 articles published by El Observador, La Repùblica, and Montevideo Portal in order to analyze document similarities (clustering) and to predict topics (topic modeling). Another 90 news articles published La Diaria and La Red21 in addition to El Observador, La República and Montevideo Portal on the political debate between the two political candidates previously mentioned were used instead to predict the political leaning for each newspaper based on the word context for a selected number of topics.

AUTOMATED DETECTION OF POLITICAL IDEOLOGY FROM TEXT_

In this study the author focused on news articles published by newspapers, which for many persons is the primary source of information and thus; they play a pivotal role in shaping personal and public opinion. Automatic detection of political ideology from news articles is based on the notion that journalists can modulate the reader's perception of a political topic through word choice; for instance when the author employs word usage with positive or negative connotations when referring to a political candidate or party, or by varying the credibility of the source [7]. The ideological perspective of a journalist is also often expressed in the choice of discussed topics as journalists with opposing ideologies will choose to write on different topics and make them more salient according to their views. Nonetheless, newspapers don't explicitly express their political preferences, which makes the task of detecting political ideology in news articles somewhat difficult.

Machine learning in conjunction with natural language processing algorithms have been implemented for the automated detection of political ideology from text. For instance, Elfardy et al, identified the ideological perspective of a person by using semantic features derived from the person's written texts [19]. The use of machine learning algorithms for detection of political ideology from news articles was also explored by Kulkarni and colleagues [20]. They proposed a model (based on Bayesian approach with stochastic attention units) that leveraged not only the text contained within news articles but also their titles and hyperlink structure (news articles would provide weblinks to other media sources with similar political ideology) as means to rank 59 news sources based on their predicted political ideology [20]. Gentzkow and Shapiro instead constructed an index of media's political bias that measured the similarity of news outlet's language to that of a congressional Republican or Democrat according to written text derived from Congressional Records in 2005 [5]. Their index measured the frequency of language usage that would 'sway' readers to the left or to the right on political issues; by examining the set of all phrases used by members of Congress during 2005, and identifying those phrases that were more frequently used by Democrats or Republicans. Consequently, they indexed newspapers by the degree in which they used 'politically charged' phrases in their news articles that were reminiscent of those used in political speeches by Democrat or Republican politicians [5]. The resulting index allowed the authors to compare newspapers to one another, rather than comparing them to any given standard of 'true' or 'unbiased' journalism [5]. Iyyer et al, implemented a recursive neural network for detection of ideological bias at the sentence level [21], differing from previous approaches that were based on 'bag of words' classifiers. The authors were interested in learning representations that could distinguish political bias given labeled data; with their dataset derived also from Congressional Debates during the year of 2005 [21]. Ahmed and Xing [22] developed topic models (multi-view Latent Dirichlet Allocation) capable of recognizing the ideological bias in a given document, their approach was also capable of summarizing where the bias was manifested on a topical level, and provided readers with alternate views that would help them to remain informed from different perspectives. Lazaridoul and Krestel focused on the examination of 'selection bias' in the sense of how much space a newspaper dedicates for each political party; they also examined how often politicians were mentioned and how often politicians were quoted in news articles [8].

Despite all the technical advances previously mentioned, a key question still remains to be answered and is: How the automated detection of ideological bias in news articles will eventually contribute to unbiased journalism and a more balanced coverage of political events and social issues to news readers in Uruguay? The work discussed here at least help towards this goal by providing the Uruguayan reader with an objective analysis of political journalism and its inherent bias so as to promote critical thinking and prevent the reader from potential manipulation by the written press.

TECHNICAL IMPLEMENTATION & RESULTS_

Corpora assembly_

The corpus was assembled by retrieving articles containing the name of the political candidate (either Luis Lacalle Pou or Daniel Martínez) by typing their names in the search box within the website of three newspapers: El Observador, Montevideo Portal, La República; whereas news articles on both political debates were downloaded from the website of newspapers during the same or the following days to the debates. The Python library NLTK [24] was used to estimate the size of the corpus in terms of tokens and is shown on Table 1.

Table 1. Structure of corpus used in this study in terms of number of news articles and document size in tokens for each newspaper and document of proposal for governance.

It is important to note that because the search of articles for each political candidate was conducted independently even within the same newspaper, news articles mentioning both candidates are thus shared between files: that is the same text file mentioning Luis Lacalle Pou and Daniel Martínez is shared between collections (a) El_Observador/Luis_Lacalle_Pou and (b) El_Observador/Daniel_Martínez.

The conjunction of news articles searched for 'Daniel Martínez' contained more tokens relative to the conjunction of news articles searched for 'Luis Lacalle Pou' for El Observador and Montevideo Portal newspapers but not for La República. During the time period of the study (June-1 to July-25 of 2019) La República had a considerably minor journalistic output in terms of published news articles, and consequently tokens, relative to the other two newspapers.

As part of the corpus, the author also included the programatic lines / proposals for governance for each of the political candidates [25, 26]. The proposals for governance were used as reference to compare word contexts of selected topics from news articles on political debates published by the five newspapers shown on Table 1.

Examining mentions of political pre-candidates from news articles_

When news articles are assembled together into a single text file according to their published date, the location of words of interest along a temporal line can be determined. This positional information can be displayed using dispersion plots (Figure 1 and 2) and served the purpose of investigating changes in language use over the studied time period (June 1 to July 2015 with preliminaries held on June 30 of 2019).

Figure 1. Dispersion plots for news articles searched and downloaded using the keyword 'Luis Lacalle Pou' from El Observador (top), Montevideo Portal (middle), and La República (bottom) newspapers, respectively. Each blue mark represents an instance of a word, whereas each row represents the entire corpus composed of news articles arranged into a single document according to publication date from June 1 (left) to July 25 (right) of 2019.

Figure 2. Dispersion plots for news articles searched and downloaded using the keyword 'Daniel Martínez' from El Observador (top), Montevideo Portal (middle), and La República (bottom) newspapers, respectively. Each blue mark represents an instance of a word, whereas each row represents the entire corpus composed of news articles arranged into a single document according to publication date from June 1 (left) to July 25 (right) of 2019.

From Figures 1 and 2 it can be seen that the frequency of mentions for the leading candidates Luis Lacalle Pou and Daniel Martínez was higher across the entire time period relative to their competitors within the same party (Juan Sartori, Jorge Larañaga, Enrique Antía, and Carlos Iafigliola for Partido Nacional; and Carolina Cosse, Mario Bergara, and Óscar Andrade for Frente Amplio; respectively). Luis Lacalle Pou and Daniel Martínez eventually won their primaries on June 30 of 2019 and both politicians selected running mates (candidates for Vice President) soon afterwards (Beatriz Argimón was selected by Luis Lacalle Pou, and Graciela Villar was selected by Daniel Martínez, respectively). The appearance in mentions for Argimón and Villar are clearly seen on the dispersion plots shown, specially for Villar whose selection as running mate of Martínez was considered controversial because of her status as outsider of the mainstream political system, in addition to the controversy surrounding her academic degree and title credentials [27, 28].

The percentage of mentions for each politician relative to the size of the news article's corpus for each newspaper is shown on Figure 3. The frequency of politician's mentions in news articles corresponded well with polls results after the election for each party [29].

Figure 3. Percentage of politician's mentions relative to corpus size for each newspaper. Because in many news articles Luis Lacalle Pou is referred as 'Lacalle' or 'Luis Lacalle', the author included the percentage of independent mentions for both last names, 'Lacalle' and 'Pou', respectively. Cross-referencing of mentions for political rivals was also included; for instance the percentage of mentions of Daniel Martínez in news articles searched and downloaded with Luis Lacalle Pou as keyword for each newspaper, and the percentage of mentions for Lacalle and Pou in news articles searched and downloaded with Daniel Martínez as keyword for each newspaper.

The results shown on Figure 3 revealed that for La República, the nomination of Graciela Villar as candidate for vice-president for Frente Amplio was very relevant; and differed from Montevideo Portal and El Observador whose emphasis was on Carolina Cosse who was second in votes after Martínez but was not chosen to run on the ticket as vice-president along Martínez. On the other side, it can be seen that the chosen running mate for Luis Lacalle Pou, Beatriz Argimón, did not generate media buzz as her mentions were proportionally less than those to other candidates such as Sartori who was a newcomer and obtained a second place in the primaries.

Identification of the predominant political party in news articles_

In order to investigate the role of Partido Nacional and Frente Amplio in the content of news articles from El Observador, Montevideo Portal, and La República newspapers, the author created a frequency-based score of Partido Nacional and Frente Amplio based on certain words that reflected each party:

Partido Nacional >

'lacalle', 'larrañaga', 'antía', 'sartori', 'nacionalista', 'blanco', 'pn', 'oposición', 'argimón'

Frente Amplio >

'martínez', 'cosse', 'bergara', 'andrade', 'frente', 'frentista', 'frenteamplista', 'fa', 'oficialismo', 'villar'

The author then analyzed the presence of these words for each sentence in news articles for each newspaper in order to classify sentences into four categories: (a) sentences assigned to Partido Nacional; (b) sentences assigned to Frente Amplio; (c) sentences that contained words descriptors for both political parties; and (d) sentences without any references to political party's words descriptors, and thus of unknown assignment. The results are shown on Figure 4 and depicted that the biggest percentage of political party associated sentences were found on news articles from La República newspaper, indicating possible ideological bias, specially when sentences associated with Partido Nacional were completely absent from news articles searched and downloaded using the keyword Daniel Martínez. The author also looked at cross mentions, this means sentences assigned to Frente Amplio when news articles were searched and downloaded using Luis Lacalle Pou as keyword; and similarly when sentences were assigned to Partido Nacional when news articles were searched and downloaded using Daniel Martínez as keyword. Cross mentioning measures the balance in political writing since no politician is consistently reported in isolation from his political context and thus from his political rivals. For instance, cross mentions to Frente Amplio in news articles searched and downloaded on Luis Lacalle Pou as keyword had slightly over 9% of sentences assigned; whereas cross mentions to Partido Nacional in news articles searched and downloaded using Daniel Martínez as keyword had 5.7 and 6.0% of sentences assigned to, with La República not having any mention at all towards Partido Nacional for articles searched and downloaded when Daniel Mart[inez was used as keyword. This phenomena was also evident when sentence assignation to political party affiliation was calculated for each day during the period of study (Figure 5).

Figure 4. Percentage of sentences (relative to total sentences in text file) assigned to political affiliation (Partido Nacional, Frente Amplio or both) from news articles searched and downloaded from El Observador, Montevideo Portal, and La República, respectively. Unassigned sentences are shown in yellow. Results are based on combined news articles into a single file during the time period of study (six files in total: two files (Lacalle Pou / Martínez) for each newspaper).

Figure 5. Daily percentage of sentences assigned to political affiliation (Partido Nacional, Frente Amplio or both) from news articles searched and downloaded from El Observador, Montevideo Portal, and La República respectively. News articles published during the same day for each newspaper were combined into a single text file and analyzed in conjunction. Unassigned sentences are not shown. Primaries were held on June 30 of 2019.

From Figure 5 it can be seen that the amount of news days across the three newspapers in which cross mentioning of Frente Amplio exceeded 10% of sentences when articles were searched and downloaded using Luis Lacalle Pou as keyword was superior; compared to the amount of news days in which cross mentioning of Partido Nacional exceeded 10% of sentences when articles were searched and downloaded using Daniel Martínez as keyword. This can be explained by the combination of two processes: first, the primary role of Partido Nacional as the leading party of the opposition towards Frente Amplio and the current government establishes that their media appearances contain a rhetoric of criticism towards Frente Amplio and thus much more mentions which get published by newspapers because the primary type of journalism in Uruguay is 'declarative' in nature. Second, Frente Amplio as the governing party does not necessarily need to mention Partido Nacional when campaigning, as they need to convince the electorate to continue in power for the next cycle. A similar phenomena of superior mentions in media towards the ruling party was also observed in England [8].

The assignment of sentences to political affiliation was also studied by the author when news articles were published reporting on the televised political debates between Luis Lacalle Pou and Daniel Martínez held on the nights of October 1 and November 13 of 2019 (Figure 6). The results showed that sentences assigned to both political parties (Partido Nacional and Frente Amplio mentioned on the same sentence) were predominant in news articles published by the newspapers El Observador, Montevideo Portal, and La República; whereas sentences assigned to Frente Amplio alone were the predominant ones in newspapers such as La Diaria and La Red21, respectively. The journalistic reporting from La República was definitely different when reporting on the political debate compared to the reporting of the political campaign during the time period studied (compare Figure 4, 5 and 6). There was a clear drop in sentences containing mentions to both political parties in La Diaria and La Red21 newspapers relative to the other three media outlets. Similar to previous results shown on Figures 4 and 5, Frente Amplio was the most mentioned political party when reporting on the debates.

Figure 6. Percentage of sentences assigned to political affiliation (Partido Nacional, Frente Amplio or both) from news articles searched and downloaded on the political debates between Luis Lacalle Pou and Daniel Martínez held on October 1 and November 13 of 2019. News articles for each newspaper were combined into a single text file (5 text files, one for each newspaper) and sentences parsed according to frequency counts for party-related words in them. Words considered as party-related were as follow: Partido Nacional > 'lacalle', 'larrañaga', 'antía', 'sartori', 'blanco', 'blancos', 'nacionalista', 'pn', 'oposición', 'argimón', 'abdala', 'delgado', 'alternancia', 'posadas'; Frente Amplio > 'martínez', 'cosse', 'bergara', 'andrade', 'frente', 'frentista', 'frenteamplista', 'fa', 'oficialismo', 'oficialista', 'villar', 'miranda', 'mujica', 'astori', 'vázquez', 'bonomi', 'leal', 'sendic', 'socialista'. The number of news articles sampled from each newspaper were: El Observador (30), Montevideo Portal (15), La República (26), La Diaria (14), La Red21(5), respectively.

Abstracting the content (core topics) from news articles_

Unsupervised machine learning techniques such as text clustering and topic modeling helped the author to automate the elucidation of content within news articles and thus asses their relationship based on similarity in their word usage [30, 31, 32]. News articles that were similar to each other could be grouped/clustered together and the resulting clusters generally portrayed the overarching topics, themes and/or patterns that related to the newspaper's vision of the political scene it was reporting at the time. When topics were discrete and well discerned, clusters did not overlap. On the contrary, when topics were fuzzy, documents were hard to distinguish and resulted with overlap among clusters. Clustering of news articles per newspaper was used to determine the number of clusters and thus the number of topics that were discussed during the time period of study (Figure 7 and Figure 8). The Python code used is available upon request.

Figure 7. Results from k-Means clustering algorithm (MiniBatchKMeans implementation) for news articles searched and downloaded using 'Luis Lacalle Pou' as keyword. Left column shows clustering of 102 news articles retrieved from 'Montevideo Portal' newspaper. Center column shows clustering of 88 news articles retrieved from 'El Observador' newspaper. Right column shows clustering of 30 news articles retrieved from 'La República' newspaper. First two rows of graphs display dimensionality reductions of vector space into 2D scatter plots as 'Principal Component Analysis' (PCA) (first row) and 't-distributed stochastic neighbor embedding visualization' (t-SNE) (second row). Silhouette Scores (third row) and Elbow Curves (fourth row) are shown as menas to evaluate the results of selecting the number of optimal k-clusters. The Python machine learning visualization library Yellowbrick [33] was used to generate t-SNE, Silhouette Scores and Elbow Curves, respectively.

Figure 8. Results from k-Means clustering algorithm (MiniBatchKMeans implementation) for news articles searched and downloaded using 'Daniel Martínez' as keyword. Left column shows clustering of 121 news articles retrieved from 'Montevideo Portal' newspaper. Center column shows clustering of 94 news articles retrieved from 'El Observador' newspaper. Because only 5 news articles were retrieved from 'La República' newspaper, no clustering study was implemented and instead, two clusters (and thus two topics) were arbitrarily assigned. First two rows of graphs display dimensionality reductions of vector space into 2D scatter plots as 'Principal Component Analysis' (PCA) (first row) and 't-distributed stochastic neighbor embedding visualization' (t-SNE) (second row). Silhouette Scores (third row) and Elbow Curves (fourth row) are shown as menas to evaluate the results of selecting the number of optimal k-clusters. The Python machine learning visualization library Yellowbrick [33] was used to generate t-SNE, Silhouette Scores and Elbow Curves, respectively.

From the above figures (Figures 7 and 8) it can be seen that news articles published by 'Montevideo Portal' were more diverse in terms of their content (clustered into 6 groups for both Luis Lacalle Pou and Daniel Martínez) relative to articles published by 'El Observador' (clustered into 5 and 4 groups for Luis Lacalle Pou and Daniel Martínez respectively). Articles published by 'La Reúplica' were less in number (30 articles retrieved when 'Luis Lacalle Pou' was used as keyword; and only 5 articles retrieved when 'Daniel Martínez' was used as keyword) and for the case of 'Luis Lacalle Pou', much more diverse in content which resulted into a high number of clusters (11 clusters). From this analysis (text clustering), the author proceeded to implement topic modeling as means to identify the main topics being discussed by each daily newspaper. For each topic, the 100 most frequently mentioned terms were retrieved and from these, the most interesting ones deemed by the author are shown. Notice that the term associated with 'Luis Lacalle Pou' does not occur within any of the topics; this is so because text clustering and modeling algorithms discard any term that appears in more than 85% of the news articles (discard too common terms), which is the case because all articles contain Luis Lacalle Pou term because they were retrieved and downloaded using this keyword. The same is for Daniel Martínez. The clustering and topic modeling algorithms also discarded terms that occurred in less than 5% of the articles. The topics obtained were the following and their relationship according to sharing terms within topics is shown on Figure 9 and Figure 10:

El Observador > Luis Lacalle Pou > 5 Topics (OLLPT1-T5)

Topic_1 (OLLPT1)

Sartori - Larrañaga - nacional - empresario - Argimón - directorio - política - precandidato - campaña - interna - Antía - denuncia - alianza - acto - precandidatos - nacionalista - senado - presidenta - fórmula - intendente - sector - diputado - Costa - líder - blancos - dirigentes - candidato - falsas - Moreira - Herrera - Abdala - vicepresidente - senador - Wilson - blanco - vivir - mujer - político - blanca - embargo - país

Topic_2 (OLLPT2)

Talvi - Martínez - Colorado - Cosse - voto - encuestas - encuesta - Sanguinetti - Radar - intención - Opción - Andrade - interna - elecciones - internas - Equipos - datos - votar - resultados - Factum - Larrañaga - ecuestadoras - opinión - Bergara - Cifra - canal - números - margen - error - nacional - consultora - partidos - votos - candidatos - votantes - indecisos - pública - electoral - puntos cabildo - consultores - informe - Sartori - liderazgo - electorado - precandidatos - expresidente - ventaja - votación - políticos - economista

Topic_3 (OLLPT3)

programa - propuestas - Silveira - gobierno - ley - seguridad - propuesta - educación - nacional - gasto - equipo - candidato - país - social - técnicos - sectores - estado - trabajo - políticas - ahorrar - reforma - blanco - estatal - internacionales - sociales - desarrollo Iafigliola - públicas - déficit - medicamentos - inversión - convención - laborales - sector - negociaciones - descentralización - Equipos - senador - sistema - mayoría - acuerdo - parlamento - aborto - economista

Topic_4 (OLLPT4)

gobierno - proceso - precandidato - partidos - rumbo - país - campaña - gira - acto - internas - multicolor - gobernar - presidente - etapa - nacionalista - militancia - social - elecciones - militantes - sector - actos - república - discursos - Vázquez - candidatos

Topic_5 (OLLPT5)

Frente - Amplio - Talvi - Martínez - Villar - fórmula - izquierda - candidato - Manini - votos - oposición - colorados - centro - coalición - gobierno - elección - presidencial - política - Astori - blancos - debate - candidatos -derecha -victoria -colorado - opositor - líder - Batlle - elecciones - campaña - votación - oficialismo - interna - Cabildo - votantes - economía - historia

Montevideo Portal > Luis Lacalle Pou > 6 Topics (MPLLPT1-T6)

Topic_1 (MPLLPT1)

partido - candidato - fórmula - interna - Argimón - nacional - Sartori - elección - reforma - vivir - Martínez - programa - campaña - propuestas - política - seguridad - mujer - colorado - candidatos - directorio - Talvi - Larrañaga - ideas - educación - propuesta - miedo - vicepresidente - precandidato - acuerdo - diálogo - nacionalista - condiciones - vicepresidencia

Topic_2 (MPLLPT2)

encuesta - voto - Sartori - Antía - Larrañaga - Radar - partido - interna - nacional - votantes - blanca - consultora - candidato - Opción - votar - elecciones - internas - amplio - frente - colorado - encuestas - votará - crecimiento - lidera - competencia - Manini - Montevideo - precandidato - votos - vtv - elección

Topic_3 (MPLLPT3)

Talvi - Martínez - votos - Andrade - Cosse - partido - frente - amplio - escrutinio - Sanguinetti - colorado - Bergara - corte - Amorín - Batlle - resultados - internas - electoral - abierto - cabildo - precandidatos - encuestadoras - Larrañaga - partidos - nacional - datos -Equipos - Antía - Sartori - votantes - Manini - elecciones - frenteamplista - candidatos - Iafigliola - votará - interna - debatir

Topic_4 (MPLLPT4)

información - ministerio - asesor - social - sociales - trabajadores - declaraciones - desarrollo - programa - educación - políticas - gente - pública - educativo - entrevista - trabajo - público - propuestas - datos - gobierno - Montevideo - denuncia - seguridad - ley - propuesta

Topic_5 (MPLLPT5)

lista - falsas - noticias - Sartori - denuncia - campaña - redes - equipo - periodista - precandidatos - programa - precandidato - denuncias - sociales - democracia - políticos - listas - dirigentes - Montevideo - radio - ciudad - electoral - nacional - blanco - tema - compañeros

Topic_6 (MPLLPT6)

gobierno - país - uruguayos - presidente - precandidato - nacionalista - senador - líder - campaña - pueblo - ministro - política - gobernar - sector - uruguayo - oposición - blanco - partido - votar - Venezuela - acto - discurso - economía - estado - justicia - nacional - república

La República > Luis Lacalle Pou > 11 Topics (RLLPT1-T11)

Topic_1 (RLLPT1)

interna - redes - gobierno - resultado - blanca - democracia - elección - denuncia - Sartori - senador - precandidato - fórmula - precandidatos - programa - libertad - uruguay - escenario - periodista - electoral - sociales - acuerdo - relación - historia - directorio

Topic_2 (RLLPT2)

Opción - elecciones - Radar - nacionalistas - fa - electorado - encuesta - candidato - Talvi - colorados - consultores - fidelidad - Sanguinetti - frenteamplista - internas - Larrañaga - Sartori - nacionales - colorado - Martínez - ampli