Academia.eduAcademia.edu

Sentiment Dynamics in Social Media News Channels

2018, Online Social Networks and Media

Social media is currently one of the most important means of news communication. Since people are consuming a large fraction of their daily news through social media, all the traditional news channels are using social media to catch the attention of users. Each news channel has its own strategy to attract more users. In this paper, we analyze how the news channels use sentiment to garner users' attention in social media. We compare the sentiment of news posts generated by television, radio and print media, to show the di erences in the news covered by these channels. We also analyze users' reactions and sentiment of users' opinions on news posts with di erent sentiments. We do our analysis on the dataset extracted from the Facebook Pages of ve popular news channels. Our dataset contains 0.15 million news posts and 1.13 billion users reactions. Our result shows that sentiment of the user opinion strongly correlates with the sentiment of news posts and the type of information source. Our study also illustrates the di erences between the social media news channels of di erent types of news sources.

Sentiment Dynamics in Social Media News Channels Nagendra Kumara , Rakshita Nagallab , Tanya Marwahc , Manish Singha a Indian Institute of Technology Hyderabad, India b Columbia University, New York, USA c Carnegie Mellon University, Pittsburgh, USA arXiv:1908.08147v1 [cs.SI] 21 Aug 2019 Abstract Social media is currently one of the most important means of news communication. Since people are consuming a large fraction of their daily news through social media, all the traditional news channels are using social media to catch the attention of users. Each news channel has its own strategy to attract more users. In this paper, we analyze how the news channels use sentiment to garner users’ attention in social media. We compare the sentiment of news posts generated by television, radio and print media, to show the differences in the news covered by these channels. We also analyze users’ reactions and sentiment of users’ opinions on news posts with different sentiments. We do our analysis on the dataset extracted from the Facebook Pages of five popular news channels. Our dataset contains 0.15 million news posts and 1.13 billion users reactions. Our result shows that sentiment of the user opinion strongly correlates with the sentiment of news posts and the type of information source. Our study also illustrates the differences between the social media news channels of different types of news sources. Keywords: Social Media Analysis; Sentiment Analysis; Data Mining; Opinion Analysis; Data Characterization 1. Introduction Television, radio and print media are the three primary types of news sources in the world. Due to the unique nature of each communication medium and the manner in which their audience consume the information, they show a significant difference in their news articles. Despite these differences, all these sources are commonly accused of using exaggerated headlines to garner attention [1] and for focusing on negative news [2]. Today, social media has emerged as a powerful platform for consumption of news with 62% of U.S. adults reportedly getting their news from social media [3]. Therefore, traditional news channels have started generating and disseminating news through various social media platforms. As growing number of people consume, share and discuss news online, it is important to understand whether the lack of regulation inherent in social media is being exploited to spread Email addresses: cs14resch11005@iith.ac.in (Nagendra Kumar), rn2439@columbia.edu (Rakshita Nagalla), tmarwah@andrew.cmu.edu (Tanya Marwah), msingh@iith.ac.in (Manish Singh) Preprint submitted to Online Social Networks and Media August 23, 2019 more aggressive and negative news. We surprisingly found that news is not necessarily negative across all the news channels. Rather, there is considerable variation in the way news is presented by different news channels and is heavily dependent on the medium through which these channels traditionally disseminated news, namely radio, TV or print media. Print and radio media based channels post more positive news while TV based channels post more negative news. The difference is not only because of difference in the type of stories covered but because of how the same news is presented by different types of channels. For instance, we can observe from Table 1 how the Dakota Access Pipeline protest, a major movement in the Northern United States to protect natural resources and spiritual sites, reported by CNN and The Economist are extreme in tone, but opposite in polarity (we measure polarity on the scale of -5 to +5) despite being posted at the same time and referring to the same incident. Organization CNN The Economist News We’re at the Standing Rock Sioux Camp in North Dakota. Protesters here are fighting to block the Dakota Access Pipeline and have vowed to stand their ground — despite growing calls for them to leave camp and threats of prosecution from law enforcement. Any questions for CNN’s Sara Sidner? Whatever the final result of the huge, long-running protests by native Americans against the Dakota Access Pipeline, the demonstrations will surely be remembered as a landmark in relations between organised religion, Christianity in particular, and indigenous people Polarity -5 +3 Table 1: Polarity of a news generated by two different types of channels As opposed to one-to-many communication structure of traditional media, social media enables many-to-many communication by allowing the users to engage with the news articles by liking, sharing and commenting on them. The number of likes, shares, and comments received by a news are good objective measures of engagement and provide insight into what type of news interests users. We use this information to understand to what extent the sentiment policy employed by news media have been successful in catching users’ attention. We show that negative news receive more number of comments and shares compared to positive news, which gets more number of likes compared to negative news. This finding is extremely interesting as it agrees with the popular ‘negativity bias’ theory1 for actions that require greater involvement such as sharing and commenting but does not agree for the relatively simpler actions such as liking the news article. Comments allow users to express their opinion regarding a news item. These opinion can be used for opinion mining to gather information on how users perceive the news, predict realworld outcomes, gain useful insight into users’ collective behavior, etc. These mining tasks often involve aggregating the users’ opinions from different news channels, which may potentially bias 1 A popular theory in social psychology that states that humans are more likely to focus on bad news [4]. 2 the result because users’ opinion on a topic depends on many factors [5, 6], such as the region where the news is published, the time when it is published, the type of information source that published the news, the sentiment with which the news is written, etc. In this paper, we analyze how users’ opinion depend on two of these factors: sentiment of the news article and type of news channel. We obtain interesting insights that can be used to correct the bias arising due to the wavering nature of users’ comments. To the best of our knowledge, very few studies [7, 8] have been devoted to the sentiment analysis of news articles and none has arrived to study the role of information sources coupled with the sentiment of news on the users’ perception of the news. To gain more insight into the factors affecting the sentiment of news, we also categorize the news based on the topic, time, and their significance. Some of our major findings show that sentiment generated by social media channels of different types of information sources are different and most of the time, these news channels generate either positive or negative news. Surprisingly, print and radio based channels generate predominantly positive news on their social media pages disagreeing with the popular opinion [9, 10] that news sites mostly post negative news to take advantage of the negativity bias. We also found that negative news were shared and commented on more often but positive news were liked more often throwing light on how the negativity bias operates at a different level of user engagement. Additionally, we show that polarity of the comments is strongly related to the sentiment polarity of the news. As news become more negative, their comments also become more negative in tone and vice versa. News from TV based news channels prompt more negative reactions from news compared to print and radio based channels, suggesting that people react not only to the type of news article but also to the source of the news article. Our key contributions are as follows: • We analyze sentiment of posts created by news channels in their social media pages. • We investigate users’ reactions to news posts of varying sentiment from different types of news channels. • We categorize the news posts into different topics to gain insight into the sentiment of news posts created under these topics. • We compare the sentiment of big news headlines with niche news to investigate how big headlines impact sentiments of the news posts. • We explicate the relationship between sentiment polarity of news articles and the polarity of textual reactions in conjunction with type of information source. • We perform temporal analysis to investigate the posting behavior of news articles having different sentiments over a period of time. • We perform our experiments on a large dataset containing 1.13 billion users reactions and 0.15 million news articles. 3 The rest of the paper is organized as follows. In Section 2, we briefly survey the related work. Section 3 presents our methodologies. Section 4 analyzes the polarity of different types of social media news channels. Section 5 describes the relationship between the popularity and polarity of news articles. In section 6, we analyze the users’ opinions on different types of channels with varying post sentiments. We present temporal analysis in Section 7 and conclude our work in Section 8. 2. Related Work Online news content analysis is an active research topic in Computer Science. One of the major sub-topic is to predict the popularity of news content [11, 12, 13, 14, 15, 16]. Lerman et al. [14] estimated the overall popularity of a content based on early users’ reactions on it. However, Bandari et al. [12] predicted the popularity of a content without using early users’ reactions. They considered multiple features of a news post to predict its popularity prior to its posting, such as subjectivity of the news, category of the news, etc. In this paper, we study the relationship between news post sentiment and the news post popularity across different types of news channels. Unlike popularity prediction studies, we analyze the sentiment of news articles, and it’s effect on popularity and users’ opinions. Several approaches [17, 18, 19, 20, 21] have been proposed to analyze the news propagation. Naveed et al. [20] showed that negative news is more attractive to users and easily catch their attention. They have used 15 different set of content-based features and predict the likelihood of a tweet being retweeted using logistic regression. Wu et al. [21] showed that the lifetime of negative news is very short but positive news stay for a longer time. They predict the decay of social media content using a classification technique namely, support vector machine. Coscia et al. [18] studied the content of news and showed that news content should be significantly different compared to already existing content, as unique content spreads faster than average or existing content. However, we show that it is not only the negative news that catch user attention but also positive news can garner a lot of user attention. Popularity gained by both positive and negative news is usually higher than neutral news. Recently, researchers [22] at UFMG developed a tool that present news to users based on their interest or polarity. They ranked the news articles based on their popularity and sentiment score. Reis et al. [23] analyzed the news headlines from a popular global news channel. They showed that sentiment of headline correlates with the popularity of news and negative comments are posted independently of the sentiment score of the headlines. Diakopoulos and Naaman [24] analyzed the relationships between news comment topicality, temporality, sentiment, and quality. They showed that comment sentiments (positive and negative) are the important indicators of discourse quality. Our analysis is complementary to existing studies, which shows that polarity of comment is not completely independent of polarity actual post; it is a function of the polarity of the news post. Further, a work published on the social news on the web2 , dealt with the problem of finding the topics of a news channel and users’ interests in social media [25]. Authors of the paper 2 http://snow2013.isti.cnr.it/?cat=4 4 performed analysis on The New York Times channel and showed that news stories selected by editors for their newspaper or channel are far from those catching users’ attention on social media. However, in this paper, we perform sentiment analysis on multiple social media news channels and show the top topics of editor interests as well as sentiment associated with each topic. Unlike the findings reported by Zubiaga [25] that news channels mainly report hard news with top priority, we have found that news channels uniformly generate hard news (e.g., politics, money, world) as well as soft news (e.g., entertainment, lifestyle, sports, science and technology) through their social media pages. Another study by Nies et al. [26] showed the impact of an actual news article on social media. They compared the relevancy of an arbitrary news article with social media news to measure its impact. However, we analyze the impact of social media news content with the varying sentiment on social media users. We show how a news with different sentiments shapes the users’ opinion about the news content. Compared to existing works, our focus is on analyzing the sentiments of news posts in social media and users’ reactions to it. We examine the correlation between the polarity of comments and the sentiment polarity of posts. Unlike previous studies [27, 28, 29, 30, 31], our analysis shows that the sentiment of comment is a function of the news post sentiment and the type of channel through which news was traditionally disseminated. 3. Methodology In this section, we first describe the process of collecting the news from Facebook pages of news channels. We then describe the method employed to measure the sentiment polarity. Next, we present the method to categorize the news posts. In the rest of the paper, we use terms such as ‘post’, ‘news post’, ‘news content’, and ‘news article’ interchangeably. 3.1. News Posts Collection In order to characterize the news posted on social media, we collected the news from Facebook pages of five major news media channels. Our choice of Facebook over other social networking platforms is informed by research from Pew Research center, which notes that Facebook has the highest reach with 44% of adults in the US getting news on the platform [3]. Further, we chose news sites with the highest valuation as calculated by Virtue’s Social Page Evaluator3 . In order to understand the differences between different media, we choose two each of television and print media based channels and one radio based channel. Dataset thus includes posts from the Facebook pages of CNN and Fox News which are television news channels, The Economist and The New York Times (or NYT) which are daily and weekly newspaper organizations respectively, and NPR which is a public radio network. We extract the dataset4 from Facebook pages using the Facebook Graph API [32]. The dataset contains news articles posted by the pages, reactions on the post, link to the original news article and attributes including the number of users who liked the page, organization name, post creation time, reaction time, etc. Users can react to posts created by pages in the form of like, comment and 3 4 http://www.adamsherk.com/social-media/most-valuable-news-site-facebook-pages/ The dataset will be made available for download from the author’s website. 5 share. Reactions consist of textual comments and rating score in the form of likes and shares. For each news channel, we present the number of posts, comments, likes, shares, and time interval in the collected news dataset as follows: News Channels Posts Comments Likes Shares Time Interval CNN NPR Fox News The Economist NYT 33324 18266 26525 24272 47522 26582081 4585776 83957661 1336956 9226029 147310056 56007054 443933576 20206137 93891025 52936764 18847580 143762565 6376644 25616593 Dec 2016-April 2012 Dec 2016-Nov 2013 Dec 2016-Jan 2014 Dec 2016-Dec 2014 Dec 2016-April 2013 Table 2: Dataset Statistics From the Table 2, we can infer that Fox News is the most popular news channel as it has the highest reaction per post ratio. Whereas The Economist is the least popular news channel among all the news channels as it has the lowest reaction per post ratio. Here, the reaction is the popularity measure which is the sum of likes, comments, and shares. We also perform preprocessing to remove noisy and unimportant words from the textual posts and comments. It avoids the trivial words, which appear frequently in the posts. We remove stop-words, such as ‘a’, ‘an’, ‘the’, etc., as these words do not contain significant information for our analysis. We also employ stemming and lemmatization [33] to reduce inflected or derived words to their root forms. Throughout the rest of the paper, we consider a common time frame from December 2014 to December 2016 for our analysis. 3.2. Sentiment Polarity Identification In social media, users use informal language to present their textual contents which differentiate social media texts different from standard texts. We are listing a few examples of the usage of informal language as follows: • Social media texts especially comments usually contain emoticons such as :), :(, :-), |-o, etc. • Increasing number of users use acronyms such as LOL, smh, ty, wth, etc. • Social media users use slang words very often, and these words became a part of social media lexicon. For example, meh, yep, giggly, nah are few commonly used slang words. • Users also use multiple punctuation marks to emphasize the certain words in a text sentence. In order to tackle all the above-mentioned issues, we have used Valence Aware Dictionary and sEntiment Reasoner (VADER), which is a powerful sentiment analyzer to find the sentiment of social media texts [34]. VADER is a lexicon and rule-based sentiment reasoner and is the best suited for sentiment analysis of contents originating in social media [35]. It creates and utilizes a new gold standard sentiment lexicon with 7500 lexical features that are commonly 6 used to express the sentiment in a social media text. It uses the rule based method consisting of five rules that embody grammatical and syntactical conventions for expressing the sentiment intensity5 . VADER has been compared with 11 sentiment analysis tools/techniques including SentiWordNet [36], SenticNet [37], LIWC [38]. It is shown that VADER outperforms all of them. Further, VADER provides a sentiment score in the range of -1 to +1, with -1 being extremely negative, +1 being extremely positive and, 0 being neutral. For the sake of interpretability, we convert these polarity scores to an integer between -5 to +5. The polarity scores inferred were as expected and a sample of the same can be observed in Table 3. Score +5 +4 +3 +2 +1 0 -1 -2 -3 -4 -5 Sample Post It’s just an amazing thing to watch good old-fashioned regular human beings and a whole lot of love change the world seismically Follow the Queen’s Diamond Jubilee celebrations with the latest photos, videos, facts and trivia. Tell us which part of the festivities you’re most impressed with When you do something extraordinary, it’s shown that you can inspire other people." #CNNHeroes The world’s first permanent ice hotel has opened in Sweden, thanks to new solar-powered cooling technology Farmers in the Australian desert are growing 15,000 tons of tomatoes using seawater — and thousands of mirrors In a tweet, President-elect Donald J. Trump says his businesses won’t do any new deals while he’s in office Between 2007 and 2014, 30% of African elephants disappeared Being exposed to the daily hassles of traffic can lead to higher chronic stress and higher blood pressure,” according to a recent study conducted in Texas Are we on the verge of a second Cold War? Terror attacks have ripped apart small towns and big cities across the Middle East and Africa throughout 2016, and this weekend was no different A young newlywed couple died a horrible death at the hands of the bride’s family Table 3: Sentiment polarity of sample posts 3.3. News Posts Categorization To get insight into posting behaviour of news channels across the categories, it is useful to categorize the news posts into multiple categories such as sport, entertainment, politics, science and technology (sci&tech), etc. Unlike online news sites, news posted on social media channels is not categorized. In order to categorize these news posts, we use unsupervised method LDA [39]. LDA is a probabilistic topic modeling algorithm which represents each document (in this case, a news post) as a mixture of various topics with definite probabilities (θ). A topic is comprised of 5 https://github.com/cjhutto/vaderSentiment 7 words or terms. The terms that often occur together, are placed under the same topic with high probabilities (φ). Document-topic distribution (θ) and term-topic (φ) are computed using Gibbs sampling as follows: DT +α Cdj DT k=1 Cdk + T α (1) CijW T + β WT k=1 Ckj + W β (2) θdj = PT φij = PW where T , D and α represent the number of topics, documents, and smoothing constant reDT is the number of times a term appears in document d that has been assigned to spectively. Cdj topic j. W , T and β represent the number of terms, topics, and smoothing constant respectively. CijW T is the number of occurrences of a word i that has been assigned to topic j. Gibbs sampling method integrate these two assignments and update the topic assignment until convergence. In order to make the topic modeling richer, we augment the post message with its URL information obtained using the Graph API. The augmented text from external documents using URL was later processed to remove invalid characters and corrected for mistakes in spelling. To determine the ideal number of topics k, we perform 5-fold cross validation on perplexity at different values of k. We then compute the rate of perplexity change (RPC) [40] on a 10% random sample. Perplexity is a statistical measure and often used to measure the performance of topic models [41]. Perplexity reflects the capacity of a model to generalize to test set or unseen posts. The point where the rate of perplexity no longer falls significantly with an increase in the number of topics is used as the ideal number of topics. In our experiment, we find the optimum value of k is 10 from where perplexity does not change significantly. As studied by Chang et al. [42] that perplexity and human judgment are not well correlated, we evaluate our topics manually using precision [43]. We ask five research scholars having knowledge of topic modeling to judge the relevancy of topical words generated by the topic model. We ask research scholars to label each topical word as relevant or non-relevant to assigned topic by the topic model. Topics are labeled by researchers independently without influencing each other. Topics for which researchers did not agree on were discussed until a consensus was reached. We then compute precision as a fraction of generated topical words that are relevant to the assigned topic. We find that the topic model performs reasonably well with 80.3% precision. One of the reasons for this is that number of topics selected for LDA categorization is the best suited for Facebook news posts dataset. Moreover, posts created by news channels in their social media pages are well framed unlike user-generated contents such as comments, tweets, etc. Further, we provide the label for each topic based on the most relevant terms that uniquely define the topic. Since each topic contains thousands of terms, we extract top relevant words based on term-topic distribution. Relevance (r) of a term i to topic j is computed as follows: φij ) (3) pi where φij is the term-topic probability, and pi is the empirical probability of the word in the corpus. λ is a weighting term and we choose 0.6 as an optimal value for λ as shown by Sievert rij = λ log(φij ) + (1 − λ) log( 8 et al. [44]. We assign each post or document to these labeled topics based on document topic probability (θdj ). If document d shows the highest probability for topic j, d is assigned to topic j. Figure 1: Distribution of news posts across categories Figure 1 shows the distribution of posts for each channel across categories. We observe that post distribution of TV based channels CNN, Fox News is almost similar where politics, crime categories contain a higher percentage of posts. In case of print media based channels like The New York Times and The Economist, lifestyle, money, politics, and crime news seem to be more common. On the other hand, NPR which is a Radio based channel, posts lifestyle news most often followed by entertainment (entertain). We show how channels post the news in these categories in Section 4.1. 4. Analysis of News Posts Polarity We begin our investigation by analyzing the distribution of polarity of post messages grouped as positive, negative and neutral for the social media news channels as discussed in Section 1. Figure 2: Polarity of news posts generated by pages 9 A quick glance at Figure 2 reveals that the dominant sentiment in posts by all the news channels is always either positive or negative but not neutral. Moreover, posts with neutral sentiment are least common in Facebook pages of all media sites except NPR. These inferences support the claim that all the channels tend to generate more positive or negative messages on their Facebook pages to attract the users’ attention. Another interesting aspect to be noted is the similarity in the distribution of sentiment polarity of posts between media channels that function through the same medium of communication. Posts by television based news channels, such as Fox News and CNN, are predominantly negative where Fox News generates the highest percentage (40%) of negative news across all the channels. On the other hand posts by radio and print media based channels such as NPR, The Economist and The New York Times are mostly positive. Radio based news channel, NPR generates the highest percentage (43%) of positive news and the least percentage of negative news (28%) across all the channels. Print based media channels, The Economist and The New York Times generate a large proportion of positive news and a less proportion of neutral news. Despite the similar pattern of news generation by these two channels, The Economist generates relatively a larger percentage of both the positive and negative news compared to the neutral news. One of the reasons for this is that The Economist reports growth (i.e. positive news) and decline (i.e. negative news) in business, commerce, and trade substantially. To investigate how different types of channels report the same news, we have also performed the sentiment experiment on the same news events covered by these channels. However, we did not notice a significant difference in the sentiment compared to Figure 2. We have observed the sentiment pattern similar to Figure 2 for all the channels. The news reported by news sources has evolved differently because of the manner in which users consume the information in each medium [45]. Our analysis suggests that these differences remain despite disseminating information on a common platform. News media are often criticized for their focus on negative news rather than providing a balanced picture of the world [19, 46, 47]. This phenomenon has been attributed to journalistic cynicism and inherent preference for negative news among users. However, we observe through our analysis, that print and radio based social media channels post more positive news than negative news. This finding raises important questions: Is this change in the type of content posted by print and radio based channels precipitated by user’s preference for positive news on social networking platforms? Does this mean that the popular negativity bias theories [4, 48], which state that humans have a predilection for negativity, not hold true in the case of news consumption in social media? We attempt to answer these questions in Section 5. 4.1. News Posts Polarity across Categories In this section, we analyze the polarity of news posts across categories to investigate how news channels generate the news across categories. We compare the polarity of news generated in multiple categories such sports, politics, health, entertainment, etc. We observe that channels from the similar type of sources show the similar pattern. Due to space constraints, we present the results of only one news channel of each type of information sources such as print, television, and radio as follows: 10 Figure 3: Fox News Figure 4: The Economist Figure 5: NPR It can be observed from Figures 3 and 4 that news belonging to the crime, world and health categories are predominantly negative, for both print and television based channels. One of the reasons for this is that most of the times news related to crime is woeful and unpleasant. The news related to health and world easily catch the attention of the channels if any terrible accident takes place across the world. However, in case of NPR (Figure 5), which is a radio based channel, all types of news except crime news, are predominantly positive in tone. This suggests the possibility that the trend in dominant sentiment observed in the Section 4 could also be a result of the same type of news being covered differently by different news channels depending on the primary medium. Thus, by analyzing the sentiments across categories for different organizations, we can conclude that both the differences in the type of news that is often covered, and the difference in the tone with which the same news is covered are responsible for the difference in dominant sentiment observed in the previous section. Moreover, except for NPR, across all categories, the proportion of news that is neutral in tone is the smallest. It indicates that trend of a higher fraction of positive or negative news, which we observed in the Section 4, is not limited to a few categories but is one of the tactic that is adapted for the generation of all types of news. NPR, however, stands out with negative news being the least common in majority of the categories. This observation is also consistent with the predominantly positive nature of news generated by NPR that we observed in the previous 11 section. Moreover, the similarity in the sentiment distribution for news channels using the same medium further asserts the influence of the medium on the tone with which news is disseminated by channels. 4.2. Big headlines versus niche news In this section, we compare the polarity of big headlines and niche news. We analyze the differences in the sentiment of news posts reported by different types of channels for big headlines as well as niche news. We report the findings in the following figures: Figure 6: Big headlines Figure 7: Niche news As can be seen in Figures 6 and 7, the sentiment polarity of big headlines are different from the polarity of niche news. All the channels generate a higher percentage of positive headline news compared to negative and neutral news (refer to Figure 6). Among all the channels, NPR generates the highest percentage (43%) of the positive headline news. One of the reasons for generating a higher percentage of positive news is that these big headlines are very popular and exist for a longer time. If news channels continuously generate a higher fraction of negative news for these types of events, users may lose their interests and it would lead to less engagement on these news channels. Negative news goes away very fast [21] and if there is a big headline that is usually persistent for some time, channels create more number of positive news to maintain the sustainability. As studied by researchers in psychology [49] that negative news causes worries to users, channels generate more positive news about the headline to retain the users’ interests over time. On the other hand, we do not observe a significant difference in the polarity of niche news as compared to Figure 2. Positive or negative news are more popular than the neutral news. TV based channels report more negative news compared to the radio and print media based channels whereas print and radio based channels report more positive news. 4.3. Polarity of Same News Events across Channels In this section, we examine how different types of news channels report the same news. We analyze sentiments of ten different real-world news events that are posted by social media news channels. In Table 4, positive indicates the percentage of positive news created for the event and negative indicates the percentage of negative news created for the event. Due to brevity, we did 12 News Channel / News CNN Fox News NPR NYT The Economist Presidential Election Positive: 42.3% Negative: 22.5% Positive: 61.5% Negative: 18.2% Positive: 55.3% Negative: 8.1% Positive: 48.7% Negative: 18.7% Positive: 41.9% Negative: 23.8% Same-sex Marriage Positive: 45.6% Negative: 32.8% Positive: 52.3% Negative: 40.7% Positive: 45.2% Negative: 40.4% Positive: 60.5% Negative: 19.8% Positive: 65.3% Negative: 18.6% Obamacare Positive: 50.3% Negative: 32.5% Positive: 42.1% Negative: 30.6% Positive: 45.2% Negative: 21.7% Positive: 41.3% Negative: 35.6% Positive: 65.2% Negative: 16.8% Football Positive: 35.7% Negative: 40.5% Positive: 38.9% Negative: 42.4% Positive: 45.2% Negative: 18.7% Positive: 45.3% Negative: 35.6% Positive: 52.3% Negative: 31.6% Dakota Access Pipeline Positive: 37.4% Negative:42.3% Positive: 38.5% Negative:40.9% Positive: 60.4% Negative: 25.7% Positive: 39.4% Negative: 38.5% Positive: 50.6% Negative: 28.3% US Ambassador Positive: 32.3% Negative:45.7% Positive: 37.7% Negative:48.6% Positive: 35.3% Negative: 25.6% Positive: 48.5% Negative: 41.3% Positive: 41.4% Negative: 35.4% Hollywood Positive: 32.4% Negative:42.8% Positive: 35.2% Negative:43.5% Positive: 43.6% Negative: 27.5% Positive: 45.2% Negative: 30.3% Positive: 53.6% Negative: 25.7% Ebola Positive: 39.4% Negative:42.3% Positive: 34.7% Negative:39.8% Positive: 38.6% Negative: 46.5% Positive: 28.3% Negative: 45.7% Positive: 25.4% Negative: 56.8% MH370 Flight Positive: 15.3% Negative:50.2% Positive: 10.3% Negative:60.4% Positive: 6.8% Negative: 75.6% Positive: 8.4% Negative: 60.3% Positive: 5.3% Negative: 66.8% Zika Virus Positive: 19.4% Negative: 41.6% Positive: 34.2% Negative: 47.5% Positive: 28.8% Negative: 50.3% Positive: 10.4% Negative: 48.8% Positive: 31.5% Negative: 40.3% Table 4: Sentiment polarity of news events across channels not mention the neutral news, which can be determined by subtracting the sum of positive and negative news from hundred. As can be seen in Table 4, different news channels report the same news events differently. Despite being generating the same news with different percentage of sentiment polarities, all the channels generate a large fraction of positive news for big headlines such as Presidential election, Same-sex Marriage, and Obamacare (refer Section 4.2). However, if a headline is very negative in nature such as MH370 Flight Disappearance, all the news channel generate a large fraction (more than 50%) of negative news due to nature of the news event. Also, all the channels generate a major fraction of negative news, if a news is related to flu epidemics such as Ebola and Zika Virus, an attack, and natural disaster. Although all the big-headlines are reported with a similar pattern of sentiment, regular events and the events that are not part of big-headlines are reported differently by the channels. If a news is related to regular events or minor headlines (i.e., Football, US Ambassador, Hollywood, Dakota Access Pipeline), channels usually generate a different sentiment pattern of positive, negative and neutral news. In this case, positive or negative news are more popular than the neutral news. TV 13 based channels report more negative news compared to the radio and print media based channels. On the other hand, print and radio based channels report more positive news. This observation is in line with our analysis in Section 4 and due to the majority of these regular events, we observe the similar pattern of sentiment in Figure 2. Apart from the results that are shown in Table 4, we also observe that news related to major breakthrough in Science and Technology (i.e., news from NASA, MIT), highly reputed awards (i.e., Nobel Prize, Oscars) and persons (i.e., Pope, Dalai Lama) are reported significantly positive (more than 55%) across all the channels. One of the reasons is that these are very esteemed organizations, awards or persons. Due to the highly positive nature, all the channels generate a major fraction of the positive news related to these organizations, awards, and persons. 5. Popularity versus Polarity We analyze the popularity of a news post as a function of its polarity. Affinity metrics such as comments, likes, and shares received on a post are good indicators of its popularity. However, each of these actions involves a different level of interaction and are assigned different weights in the Facebook newsfeed algorithm [50] with share receiving the highest weight and like the least. Hence, we do not aggregate these counts but analyze them separately. In order to account for the large difference in popularity of different news sites under consideration, we scale these counts of affinity metrics in the range of 0 to 1 and use the normalized values to determine the popularity of news posts. Figure 8: Likes We observe in Figure 8-10 that posts, which are either positive or negative, are more popular than the neutral ones in most of the cases. This suggests that news posts that are either positive or negative in tone tend to be more popular in social media. This finding is inline with observations made by Naveed et al. [20] stating that people are more attracted towards positive or negative news compared to neutral ones. Exceptions to this are Fox News only for comments and The Economist channel. One of the reasons for this is that The Economist reported a major fraction of the news related to money and lifestyle (refer to Figure 1). The Economist also reported the highest number of money or business related news compared to other news channels. We have 14 Figure 9: Comments Figure 10: Shares observed that these news were mostly reported facts and figures (usually neutral in sentiment), which leads to a higher number of reactions for neutral news. Further, in relation to preference between positive and negative content, we observe that more likes are received for the positive posts whereas more comments and shares are received for the negative posts. It shows that the results agree with “Negativity bias” for actions that involve a greater level of engagement such as commenting and sharing but disagrees with when it comes to simpler actions such as liking the post. Trussler and Soroka [51] performed an eye tracking experiment to understand consumer demand for negative news frames and found a similar result. They found that participants “said”, they preferred good news but in reality often chose negative news stories over positive ones. While it is apparent that negative news receives a greater level of engagement, it is important to understand whether users are engaging positively or negatively with content in order to design any plan to receive appropriate users’ opinions. We answer this question in the upcoming section. 6. User Opinion Analysis In addition to indicating the popularity of the post, user opinion (or comment) can provide a great deal of information about the tone of the audience, which can conclude whether the post is being perceived positively or negatively. To understand how users respond to posts of different sentiment polarities, we determine average sentiment polarity of comments received for each of the posts. We show the relationship between average sentiment polarity of comments and sentiment polarity of posts as follows: 15 Figure 11: CNN Figure 12: Fox News Figure 13: The Economist Figure 14: The New York Times We observe from all the five channels that as posts become more and more positive, comments also become increasingly positive. As can also be seen in Table 5 that there is a strong correlation between the sentiment polarity of comments and the sentiment polarity of posts. Comments of all the three types of channels have high sentiment correlation with the posts, and among all the channels TV based channels show the highest correlation. To validate our findings, we perform p-test [52], which shows correlations between posts and comments sentiments are significant at p < .05. We can thus infer that the posts written with varying levels of sentiment polarity prompt different reactions from users. A high correlation between post sentiment and comment sentiment suggests that measures of sentiment polarity of posts can be used to correct for biases that occur when aggregating comments from various channels for tasks such as opinion mining, opinion summarization, real-world outcome prediction, etc. However, the polarity for which a post starts attracting negative comments varies based on the medium of the channel. While Facebook pages of TV based channels, on an average, attract negative comments for negative posts and vice versa, comments for print media based channels do not become negative until posts become strongly negative in tone (i.e. sentiment score less than -3). It is interesting to note that the average comment sentiment polarity of NPR, which posts the highest proportion of positive content amongst all the channels, remains positive irrespective of the polarity of post. It must be recalled from Section 4 that posts by TV based news 16 News Channel CNN Fox News The Economist NYT NPR Correlation 0.97 0.98 0.95 0.97 0.93 Table 5: Correlation between post sentiment and comment sentiment Figure 15: NPR channels were predominantly negative whereas those by print media and radio based channels were predominantly positive with radio based channel having the highest percentage of positive posts. This suggests that sentiment expressed in the comments is not only strongly influenced by the polarity of that particular post but also by users’ opinion about the channel posting the news. The user opinion about the news channel is in turn shaped by the whether the majority of the post messages have a positive or negative tilt. That is, Facebook pages of TV news channels which mostly post negative content attract more negative comments whereas channels that are positive in tone like print media and radio attract fewer negative comments. 7. Temporal Analysis In this Section, we analyze the polarity of news posts temporally. We investigate how the polarity varies over the years. We also investigate whether posts of certain polarity drastically increase or decrease in particular months or days of the week. A common time frame from December 2014 to December 2016 is considered for the analysis. We present the results as follows: Figure 16: The Economist Figure 17: NPR Figures 18-17 show the polarity of the news articles over the period of years. We observe that the behavior of television, print and radio based channels remain same as that revealed in 17 Section 4, i.e. negative sentiment dominates in the television based channels, positive sentiment dominates in the radio and print media based ones majority of the time. We observe that, over the time, Facebook pages of print based media channels (Fig. 16) show a gradual decrease in the percentage of positive posts while neutral posts increase and negative ones remain almost constant. As can also be seen in Figure 16 and 17, there is a peak of positive sentiment during the months of April to July 2015. One of the reasons for this is that two big headlines about Obamacare and same-sex marriage were in the trending news during that time. A few example news of these headlines are as follows: (1). The Supreme Court has ruled on Obamacare subsidies; (2). US supreme court declares same-sex marriage legal. These big news headlines lead to a sentiment peak in Figure 16 and 17, which is also in line with our analysis in Section 4.2 that report that news channels slightly generate more positive news for big headlines compared to negative news. Figure 18: Fox News Figure 19: CNN Both the Television based channels, CNN and FOX exhibit a slight increase in the positive news over the negative news in the year 2016. Such instances of graphs of same-medium news sites (Fig. 18 and 19) showing similar trends suggest that they behave similarly and respond similarly to external events. One of the primary reasons for slightly dip in negative sentiment and a rise in positive sentiment is a big headline, namely US presidential election. These TV based channels are somewhat biased towards a party and generate positive news about that party. Another big headline news was Nobel peace prize award to Colombian president. These headlines are one of the reasons for slightly dip in negative news and rise in positive sentiment. We also analyze the news posts over the months and weeks but we do not notice considerable variations in their polarities. On inspecting sentiment distribution over the months, we do not observe any consistent and significant change for certain months of the year. Even the weekly analysis does not reveal any significant change in the distribution except for a slight increase in positive posts on weekends. One of the reasons for this is that news channels have posted slightly more entertainment, lifestyle, and sports news during the weekend. This posting behavior of news channels leads to a slight increase in positive posts on weekends. Moreover, it is noted that the dominant sentiment in both the monthly and weekly analysis is also similar to the behavior of the channels observed in previous Section 4. Thus, by analyzing the sentiment temporally, 18 we can conclude that, on average, the specific characteristics observed in Section 4 are exhibited consistently across the weeks, months and years. 8. Conclusion In this paper, we conducted an extensive analysis on social media news channels from three types of news information sources to analyze the sentiment of the news generated by these channels and its effect on users’ reactions. We characterized the news in different categories to uncover the distribution of the news posts and their sentiment polarity across categories. Our analysis revealed that sentiment of the news posted by different types of social media channels is dependent on the medium through which these channels traditionally disseminated the news. We also investigated popularity of news with different polarity to get insight into the type of news that attract lots of users’ attention. Interestingly, we found that news with positive or negative sentiment receive lots of users’ attention. We also found that users’ opinion depend on the sentiment of news articles and the type of information sources. Finally, we performed temporal analysis to understand how news posting behaviour of social media channels evolve over a period of time. Future work can look at actual online news articles from different types of online news channels to investigate the differences in the news articles and to compare these articles with social media news posts. References [1] U. K. Ecker, S. Lewandowsky, E. P. Chang, R. Pillai, The effects of subtle misinformation in news headlines., Journal of experimental psychology: applied 20 (4) (2014) 323. [2] S. Soroka, S. McAdams, News, politics, and negativity, Political Communication 32 (1) (2015) 1–22. [3] J. Gottfried, E. Shearer, News use across social media platforms 2016, Pew Research Center 26. [4] P. Rozin, E. B. Royzman, Negativity bias, negativity dominance, and contagion, Personality and social psychology review 5 (4) (2001) 296–320. [5] E. Cambria, B. Schuller, Y. Xia, C. Havasi, New avenues in opinion mining and sentiment analysis, IEEE Intelligent Systems 28 (2) (2013) 15–21. [6] G. Vinodhini, R. Chandrasekaran, Sentiment analysis and opinion mining: a survey, International Journal 2 (6) (2012) 282–292. [7] M. Bautin, L. Vijayarenu, S. Skiena, International sentiment analysis for news and blogs., in: ICWSM, 2008. [8] N. Godbole, M. Srinivasaiah, S. Skiena, Large-scale sentiment analysis for news and blogs., ICWSM 7 (21) (2007) 219–222. [9] T. E. Patterson, Out of Order: An incisive and boldly original critique of the news media’s domination of Ameri, Vintage, 2011. [10] S. Stieglitz, L. Dang-Xuan, Emotions and information diffusion in social media—sentiment of microblogs and sharing behavior, Journal of Management Information Systems 29 (4) (2013) 217–248. [11] M. Ahmed, S. Spagna, F. Huici, S. Niccolini, A peek into the future: Predicting the evolution of popularity in user generated content, in: Proceedings of the sixth ACM international conference on Web search and data mining, ACM, 2013, pp. 607–616. [12] R. Bandari, S. Asur, B. A. Huberman, The pulse of news in social media: Forecasting popularity., ICWSM 12 (2012) 26–33. [13] J. G. Lee, S. Moon, K. Salamatian, An approach to model and predict the popularity of online contents with explanatory factors, in: Web Intelligence and Intelligent Agent Technology (WI-IAT), 2010 IEEE/WIC/ACM International Conference on, Vol. 1, IEEE, 2010, pp. 623–630. 19 [14] K. Lerman, T. Hogg, Using a model of social dynamics to predict popularity of news, in: Proceedings of the 19th international conference on World wide web, ACM, 2010, pp. 621–630. [15] G. Szabo, B. A. Huberman, Predicting the popularity of online content, Communications of the ACM 53 (8) (2010) 80–88. [16] A. Tatar, P. Antoniadis, M. D. De Amorim, S. Fdida, From popularity prediction to ranking online news, Social Network Analysis and Mining 4 (1) (2014) 1–12. [17] I. Arapakis, M. Lalmas, B. B. Cambazoglu, M.-C. Marcos, J. M. Jose, User engagement in online news: Under the scope of sentiment, interest, affect, and gaze, Journal of the Association for Information Science and Technology 65 (10) (2014) 1988–2005. [18] M. Coscia, Average is boring: How similarity kills a meme’s success, Scientific reports 4 (2014) 6477. [19] L. K. Hansen, A. Arvidsson, F. Å. Nielsen, E. Colleoni, M. Etter, Good friends, bad news-affect and virality in twitter, Future information technology (2011) 34–43. [20] N. Naveed, T. Gottron, J. Kunegis, A. C. Alhadi, Bad news travel fast: A content-based analysis of interestingness on twitter, in: Proceedings of the 3rd International Web Science Conference, ACM, 2011, p. 8. [21] S. Wu, C. Tan, J. Kleinberg, M. W. Macy, Does bad news go away faster?, in: Fifth International AAAI Conference on Weblogs and Social Media, 2011. [22] J. Reis, P. Gonçalves, P. Vaz de Melo, R. Prates, F. Benevenuto, Magnet news: You choose the polarity of what you read, Proceedings of ICWSM. [23] J. C. S. dos Rieis, F. B. de Souza, P. O. S. V. de Melo, R. O. Prates, H. Kwak, J. An, Breaking the news: First impressions matter on online news, in: Ninth International AAAI Conference on Web and Social Media, 2015. [24] N. Diakopoulos, M. Naaman, Topicality, time, and sentiment in online news comments, in: CHI’11 Extended Abstracts on Human Factors in Computing Systems, ACM, 2011, pp. 1405–1410. [25] A. Zubiaga, Newspaper editors vs the crowd: on the appropriateness of front page news selection, in: Proceedings of the 22nd International Conference on World Wide Web, ACM, 2013, pp. 879–880. [26] T. De Nies, G. Haesendonck, F. Godin, W. De Neve, E. Mannens, R. Van de Walle, Towards automatic assessment of the social media impact of news content, in: Proceedings of the 22nd International Conference on World Wide Web, ACM, 2013, pp. 871–874. [27] M. Burke, M. Develin, Once more with feeling: Supportive responses to social sharing on facebook, in: Proceedings of the 19th ACM Conference on Computer-Supported Cooperative Work & Social Computing, ACM, 2016, pp. 1462–1474. [28] C. Castillo, M. El-Haddad, J. Pfeffer, M. Stempeck, Characterizing the life cycle of online news stories using social media reactions, in: Proceedings of the 17th ACM conference on Computer supported cooperative work & social computing, ACM, 2014, pp. 211–223. [29] J. Cheng, C. Danescu-Niculescu-Mizil, J. Leskovec, How community feedback shapes user behavior, in: Eighth International AAAI Conference on Weblogs and Social Media, 2014. [30] K. D’Costa, Dont read the comments!(why do we read the online comments when we know theyll be bad?), Scientific American. [31] T. Moosa, Comment sections are poison: handle with care or remove them, The Guardian. [32] Facebook, Facebook graph api, https://developers.facebook.com/docs/graph-api (2017). [33] A. K. Uysal, S. Gunal, The impact of preprocessing on text classification, Information Processing & Management 50 (1) (2014) 104–112. [34] C. J. Hutto, E. Gilbert, Vader: A parsimonious rule-based model for sentiment analysis of social media text, in: Eighth international AAAI conference on weblogs and social media, 2014. [35] A. Tamersoy, M. De Choudhury, D. H. Chau, Characterizing smoking and drinking abstinence from social media, in: Proceedings of the 26th ACM Conference on Hypertext & Social Media, ACM, 2015, pp. 139–148. [36] S. Baccianella, A. Esuli, F. Sebastiani, Sentiwordnet 3.0: an enhanced lexical resource for sentiment analysis and opinion mining., in: Lrec, Vol. 10, 2010, pp. 2200–2204. [37] E. Cambria, C. Havasi, A. Hussain, Senticnet 2: A semantic and affective resource for opinion mining and sentiment analysis., in: FLAIRS conference, 2012, pp. 202–207. [38] J. W. Pennebaker, M. E. Francis, R. J. Booth, Linguistic inquiry and word count: Liwc 2001, Mahway: Lawrence 20 Erlbaum Associates 71 (2001) (2001) 2001. [39] D. M. Blei, A. Y. Ng, M. I. Jordan, Latent dirichlet allocation, Journal of machine Learning research 3 (Jan) (2003) 993–1022. [40] W. Zhao, J. J. Chen, R. Perkins, Z. Liu, W. Ge, Y. Ding, W. Zou, A heuristic approach to determine an appropriate number of topics in topic modeling, BMC bioinformatics 16 (13) (2015) S8. [41] T. L. Griffiths, M. Steyvers, Finding scientific topics, Proceedings of the National Academy of Sciences 101 (suppl 1) (2004) 5228–5235. [42] J. Chang, J. L. Boyd-Graber, S. Gerrish, C. Wang, D. M. Blei, Reading tea leaves: How humans interpret topic models., in: Nips, Vol. 31, 2009, pp. 1–9. [43] R. Agrawal, S. Gollapudi, A. Halverson, S. Ieong, Diversifying search results, in: WSDM, ACM, 2009, pp. 5–14. [44] C. Sievert, K. E. Shirley, Ldavis: A method for visualizing and interpreting topics, in: Proceedings of the workshop on interactive language learning, visualization, and interfaces, 2014, pp. 63–70. [45] W. L. Bennett, News: The politics of illusion, University of Chicago Press, 2016. [46] C. Budak, S. Goel, J. M. Rao, Fair and balanced? quantifying media bias through crowdsourced content analysis, Public Opinion Quarterly 80 (S1) (2016) 250–271. [47] K. Leetaru, Culturomics 2.0: Forecasting large-scale human behavior using global news media tone in time and space, First Monday 16 (9). [48] R. F. Baumeister, E. Bratslavsky, C. Finkenauer, K. D. Vohs, Bad is stronger than good., Review of general psychology 5 (4) (2001) 323. [49] W. M. Johnston, G. C. Davey, The psychological impact of negative tv news bulletins: The catastrophizing of personal worries, British Journal of Psychology 88 (1) (1997) 85–91. [50] C. Kim, S.-U. Yang, Like, comment, and share on facebook: How each behavior differs from the other, Public Relations Review 43 (2) (2017) 441–449. [51] M. Trussler, S. Soroka, Consumer demand for cynical and negative news frames, The International Journal of Press/Politics (2014) 360–379. [52] T. Dahiru, P-value, a true test of statistical significance? a cautionary note, Annals of Ibadan postgraduate medicine 6 (1) (2008) 21–26. 21 View publication stats