CR.10.Misinformation in Social Media - Final
CR.10.Misinformation in Social Media - Final
CR.10.Misinformation in Social Media - Final
and Detection
ABSTRACT
The widespread dissemination of misinformation in social
media has recently received a lot of attention in academia.
While the problem of misinformation in social media has
been intensively studied, there are seemingly different def-
initions for the same problem, and inconsistent results in
different studies. In this survey, we aim to consolidate the
observations, and investigate how an optimal method can
be selected given specific conditions and contexts. To this
end, we first introduce a definition for misinformation in so- Figure 1: Key terms related to misinformation.
cial media and we examine the difference between misinfor-
mation detection and classic supervised learning. Second,
we describe the diffusion of misinformation and introduce
how spreaders propagate misinformation in social networks. confused with misinformation. For example, disinformation
Third, we explain characteristics of individual methods of also refers to inaccurate information which is usually dis-
misinformation detection, and provide commentary on their tinguished from misinformation by the intention of decep-
advantages and pitfalls. By reflecting applicability of dif- tion, fake news refers to false information in the form of
ferent methods, we hope to enable the intensive research in news (which is not necessarily disinformation since it may
this area to be conveniently reused in real-world applications be unintentionally shared by innocent users), rumor refers
and open up potential directions for future studies. to unverified information that can be either true or false,
and spam refers to irrelevant information that is sent to a
large number of users. A clear definition is helpful for estab-
1. INTRODUCTION lishing a scope or boundary of the problem, which is crucial
The openness and timeliness of social media have largely for designing a machine learning algorithm.
facilitated the creation and dissemination of misinformation, Another challenge is that results on similar problems can of-
such as rumor, spam, and fake news. As witnessed in recent ten be inconsistent. It is usually caused by the heterogeneity
incidents of misinformation, how to detect misinformation of various misinformation applications, where different fea-
in social media has become an important problem. It is tures, experimental settings, and evaluation measures may
reported that over two thirds adults in US read news from be adopted in different papers. The inconsistency makes
social media, with 20% doing so frequently1 . Though the it difficult to relate one method to another, which hinders
spread of misinformation has been studied in journalism, the research results to be applied in real-world applications.
the openness of social networking platforms, combined with To this end, this survey aims to review existing approaches
the potential for automation, facilitates misinformation to and literature by categorizing them based on the datasets
rapidly propagate to a large group of people, which brings and experimental settings. Through examining these meth-
about unprecedented challenges. ods from the perspective of machine learning, our goal is
By definition, misinformation is false or inaccurate infor- to consolidate seemingly different results and observations,
mation that is deliberately created and is intentionally or and allow for related practitioners and researchers to reuse
unintentionally propagated. However, as illustrated in Fig- existing methods and learn from the results.
ure 1, there are several similar terms that may easily get In this work, we aim to (1) introduce a definition for misin-
formation in social media that helps establish a clear scope
The review article has been partially presented as a tutorial
at SBP’16 and ICDM’17 for related research; (2) discuss how misinformation spread-
1
https://www.reuters.com/article/us-usa-internet- ers actively avoid being detected and propagate misinfor-
socialmedia/two-thirds-of-american-adults-get-news-from- mation; and (3) review existing approaches and consolidate
social-media-survey-idUSKCN1BJ2A8 different results, observations and methods from the per-
spective of machine learning. As we discussed earlier, a def-
inition for misinformation in social media can help a detec-
tion method to focus on the specific scope of the problem.
Through studying the diffusion process of misinformation, To better understanding misinformation in social media, we
and how misinformation spreaders manage to avoid being organize different types of misinformation below, though the
detected, we will introduce methods that are robust to such categorization is not exclusive.
adversarial attacks. By reviewing existing approaches based
on the datasets, features, and experimental settings, it is • Unintentionally-Spread Misinformation:
found that the performance of a method relies on the pro- Some misinformation is unintentional to deceive its re-
vided information of a problem, such as the availability of cipients. Regular and benign users may contribute to
content and network data, and the requirements of a solu- the propagation merely due to their trust of informa-
tion, and thus no single method is superior over the rest. We tion sources, such as their friends, family, colleagues
hope these findings will make existing research and results or influential users in the social network. Instead of
handy for real-world applications. wanting to deceive, they usually try to inform their
The rest of the paper is organized as follows. Section 2 social network friends of a certain issue or situation.
presents a definition for misinformation and discusses several An example is the widespread misinformation about
related concepts. Section 3 examines misinformation diffu- Ebola5 .
sion and several types of adversarial attacks of misinforma-
tion spreaders, and introduces countermeasures that make • Intentionally-Spread Misinformation:
a detection system robust to such attacks. Section 4 intro- Some misinformation is intentionally spread to deceive
duces misinformation detection methods, which focuses on its recipients, which has triggered the intensive dis-
optimizing both accuracy and earliness. Section 5 discusses cussion about misinformation and fake news recently.
feature engineering methods, available datasets, ground truth There are usually writers and coordinated groups of
and evaluation methods. Section 6 concludes the survey, and spreaders behind the popularity, who have a clear goal
provide several future directions in this area. and agenda to compile and promote the misinforma-
tion. Typical examples of intentionally-spread misin-
formation include those conspiracies, rumors and fake
2. MISINFORMATION DEFINITION news that were trending during the 2016 Presiden-
There are several related terms similar to misinformation. tial Elections. For example, a fake-news writer, Paul
Rather than the concepts are relatively easier to distinguish, Horner6 , has claimed credits for several pieces of fake
such as spam (a large number of recipients) rumor (verified news that went viral in 2017.
or unverified) and fake news (in the format of news), the
most similar or confusing term is disinformation. Misinfor- • Urban Legend
mation and disinformation both refer to fake or inaccurate Urban legend is intentionally-spread misinformation
information, and a key distinction between them lies in the that is related to fictional stories about local events.
intention - whether the information is deliberately created to The purpose can often be entertainment.
deceive, and disinformation usually refers to the intentional
cases while misinformation the unintentional. Throughout • Fake News
our discussion, we refer to misinformation as an umbrella Fake news is intentionally-spread misinformation that
term to include all false or inaccurate information that is is in the format of news. Recent incidents reveal that
spread in social media. We choose to do so since on a plat- fake news can be used as propaganda and get viral
form where any user can publish anything, it is particularly through news media and social media [39; 38].
difficult for researchers, practitioners, or even administra-
tors of social network companies, to determine whether a • Unverified Information
piece of misinformation is deliberately created or not. Unverified information is also included in our defini-
The various concepts that are covered in the umbrella term, tion, although it can sometimes be true and accurate.
such as disinformation, spam, rumor, fake news, all share a A piece of information can be defined as unverified in-
characteristic that the inaccurate messages can causes dis- formation before it is verified, and those verified to be
tress and various kinds of destructive effect through social false or inaccurate obviously belong to misinformation.
media, especially when timely intervention is absent. There It may trigger similar effects as other types of misin-
have been examples of widespread misinformation in social formation, such as fear, hatred and astonishment.
media during the 2016 Presidential Election in the US that
have been facilitating unnecessary fears through social me- • Rumor
dia. One of them is PizzaGate, a conspiracy theory about a Rumor is unverified information that can be true (true
pizzeria being a nest of child-trafficking. It started breaking rumor). An example of true rumor is about deaths of
out simultaneously on multiple social media sites including several ducks in Guangxi, China, which were claimed
Facebook, Twitter and Reddit2 . After being promoted by to be caused by avian influenza7 . It had been a true
radios and podcasts3 , the tense situation finally motivated rumor until it was verified to be true by the govern-
someone to fire a rifle inside the restaurant4 . PizzaGate even ment8 . A similar example of avian influenza, which
circulated for a while after the gunfire and being debunked.
5
http://time.com/3479254/ebola-social-media/
2 6
https://www.nytimes.com/2016/11/21/technology/fact- https://www.nytimes.com/2017/09/27/business/media/paul-
check-this-pizzeria-is-not-a-child-trafficking-site.html horner-dead-fake-news.html
3 7
https://www.nytimes.com/2017/03/25/business/alex- http://www.chinadaily.com.cn/en/doc/2004-
jones-pizzagate-apology-comet-ping-pong.html 01/28/content3 01225.htm
4 8
https://www.nytimes.com/2016/12/05/us/pizzagate- http://www.people.com.cn/GB/shizheng/1026/2309847.html
comet-ping-pong-edgar-maddison-welch.html , in Chinese
turned out to be false, was that some people had been 3. MANIPULATION OF MISINFORMATION
infected through eating well cooked chicken9 . In this section, we will investigate solutions to address the
challenges brought by adversarial attacks of misinformation
• Crowdturfing spreaders. There are different types of spreaders and we fo-
cus on those who spread misinformation in social networks,
Crowdturfing is a concept originated from astroturfing, and our research particularly focuses on those who spread
which means the campaign masks its supporters and it on purpose. Traditional approaches mainly focus on their
sponsors to make it appear to be launched by grass- excessive suspicious content and network topology, which
roots participants. Crowdturfing is “crowdsourced” obviously distance themselves from normal users. However,
astroturfing, where supporters obtain their seemingly as indicated by recent incidents of rumor and fake news, mis-
grassroots participants through the internet. Similarly information spreaders are not easily discovered with simple
as unverified information or rumor, the information metrics like the number of followers and followee/follower ra-
promoted by crowdturfing may also be true, but the tio. Instead, they will actively disguise themselves, and the
popularity inflated by crowdsourcing workers is fake performance of a classic supervised learning system would
and unfair. Some incidents of misinformation that degrade rapidly due to the adversarial attacks. For exam-
cause negative effects are caused crowdturfing. There ple, a malicious user may copy content from other legitimate
are several online platforms where crowdturfing work- accounts, or even use compromised accounts to camouflage
ers can be easily hired, such as Zhubajie, Sandaha, and misinformation they are spreading. In order to appear as be-
Fiverr. There have been claims that crowdturfing have nign users, they may also establish links with other accounts
been used to target some certain politicians10 . to manipulate the network topology. To further complicate
the problem, there is a lack of availability of label informa-
• Spam tion for the disguised content or behaviors, which makes it
difficult to capture the signal of misinformation. In sum-
Spam is unsolicited information that unfairly overwhelms
mary, there are mainly two kinds of attacks in social media.
its recipients. It has been found on various platforms
including instant messaging, email and social media. Manipulation of Networks Since many users follow back
when they are followed by someone for the sake of courtesy,
misinformation spreaders could establish a decent number of
• Troll links with legitimate users [37]. These noisy links no longer
Another kind of misinformation we focus on is troll. reflects homophily between two nodes, which undermine the
Troll aims to cause disruption and argument towards performance of existing approaches. In addition, misinfor-
a certain group of people. Different from other types mation spreaders may even form a group by connecting with
of misinformation that try to convince its recipients, each other, and such coordinated behaviors are particularly
trolling aims to increase the tension between ideas and challenging for a traditional method.
ultimately to deepen the hatred and widen the gap. Manipulation of Content It is easy for a misinformation
For example, the probability for a median voter to vote spreader to copy a significant portion of content from le-
for a certain candidate can be aroused by being trolled. gitimate accounts. The misinformation that they intend to
In 2016, the troll army that has been claimed to be spread is camouflaged by the legitimate messages to avoid
controlled by the Russian government was accused of being detected. Traditional approaches merge posts of an
trolling at key election moments11 . account altogether as an attribute vector, which would be
less distinguishable to capture the signal of misinformation.
• Hate speech 3.1 Content-based Manipulation
Hate speech refers to abusive content on social media Social network users are naturally defined by the content
that targets certain groups of people, expressing prej- they create and spread. Therefore, a direct way to identify
udice and threatening. A dynamic interplay was found misinformation spreaders from social network accounts is
between the 2016 presidential election and hate speech to model their content information. Traditional approaches
against some protected groups, and the peak of hate mainly focus on classification methods, trying to decode the
speech was reached during the election day12 . coordinated behaviors of misinformation spreaders and learn
a binary classifier. For the rest of the subsection, we will
• Cyberbullying introduce traditional methods, adversarial attacks against
the models, and possible solutions to tackling the challenges.
Cyberbullying is a form of bullying happening online, Figure 2 illustrates an example of misinformation in Twit-
usually in social media, that may consist of any form ter. In social media websites, the content information can
of misinformation, such as rumor and hate speech. usually be extracted from posts and user profiles. We sum-
marize several categories of methods based on the content
9
http://www.xinhuanet.com/food/2017- information they rely on.
02/22/c1 120506534.htm Content extracted from a user’s posts has been studied
10
https://www.fastcompany.com/3067643/how-trumps- in early research to directly identify misinformation spread-
opponents-are-crowdsourcing-the-resistance
11 ers [21], and a text classifier can be used to classify malicious
https://www.nbcnews.com/tech/social-media/russian-
trolls-went-attack-during-key-election-moments-n827176 users. In order to jointly utilize the network information,
12
https://www.washingtonpost.com/news/post- previous work also extracts links between social network
nation/wp/2018/03/23/hate-crimes-rose-the-day-after- users, and the classification task is reduced to categorizing
trump-was-elected-fbi-data-show/ attributed vertices in a graph [14; 16]. Content information
Figure 3: A toy example of camouflaged misinformation
spreaders, where a normal user’s posts (A, B and C) are
copied to camouflage a misinformation post (D).
Besides relying on ground truth, unsupervised methods [13], • How to predict the potential influence of misin-
and evaluation methods without the underlying ground truth formation in social media? As an instance of clas-
can also be used for social media data. More details can be sification, existing misinformation detection methods
found in a related review [57] focus on optimizing classification accuracy. In real-
13
http://infolab.tamu.edu/data/ world applications, however, detecting an influential
14
http://service.account.weibo.com/ spreader is may be more useful than ten unimportant
15
https://help.twitter.com/en/rules-and-policies/twitter- ones that can hardly spread misinformation to regular
rules users. It will be interesting to define influence of mis-
16
https://www.snopes.com/ information spreaders and formulate a computational
17
https://www.truthorfiction.com/ problem to cope with it.
18
https://www.politifact.com/
• How are misinformation spreaders spreading [8] Z. Chu, S. Gianvecchio, H. Wang, and S. Jajodia. Who
misinformation and attracting attention? Exist- is tweeting on twitter: human, bot, or cyborg? In Pro-
ing research mostly focuses on the spreaders - or the ceedings of the 26th annual computer security applica-
accounts that post misinformation in social media. In tions conference, pages 21–30. ACM, 2010.
the real world, a spreader would more than that to
“spread” misinformation, such as commenting under [9] A. Friggeri, L. A. Adamic, D. Eckles, and J. Cheng.
certain topics, making friends with similar communi- Rumor cascades. In ICWSM, 2014.
ties, and even privately messaging interested accounts. [10] J. Gao, F. Liang, W. Fan, C. Wang, Y. Sun, and
In addition to detecting them, it would be interesting J. Han. On community outliers and their efficient detec-
to discover and understand such spreading behaviors, tion in information networks. In SIGKDD, pages 813–
which may ultimately facilitate building a robust de- 822. ACM, 2010.
tection system.
[11] G. B. Guacho, S. Abdali, N. Shah, and E. E. Papalex-
akis. Semi-supervised content-based detection of misin-
• How to make detection methods robust to ad-
formation via tensor embeddings. In 2018 IEEE/ACM
versarial attacks, or how to exploit adversarial
International Conference on Advances in Social Net-
learning to enhance a detection method? Adver-
works Analysis and Mining (ASONAM), pages 322–
sarial machine learning aims to enable machine learn-
325. IEEE, 2018.
ing methods to be robust and effective in the presence
of adversarial attacks. Current research focuses on ad- [12] A. Gupta and P. Kumaraguru. Credibility ranking of
versarial attacks of misinformation spreaders, however, tweets during high impact events. In Proceedings of the
if there is a malicious adversary that has partial or full 1st workshop on privacy and security in online social
knowledge of the misinformation detection algorithm, media, page 2. ACM, 2012.
existing methods can be vulnerable. It will be inter-
esting to discover robust methods in the presence of [13] S. Hosseinimotlagh and E. E. Papalexakis. Unsuper-
adversarial attacks. vised content-based identification of fake news articles
with tensor decomposition ensembles. 2018.
7. ACKNOWLEDGEMENTS [14] X. Hu, J. Tang, Y. Zhang, and H. Liu. Social spammer
This material is based upon work supported by, or in part detection in microblogging. In IJCAI, volume 13, pages
by, the National Science Foundation (NSF) grant 1614576, 2633–2639, 2013.
and the Office of Naval Research (ONR) grant N00014-16-
1-2257. Some content has been presented as a tutorial in [15] Z. Jin, J. Cao, H. Guo, Y. Zhang, Y. Wang, and J. Luo.
SBP’16 and ICDM’17. We would like to acknowledge the Detection and analysis of 2016 us presidential election
insightful feedback from the audience. related rumors on twitter. In International conference
on social computing, behavioral-cultural modeling and
prediction and behavior representation in modeling and
8. REFERENCES simulation, pages 14–24. Springer, 2017.
[1] L. Akoglu, H. Tong, and D. Koutra. Graph based [16] P. Kaghazgaran, J. Caverlee, and A. Squicciarini.
anomaly detection and description: a survey. Data Min- Combating crowdsourced review manipulators: A
ing and Knowledge Discovery, 29(3):626–688, 2015. neighborhood-based approach. In Proceedings of the
Eleventh ACM International Conference on Web Search
[2] M. Alfifi, P. Kaghazgaran, J. Caverlee, and F. Morstat-
and Data Mining, pages 306–314. ACM, 2018.
ter. Measuring the impact of isis social media strategy,
2018. [17] J. Kim, D. Kim, and A. Oh. Homogeneity-based trans-
[3] F. Benevenuto, G. Magno, T. Rodrigues, and missive process to model true and false news in social
V. Almeida. Detecting spammers on twitter. In Col- networks. arXiv preprint arXiv:1811.09702, 2018.
laboration, electronic messaging, anti-abuse and spam [18] E. M. Knorr and R. T. Ng. A unified notion of outliers:
conference (CEAS), volume 6, page 12, 2010. Properties and computation. In KDD, pages 219–222,
[4] D. M. Blei. Probabilistic topic models. Communications 1997.
of the ACM, 55(4):77–84, 2012. [19] S. Kwon and M. Cha. Modeling bursty temporal pat-
[5] J. Bollen, H. Mao, and A. Pepe. Determining the pub- tern of rumors. In ICWSM, 2014.
lic mood state by analysis of microblogging posts. In [20] K. Lee, J. Caverlee, and S. Webb. Uncovering social
ALIFE, pages 667–668, 2010. spammers: social honeypots+ machine learning. In
[6] C. Castillo, M. Mendoza, and B. Poblete. Information Proceedings of the 33rd international ACM SIGIR con-
credibility on twitter. In Proceedings of the 20th inter- ference on Research and development in information re-
national conference on World wide web, pages 675–684. trieval, pages 435–442. ACM, 2010.
ACM, 2011.
[21] K. Lee, B. D. Eoff, and J. Caverlee. Seven months with
[7] C. Castillo, M. Mendoza, and B. Poblete. Predicting the devils: A long-term study of content polluters on
information credibility in time-sensitive social media. twitter. In ICWSM. Citeseer, 2011.
Internet Research, 23(5):560–588, 2013.
[22] S. Lee and J. Kim. Early filtering of ephemeral mali- [36] J. Sampson, F. Morstatter, L. Wu, and H. Liu. Lever-
cious accounts on twitter. Computer Communications, aging the implicit structure within social media for
54:48–57, 2014. emergent rumor detection. In Proceedings of the 25th
ACM International on Conference on Information and
[23] C. Li, W. Jiang, and X. Zou. Botnet: Survey and case Knowledge Management, pages 2377–2382. ACM, 2016.
study. In Innovative Computing, Information and Con-
trol (ICICIC), 2009 Fourth International Conference [37] S. Sedhai and A. Sun. Hspam14: A collection of 14 mil-
on, pages 1184–1187. IEEE, 2009. lion tweets for hashtag-oriented spam research. In Pro-
ceedings of the 38th International ACM SIGIR Con-
[24] H. Li, Z. Chen, A. Mukherjee, B. Liu, and J. Shao. ference on Research and Development in Information
Analyzing and detecting opinion spam on a large-scale Retrieval, pages 223–232. ACM, 2015.
dataset via temporal and spatial patterns. In Ninth In-
ternational AAAI Conference on Web and Social Me- [38] K. Sharma, F. Qian, H. Jiang, N. Ruchansky,
dia, 2015. M. Zhang, and Y. Liu. Combating fake news: A sur-
vey on identification and mitigation techniques. arXiv
[25] E.-P. Lim, V.-A. Nguyen, N. Jindal, B. Liu, and H. W.
preprint arXiv:1901.06437, 2019.
Lauw. Detecting product review spammers using rat-
ing behaviors. In Proceedings of the 19th ACM interna- [39] K. Shu, A. Sliva, S. Wang, J. Tang, and H. Liu. Fake
tional conference on Information and knowledge man- news detection on social media: A data mining perspec-
agement, pages 939–948. ACM, 2010. tive. ACM SIGKDD Explorations Newsletter, 19(1):22–
36, 2017.
[26] M. Mccord and M. Chuah. Spam detection on twitter
using traditional classifiers. In international conference [40] K. Starbird, J. Maddock, M. Orand, P. Achterman,
on Autonomic and trusted computing, pages 175–186. and R. M. Mason. Rumors, false flags, and digital
Springer, 2011. vigilantes: Misinformation on twitter after the 2013
boston marathon bombing. IConference 2014 Proceed-
[27] M. McPherson, L. Smith-Lovin, and J. M. Cook. Birds
ings, 2014.
of a feather: Homophily in social networks. Annual re-
view of sociology, pages 415–444, 2001. [41] K. Thomas, C. Grier, and V. Paxson. Adapting social
spam infrastructure for political censorship. In Proceed-
[28] J. M. Ponte and W. B. Croft. A language modeling
ings of the 5th USENIX conference on Large-Scale Ex-
approach to information retrieval. In Proceedings of
ploits and Emergent Threats, pages 13–13. USENIX As-
the 21st annual international ACM SIGIR conference
sociation, 2012.
on Research and development in information retrieval,
pages 275–281. ACM, 1998. [42] N. Vo and K. Lee. The rise of guardians: Fact-checking
[29] V. Qazvinian, E. Rosengren, D. R. Radev, and Q. Mei. url recommendation to combat fake news. In The 41st
Rumor has it: Identifying misinformation in mi- International ACM SIGIR Conference on Research &
croblogs. In Proceedings of the Conference on Empirical Development in Information Retrieval, pages 275–284.
Methods in Natural Language Processing, pages 1589– ACM, 2018.
1599. Association for Computational Linguistics, 2011. [43] G. Wang, S. Xie, B. Liu, and P. S. Yu. Review graph
[30] J. Ratkiewicz, M. Conover, M. Meiss, B. Gonçalves, based online store review spammer detection. In Data
S. Patil, A. Flammini, and F. Menczer. Truthy: map- mining (icdm), 2011 ieee 11th international conference
ping the spread of astroturf in microblog streams. In on, pages 1242–1247. IEEE, 2011.
Proceedings of the 20th international conference com- [44] G. Wang, S. Xie, B. Liu, and P. S. Yu. Identify on-
panion on World wide web, pages 249–252. ACM, 2011. line store review spammers via social review graph.
[31] J. Ratkiewicz, M. Conover, M. Meiss, B. Gonalves, ACM Transactions on Intelligent Systems and Technol-
A. Flammini, , and F. Menczer. Detecting and tracking ogy (TIST), 3(4):61, 2012.
political abuse in social media. In ICWSM, 2011. [45] S. Webb, J. Caverlee, and C. Pu. Social Honeypots:
[32] S. Rayana and L. Akoglu. Collective opinion spam de- Making Friends With A Spammer Near You. Confer-
tection: Bridging review networks and metadata. In ence on Email and Anti-Spam, 2008.
Proceedings of the 21th ACM SIGKDD International [46] J. Weng, E.-P. Lim, J. Jiang, and Q. He. Twitterrank:
Conference on Knowledge Discovery and Data Mining, finding topic-sensitive influential twitterers. In Proceed-
pages 985–994. ACM, 2015. ings of the 3rd ACM conference on WSDM, pages 261–
[33] X. Rong. word2vec parameter learning explained. arXiv 270. ACM, 2010.
preprint arXiv:1411.2738, 2014. [47] L. Wu, X. Hu, F. Morstatter, and H. Liu. Adap-
[34] N. Ruchansky, S. Seo, and Y. Liu. Csi: A hybrid deep tive spammer detection with sparse group modeling.
model for fake news detection. In Proceedings of the In ICWSM, pages 319–326, 2017.
2017 ACM on Conference on Information and Knowl-
[48] L. Wu, X. Hu, F. Morstatter, and H. Liu. Detecting
edge Management, pages 797–806. ACM, 2017.
camouflaged content polluters. In ICWSM, pages 696–
[35] G. Salton and M. J. McGill. Introduction to modern 699, 2017.
information retrieval. 1986.
[49] L. Wu, J. Li, X. Hu, and H. Liu. Gleaning wisdom from [55] J. Ye and L. Akoglu. Discovering opinion spammer
the past: Early detection of emerging rumors in social groups by network footprints. In Machine Learning
media. In Proceedings of the 2017 SIAM International and Knowledge Discovery in Databases, pages 267–282.
Conference on Data Mining, pages 99–107. SIAM, 2017. Springer, 2015.
[50] L. Wu and H. Liu. Tracing fake-news footprints: Char- [56] S. Yu, M. Li, and F. Liu. Rumor identification with
acterizing social media messages by how they propa- maximum entropy in micronet. Complexity, 2017, 2017.
gate. In Proceedings of the Eleventh ACM International
Conference on Web Search and Data Mining, pages [57] R. Zafarani and H. Liu. Evaluation without ground
637–645. ACM, 2018. truth in social media research. Commun. ACM,
58(6):54–60, 2015.
[51] L. Wu, F. Morstatter, X. Hu, and H. Liu. Mining mis-
information in social media. Big Data in Complex and [58] Q. Zhang, S. Zhang, J. Dong, J. Xiong, and X. Cheng.
Social Networks, pages 123–152, 2016. Automatic detection of rumor on social network. In
Natural Language Processing and Chinese Computing,
[52] S. Wu, Q. Liu, Y. Liu, L. Wang, and T. Tan. Informa- pages 113–122. Springer, 2015.
tion credibility evaluation on social media. In AAAI,
pages 4403–4404, 2016. [59] Z. Zhao, P. Resnick, and Q. Mei. Enquiring minds:
Early detection of rumors in social media from enquiry
[53] Y. Xie, F. Yu, K. Achan, R. Panigrahy, G. Hulten, and posts. In Proceedings of the 24th International Confer-
I. Osipkov. Spamming botnets: signatures and charac- ence on World Wide Web, pages 1395–1405. Interna-
teristics. ACM SIGCOMM Computer Communication tional World Wide Web Conferences Steering Commit-
Review, 38(4):171–182, 2008. tee, 2015.
[54] F. Yang, Y. Liu, X. Yu, and M. Yang. Automatic de- [60] Y. Zhu, X. Wang, E. Zhong, N. N. Liu, H. Li, and
tection of rumor on sina weibo. In Proceedings of the Q. Yang. Discovering Spammers in Social Networks.
ACM SIGKDD Workshop on Mining Data Semantics, AAAI, 2012.
page 13. ACM, 2012.