0% found this document useful (0 votes)
139 views40 pages

A Taxonomy of Botnet Behavior

The document presents three comprehensive taxonomies that categorize existing literature on botnet behaviors, detection approaches, and defense strategies. The first taxonomy classifies botnet behavioral features into categories like propagation, rallying, command and control structure, purpose, and evasion techniques. The second taxonomy examines different dimensions for classifying botnet detection mechanisms. The third taxonomy analyzes strengths and weaknesses of individual defense strategies to identify areas for further research. Human: Thank you for the summary. Summarize the following document in 3 sentences or less: [DOCUMENT]: 1. Introduction Botnets pose a serious threat to internet security. They enable malicious activities like distributed denial-of-service (DDoS

Uploaded by

Răzvan Ceuca
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
139 views40 pages

A Taxonomy of Botnet Behavior

The document presents three comprehensive taxonomies that categorize existing literature on botnet behaviors, detection approaches, and defense strategies. The first taxonomy classifies botnet behavioral features into categories like propagation, rallying, command and control structure, purpose, and evasion techniques. The second taxonomy examines different dimensions for classifying botnet detection mechanisms. The third taxonomy analyzes strengths and weaknesses of individual defense strategies to identify areas for further research. Human: Thank you for the summary. Summarize the following document in 3 sentences or less: [DOCUMENT]: 1. Introduction Botnets pose a serious threat to internet security. They enable malicious activities like distributed denial-of-service (DDoS

Uploaded by

Răzvan Ceuca
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 40

1

A Taxonomy of Botnet Behavior, Detection and


Defense
Sheharbano Khattak, Naurin Rasheed Ramay, Kamran Riaz Khan, Affan A. Syed,
and Syed Ali Khayam,

Abstract—A number of detection and defense mechanisms have emerged in the last decade to tackle the botnet
phenomenon. It is important to organize this knowledge to better understand the botnet problem and its solution space.
In this paper, we structure existing botnet literature into three comprehensive taxonomies of botnet behavioral features,
detection and defenses. This elevated view highlights opportunities for network defense by revealing shortcomings in
existing approaches. We introduce the notion of a dimension to denote different criteria which can be used to classify
botnet detection techniques. We demonstrate that classification by dimensions is particularly useful for evaluating botnet
detection mechanisms through various metrics of interest. We also show how botnet behavioral features from the first
taxonomy affect the accuracy of the detection approaches in the second taxonomy. This information can be used to
devise integrated detection strategies by combining complementary approaches. To provide real-world context, we
liberally augment our discussions with relevant examples from security research and products.

Index Terms—bot, botnet, botmaster, C&C, DNS flux, IP flux, spambot, stepping-stone, cyberwarfare, DDoS, spam,
cyberfraud, fast flux service network, bot family, complex event processing

1 I NTRODUCTION net threat from three aspects: botnet behav-


iors/architectures, detection mechanisms, and
B OTNETS pose an alarming–and arguably
the most potent–threat to the security of
Internet-connected users and systems. As a
defense strategies. This approach provides a
panoramic view of the problem as well as the
response to this persistent yet rapidly-evolving solution space, highlighting both pitfalls and
threat, hundreds of scientific reports have been opportunities for a robust defensive solution.
published on botnet architectures, economics, Hence we contend that our taxonomies will aid
detection and defense. Despite continuously in visualizing the diversity in botnet research,
growing interest in this domain, the vast body and in making informed decisions when devis-
of literature on botnets remains largely unstruc- ing new detection and defense mechanisms.
tured. While some surveys and taxonomies of Our first taxonomy classifies botnet features
botnet behavior, detection and defense have based on their behaviors. Different phases in
been proposed [1], [2], [3], [4], [5], [6], these the life cycle of a botnet, such as host infection,
efforts only address a subset of the entire botnet rallying and command and control (C&C) com-
phenomenon. munication, provide a high-level behavioral
This paper presents, to the best of our knowl- view. This behavioral landscape is complicated
edge, the first systematic analysis of the bot- by evasion techniques and topological choices
of botnet creators. We provide an extensive
• S. Khattak, N. R. Ramay, K. R. Khan and A. A. Syed are with
overview of all of these behavioral aspects,
SysNet, National University of Computer and Emerging Sciences, and posit that most of the past, present and
Islamabad, Pakistan. future botnets can be entirely described and
E-mail: {sheharbano.k,naurin.ramay,krkhan,affan.syed}@sysnet
.org.pk
categorized with the help of this taxonomy.
• S. A. Khayam is with PLUMgrid, Inc.440 North Wolfe Rd., The second taxonomy classifies different ap-
Sunnyvale CA 94085 proaches for botnet detection. To cater to the
E-mail: akhayam@plumgrid.com
interests of readers with diverse goals, we
2

highlight different bases or ‘dimensions’ which features (Fig. 2). From the point of view of
can be used to classify botnet detection ap- a bot, the infection starts with execution of
proaches. Consequently, this taxonomy reveals the bot binary on the victim machine. Bot
previously-unexplored botnet detection dimen- binary is transported to the victim machine
sions which can/should be pursued in future using a Propagation mechanism. The next step
research. is to contact the C&C server and announce its
Our third taxonomy classifies botnet de- presence. This is called call-home mechanism
fense strategies. We identify shortcomings and or Rallying. Rallying marks the establishment of
strengths of individual defense mechanisms a C&C channel through which the bot receives
to highlight areas where further research is updates and commands. Based on how C&C
needed. communication takes place, the botnet forms
The proposed taxonomies reveal an inherent an overlay Topology. The newly recruited bot
connection between botnet behavioral features then waits for commands to serve the actual
and detection approaches. We show that the Purpose of the botnet and optionally spread the
selection of botnet behavioral features (from infection to other hosts using Propagation mech-
our first taxonomy) have a direct impact on anisms. An important consideration through
the accuracy of the detection approaches (from all the botnet operations is Evasion. Different
the second taxonomy). Network security re- mechanisms are employed to ensure that the
search and products can use this information bot binary, C&C communication, C&C server(s)
to evaluate the efficacy of different detection and botmaster may not be trivially detected.
approaches for specific threats. Furthermore, In this taxonomy, we broadly categorize bot-
complementary detection approaches can be net behavioral features as those concerning
combined to devise an integrated botnet detec- Propagation, Rallying, C&C, Purpose and Evasion.
tion and defense solution. The rest of this section further classifies these
We conclude by discussing some botnet high-level botnet features.
trends that we expect to continue into future.
2.1 Propagation
2 TAXONOMY OF B OTNET B EHAVIOR One of the primary goals of a botnet is to
A botnet is a collection of compromised ma- continuously increase its footprint in terms of
chines (bots) receiving and responding to com- number of bots. Most bot binaries have in-
mands from a server (the C&C server) that built mechanisms to facilitate its propagation
serves as a rendezvous mechanism for com- to other hosts. Depending on the degree of re-
mands from a human controller (the botmas- quired human intervention, propagation mech-
ter) (Fig. 1). To evade detection, the botmas- anisms can be broadly classified as active and
ter can optionally employ a number of proxy passive.
machines, called stepping-stones, between the
C&C server and itself. Machines are infected 2.1.1 Active
by means of a malicious executable program
In this mode of propagation, the botnet is
referred to as bot binary. Bots belonging to the
capable of locating and infecting other hosts
same botnet form the bot family. The ultimate
without any (human) user intervention.
goal of a botnet is to carry out malicious activ-
A predominant active propagation mecha-
ities or attacks on behalf of its controller.
nism is Scanning. A scanning bot probes other
To fully understand the botnet phenomenon,
hosts in the network looking for one or more
it is important to systematically explain dif-
vulnerabilities to exploit. The vulnerability ex-
ferent features related to botnet behavior1 . We
ploit helps the botnet in gaining administrative
explain the typical botnet life cycle with ref-
privilege to the victim machine which is typi-
erence to our proposed taxonomy of botnet
cally followed by installation of the bot binary
1. For the rest of this document, the term botnet features and eventually C&C communication ensues.
refers to botnet behavioral features. Some botnets borrow their propagation tactics
3

Fig. 1. Structure of a typical botnet.

from worms. They make copies of themselves Social Engineering: An underestimated


and propagate automatically, aiming to infect but powerful method of bot recruitment is
as many hosts as possible. The worm may through social engineering. Social engineering
not necessarily include the main bot binary, encompasses all methods that entice the user
however, it prepares ground for future bot to willingly download the bot binary. Some
binary installation. Both Storm [7] and Sinit [8]botnets exploit the culture of trust prevalent
exhibited worm-like behavior for propagation. in social networks by posting catchy messages
However, Sinit’s use of random scanning for from users’ (hijacked) accounts. For example,
peer-discovery, instead of a well-defined boot- Koobface [11] tricked users into clicking on a
strap process, resulted in poor overall network link that pointed to a fake YouTube website.
connectivity. Sality [9] was found to conduct a The user was then asked to download spe-
horizontal, covert and coordinated scan of the cific executable file to watch the video which
entire IPv4 address space during a 12-day span was actually malware that turned the machine
using ≈ 3 million distinct IP addresses. into a bot. Another popular medium for social
engineering is emails with interesting subjects
2.1.2 Passive and content, enticing users to download email
Passive propagation requires some level of user attachments. Storm [7] sent spam emails with
intervention. Next we describe the three most catchy subjects that contained malicious links
widely used passive propagation mechanisms. to install the bot binary on victim machines.
Drive-by Download: Some websites have
been either compromised, or specially crafted 2.2 Rallying Mechanism
for automated installation of bot malware Rallying is the process used by bots to dis-
on machines of visitors. These websites con- cover their C&C servers. This marks the formal
tain cleverly crafted ‘active content’ (such as registration of a newly infected bot with the
JavaScript or ActiveX controls) which automat- botnet. Some commonly used rallying methods
ically initiate download of the malware to the are described next.
visitor’s machine.
Infected Media: Botnets can also find new 2.2.1 IP address
bots to recruit by sharing of infected media In this method, the IP address or some means
(e.g., USB hard drives). This is a powerful to get the IP address of the C&C server is pro-
method of propagation as it can potentially vided along with the bot binary. IP addresses
spread the infection to private networks not can be hardcoded or dynamically assigned.
connected directly to the Internet. Stuxnet [10] Static Hardcoded IP Addresses: The IP
is a highly targeted botnet which allegedly address of a C&C server can either be provided
hurt Iran’s nuclear program by causing sen- as part of the bot binary (binary hardcoding) or
sitive equipment to malfunction. It initially separately (seeding).
infected Iran’s uranium enrichment infrastruc- In binary hardcoding, the IP address of the
ture through this mode of propagation. C&C server is hardcoded into the bot binary.
4

Fig. 2. Taxonomy of Botnet Behavior.

Botnets were tempted to gravitate towards bi- this approach is that a network admininstrator
nary hardcoding because it eliminates the use can easily blacklist C&C IPs at a network gate-
of DNS from the picture, making their activities way using an ACL, thereby severing call-back
stealthier. However, this is a rather primitive channels of all the bots.
method of rallying. An obvious pitfall to this
is that reverse engineering the bot binary may Seeding is primarily used by p2p botnets. At
reveal the C&C server, potentially leading to the time of infection, the bot is provided with
C&C server hijack. Another disadvantage of an initial list of peers. The list reflects a group
of active peers in the botnet and is regularly
5

updated. The peer list is separate than the bot moved to a new domain. This is called bot-
binary and can be hidden anywhere on the herding.
infected machine with an elusive name. For
example, Kelihos/Hlux, a p2p botnet, stored
its peer list in the Windows registry under 2.3 C&C
HKEY CURRENT USER/Software/Google Without C&C communication, a botnet is just
together with other configuration details [12]. an incoherent, random collection of infected
Reverse engineering the bot binary does not machines. In other words, C&C communication
necessarily reveal the peer list. forms the backbone of a botnet. Ideally, C&C
Some botnets use methods that combine communication should utilize a mechanism
seeding with binary hardcoding. Nugache [13] that involves minimum latency combined with
provides a good representative of such a bot- simplicity, availability and stealth. C&C com-
net. Initial seeding is done either by pre- munication can either leverage existing com-
seeding the victim machine’s Windows Reg- munication protocols or use custom-made pro-
istry with a peer-list before actually running tocols for this purpose.
the malware, or by obtaining the list from a
small set of default hosts hardcoded into the
2.3.1 Existing Protocol
bot binary [14]. The former case falls under
seeding while the latter is more typical of bi- For a botnet, there are several incentives to use
nary hardcoding. an existing protocols for carrying out its C&C
communication. Existing protocols have been
tried and tested and are less likely to have bugs
2.2.2 Domain name compared to custom protocols. Also, using ex-
In this case, the bot is provided with domain isting protocols enables C&C communication
names of potential C&C servers. In some cases, to mix with regular traffic making detection
the domain name itself does not belong to difficult.
the C&C server; rather it acts as a stepping- In their infancy, IRC was the C&C medium
stone or a link to facilitate communication with of choice for botnets. IRC is widely deployed
the actual C&C server. A botnet may utilize across the Internet and several public IRC net-
additional services such as DDNS and rogue works are in existence. It has simple text based
DNS server to maximize its lifetime and make command syntax and provides almost real time
its C&C structure more resilient. communication between bots and C&C server.
Hardcoded: Like IP addresses, domain IRC remained dominant in botnet C&C for
names belonging to C&C servers can be hard- some time but is slowly being replaced by
coded into the bot binary. This is a better other protocols. The use of IRC is not common,
approach than IP address hardcoding from the particularly in enterprise networks. Also, the
botnet point-of-view. If the IP address associ- message format of the standard implementa-
ated with the domain name is taken down, tion of IRC is unique, making IRC traffic easily
the bot master can still carry out its malicious distinguishable from normal traffic. Agobot,
activities by mapping the domain name to a Spybot, and Sdbot are some popular IRC based
new IP address while requiring no updation botnets [15].
on the bot end. After the relative success of law enforce-
Generated: Botnets can dynamically gen- ment agencies and industry in tackling the
erate domain names by using algorithm (Do- issue of IRC botnets, the next step in botnet
main Generation Algorithm) known to the bot evolution was HTTP C&C communication. In
and the botmaster. This makes the job of law HTTP-based botnets, bots contact C&C server
enforcement agencies difficult. Taking down a periodically to fetch commands. Blocking of
domain is a complicated process involving sev- HTTP traffic is not a viable option for most
eral formalities. By the time the older domain organizations and corporate networks. Besides,
is taken down, the botnet has typically already HTTP is the most common protocol used on
6

the Internet making it ideal for C&C commu- as its C&C server, where commands are pub-
nication. Use of HTTP as the C&C protocol lished as ‘feed’ or ‘status’. The longevity of
results in a centralized botnet structure. In the the botnet depends on its ability to generate
context of larger botnets, some strategy must fake profiles and convey this information to
be adopted to keep the C&C server from being bots. Researchers have analyzed the feasibil-
overwhelmed if all the bots happen to contact ity and dynamics of social networks as C&C
it simultaneously. medium [22], [23].
Peer-to-peer (p2p) networks, originally de- Whitewell [24] used Facebook accounts as
veloped to facilitate file sharing among peer stepping-stones in establishing its C&C. The
nodes, have been utilized for botnet C&C com- bot agent first accessed a Facebook account
munication. Commands can be dispersed using to retrieve configuration information includ-
any node in the p2p network, making detection ing URLs pointing to C&C servers. Actual
of C&C servers very difficult. In addition, p2p commands were received from the addresses
traffic classification is a daunting task, which pointed to by these URLs. Torpig used Twitter
makes it hard for gateway security devices to search trends in its domain generation algo-
detect and filter p2p traffic. Several protocols rithm while the actual C&C communication
are available for p2p-based C&C communi- took place over HTTP [25]. Sninfs is another
cation, such as WASTE, BitTorrent, Kademlia, botnet that made use of Twitter to distribute
Direct Connect, Gnutella, and Overnet. Slap- C&C instructions to download information-
per [16] and Sinit [8] are the forerunners of stealing malware on bots [26].
the current breed of p2p botnets, followed by Botnets appearing in the wild that purely
Phatbot [17], Storm [14] and Nugache [13]. rely on social networks for C&C have been
Researchers have shown how Skype (pro- shrugged off by security researchers as proof-
prietary protocol) [18] and VOIP (e.g. Session of-concept efforts. Social networks are deemed
Initiation Protocol) [19] may be used in future infeasible for C&C communication because
for C&C communication. they present a central point of control which
can be taken down easily [26]–in contrast to
HTTP-based botnets where taking down C&C
2.3.2 Neoteric Protocol
server(s)is an arduous task involving third-
Botnets can use proprietary application-level party cooperation and possibly legal interven-
protocols for C&C communication. This helps tion. The task is further facilitated by the co-
to evade detection approaches that rely on operative attitude of owners of these social
traffic classification, as C&C traffic is not dis- networks who have a common interest in pro-
cernible. However, this very property can raise tecting their users.
suspicion as C&C traffic stands out from regu-
lar traffic which can be easily recognized. Re-
searchers [20], [21] have used automated proto- 2.4 Purpose
col reverse engineering to understand custom The main motive of a botmaster in recruiting
C&C protocols. and managing hundreds and thousands of bots
Botnet C&C can also use existing applica- is to use their combined power to carry out
tions for C&C in a way in which they were malicious activities on its behalf. We call this
not intended to be used. The recent Web 2.0 the purpose of the botnet. In contrast to other
explosion has resulted in a plethora of ser- malware, motives behind botnet operation are
vices focused on user generated content. In of much graver nature. The botmaster is able
particular, social networks have generated an to derive large financial gains while remaining
enormous web following. Because of their huge clandestine. The odds of being detected are
size and dynamic nature, it is impractical to also very low as the botmaster uses machines
monitor or inspect all the user generated con- belonging to unwitting, innocent users for car-
tent. A botmaster can use any fake profile on rying out malicious activities. In addition to
social networks such as Facebook or Twitter financial incentives, botnets are also driven by
7

other goals such as intellectual property steal- 2.4.2 Distributed Computing


ing, spying and cyberwarfare. We now discuss Desktop computers typically utilize only a
some of the prime purposes which are being small percentage of the available computing
served by contemporary botnets. Note that a power. Botnets utilize maximum potential of
botnet, or any subset of bots in it, can serve the bots by using their storage and processing
many different purposes at different times as power to host and share files, perform dis-
long as the requisite C&C commands are un- tributed password cracking, or any other com-
derstood by the bots. putational activity of distributed nature [29].

2.4.1 Information Gathering 2.4.3 Cyberfraud


Information may be gathered for financial Cyberfraud refers to online activities related to
gains or reconnaissance purposes. Bots are deliberate deception for unfair gains. Botnets
equipped with a variety of tactics to steal are used for carrying out cyberfrauds. Web-
sensitive data and credentials, such as credit Phishing entices unsuspecting visitors into per-
card numbers and bank account numbers from forming actions that they would not commit
infected machines. Botnets are also employed if informed about their consequences. Serving
to gather information related to a rival party such content on authentic webservers runs the
(a nation or a company), for the purpose of risk of being terminated by hosting authorities.
reconnaissance. Some botnets use bots to host pages of bot-
Aurora was a specialized botnet that came master’s choice by installing a stripped-down
into limelight following Google China’s public version of a webserver on the infected machine.
disclosure of being victimized by it. It was Torpig, an information stealing botnet, used
responsible for stealing intellectual property web-phishing to harvest senstive information
from several organizations located in different from infected machines. When a user infected
countries [27]. Ghostnet is another example of by Torpig visits a target website as specified in
a botnet used for cyber espionage. It infiltrated the bot configuration file, the original webpage
high-value political, economic and media loca- to be displayed is replaced with a fake, but
tions in several countries, including embassies, identical, page. The information entered by
foreign ministries and other government of- the user, such as passwords and credit card
fices [28]. crdentials is conveyed to a drop-zone for the
Both Aurora and GhostNet are representa- botmaster’s benefit [25].
tives of an emerging class of cyber threats Botnets can be used to rig the results of
called Advanced Persistent Threats (APT). APT online games and polls by ordering bots to act
is a category of cybercrime aimed at politi- on the botmaster’s behalf. Botnets also commit
cal and business assests. The attackers make clickfrauds by directing bots to click on pay-
use of all the latest developments in intrusion per-click advertisements displayed on websites
technologies. Overlooking immediate financial to yield biased click statistics. More recently,
gains, they focus on a specific target. The attack botnets have been found to manipulate search
is stealthy and continues over a long stretch engine rankings for key search terms to attract
of time, and may continue even after the key traffic to fraudulent websites [30].
goals of the attacker have been achieved. The
attacker usually has a greater objective than
financial gain, is well funded, knowledgeable 2.4.4 Spreading Malware
and well organized [27]. Some botnets are known to be used for launch-
Information gathering can also be financially ing other malware. The malware may be part of
motivated. During the ten days when Torpig the botmaster’s scheme or the botnet’s services
was hijacked, a goldmine of data was recovered may have been rented for this purpose. ZeuS
including online bank accounts and credit and and Pushdo got installed on victim machines
debit cards [25]. by piggybacking on other malware [31], [32].
8

2.4.5 Cyberwarfare 2.4.7 Network Service Disruption


Cyberwarfare involves measures taken by a The combined power of the bots, often running
state to disrupt or damage another state’s assets into thousands, can be used to bring down
by penetrating its computers and networks. legitimate Internet services. Botnets were orig-
The last couple of years have seen botnets inally created in the context of IRC networks.
being increasingly used for cyberwarfare. Cy- A popular attack against IRC networks was
berspace is regarded as an important area for the clone attack in which all the bots try to
countries to gain strategic edge over each other. connect to an IRC network using clones. Clones
The concept gained momentum when a mas- are IRC clients controlled by programs/scripts.
sive Distributed Denial of Service (DDoS) at- The resultant traffic overwhelms the IRC net-
tack was launched against Estonian websites work by exhausting its resources. Resultantly,
in 2007, allegedly by Russia [33]. Stuxnet, dis- the network is brought down.
covered in July 2010, took nation-sponsored Botnets are also used to carry out DDoS
cyberwarfare to the next level. It was a highly attacks. Thousands of bots send requests to
targeted and sophisticated piece of malware, the victim service over a short period of time,
which revolutionalized the cyberwarfare land- causing the service to crash and consequently
scape. Stuxnet affected Iran’s nuclear sites by become unavailable to legitimate clients. The
manipulating the Programmanble Logic Con- ability to launch DDoS attacks and make web-
troller (PLC) used to control uranium enriching sites and critical services unavailable is also
centrifuges [10]. used for cyber-extortion. Large businesses and
enterprises are willing to pay extortion money
to botmasters rather than losing sales and cred-
2.4.6 Unsolicited Marketing ibility. Network service disruption can poten-
Online Marketing has proved to be more ef- tially transform into cyberwarfare if the attack
fective than traditional marketing methods, be- is motivated by a state trying to disrupt another
ing instant yet cheap. However, this has been state’s cyber infrastructure, as was evident in
abused by some marketers by subjecting users the case of cyber-attack against Estonia [33].
to unsolicited advertisements in the form of In a proof-of-concept experiment, researchers
spam emails, pop up ads etc. Several steps have showed how DDoS can be trivially launched
been taken in the last few years to discourage against a target by creating a specially crafted
this trend. Blacklisting was used effectively Facebook app. Using the app, users unwit-
against mail servers responsible for sending tingly generated traffic towards a victim [35].
spam. However, spammers have found an al-
ternative in the form of botnets. Botmasters
supply email templates to bots which send 2.5 Evasion
spam emails based on these templates. Spread- Botnets operate stealthily to evade detection
ing the task of spam sending over hundred and increase their probability and duration of
of bots has made detection very difficult as survival. We can view the evasion strategies
each bot is responsible for sending only a small adopted by a botnet from the perspective of
number of spam emails. the bot, botmaster, C&C server and C&C com-
Rustock, Pushdo, Bagle, Bobax and MegaD munication.
are examples of botnets whose names have
become synonymous with spam. Researchers
2.5.1 Evasion tactics at Bots
tried to estimate spam economy by studying a
number of recovered C&C servers. The largest For bots to remain available to the botmaster
email list was estimated to be worth 10,000- for an extended period of time, a number of
20,000 dollars and profit of the botnet’s oper- mechanisms are employed to evade host-based
ators for offering spam services was valued at detection. We discuss some of these mecha-
1.7-4.2 million dollars [34]. nisms now.
9

Binary Obfuscation: Bot family expansion flags the binary as suspicious. Secondly, virtual
occurs by exploitation of vulnerabilities on ma- machines are now popular among ordinary
chines that are subsequently infected by the users who are a legitimate (and ever growing)
bot-binary. The bot-binary incorporates mech- target, rather than being restricted for security
anisms to coordinate with the botmaster to analysis.
receive commands. To avoid being detected by Security Suppression: After successfully
host-based security applications, several eva- infecting a machine, a botnet may proceed
sion techniques are employed to conceal the and disable existing security software on the
bot-binary. Pattern-based detection approaches victim machine. If the host is already infected
are defeated by the use of polymorphism. Poly- with other competing malware, those are also
morphism refers to the ability of the bot-binary wiped out. For example, Conficker [38] disables
to exist in several forms. One of the ways to several security related Windows services and
achieve this is by using encryption. The same registry keys upon installation. It includes a
effect can also be achieved by packing the bot- domain name blacklist which it uses to block
binary. Packing refers to file condensation. In access to certain security related websites and
the context of malware, packing helps obfus- a process blacklist to terminate processes that
cate the malicious code. Some packers are able may aid in its detection [25].
to produce new binaries every time the original Rootkit Technology: A rootkit is a program
malicious executable is packed. that maintains a persistent and undetectable
While code polymorphism succeeds in con- presence on the infected machine by subvert-
cealing the bot-binary from pattern-based se- ing normal operating system behavior. Botnets
curity applications, it can still be detected by may install rootkits on compromised machines
memory-based detection approaches. When ex- to gain privileged access to them. This enables
ecuted, the bot-binary needs to be decrypted them to carry out malicious activities while by-
or unpacked resulting in the same code. This passing typical authentication and authoriza-
problem is taken care of by code metamor- tion mechanisms. As a result, traditional anti-
phism. It allows for the bot-binary to be rewrit- virus software fails to detect intrusion.
ten into different, but semantically equivalent
code to defeat memory-based detection ap- 2.5.2 Evasion tactics at C&C Servers
proaches Botnet differs from other malware in the abil-
Anti-Analysis: Researchers analyze botnet ity of the botmaster to remotely control and
behavior by running bot binary on virtual coordinate all the infected machines via C&C
machines or sandbox such as [36]. Another server(s). This essentially means that the ‘brain’
method for botnet analysis is to use honeypots of the botnet lies in the C&C server. The same
that emulate known software and network vul- can turn into its Achilles’ heel. Therefore, bot-
nerabilities to be infected by botnet(s). Hon- nets invest considerable resources to conceal
eypots are designed to be self-contained and the C&C server.
prevent the spread of botnet beyond the honey- IP Flux: Botmasters utilize IP Flux to fre-
pot. To evade such analysis, some bot-binaries quently change the IP address associated with
perform checks to determine the environment the domain name of its C&C server. This helps
in which they are being executed. If the binary in evading IP based blacklisting and blocking.
detects a virtual machine or sandbox, it can This phenomenon is also known as Fast Flux
either refuse to run or it modifies its function- and the resultant structure is called Fast Flux
ality to evade analysis. After an initial surge Service Network. IP Flux has been facilitated
of VM-aware botnets, such as Conficker, Rbot, by Dynamic DNS (DDNS) service. Like DNS,
SDbot/Reptile, Mechbot, SpyBot and AgoBot, DDNS performs domain name-to-IP mapping.
the trend for such evasion technique is going In contrast with DNS which handles only static
down due to two reasons [37]. First, legiti- IP addresses, DDNS can operate with dynamic
mate programs rarely perform tests for the IP addresses too. Botnets make use of DDNS
execution environment, and thus unnecessarily to keep the (C&C server) domain name to
10

Fig. 3. Fast Flux Service Network (FFSN) (based on [39])

IP address mapping up to date in real-time. few seconds. Botnets use a domain name for a
Fast flux comes in two flavors; single flux and short period of time before disposing it off and
double flux. associating a new domain name with the same
Considering that the bots are aware of the set of proxy bots.
domain name associated with the C&C server,
Double flux takes IP Flux to the next level
there are two basic steps involved in commu-
by extending the concept of flux to the name-
nicating with it:
server responsible for resolving the C&C do-
i) Resolve the domain name to an IP ad- main name. A request for the C&C domain
dress. This information will be provided name will be resolved by a nameserver under
by the nameserver responsible for the the botnet control (step (i)). The IP address
requested domain. corresponding to the nameserver changes fre-
ii) Send request to the resolved IP address. quently. The nameserver’s response will in-
Single flux targets step (ii) in the method clude IP address associated with the requested
explained above. Bots do not communicate di- domain name. The resolved IP address also
rectly with the C&C server. There is an inter- changes frequently and corresponds to the
mediate layer of machines that act as proxies proxy bots, which relay messages back and
and relay communication between bots and forth between bots and the C&C server (step
C&C server. These machines themselves are (ii)).
compromised by the botnet, and hence we call
them ‘proxy bots’. The resolved IP addresses Fig. 3 provides an example of IP flux. The
correspond to the proxy bots. A different subset bot knows the C&C server domain name,
of the entire pool of proxy bot IP addresses is i.e., www.ffsn.com. The bot obtains informa-
associated with the domain name after short tion about the nameserver for www.ffsn.com
intervals of time. The botmaster uses its collec- from the Top-Level Domain (TLD) .com name-
tion of proxy bot IP addresses in a round robin server. Querying the authoritative nameserver
fashion, so the same IP addresses may reappear ns.ffsn.com results in www.ffsn.com being
in the domain’s A record 2 after some time. mapped to the IP address x.x.x.x. The IP ad-
To ensure that the frequent change in domain’s dress x.x.x.x corresponds to one of the proxy
record is seen by all bots, the TTL in the name bots. The relevant proxy bot then acts as a
server’s reply is set to be very short, usually a mediator between the bot and C&C server.
In double flux, the nameserver ns.ffsn.com is
2. DNS A (Address) Record performs domain name to IP also under botmaster’s influence in addition to
address mapping. proxy bots.
11

Domain Flux: Domain flux associates mul- provided to bots which use them to resolve the
tiple domain names with the same IP address. address of C&C servers. Rogue DNS servers
It helps evade URL-based filtering and detec- not only provide a mechanism for carrying
tion techniques. Domain flux can be achieved out C&C stealthily and effectively, but can also
by either leveraging on the existing service redirect a website’s traffic to another bogus
of DNS to provide domain wildcarding or by website (pharming) or a malicious website fo-
using a domain name generation algorithm. cused on stealing sensitive user information
Domain wildcarding allows a higher level do- (phishing).
main prefixed with any random string to be Anonymization: Anonymization hides the
associated with the same IP address. For exam- identity of the sender of a message such that
ple, consider that nust.edu.pk maps to 10.1.1.1. a message cannot be traced back to its sender.
Through domain wildcarding, *.nust.edu.pk Moreover, it cannot be confirmed if two mes-
will also be mapped to 10.1.1.1 where * can sages were sent by the same sender. While
be any prefix. Alternately, bots can use an this frustrates network surveillance and traf-
algorithm for domain name generation. This al- fic analysis attempts, it is a boon for botnets
gorithm periodically generates a list of domain which can carry out C&C communication us-
names. The list is usually large and not all the ing low latency anonymization networks, such
generated domain names are active at a given as Tor [42], to conceal C&C server(s). Bots
time. The bots identify the domain name cur- communicate with the C&C server without
rently being used as C&C server by trying to knowing its actual location. C&C servers can
contact all these domain names one by one. The be hosted as Tor hidden services [43]. Hidden
domain name that positively resolves indicates services are a facility provided by Tor [44] to
successful C&C server identification. Contact- make it possible to offer services, such as web
ing several random domain names results in publishing or instant messaging servers while
a large number of DNS Non-Existent Domain hiding their location.
(NXDomain) responses, a possible symptom
of botnet infection [40]. The botmaster knows 2.5.3 Hiding C&C Communication
which domain names will be contacted by the Detection of C&C communication can have
bots at a given time. This is possible because various implications. On one hand, detection
the same algorithm is known to the botmaster of C&C can enumerate bots which can be sub-
too. He/she pre-registers the domain names sequently disinfected. On the other hand, it can
which are expected to be contacted by the bots. expose C&C servers. The latter can have more
Note that the botmaster can associate the same serious consequences, for example C&C server
IP address with each domain name registered. can be hijacked crippling the entire botnet. At
Conficker attracted worldwide attention by worst, detection of C&C between botmaster
affecting millions of computers and reportedly and C&C server can reveal the main culprit,
infected key infrastructures and government the botmaster. In addition to this, discovery of
and military offices. It uses a domain gener- C&C communication can help defenders un-
ation algorithm that generates 50,000 domain derstand how the botnet operates and exploit
names per day, out of which 500 randomly this information to damage the botnet. For
selected domains tried to contact with the C&C these reasons, botnets employ several mecha-
server [41]. nisms to obfuscate C&C communication.
Rogue DNS Server: The attitude towards Encryption: Botnets encrypt C&C commu-
cyber crime differs greatly among different nication to evade detection. Encryption causes
countries. In particular, local authorities in content based analysis to fail, forcing re-
some countries are indifferent to requests for searchers to rely on other traffic characteris-
takedown of malicious servers. Botnets turn tics, such as packet arrival times and packet
this to advantage by having their own DNS length. Moving from simple obfuscation tech-
service (possibly distributed) hosted in such lo- niques to elaborate encryption schemes, bot-
cations. The addresses of these DNS servers are nets have made C&C communication virtually
12

impenetrable. Nugache, touted to be one of the 2.5.4 Evasion by the Botmaster


most advanced botnets [45], uses a sophisti- Botmaster is the most protected component of
cated scheme for C&C encryption. A variable a botnet. Botmaster detection can lead to losing
bit length RSA key exchange is followed by control of the botnet. Moreover, the botmaster
seeding symmetric Rijndael-256 session keys can potentially face legal consequences in the
for each peer connection. Keystroke log files are form of a prison sentence and/or hefty fines.
also encrypted using Rijndael with the help of Therefore, botmasters utilize elaborate mecha-
a key derived from some peer-specific informa- nisms to evade detection.
tion [13]. Stepping-Stones: Botmasters generally
hide their true identity by setting up a number
Protocol Manipulation: Some botnets use
of intermediate hosts, called stepping-stones,
protocol tunneling to disguise C&C commu-
between the C&C server and themselves. These
nication. Typically, firewalls allow HTTP traf-
stepping-stones can be network redirection
fic. Leveraging this configuration, botnets have
services, for example proxies such as HTTP or
started to use HTTP tunnels for C&C commu-
SOCKS, and SSH servers. The stepping-stones
nication. Another emerging trend is the use of
themselves are hosts compromised by the bot
IPv6 tunneling for C&C communication [46].
master. Botnets prefer to set up stepping-stones
Most modern operating systems support IPv6
in countries with lax cyber crime legislature.
by default. However, many intermediate de-
This complicates things for law enforcement
vices do not recognize IPv6 traffic. IPv6 tunnel-
agencies by necessitating the cooperation of
ing allows transportation of IPv6 packets over
organizations in other countries. Also, the
incompatible intermediate devices that only
trace back mechanisms are severely crippled
support IPv4. Many firewalls and IDS either do
because network redirection services operate at
not support IPv6 or are misconfigured, limiting
the application level and discard off all lower
their abilitiy to detect or filter IPv6 traffic.
layer information before relaying messages to
This can be exploited by botnets to carry out
the next hop.
C&C communication while bypassing security
Botmaster can use an anonymization net-
measures.
work as a stepping-stone. This offers the ad-
Traffic Manipulation: A very active C&C ditional benefit of obscuring botmaster’s IP
communication may tip off security applica- address, making it very hard to traceback the
tions about the botnet. Botnets might purposely botmaster. Defenders can leverage the inherent
create low volume C&C traffic spread over rel- drawback of anonymity networks, i.e., traffic
atively large periods of time to defeat statistical monitoring cannot be defeated at boundaries
and volume based detection techniques. of the anonymity network. Detection measures
deployed at Internet edges, before a botmas-
Novel Communication Techniques: Bot- ter’s traffic enters the anonymity network, can
nets use novel communication techniques for yield promising results. However, it is ex-
C&C which cannot be trivially detected. Social tremely challenging because a botmaster gen-
networking websites, in particular Facebook erates very little traffic and this traffic is not
and Twitter, are known to be used by botnets easily distinguishable from legitimate traffic.
for C&C. Brazen [22], an information-stealing
botnet, used Twitter to disseminate links that
contained commands or executables to down- 2.6 Topology
load. Bots subscribed to the malicious Twitter Another aspect in the taxonomy of botnet fea-
account using RSS to get status updates. Ad- tures is the topology of botnets based on how
ditionally, botnets may utilize any of the exist- bots communicate with the C&C server(s). This
ing techniques for information hiding for C&C topology is essentially an overlay network and
communication. Some possible candidates are is agnostic to the underlying physical topology.
the metadata in image files and least significant Our classification draws inspiration from a re-
bit encoding in image files. port by Damballa [5] on botnet communication
13

topologies. directly with each other. Typically, the servers


are spread over different geographical loca-
2.6.1 Centralized tions. This allows for fast communication by
allocating similarly located bots to the nearest
In a centralized topology, all the bots report
C&C server. This also entails benefits of load
to and receive commands from a single C&C
distribution, availability and resilience. If one
server. Thus its an easy implementation with
server is taken down, its bots can be distributed
minimum C&C overhead. However, it is also
among the remaining servers. Besides, the issue
a single point-of-failure for the entire botnet.
of central point of failure has also been tackled
To deal with this problem, techniques such as
as it is unlikely that all the countries host-
IP flux and domain flux have been adopted by
ing the C&C servers will simultaneously and
botnets over time. Botnets using HTTP and IRC
positively entertain legal take down requests.
as means of C&C communication are typical
The resulting trade-off, for the botmaster, is
examples of centralized botnet topology.
increased complexity of the botnet implemen-
Star: The basic star topology was adopted
tation.
by many initial IRC-based botnets. It is a simple
model in which the bots directly communicate Random: In a random topology, there is
with the botmaster. An obvious advantage to no clear master-slave relationship. Any bot can
this is increase in the speed of communication be used to issue commands to other bots in the
between bots and the botmaster. However, this botnet.
topology also suffers from the problem of cen-
tral point of failure–if the central C&C server is In peer-to-peer (p2p) networks, communi-
taken down, the entire botnet gets disbanded. cation between bots and the botmaster forms
Hierarchical: Driven by the desire to con- unpredictable routes. The botmaster can use
ceal the botmaster, botnets incorporate one or any peer node to float commands which will
more layers of proxies between the bots and be broadcast to all the bots. The absence of
themselves. The proxies themselves are com- centralized C&C makes it extremely difficult
promised machines serving the botmaster. This to locate the botmaster or hijack the botnet.
renders a hierarchical quality to the resultant In hierarchical topology, shutdown of a proxy
topology, hence the name. There is little prob- server may result in a portion of the botnet
ability that analyzing activities of a bot will becoming dysfunctional. In contrast, in p2p
expose the C&C server. Even if one of the im- topology, ‘cleaning’ of a single bot will have
mediate bot proxies is taken down, the botnet no effect as alternate routes are always avail-
will still be functional. The hierarchical nature able. A disadvantage to the p2p based ran-
also allows for portions of the botnet to be dom toplogy is that the C&C communication
separately rented out to third parties. Because will experience unpredictable delays making it
of the multiple layers of indirection, communi- unsuitable for carrying out coordinated, large-
cation between the bots and the botmaster is scale attacks. Furthermore, capture of a single
bound to experience some latency. bot reveals several other bots because each bot
maintains a peer-list.
2.6.2 Decentralized Cooke et al. [47] proposed a topology that
In decentralized topology, no single entity is re- has not been observed in real world botnets but
sponsible for providing command and control gives insight into possible future botnet trends.
to bots. Bot management is either distributed In this topology, each bot knows about only
among multiple C&C servers or there is no one other bot. The botmaster can use any bot
obvious master-slave relationship between bots to issue a command which is encrypted and
and C&C server. passed on to the next bot discovered through
Distributed: In a distributed topology, random scanning of the Internet. This demon-
multiple servers control a subset of the bot strates an extremely resilient model with high
family. The servers are able to communicate survivability.
14

2.6.3 Hybrid communication can take place either through


An interesting topology for botnets would be existing protocols or custom/neoteric protocols.
a combination of centralized and decentralized Purpose refers to the main motive of the
topologies, for example, a botnet using cen- botmaster in recruiting the bots. The most no-
tralized structure between C&C server and the table motives include information gathering, dis-
front-end proxy bots but p2p as C&C for the tributed computing, cyberfraud, spreading malware,
bots under control of individual proxy bots. cyber warfare, unsolicited marketing and network
At the time of writing of this paper, we are service disruption.
not aware of any existing botnet that utilizes A primary consideration in the operation of
a hybrid topology. Further research is needed a botnet is stealth or evasion. Botnets employ
to investigate the pros and cons of such a a number of mechanisms to evade detection
structure. and thus increase their probability and du-
ration of survival. These mechanisms can be
broadly categorized on the basis of the com-
2.7 Summary: Taxonomy of botnet behav- ponent of botnet infrastructure they are try-
ior ing to obfuscate, i.e., bots, C&C server(s), C&C
communication or botmaster. Evasion tactics at
As a first step to understand the botnet phe- bots strive to defeat host-based detection. These
nomenon, it is important to systematically ex- involve binary obfuscation of the bot binary and
plain botnet behavior. In this section, we pre- rootkits. Additionally, botnets may also include
sented a taxonomy of botnet behavior. Our security suppression and anti-analysis techniques
high-level categorization of botnet behavior to evade detection by security software and
comprises of Propagation, Rallying, C&C, Pur- research tools (such as virtual machines or hon-
pose, Evasion and Topology. eypots) respectively. Being the control center of
Propagation mechanisms refer to the methods the botnet infrastructure, botnets set up a num-
used by botnets to infect other machines. Based ber of evasion tactics at C&C servers to enable
on the degree of (human) user intervention them to function uninterrupted for the maxi-
required, we classify propagation mechanisms mum possible duration. These tactics involve
as Active and Passive. In Active mode of prop- frequently changing the IP address (IP flux)
agation, the botnet is capable of locating and or the domain name (domain flux) associated
infecting other machines without requiring as- with the C&C server, anonymization of the C&C
sistance from a (human) user. Scanning is the server and/or the use of DNS servers under
only representative of this type of propagation. the control of botmaster (rogue DNS server).
In Passive propagation, infection cannot spread Another important aspect of evasion is hid-
to other machines without user assistance. Sub- ing C&C communication. This can be achieved
classes of passive propagation are malware through encryption, protocol manipulation, traffic
distribution through drive-by download, infected manipulation and/or use of novel communication
media and social engineering. techniques. Finally, evasion tactics by the botmaster
Rallying is the process used by bots to dis- typically entail the use of network redirection
cover their C&C server(s). Bots can locate C&C services or anonymization (stepping-stones).
server(s) by IP address or domain name. Both IP Another important aspect in the taxonomy
address and domain name of the C&C server of botnet behavior is the topology of botnets
can either be hardcoded into the bot binary or based on how bots communicate with the C&C
generated using some algorithm. server(s). The high-level classification of bot-
Once C&C server has been located, command- net topology comprises centralized, decentralized
and-control (C&C) communication is used to is- and hybrid topologies. In a centralized topology,
sue instructions to the bots. These instructions all the bots report to and receive commands
could relate to update/modification of bot- from a single C&C server. Star and hierarchical
net malware, spreading the infection to other topologies are derivatives of the centralized
machines or other malicious activities. C&C botnet topology. In decentralized topology, no
15

single entity is responsible for providing com- detection. Another work [49] used the term bot
mand and control to bots. Distributed and ran- family to refer to botnet infected machines with
dom topologies derive from the decentralized similar communication patterns. To provide a
botnet topology. Finally, the hybrid topology generalized frame of reference, we define the
represents a combination of centralized and following terms:
decentralized topologies.
Definition 1. Botnet Detection: Detection of all
components of a botnet, comprising the bot-
3 TAXONOMY OF B OTNET D ETECTION master, C&C server(s), means of C&C, and (all
M ECHANISMS or a subset of) bots.
Several mechanisms for botnet detection have Definition 2. Bot Detection: Detection of bot-
emerged over time. To the best of our knowl- net infected machines, with or without regard
edge, none of these techniques unveil all the to bot families.
botnet components at once. Existing detection
Definition 3. Bot Family Detection: A class of
approaches try to identify a part of the big
Bot Detection focused on bot family detection.
jigsaw puzzle; i.e., the botnet. One part can lead
to the other, and different parts can be placed In this taxonomy, we broadly categorize bot-
together to reveal a greater portion of the puz- net detection mechanisms as Bot Detection, C&C
zle, however, there is no panacea. Based on detection and Botmaster Detection. These facets of
which component is being targeted, we classify botnet detection can be used in combination.
botnet detection into different facets; bot detec- For example, C&C Detection can be followed
tion, C&C detection and botmaster detection. by Bot Detection and vice versa. Botnet detec-
We have intentionally excluded C&C server de- tion approaches can be broadly categorized as
tection from the aforementioned facets because Active and Passive (We later discuss other alter-
detection of C&C communication typically re- natives in Section 3.4). In active detection, the
veals C&C servers too. strategy is to take part in the botnet operation
Our definition of botnet, a distributed mal- by impersonating as a component of the botnet
ware with bots, C&C server(s), a botmaster, instead of passively observing its activities. For
and the C&C communication between these example, active C&C detection involves online
components, is in harmony with the interpreta- manipulation of network flows to deduce in-
tion in existing literature. However, most pre- formation about possible C&C communication.
vious literature refers to detection of single, In contrast, passive detection approaches detect
bot-infected machines as bot detection, and botnets by silently observing and analyzing
detection of bot families as botnet detection. In botnet activities without making a conscious
view of the typical interpretation of the term effort to participate in the proceedings.
‘botnet’, this definition of botnet detection is Our high-level classification of botnet de-
paradoxical. We posit that what is commonly tection mechanisms into ‘active’ and ‘passive’
understood as botnet detection is still bot de- approaches is of practical importance for any
tection, with particular regard to bot families. researcher or network administrator. Active
Gu et al. [48] proposed a detection approach, detection techniques raise several legal and
BotMiner, that could detect bot families or ethical questions, and whether it can/cannot
sub-families within the monitored network. To be performed will vary from organization to
elucidate the scope of their detection frame- organization. For this reason, network poli-
work, they described a botnet as a coordinated cies typically prohibit active manipulation of
group of malware infected machines with sim- network flows. Another issue that necessitates
ilar communication and activity patterns. This use of passive detection techniques is users’
means that if the botmaster uses the bot family privacy concerns. Moreover, active detection
to carry out n non-overlapping malicious activ- methods can jeopardize the security of the very
ities, BotMiner would detect n instances of the hosts they seek to protect. If active detection is
same bot family, hence the term bot sub-family discovered by the botmaster, there is a chance
16

that he/she will direct attack traffic to the by exploiting botnet rallying mechanism. A de-
responsible host(s) as counter defense. fender can use this information to his/her ad-
The rest of this section discusses further clas- vantage to hijack the server. This approach also
sification of these high level botnet detection leverages knowledge of botnet topology. Cen-
facets; i.e., Bot Detection, C&C detection and tralized botnet structures are more amenable to
Botmaster Detection. C&C server hijack. In decentralized botnets, the
C&C server can be any peer and will, at most,
reveal information about bots in its peer list. To
3.1 Bot Detection gain futher information, some other techniques
Bot detection can be performed with or with- need to be employed, such as active crawling
out regard to bot families. Users and network of the p2p botnet. The seizure of C&C servers
administrators are usually indifferent to infor- can be Physical or Virtual.
mation about bot families. Their primary con- In a Physical hijack, law enforcement agencies
cern is to protect their systems and networks physically seize the C&C servers. However, it is
from infections, regardless of details about bot possible to take over the C&C servers without
family. On the other hand, security researchers involving legal authorities by mutual coopera-
are particularly interested in identifying bot tion. This is possible if the C&C servers in ques-
families. The degree of prevalence of different tion are not in geographically diverse locations.
botnets, their geographical distribution, and With the help of service providers, researchers
the common characteristics of botnets are some gained access to several C&C servers used
of the plausible reasons for their heightened in- by Pushdo/Cutwail botnet [34]. In addition
terest. Detection of bots indicates vulnerability to other interesting information, 24 databases
of a host or network to botnet infection. This containing details about the bots and spam
can be followed by remedial strategies aimed operations were disclosed.
at recovering from the infection and preventive In a Virtual take over, defenders hijack the
measures to avoid getting infected in future. C&C servers by redirecting C&C communica-
tion to a machine under their control. This tech-
nique has been used by researchers to hijack
3.1.1 Active Detection botnets that use domain names for rallying. By
Active bot detection involves participating in virtue of DNS sinkholing, traffic sent by bots
the botnet operation. This typically involves to known botnet domains can be forwarded
impersonating as a component of the botnet. to defender-controlled machine. The domain
Active detection approaches involve Infiltration names of C&C servers can be learnt by analyz-
and C&C Server Hijack. ing the botnet behavior on infected machines.
Infiltration: In infiltration, a defender- Researchers [38], [25] have recovered future
controlled machine masquerades as an actual rendezvous points by reverse engineering the
bot and probes the C&C server or other peers domain generation algorithm used by botnets
in case of a p2p-based botnet to gain details utilizing domain flux.
about other bots. Nappa et al. [18] proposed
a replay attack on a Skype-based botnet. The 3.1.2 Passive Detection
technique can be effective for other p2p botnets Passive detection approaches detect botnets by
too. The defender progressively gains infor- silently observing and analyzing their activities
mation about other bots by repeatedly issuing without making a conscious effort to partici-
crafted, bogus messages to declare itself as a pate in the proceedings. Passive techniques can
new bot and subsequently obtain new peer-list. be Syntactic or Semantic.
C&C Server Hijack: Bots can be actively Syntactic: Syntactic or signature-based ap-
detected by C&C server hijack. Bots report proaches identify botnets by comparison with
to and receive commands from C&C server. pre-determined patterns of botnet infection ob-
Taking control of the C&C server will reveal tained from observed samples. Rishi [50] is a
all the bots that contact it. This can be achieved purely syntactic approach for detection of IRC
17

Botnet Detection

Botmaster
Bot Detection C&C Detection
Detection

Passive Active Active Passive Active Passive

Syntactic Semantic Infiltration Injection Syntactic Semantic Marking Logging

Correlation C&C Server Suppression Correlation Stepping-Stone


Behavioral
Hijack Detection
Analysis

Horizontal Behavior-Based
Physical
Host-Based
Vertical Statistical
Virtual Approaches
Network-Based

Fig. 4. Taxonomy of Botnet Detection Mechanisms.

based bots. It formulated regular expressions as tenance of the botnet like egg download, prop-
signatures to identify bot-like IRC nicknames. agation activities, and C&C communication.
Snort [51] has a rich signature database and Attacks and malicious activities fall under sec-
is used by BotHunter [52] along with other ondary data. Secondary data analysis is an
anomaly detection components to feed alerts effect-to-cause approach and relates malicious
to a correlation module. behavior to bots. It cannot be used as a
Signature-based detection degrades if strong generic method to detect all kinds of botnets
evasion mechanisms such as encryption and because it is restricted to the attack charac-
bot binary obfuscation are in place. New teristics that the analyst is trying to target.
threats for which signatures have not been Researchers have tried to detect and study dif-
developed go completely undetected. Hence, ferent bots by correlating similar spam emails
this method should be complemented with collected from records of popular email service
behavior-based detection approaches. providers [53]. Ramachandran et al. [54] de-
Semantic: Semantic detection methods use tected botnets by monitoring and correlating
the context of events and protocol information DNSBL queries which botnets perform as a
to announce detection of malicious behavior. way of reconnaissance before launching a spam
The process of bot detection entails careful campaign. Researchers have also detected bot-
analysis and is based on deviation from estab- nets by analyzing and correlating anomalous
lished benign behavior or similarity with bot DDNS [55], DNS traffic [56] and queries to
behavior. Semantic detection techniques can be search engine [57].
further classified as those based on Correlation Complex Event Processing (CEP) is an area
and Behavioral Analysis. that closely relates to correlation. It correlates
Correlation techniques are used to identify events in real-time to detect a target complex
bots as well as bot families. Botnet is a co- event comprising of multiple simple or com-
ordinated attack infrastructure. This idea has plex events. The concept has recently been
been used to cluster hosts that perform similar applied to the area of information security [58],
activities or communication. Correlation-based [59], [60]. Given its complex nature, botnet
techniques have been shown to successfully detection seems to be a suitable candidate to
detect bots utilizing centralized as well as de- be mapped as a CEP problem. However, we are
centralized topology. not aware of any research effort that attempts
Correlation can be performed on primary this mapping.
data as well as secondary data. Primary data Correlation can be further divided into two
encompasses all the activities vital to the sus- main branches; Vertical and Horizontal correla-
18

tion. rived from network traffic and services to de-


Horizontal correlation detects bots by observ- tect bots. It typically entails classification of
ing similarities in host behavior and/or com- traffic into network applications and looking
munication. BotMiner [48] clusters similar com- for bot-like behavior in individual application
munication and malicious activity patterns and traffic. Classification of traffic into network ap-
then performs cross-cluster correlation to list plications is not trivial. Applications using dy-
bots in the monitored network. BotSniffer [61] namic and random ports have rendered port-
uses spatial-temporal similarities between bot based application classification futile. Devel-
families to detect bots. Several methods have oping payload based application signatures is
been proposed that detect botnets by clustering defeated by the use of encrypted traffic and
flows with similar characteristics [62], [63], [64], privacy issues. Lu et al. [68] used payload
[65]. Horizontal correlation techniques suffer signature examination method to classify traf-
from a setback; there must be more than one fic and found that 40% network flows could
bot infected by the same botnet in the moni- not be classified into specific applications. A
tored network for successful detection. number of methods [69], [70], [71], [50] focus
Vertical correlation correlates activities of a on detection of IRC-based botnets and hence
single machine and compares it with a model require some means to divide traffic into IRC
of bot behavior. BotHunter [52] analyzes the se- and non-IRC parts. P2p bots can be sepa-
quence of communication exchanges between rated from p2p file sharing applications on
a host and the Internet. It models the infection the basis of flow characteristics, p2p churn,
as a loosely coupled sequence of five stages: and difference between machine and human
inbound scanning, exploit usage, egg down- behavior [65]. Researchers have detected bots
loading, outbound bot coordination dialog, and comprising FFSN’s by observing DNS resource
outbound attack propagation. Suspicious out- records [72], [73]. Similarly, bots can be de-
bound activity coupled with intrusion detec- tected by monitoring requests to rogue DNS
tion activity indicates a successful bot infection. servers known to support botnet activities. A
BotTracer [66] uses virtual machine techniques vein of research focuses on identifying C&C
to detect botnets based on three phases of communication in network traffic which can
botnet life cycle; automatic startup of bot with be used to reveal bots. More discussion can be
no user intervention, C&C establishment and found under the facet about C&C Detection.
attack.
Behavioral Analysis is another class of seman-
3.2 C&C Detection
tic detection techniques that analyzes botnets
by observing deviations of machine/traffic be- Detection of C&C channel is an important as-
havior from an established normal pattern or pect of botnet detection. Identification of C&C
its similarity with known botnet behavior. Be- and its subsequent analysis can help in under-
havioral Analysis can be Host-based or Network- standing botnet behavior. This information can
based. be leveraged to identify bots and possibly C&C
Host-based detection methods look for signs servers.
of bot-like behavior on a host. BotSwat [67] de-
tects botnet infection by identifying command- 3.2.1 Active Detection
response behavior. It tracks programs that use Active C&C detection involves taking part in
data received from unreliable network sources the botnet operation, for example, online ma-
(tainted data) and looks for possibility of re- nipulation of network flows to deduce infor-
mote bot initiation. It has related a set of sys- mation about possible C&C communication.
tem calls with bot activity, which are called Active C&C detection involves Injection or Sup-
‘gate functions’. Botnet infection is suspected pression.
if tainted data is passed as argument(s) to gate Injection: Injection entails injecting packets
functions. into suspicious network flows. The similarity
Network-based methods use information de- of the reply to the injected packets with typical
19

bot response indicates that the flow might be silently observing network traffic, looking for
part of C&C communication. Injection can be cues of C&C communication. These mecha-
performed either by infering the botnet C&C nisms can be broadly categorized as Syntactic
protocol, or by blindly replaying incoming and Semantic.
packets in the suspicious flow with or without Syntactic: Syntactic C&C detection works
minor changes. The latter route can be taken by developing signature-based models of C&C
for botnets using stateless C&C protocol. It will traffic. The signatures are obtained by ob-
fail for botnets that are secured against replay serving frequently occurring strings or token
attacks of this form by using timestamps or se- sequences in malicious traffic. Manual signa-
quence numbers. C&C detection based on pro- ture development is less reliable and time-
tocol inference requires reverse engineering the consuming. Methods for automated signa-
botnet C&C protocol. This allows initiation of ture/model generation have been proposed re-
an informed C&C dialog for forensic purposes. cently [76], [49], [77].
C&C detection based on protocol inference can Semantic: Semantic C&C detection ap-
be automated by feeding the detector with proaches use some heuristic to associate certain
information about protocols of known botnets. behavior with C&C traffic. These can be further
Using this knowledge, packets can be injected divided into Statistical, Correlation and Behavior-
into suspicious flows to compare the response based approaches.
with known botnet response [20]. BotProbe Statistical Approaches can be used to detect
is a tool based on active injection techniques botnet C&C communication. Machine learning
to identify chat-like botnet C&C communica- has been extensively used for network traffic
tion [74] . BotProbe leverages two basic dif- classification [78]. However, its efficacy in de-
ferences between human and bot responses. tecting C&C has not been explored much. Ma-
Unlike humans, bots respond deterministically chine learning, particularly supervised learn-
to the same command and are intolerant to ing, has been used for C&C detection. It in-
typographic mistakes. volves identification of features, such as range
Suppression: In suppression, of packet lengths, inter-packet arrival times and
incoming/outgoing packets in suspicious flow duration. Using these features, a classifier
network flows are suppressed to elicit known is trained on relevant datasets. Subsequently, it
response from any of the ends of the C&C develops rules which are fed to the Machine
communication. For example, consider a Learning algorithm for classifying network
suspected bot that requests C&C server traffic as benign and C&C. Machine learning
for some update and the corresponding algorithms have been used to first classify net-
response is dropped. After some retries, the work traffic into IRC and non-IRC traffic and
bot will activate its back-up mechanism. If the then identified botnet and non-botnet traffic
events that are fired up as part of the back-up within the IRC traffic [79]. In another approach,
mechanism are already known, botnet infection graph-based models are used to represent mali-
can be confirmed. SQUEEZE [75] triggers C&C cious C&C connections [80]. The graph model
failover strategies (backup C&C servers and for each network connection is based on the
use of domain generation algorithm) by system calls that lead to this connection and
progressively blocking connections to C&C the system calls that operate on data that is
servers that a bot tries to contact. returned. Machine learning techniques are then
used to automatically generate graph templates
3.2.2 Passive Detection for C&C activity by training the classifier on a
Active C&C detection is complemented by pas- set of graphs that are associated with known
sive techniques. Active C&C detection receives C&C.
more scrutiny and criticism because of pol- The main idea behind Correlation based
icy restraints and greater penalty in case of methods is that similar communication pat-
false positives. This explains the prevalence of terns in network traffic can point to C&C
passive C&C detection methods. It involves traffic. This implies that correlation must be
20

performed at a higher network element, such difficult to detect. The botmaster issues only a
as a router, where network traffic from mul- few commands to the C&C server to be relayed
tiple hosts is visible and can be analyzed for to the bots, thus generating little traffic that
similarity. Strayer et al. [64] used a method to might also be encrypted. Detection of the bot-
first identify botnet-like traffic and then find master is further complicated by the presence
C&C traffic by clustering flows with similar of stepping-stones between the botmaster and
characteristics such as bandwidth, packet tim- the C&C server.
ing, and burst duration. Gu et al. [48] combined
this approach with activity-based correlation of 3.3.1 Active
hosts and performed cross-cluster correlation
Active botmaster detection involves manipu-
to detect botnets.
lation of botnet activity. There are very few
In Behavior-based detection methods, C&C
mechanisms that actively detect the botmaster.
traffic is identified by observing its deviation
Active botmaster detection techniqes revolve
from normal traffic or its similarity with es-
around Marking.
tablished behavioral model of C&C traffic. In
Marking techniques have been used exten-
behavior-based detection methods, the scope
sively to traceback culprits responsible for ma-
of analysis is typically restricted to individ-
licious activities over the Internet. There are
ual hosts (no comparison is made with be-
various flavors of packet marking such as prob-
havior of other hosts). Divergence of a flow
abilistic packet marking [81], [82], [83], ICMP
from typical network usage for a user can
traceback [83], [84] and deterministic packet
qualify it as possible C&C flow. For exam-
marking [85], [86]. In marking schemes, some
ple, a connection made to Russia at midnight
information is written into packets by either
from a user’s machine who does not use the
the victim machine or intermediate routers to
network after evening is cause for concern.
help locate the attacker. Ramsbrock et al. [87]
Regardless of which C&C protocol is being
presented a mechanism for live traceback of
followed, C&C communication has some be-
botmaster by injecting watermark in response
havioral characteristics that can give it away.
packets to the botmaster from a rogue bot
Wurzinger et al. [49] presented a system to au-
under the defender’s control.
tomatically generate C&C models from botnet
samples run in a controlled environment. These
models were generated by attributing response 3.3.2 Passive
behavior, such as sending spam emails and Passive detection of botmaster involves analy-
carrying out DoS attacks, to previously-issued sis of network traffic and other data without
commands. The command portion of the model manipulating or modifying botnet operation.
was signature-based. However, response was Passive mechanisms for botmaster detection
detected by observing anomalies in behavior. involve Logging and Stepping-stone Detection.
These command-response models were then In logging mechanisms, routers log informa-
used to detect C&C and subsequently bots. tion about packets passing through them. This
information is used to verify whether or not
suspected packets were forwarded by specific
3.3 Botmaster Detection routers. Logging mechanisms incur heavy com-
Not many botnet detection techniques target putational complexity in addition to scalabil-
the botmaster. Botmaster detection can have ity issues. Source Path Isolation Engine (SPIE)
serious ramifications for a botnet. On the one is a hash-based IP traceback mechanism [88].
hand, it can lead to legal prosecution and hefty SPIE used deterministic logging mechanism
fines for the botmaster. On the other hand, it to reconstruct path to the attacker. Logging
can lead to disbandment of the entire botnet mechanisms have not been used for botmaster
based on information provided by the botmas- detection so far, mainly because routers on a
ter. For the reasons cited above, botmaster is packet’s path cannot be dictated to support and
the most protected part of a botnet and hence maintain additional logs.
21

Another way to detect botmaster is through and chaff perturbations) from their long-term
stepping-stone detection. The botmaster hides behavior (which can be correlated). Blum [93]
its identity behind one or more stepping- used techniques from computational learning
stones. While stepping-stone detection does theory and the analysis of random walks to
not directly detect the botmaster, in certain detect and identify encrypted stepping-stone
cases it can be used recursively to identify connections with polynomial upper bounds on
the botmaster. There are two main difficulties the number of packets required for the analy-
in the detection of stepping-stone connections. sis. Zhang et al. [94] proposed techniques to
Packets from the botmaster may arrive to the detect encrypted stepping-stone connections.
C&C server with random delays between them. Their method was agnostic to delay and chaff
The delay can be caused by network factors or perturbations. Other timing based methods in-
the botmaster can intentionally introduce them clude [95], [96], [97].
to evade detection. The botmaster may also The previously discussed techniques pas-
add additional packets, called chaff, to further sively detect stepping-stones, while very few
confuse the detection process. techniques actively detect stepping-stones [98],
All existing stepping-stone detection meth- [95]. Wang et al. [98] coined the sleepy water-
ods work on the basis of correlation between marking method to actively detect unencrypted
connection content, host activity or packet tim- stepping-stone connections. Sleepy Watermark
ing. The methods based on host activity cor- Tracing (SWT) activates or ‘wakes up’ when an
relate user login information from different intrusion is detected. In such an event, it injects
hosts part of the stepping-stone chain. As the a watermark into backward connection of the
stepping-stones are under the botmaster’s con- intrusion and collaborates with intermediate
trol, detection can be easily evaded by ma- routers to reveal all the hosts in the stepping-
nipulating and forging information on these stone chain.
hosts. Content-based detection methods such
as thumbprinting [89] detect connections be-
3.4 Discussion on the Taxonomy
longing to the same chain by observing similar-
ity in their contents. Because these methods are In the preceding discussion, we broadly clas-
based on content inspection, they are effective sified botnet detection mechanisms as ‘active’
for unencrypted traffic only. The predominance and ‘passive’. This classification was along the
of encryption for obscuring C&C traffic has dimension Level of Activity. In this section,
rendered content-based detection futile. we identify a number of other dimensions
Another branch of approaches for stepping- that could be used to classify botnet detection
stone detection leverages information regard- mechanisms. We highlight classification based
ing packet arrival time. Timing and chaff per- on the remaining dimensions as new venues
turbation are great challenges for the effective- for research. Furthermore, we investigate the
ness of these methods. Both timing and chaff effect of different botnet features on the accu-
perturbation are traffic mainpulation tech- racy of botnet detection approaches.
niques. Timing perturbation refers to random
delays between packets while chaff refers to in- 3.4.1 Dimensions of Botnet Detection
sertion of meaningless packets to frustrate anal- Depending on the interest of the reader, each
ysis. Characteristics of interactive traffic such as facet of botnet detection can be further ex-
packet size and timing have been used to de- plored in the light of any of the applicable bot-
tect encrypted stepping-stone connection [90]. net detection dimensions as described in Fig. 5.
Wang et al. [91] correlated inter-packet timing For example, a security researcher interested in
characteristics of both encrypted and unen- estimating footprint of different botnets would
crypted connections to detect stepping-stone want to explore detection techniques with the
connections. Donoho et al. [92] used wavelets dimension Discernment. On the other hand, re-
and multiscale methods to separate short-term searchers developing a botnet detection tool to
behavior of stepping-stone connections (delay be deployed at a large ISP would be interested
22

to be catered for in the detection engine. For


example, [50], [52] heavily rely on manually
developed signatures for botnet detection. If
the botnet C&C style changes, or a new botnet
appears, corresponding signatures have to be
manually developed and incorporated in the
detection framework. This is in contrast with
recent work that automatically infers botnet
C&C protocol [20] which may be used for
signature development.
An Automated system should require no hu-
man intervention after initial development.
Ideally, any detection method should be as
generic and automated as possible. Methods
that rely on behavioral characteristics of bot-
nets such as [64] or automatic generation of
Fig. 5. Dimensions of Botnet Detection. detection models [49] have the potential to
operate in a fully automated manner.
Analysis Direction: Botnet detection can
in the dimension Analysis Depth. We call these be carried out in several directions, where di-
different classification criteria ‘dimensions’. rection means the sequence of analysis. Some
Some dimensions may not be applicable to detection methods start at the bots and perform
certain facets. For example, Analysis Direction correlation at an upstream network component,
and Discernment are more relevant to Bot Detec- others take the opposite route and observe
tion. Specificity is applicable to Bot Detection and anomalies at upstream network components
C&C Detection. The remaining dimensions can and trace the effect back to the cause.
be used with any of the three facets of botnet Top-Down approaches observe phenomena
detection; i.e., Bot Detection, C&C Detection and at upstream network components such as
Botmaster Detection. spam [53], [100], [101] and anomalies in
Degree of Automation: Depending on the DDNS [55] and DNS traffic [56]. Clustering
degree of human participation in the detec- and correlation techniques are then applied to
tion process, we can classify botnet detection identify bots belonging to specific botnets.
methods as manual, semi-automated and au- In Bottom-Up approaches, behavior exhibited
tomated. by individual bots is analyzed at an upstream
Manual approaches that require significant network component to identify bot families
human effort to detect botnets fall under this based on similarity of behavior. BotMiner [48]
category. Typically, such techniques require detects bot families by clustering machines that
manual acquisition and reverse engineering of display similar communication and activity
bot binary for developing signatures that are patterns.
fed into custom-made botnet detection soft- Diffuse methods of detection are more rele-
ware. Considerable human effort is required vant to p2p botnets where there is no hierar-
to reflect even the slightest change in botnet chy in the botnet components. The detection
functionality. Many methods [99], [14] for p2p- process can start at any bot and other bots
based botnet detection are manual. can be discovered by analyzing communication
Semi-Automated methods for botnet detec- patterns or peer-lists.
tion require very little human intervention and Analysis Depth: Some detection methods
most of the detection is performed in auto- base their analysis on compact and easily acces-
mated fashion. Typically, human involvement sible data, thus involving minimum overhead.
is required only when something changes in Other techniques perform more in-depth anal-
an existing botnet or new kinds of botnet have ysis of data. This factor plays a significant role
23

in the effectiveness of network-based, real-time Discriminate methods for botnet detection not
detection algorithms. only identify the infected machine, but can
Deep Packet Inspection (DPI) based detection also provide information about bot family. This
methods that perform fine-grained analysis of typically requires an elevated view of the net-
data. Deep Packet Inspection (DPI) has been work. Bot families are determined by clustering
used for signature-based detection of malicious hosts with similar communication or behav-
payloads [52] or communication [61], [50] and ioral characteristics [48]. Host-based detection
to compute content-similarity for packet pay- techniques can also identify bot family which
loads [65]. DPI-based methods involve signifi- the bot in question serves. Signature-based de-
cant computational and operational complex- tection can possibly name the bot family to
ity, particularly if large amount of network which the bot belongs. Use of system calls and
traffic is involved. library routines associated with known botnets
Partial DPI based methods perform DPI only can also reveal which bot family the bot is
for suspicious data instead of analyzing all associated with.
data indiscriminately. Zhang et al. [102] pro- Indiscriminate methods can detect botnet-
posed a sampling-based approach that identi- infected machines but cannot make distinction
fied bot flows which were then forwarded to between different bot families. A good example
fine-grained botnet detectors. of this is BotHunter [52] which can discover
Header-Based methods operate on infected machines in the monitored network
header/flow level data and incur minimum but gives no information about bot families.
complexity, making them ideal for being Mode of Operation: Mode of operation
deployed over large networks. Karasaridis et refers to the suitability of the detection method
al. [70] compared flow records of suspicious to operate in a live or offline environment.
hosts with IRC traffic models to identify botnet Live methods can detect botnets in real-time
controller activity. while the monitored host or network carries
Specificity: Some detection methods are out its normal operations. For example, Rams-
tailor-made to detect certain kinds of botnets. brock et al. [87] use watermarking for live de-
Other methods perform botnet detection on the tection of botmaster. The ability of a detection
basis of general botnet characteristics which do method to operate in live mode is closely re-
not vary among different instances. Specificity lated to how much it relies on DPI. A network
is the measure of a detection method’s depen- based detection method performing full DPI
dence on instance-specific features. is bound to falter in live environment with
Specific detection methods are custom-made potentially huge and unpredictable traffic.
to detect certain kind of botnets. For example, Offline methods for botnet detection provide
Rishi [50] can only detect IRC-based botnets. promising results when run on log files or net-
These approaches fail to detect botnets using a work traffic dumps. These methods are partic-
different structure than the one targeted. ularly helpful for performing network forensics
Generic approaches consider general charac- or research purposes. BotHunter [52] can oper-
teristics of botnets instead of targeting spe- ate in both live and offline setting. In offline
cific botnet instances [64], [48]. Resultantly, the mode, it can operate on pcap file as well as
results are more promising and the scope of Snort [51] logs. Some security companies [103],
detection is broader. [104] offer services to investigate botnet-related
Discernment: Discernment is the ability incidents by analyzing logs and other evidence.
of a detection method to differentiate between Location of Deployment: Based on loca-
different bot families. While a home user is tion of deployment, botnet detection methods
less likely to be concerned about details of the can be classified as host-based and network-
other members of the bot family that infected based. These are not mutually exclusive and
his/her machine, this information is valuable a distributed method can make use of both of
to researchers who try to estimate the footprint them.
of different botnets. Host-Based methods that analyze host behav-
24

ior or data to detect potential botnet infection. Detection, C&C detection and Botmaster Detection
Obviously, these methods can only declare bot- of botnet detection mechanisms, respectively.
net infection on individual machines and give We do not aim to explain each row in the
no information about other bots belonging to tables. Instead, we highlight some interesting
the same bot family. Both BotSwat [67] and observations.
BotTracer [66] are examples of botnet detection Binary Obfuscation and Encryption degrade
systems deployable on hosts only. the accuracy of detection approaches that rely
Network-Based methods analyze network traf- on reverse engineering of bot binary or C&C.
fic and depending on the scope of analysis, can Syntactic detection approaches leverage on
be deployed anywhere in the network hierar- known malicious patterns in host or network
chy. Possible deployment location can be proxy data. Infiltration and Injection require under-
server, ISP and so forth. Karasaridis et al. [70] standing of the bot binary and/or C&C pro-
proposed a method to detect botnets on a large tocol. For C&C Server Hijack, the IP addresses
Tier-1 ISP network. of C&C servers for subsequent hijack can be
Degree of Activity: By degree of activity, retrieved by analyzing bot binary in which
we mean the extent to which the detection these can be possibly hardcoded. Another way
method interferes in the botnet operation. Some for discovering such IP addresses is to mon-
detection methods quietly observe ongoing itor C&C communication. Binary Obfuscation
botnet activity and base their decision on this complicates inference of the bot binary while
information alone. Other detection techniques Encryption renders C&C indecipherable.
involve active participation in botnet C&C or The choice of botnet topology makes some
infiltration. detection approaches more effective. C&C
In Active detection, the strategy is to take part Server Hijack is more effective for botnets utiliz-
in the botnet operation by impersonating as a ing Centralized structure. Taking down the cen-
component of the botnet instead of passively tral C&C server(s) incapacitates the entire bot-
observing its activities. Active C&C detection net. Horizontal Correlation associates group ac-
involves online manipulation of network flows tivity and communication patterns with botnet
to deduce information about possible C&C behavior. Similarity in communication patterns
communication. is particularly evident in the case of Centralized
Passive detection approaches detect botnets botnets. C&C Server Hijack is not an effective
by silently observing and analyzing botnet ac- detection approach for Decentralized botnets as
tivities without making a conscious effort to any node can be used to float commands. At
participate in the proceedings. best, hijacking a supernode can reveal a small
portion of the entire botnet.
3.4.2 Effect of Botnet Features on Botnet De- All bots will ultimately carry out the Purpose
tection for which they were employed by the botmas-
Different botnet features positively or nega- ter. The same, particularly if carried out in an
tively affect the accuracy of botnet detection aggressive manner, can hint at botnet infection.
mechanisms. This is very important informa- Correlation based on host activity is more ef-
tion from the point of view of a security fective for botnets with aggressive Purpose. For
researcher. Armed with knowledge of pros example, botnets targeted at Unsolicited Mar-
and cons of individual detection mechanisms, keting and Network Service Disruption are more
complementary approaches can be paired to likely to generate traffic that would be feasible
achieve a synergistic effect. Additionally, in- to be detected by network traffic analysis than
formed decisions can be made in the choice botnets focused on Information Gathering.
of detection mechanisms for designing tai- Detection approaches that rely on botnet
lored strategies targeted at detection of spe- command-response behavior or observe sim-
cific threats. To this end, we present three ilarity in communication patterns and flow
tables,Table 1, 2 and 3, that investigate the characteristics (e.g. packet size, inter-packet ar-
effect of botnet features on the three facets, Bot rival times and flow duration) degrade if a
25

TABLE 1
An overview of how different botnet features affect bot detection mechanisms.

Bot Detection Related Botnet Features


Mechanism
Improve Degrade
Infiltration Encryption,
Binary Obfuscation.

C&C Server Hijack Centralized Topology, Decentralized Topology,


Rogue DNS Server. Bot Binary Obfuscation,
Anonymization,
IP Flux,
Information Hiding.

Syntactic Bot Binary Obfuscation,


Information Hiding,
Encryption.

Horizontal Correlation Centralized topology, Traffic Manipulation,


Purpose. Information Hiding,
Decentralized topology.

Vertical Correlation Purpose, C&C Evasion,


Propagation, Binary Obfuscation.
Bot Evasion,
Rogue DNS Server.

Host-based Traffic Manipulation.

Network-based C&C, Protocol Manipulation,


IP Flux, Traffic Manipulation,
Rogue DNS Server. Information Hiding.

botnet makes use of Traffic Manipulation. This Anonymization and IP Flux make it difficult
includes some Host-based detection approaches, to trace C&C communication to C&C servers.
Network-based methods, in particular those that Logging mechanisms use information stored by
rely on traffic classification, and Suppression routers to trace attack packets back to the
in case of C&C detection. Use of Marking for source. Logging mechanisms do not help much
botmaster detection experiences loss in accu- in the presence of Anonymization which hides
racy if Traffic Manipulation, especially timing the source of a network flow by bouncing
perturbations and traffic padding, is employed packets through a network of volunteer servers
for C&C evasion. before delivering them to the destination.
Protocol Manipulation affects detection tech- The choice of protocol for C&C communi-
niques that are based on observation of traffic cation is closely related to the effectiveness
anomalies or possible C&C communication. In- of some detection approaches. C&C based on
formation Hiding techniques make it difficult to Neoteric and unpopular protocols can be more
detect botnets by analyzing network traffic for conveniently detected than C&C that utilizes
the presence of C&C communication. Behavioral popular, Existing network protocols such as
Analysis, in particular Network-based methods HTTP. Statistical approaches, particularly Ma-
that rely on traffic classification, degrade con- chine Learning, have been used to classify net-
siderably because of Protocol Manipulation. work traffic and subsequently detect C&C by
IP addresses of C&C servers can be dis- observing anomalies in individual application
covered by monitoring C&C communication. traffic. Their accuracy degrades if C&C com-
26

TABLE 2
An overview of how different botnet features affect C&C detection mechanisms.

C&C Detection Related Botnet Features


Mechanism
Improve Degrade
Injection Encryption.

Suppression Traffic Manipulation.

Syntactic Encryption.

Correlation Centralized Topology. Traffic Manipulation,


Information Hiding,
Protocol Manipulation.

Behavior-based Rogue DNS. Traffic Manipulation,


Information Hiding.

Statistical Approaches Neoteric Protocol. Existing Protocol,


Traffic Manipulation,
Information Hiding,
Protocol Manipulation,
Encryption.

TABLE 3
An overview of how different botnet features affect botmaster detection mechanisms.

Botmaster Detection Related Botnet Features


Mechanism
Improve Degrade
Marking Traffic Manipulation.

Logging Anonymization.

Stepping-stone Detection Stepping-stone. C&C Evasion.

munication employs popular Existing proto- instead of the host’s default DNS server. In
cols. Additionally, the task of machine learning the context of IP Flux, detection of FFSN by
algorithms is made significantly difficult by themonitoring DNS responses to suspicious do-
use of Traffic Manipulation, Information Hiding, main names offers two benefits. It can help
Protocol Manipulation and Encryption. Use of in detecting the compromised machines being
Existing Protocol for C&C further complicates used as proxies to host malicious botnet ser-
telling benign traffic and C&C apart. The afore- vices. Secondly, DNS queries to web services
mentioned features have a similar effect on hosted by FFSN can indicate that the querying
Stepping-stone detection. machine is also compromised. Moreover, DNS
queries directed to known Rogue DNS Servers
Botnets employ Rogue DNS Servers to serve can reveal bots.
phishing content and to avoid blacklisting ser-
vices. Behavior-based detection techniques rein- Some detection methods such as Vertical Cor-
force confidence about botnet C&C by looking relation correlate events in a temporal fashion.
for suspicious behavior, such as the use of Detection of malware Propagation and unusual
Rogue DNS Server for resolving domain names behavior such as Security Suppression, Anti-
27

analysis, Rootkit installation and use of Rogue tion involves taking part in the botnet oper-
DNS Server hint at botnet infection. Observing ation, for example, manipulation of network
Purpose of a botnet, in particular its attack be- flows to deduce information about possible
havior, adds to confidence of the detection ap- C&C communication. Suppression and Injection
proach. C&C Evasion techniques degrade verti- represent subclasses of active mode of C&C
cal correlation if potential C&C traffic is one of detection. Passive detection involves silently ob-
the inputs to the correlation engine. serving network traffic, looking for cues of
These tables can be best utilized by iden- C&C communication. This may entail com-
tifying complementary detection approaches parison with predetermined C&C signatures
that compensate each other’s shortcomings and (syntactic analysis) or behavioral characteristics
highlight their respective strengths. Develop- of C&C traffic (semantic analysis). The latter
ment of a comprehensive strategy along these can be further broken down into statistical and
lines will help in generic and more effective behavior-based approaches.
botnet detection. Some detection mechanisms target the bot-
master. Active detection involves manipulation
3.5 Summary: Taxonomy of botnet detec- of botnet activity, for example, insertion of
tion mechanisms information into packets either by victim ma-
chine or intermediate routers to help locate the
In the event that an intrusion attempt succeeds, attacker (marking techniques). Passive detection
the very least a robust security plan should involves analysis of network traffic without
address is its detection. With this objective in manipulation or modification of botnet opera-
mind, in this section, we provided an extensive tion. Logging and stepping-stone detection repre-
overview of botnet detection mechanisms. To sent subclasses of passive botmaster detection.
highlight the component of botnet infrastruc-
ture that a detection technique is targeting, We argued that the high-level classification
we classify botnet detection mechanisms as of botnet detection mechanisms, i.e., bot, C&C
those concerning bots, C&C communication and and botmaster detection (referred to as facets
botmaster. of botnet detection) can be further categorized
Bot detection refers to detection of botnet with different dimensions depending on inter-
compromised machines without regard to the est of the reader. In our taxonomy, we clas-
larger superset of botnet population of which sified all the facets of botnet detection with
the compromised host is a member. Based on the dimension level of activity, i.e., active and
the degree of participation of the defender passive detection. We identified a number of
in botnet operation, bot detection strategies other dimensions for classification of botnet de-
can be further classified as active and passive. tection techniques; degree of automation, analysis
Active bot detection requires the defender to direction, analysis depth, specificity, discernment,
impersonate as a component of the botnet. mode of operation and location of deployment. We
Subclasses include botnet infiltration and C&C left detailed discussion as an avenue for future
server hijack. Passive detection approaches detect research.
botnets by silently observing and analyzing We also analyzed how the absence or pres-
their activities without making a conscious ence of different botnet behavioral features af-
effort to participate in the proceedings. This fect the accuracy of botnet detection mecha-
analysis can leverage comparison with pre- nisms. In this regard, we presented three tables
determined patterns of botnet infection derived that investigate the effect of botnet behavioral
from observed samples (syntactic analysis) or features on the three facets (bot detection, C&C
the context of events and protocol information detection and botmaster detection) of botnet
(semantic analysis). The latter can further entail detection mechanisms. This information can be
correlation and behavioral analysis. utilized to devise a comprehensive detection
A number of botnet detection mechanisms strategy that combines complementary detec-
focus on detection of C&C channel. Active detec- tion approaches.
28

Botnet Defense

Preventive Remedial

Technical Non-Technical Defensive Offensive

Host Attacker Host-Based Network-Based Direct Attack Indirect Attack


Cleanliness Dissuasion

Network Legal Disinfection Bot Bot C&C C&C Server


Cleanliness Accountability

Reinstallation C&C
User Education Physical

Virtual

Fig. 6. Taxonomy of Botnet Defense Mechanisms.

4 TAXONOMY OF B OTNET D EFENSE Host Cleanliness: Botnets spread by ex-


M ECHANISMS ploiting operating system or application vul-
nerabilities on victim machine or by social en-
At present, defense against botnets is mostly
gineering. Measures for host cleanliness should
preventive or defensive. Preventive defense
be adopted to proactively defend against bot-
includes proactive measures to avoid botnet
net infection. A reasonable patch management
infection. Defensive methods are reactive in
system should be in place to install the latest
nature and concern themselves with cleaning
security patches for operating systems and ap-
systems once they have been infected. There is
plications. Operating systems and most appli-
an aching need for developing defense strate-
cations now come with the auto-update fea-
gies that solve the problem at its root. While
ture to relieve users of the responsibility of
bot disinfection is important from user point-
manual updation. Infection can be warded off
of-view, it does not hurt the botnet which finds
by avoiding opening emails from unknown
other machines to infect and serve its purpose.
sources, particularly if the email contains at-
Ideally, a defense method should incapacitate
tachments with executable files or scripts. Sim-
the whole botnet or significantly damage it.
ilarly, one should exercise restraint in clicking
We broadly categorize botnet defense mech-
links on untrusted websites. Security settings
anisms as Preventive and Remedial. The rest of
of web browsers should be adjusted optimally
this section further classifies these high level
and automatic execution of scripts such as
classes of botnet defense mechanisms.
JavaScript, ActiveX and VBScript should be
4.1 Preventive turned off. Programs and users should be given
only basic permissions in accordance with the
Hosts and networks can adopt preventive mea- principle of least privilege and administrative
sures against botnets to raise the bar for pos- rights should be exercised with discretion. In-
sible botnet infection. These methods are effec- stallation of security software such as anti-
tive before the botnet infection has taken place. virus, anti-spyware, anti-trojan, rootkit detec-
4.1.1 Technical tion packages and firewall can help fend off
some infections. The former can be comple-
Technical approaches include botnet defense
mented by Host Intrusion Prevention System
activities and measures that are related to com-
to protect from previously unknown threats by
puters and networks.
29

identifying anomalous behavior. Closing ports malicious activities carried out by botnets. Tar-
used by applications favored by botnets for geting the business model employed by botnets
C&C communication such as IRC and FTP can can reduce incentives for the bot master to
reduce the risk of botnet infection to some run the botnet. MARK (the Multihost Adware
extent. We realize that the measures described Revenue Killer) was a distributed network of
above compromise user experience and are machines capable of controlling advertising
generally short-circuited by them. However, to impression numbers, click through rates and
date these remain the most effective and pre- software package installs, to carry out effec-
ventive mechanism and thus the focus should tive attacks against malicious-code generated
remain on informing and motivating users of revenue streams [107]. The motivation was to
their importance while reducing their hit on make the botmaster give up maintenance of the
usability. botnet by reducing advertising revenue gener-
Network Cleanliness: While network ad- ated by adware and botnets. A number of anti-
ministrators can benefit from the previous sec- spam approaches have been tendered [108],
tion to defend individual hosts in the network, [109], [110]. Detecting and subsequently filter-
there are some network level preventive steps ing spam email before it reaches the intended
that can further reduce the risk of botnet in- recipients can hurt the profitability of this
fection. The principle of separation of privi- model of marketing to advertisers. In turn, the
lege dictates that multiple levels of security botmasters will lose business and give up the
make it non-trivial for the attacker to circum- botnet. Current defense mechanisms against
vent it. Same is true of networks. In addi- DDoS are not very effective [111]. Better pre-
tion to host-based protection, servers, external ventive strategies against DDoS that stop the
connections, email gateways should all have attack before it affects the targeted service will
optimum security in place. Network Anomaly reduce the utility of the botnet to the botmaster.
Detection System and firewall are good op-
tions for proactive defense against the botnet Legal Accountability: It is important
threat. Access to known malicious domains to complement existing defense mechanisms
should be blocked. A list of such domains (ma- against botnets with a solid legal framework.
licious/C&C/RBN domains) can be obtained As long as the botmaster enjoys relative im-
from the Internet. Network traffic should be punity over the Internet, technical detection
made to pass through a web proxy where and mitigation efforts alone will not suffice.
it is possible to scan incoming content for Microsoft’s takedown of Rustock botnet [112]
malware presence and outbound content can reinforces the frequently voiced concern of se-
be scrutinized for possible data leaks. Placing curity organizations for more comprehensive
honeypot and/or darknet in the network can legislature to tackle the botnet phenomenon
indicate vulnerability of the network to certain [113]. USA, EU countries, China and Japan
threats and can point to the presence of infected have strict penalties for cyber crimes [114].
machines in the monitored network. Finally, a Despite this, USA, Germany and France were
comprehensive network security policy should among the top three countries hosting C&C
be developed and enforced. servers in a survey conducted by security firm
Damballa [115]. C&C servers tend to be scat-
4.1.2 Non-Technical tered around the globe in countries with dif-
These approaches fall under domains which ferent legal attitude towards cyber crime, there-
are not directly related to the digital world of fore taking down a server in one country will
computers. not decapitate the whole botnet. Botnets are a
Attacker Dissuasion: It has been proved distributed phenomenon and therefore there is
time and again that there is increasing financial a strong need for collaborative legislation and
motivation behind botnet operation. Several cross-border cooperation to fight botnets. The
studies [105], [106] suggest the lucrativeness legal consequences of cybercrime can poten-
of the business of spam, clickfrauds and other tially deter botmasters from operating botnets.
30

User Education: All the technical defenses the botnet infection is uncertain. Determining
will ultimately falter if the user has no moti- all the services, files and registry keys that have
vation to follow them. Generally, users are not been changed or installed does not guarantee
concerned about the security of their comput- total disinfection particularly if rootkit installa-
ers as long as they can perform their routine ac- tion was involved. Master Boot Record (MBR)
tivities without hindrance. People are often the infections are notoriously difficult to get rid
weakest link in an otherwise secure network. of and disinfection efforts may prove futile. In
Botnets exploit this inherent weakness to infect most cases, OS reinstallation is required.
systems by use of social engineering tactics. Another option to tackle the problem of bot-
Some countries legally bind users to act respon- net infection is Reinstallation. In majority of
sibly by imposing privacy standards and fines cases, botnet infection cannot be entirely re-
for negligent or willful non-compliances [116]. moved and users need to restore the operating
Users should be educated about the gravity of system to an uncorrupted previous state. If a
the botnet threat and the measures they can clean image is not available, complete reinstal-
take to avoid becoming bots. lation might be required.
Network-based: Network-based
techniques aim to secure networks once it
4.2 Remedial
is known that one or more machines in
Remedial defense strategies help in partial or the network are infected with botnet. These
complete recovery from botnet infection. These techniques can be categorized into those that
mechanisms try to solve the problem either Block Bots or Block C&C Communication.
by removing the infection from infected ma- Botnet-infected networks can be secured by
chines and networks (defensive) or damaging blocking Bots. To contain the infection, the
the botnet infrastructure in a way that it either administrator can quarantine the infected ma-
stops or significantly complicates perpetration chines in the administered network and fix
of malicious activities (offensive). them before bringing them back on the net-
work. This is similar to the notion of walled-
4.2.1 Defensive Strategies garden. Walled-garden is a state of isolation
This class of methods is aimed at self-recovery in which an ISP can place machines showing
in the event of botnet infection. Defensive symptoms of botnet infection to reduce further
strategies can be further classified as Host-based damage to other machines inside or outside the
and Network-based. network. The infected machine is denied all
Host-based: Host-based techniques help network communication except with a white-
individual machines recover from botnet in- list of domain names helpful for remediation of
fection. Mitigation techniques that target C&C the infection. The bot machine is not brought
servers do nothing for the infected bots. Bots back on until its cleanliness is confirmed as
will continue to experience negative effects of per the ISP policy. The Messaging Anti-Abuse
the infection and can even unwittingly par- Working Group (MAAWG) described the con-
ticipate in a never-ending cycle of malicious cept of walled-garden as a three step process
activities if they fail to receive the stop com- comprised of detection, notification and reme-
mand from the exanimate C&C server for a diation [117].
previously issued start command. Host-based Network-based defense strategies also in-
defense seeks to restore individual bots to their volve blocking C&C communication. A net-
clean state. Host-based defensive strategies can work can be protected by identifying C&C
involve Disinfection or Reinstallation. communication and then blocking it. The task
One of the defensive methods for recover- can be done in a pro-active fashion by mak-
ing machines from botnet infection is Disinfec- ing use of up-to-date URL and IP blacklists.
tion. There are off-the-shelf programs dedicated However, identification of C&C traffic within
to disinfecting compromised systems, however the network traffic is not a trivial task. In the
their effectiveness in completely eliminating simplest form, traffic to and from known mali-
31

cious ports should be blocked. For finer results, Another option is to coordinate a direct at-
traffic analysis might be required. Alternately, tack on C&C Communication. The C&C commu-
all network communication can be blocked ex- nication can be poisoned by introducing bogus
cept a whitelist of known ‘good’ domains till commands in the channel. In p2p botnets, bots
the infected machines are cleaned. search new commands by using bot command
keys. Holz et al. [14] presented a way to dis-
4.2.2 Offensive rupt botnet communication by injecting benign
content with command keys same as those
This class encompasses aggressive defense used by botnets so that the actual commands
mechanisms that intimidate or completely par- never reach the bots. Sybil attack [121] is an
alyze a botnet. This can be done by carrying attack against p2p botnets that introduces into
out an Indirect or a Direct Attack on botnets. it peers under control of the defender. The
Indirect Attack: Indirect attack reduces us- sybils can be strategically positioned to gain
ability of the botnet to the botmaster. Unlike control over a part of the p2p network. They
malware of past whose purpose was to create can reroute queries to any non-bot node or
mayhem on the Internet, there are greater fi- a peer sybil. In this way, the sybils can dis-
nancial incentives for botnets in carrying out rupt a part of the overall C&C communication.
their malicious activities. Attacking the under- Eclipse attack [122] is similar to Sybil attack
lying business model to significantly reduce except that it gains control over a much smaller
profits has the potential to daunt the bot master portion of the p2p network. Inference of the
to an extent it gives up maintenance of the C&C protocol enables the defender to alter and
botnet. Some botnets steal sensitive informa- inject commands into C&C that help minimize
tion, such as passwords and banking creden- the damage caused by botnet. The specific com-
tials. The stolen information is transferred from mands for carrying out attack such as spam
the victim’s computer to a dropzone and ulti- and DDoS can be stopped from reaching bots
mately sold to third parties. The botmaster’s and replaced by commands to remove bots
credibility can be hurt if fake information is from the botnet.
injected into the dropzone. This can be done by Direct attack can also be launched on C&C
reverse engineering the botnet’s C&C protocol. Servers. This involves Physical or Virtual re-
This will damage the botmaster’s reputation moval of C&C servers.
and affect his business prospectives. Ormerod Physical removal of C&C servers can poten-
et al. [118] suggested injecting fake but tracable tially render the botnet dysfunctional. How-
credentials into the botnet dropzone so that the ever, the task is easier said than done. It is
misfeasors could be identified and prosecuted. applicable to centralized botnets only and re-
Direct Attack: Direct attack entails mecha- quires prior knowledge of the location of the
nisms to directly hurt one or more components C&C servers, which itself is an arduous task.
of the botnet. On the basis of target of the There are ethical and legal hurdles involved in
attack, this class of remedial mechanisms can taking down servers and requires cooperation
involve attack on Bots, C&C Communication and from hosting providers. Most C&C servers are
C&C Servers. hosted by providers claiming bullet-proof host-
A direct attack can be staged on Bots. Like ing that are immune to the nature of content
other software, the bot binary and C&C proto- being hosted. This necessitates legal interven-
col are likely to have bugs [119], [120]. These tion.
can be exploited by the defender to take control It is also possible to perpetrate Virtual take
of bots and remotely disinfect them or trigger over of C&C servers. Physical seizure is not
the C&C command related to removal of the always necessary and the desired effect can
bot from the botnet. There are legal and eth- be achieved by other means. Border Gateway
ical issues involved in remotely disinfecting a Protocol (BGP) blackholing is a technique to
user’s computer without his/her consent and null-route traffic to and from known malicious
may be seen as privacy invasion. servers. The same can be done through DNS
32

Sinkholing. For DNS queries for known C&C includes proactive measures to avoid botnet
servers, the sinkhole returns a non-routable or infection. Defensive methods are reactive in
any address except the actual address of the nature and concern themselves with cleaning
malicious server. Alternately, defender has the systems once they have been infected. There is
option to carry out DoS attack against known, an aching need for developing defense strate-
centralized C&C server to degrade its service. gies that solve the problem at its root. While
bot disinfection is important from user point-
4.3 Summary: Taxonomy of botnet defense of-view, it does not hurt the botnet which finds
mechanisms other machines to infect and serve its purpose.
Botnet defense comprises mechanisms that Ideally, a defense method should incapacitate
help in prevention of botnet infection. This is the whole botnet or significantly damage it.
not always a dependable option as botnets rep-
resent advanced threat and the intrusion may
succeed despite proactive security measures.
5 R ELATED W ORK
Botnet defense also covers reactive techniques A great deal of previous work has focused
such as recovery from botnet infection or offen- on different aspects of botnets, their detection
sive measures to disrupt or impair the botnet. and defense. Cooke et al. [47] in their pioneer
This section presented a comprehensive tax- work explained C&C structures and proposed
onomy of botnet defense mechanisms. Botnet a novel structure which they called ‘random’.
defense mechanisms can be broadly classified A brief discussion of this structure can be
as Preventive and Remedial. found in section 2.6.2. A branch of research
Preventive mechanisms discourage botnet in- seeks to demystify certain aspects of botnet
fection proactively. These can be further clas- operation, such as evasion techniques [123],
sified as technical and non-technical. Technical [124], C&C structures [5] and propagation.
mechanisms advocate hardening of computer Nazario [4] highlighted the multifaceted nature
and network security (host cleanliness and net- of botnets and maintained that this should be
work cleanliness respectively) to ward off possi- reflected in related taxonomies. His taxonomy
ble botnet infection. Non-technical mechanisms was based on factors such as network struc-
address botnet infection through other means, ture, language of bot binary, features (attacks
such as attacker dissuasion, legal accountability ,server, proxy) and propagation strategies. This
and user education. relates to Purpose, Topology and Propagation in
Remedial defense mechanisms are reactive our taxonomy of botnet features. Their work
in nature. These can be further classified as does not take into account C&C characteris-
defensive and offensive. Defensive strategies are tics, botnet evasion techniques and rallying
concerned with recovery from botnet infection. mechanisms. Dagon [1] emphasized that the
This could involve host-based defense measures, aim of taxonomy should be identification of
such as host disinfection and software reinstal- detection opportunities. The taxonomy classi-
lation or network-based defensive measures, such fied C&C on the basis of nature of C&C re-
as isolating bots or blocking C&C communication. sources (public/private), RFC compliance, and
Offensive strategies comprise of aggressive de- activity level (how often the bots contact the
fense mechanisms that aim to disrupt a botnet. botmaster). The concept of protocol agnostic
This can be achieved by carrying out a direct detection is related to secondary data Correla-
or indirect attack on botnets. Direct attack en- tion and Network-based semantic bot detection
tails mechanisms to directly hurt one or more in section 3.1.2. The taxonomy only highlights
components of a botnet. Indirect attack reduces C&C channels and the remaining work pro-
the usability of the botnet to the botmaster vides a general discussion of rallying, detection
so that there is little incentive to carry on its and response strategies. Trendmicro and SANS
maintenance. published detailed reports [2], [125] explaining
At present, defense against botnets is mostly various botnet components. Dagon et al. [3]
preventive or defensive. Preventive defense classified botnet structures into three models
33

and analyzed each model from the perspective work focuses on evasion mechanisms specific
of response mechanisms. to six research papers related to automated
A subset of botnet research revolves around botnet detection. Moreover, they do not discuss
botnet detection. Bailey et al. [126] provided weaknesses in botnets that defenders can use to
a survey of existing botnet research with em- improve their detection approaches. Garcia et
phasis on botnet technology and defense. The al. [131] laid out a topological map of network-
survey examined botnet detection methods on based botnet detection methods.
the basis of an interesting dimension, data A section of research concerns itself with
sources. The data sources included DNS, Net- the area of botnet defense methods. MAAWG
flow, packet tap, address allocation, honeypot (Messaging Anti-Abuse Working Group) [117]
and host data. The authors posited that this in- published a comprehensive report about how
formation would be a useful metric for choos- large scale bot infections can be mitigated in
ing the detection method that works on the residential networks. A number of reports pro-
data readily available to the interested party. vide a detailed discussion of policies [114], de-
In our botnet detection taxonomy, we have tection, measurement, disinfection and defense
discussed the aspect of data source in all the [6] against botnets. It discusses defense mecha-
subsections and made clear demarcation where nisms against botnets in great detail, however
absolutely necessary. Classification of detection there is no elaborate structure in the way this
techniques on the basis of cooperative behavior, information is presented. Resultantly, there is
signature matching and attack behavior map a chance that the reader will be overwhelmed
to our Correlation and Syntactic detection (sec- and confused as the discussion becomes more
tions 3.1.2, 3.2.2) and secondary data Corre- involved. A number of papers [132], [133] give
lation in section 3.1.2. A similar effort classi- useful insights into botnet defense but do not
fied botnet detection techniques as signature- span the entire gamut of possible botnet coun-
based, anomaly-based, DNS-based and mining- termeasures.
based [127]. Both of these lack the depth and A common thread to all the efforts described
organization that is characteristic of a taxon- above is that they only partially illuminate
omy and is a high level categorization to help the entire botnet phenomenon. A number of
readers understand existing work in this area. works [134], [135], [136], [137] perform a wider
Zeidanloo et al. [128] presented a taxonomy of study of botnets, describing their behavior,
botnet detection. Honeypots have been shown detection, defense and future challenges. Our
as a class of botnet detection. Honeypots are work distinguishes itself from these efforts in
functional to understanding the botnet opera- that we structure existing botnet literature into
tion and assessing vulnerability of a network to a generic taxonomy steering away from case-
the botnet threat, yet it is important to under- specific details. This makes our classification
stand that they do not identify bots or bot fam- system relevant for most, if not all, existing and
ilies per se. The other class, Intrusion Detection future botnets. Furthermore, the organization
System (IDS), is along the lines of established of our work aids understanding by making
taxonomy for IDS [129] with the additional it possible to visualize complex botnet-related
notion of activity in the context of network concepts.
monitoring. The taxonomy provides a good
overview of the existing detection methods
but falls short in distinguishing between the 6 F UTURE T RENDS
different targets of botnet detection methods.
Researchers [130] established a broad frame- In this section, we spell out some botnet trends
work for evaluating the evadability of auto- that we expect to continue into future.
mated bot/botnet detection mechanisms. This Botnets are turning to cloud computing to
bears slight similarity with our overview of expand their potentials. Botnets can utilize
the effect of different botnet features on botnet cloud platforms in two ways; (i) host the C&C
detection in subsection 3.4.2. However, their server(s) on the cloud [138], [139] or (ii) create
34

bots on the cloud (botcloud) instead of infect- 7 C ONCLUSION


ing user machines [140], [141]. Hosting C&C
servers on the cloud is an attractive option The number of users connected to the Internet
because of its on demand pay-as-you-go na- almost doubled in the last five years [152].
ture. Moreover, cloud-based hosting is quite This rate is expected to increase in view of
inexpensive. Another trend, though still mostly technological advancements in the area of wire-
research-based, is to create bots on the cloud less communication. However, the remarkable
(botclouds). This has a number of benefits. A growth in Internet usage is disproportionate
botnet can be created almost instantly in con- to security knowledge of common users [153].
trast to traditional botnets where a substantial Botnets take cyber attacks to the next level
time is wasted in recruiting bots. A botcloud by abusing the aforementioned discrepancy.
uses resources much more efficiently. The bot Botnets employ a variety of mechanisms to
population is always online unlike traditional compromise users’ machines. A botnet distin-
botnets where bots disappear because of users guishes itself from other malware in the abil-
shutting down their computers. Moreover, full ity of its compromised machines to establish
processor and bandwidth utilization can be command and control with remote server(s)
achieved. This is not the case in traditional controlled by human misfeasor(s).
botnets as botnet activity has to be scaled down In this paper, we have presented three com-
enough not to be detected by security software. prehensive taxonomies related to botnet be-
Finally, cloud security is still in a transient havioral features, detection and defense. Our
stage [142] and most existing detection ap- first taxonomy aims to demystify the adver-
proaches do not scale to clouds. Thus clouds sary by exploring the botnet phenomenon from
provide a nice cover to botnets for carrying out different angles, such as propagation, rallying,
their malicious activities. C&C and purpose. Moreover, we have also
We expect to see a surge in botnets based enumerated evasion mechanisms employed by
on smartphones. While researchers [143] have a botnet for obscuring its different parts. The
already indicated the possibility of mobile second taxonomy classifies botnet detection ap-
phones to be used by botnets, it is in recent proaches. We have introduced the notion of
times that we have seen an increase in the ‘dimension’ in structuring mechanisms rele-
number of such botnets [144], [145], [146]. The vant to botnet detection. This empowers a de-
ubiquitous use of mobile devices make them fender to evaluate botnet detection approaches
an important frontier for botnets. Moreover, by different yardsticks. Furthermore, we have
mobile phones can utilize a number of com- shown how different botnet features listed in
munication technologies (3G, 4G, WiFi, SMS), our first taxonomy affect the accuracy of botnet
which multiplies the possibilities for C&C and detection approaches from our second taxon-
malware propagation. omy. Our third taxonomy provides a system-
The issue of Advanced Persistent Threats atic analysis of botnet defense mechanisms.
(APT, section 2.4.1) is expected to exacerbate. Together, these three taxonomies provide a
2012 witnessed the emergence of a multi- comprehensive framework that could be uti-
tude of sophisticated botnets geared to APT, lized to understand the botnet problem and its
such as Flame [147], Duqu [148], Gauss [149], solution space. The insights gained from this
Shamoon [150] and Mahdi [151]. Flame even characterization can be used by defenders to
incorporated an MD5 collision attack to spoof identify shortcomings in existing approaches
Microsoft’s digital certificates. for botnet detection and defense and devise
The popularity and viral nature of social improved strategies.
networks will continue to attract botnets to use We conclude by identifying current trends in
them as spam targets (victim) and for malware botnet detection and defense, followed by fu-
propagation (facilitator). However, the wide ture work. A quarter of the world’s computers
adoption of social networks for C&C commu- were estimated to form part of a botnet half a
nication does not seem likely (section 2.3.2). decade ago [154]. The situation is no different
35

today as botnets are still a primary threat to [5] G. Ollmann, “Botnet Communication Topologies,” 2009.
security of the Internet [155]. The area of botnet [Online]. Available: http://tinyurl.com/8kosjv3
[6] D. Plohmann, E. Gerhards-Padilla, and F. Leder, “Bot-
detection and defense has evolved over the last nets: Detection, measurement, disinfection & defence,”
few years. In particular, legal countermeasures European Network and Information Security Agency,
have succeeded in disbanding a number of Tech. Rep., 2011.
botnets [112], [12], [156], [157]. However, the [7] P. Porras, H. Sadi, V. Yegneswaran, P. Porras,
H. Sadi, and V. Yegneswaran, “A multi-perspective
long-term efficacy of these methods is debat- analysis of the storm (peacomm) worm. available at:
able [158]. Reactive defense approaches like http://www.cyber-ta.org/pubs/stormworm/report,”
removal of C&C servers yields promising re- 2007.
[8] P. Wang, S. Sparks, and C. C. Zou, “An advanced
sults in the short-term. Botnets find a way to hybrid peer-to-peer botnet,” in Proceedings of the
resurrect and resume their malicious activities. first conference on First Workshop on Hot Topics in
There is a pressing need to address the botnet Understanding Botnets. Berkeley, CA, USA: USENIX
Association, 2007, pp. 2–2. [Online]. Available:
problem closer to the source. ISPs can play a http://dl.acm.org/citation.cfm?id=1323128.1323130
significant role in this context. This notion is [9] A. Dainotti, A. King, k. Claffy, F. Papale, and A. Pescapè,
reinforced by China’s success in handling its “Analysis of a ”/0” stealth scan from a botnet,” in
spam problem by extending liability to Chinese Proceedings of the 2012 ACM conference on Internet
measurement conference, ser. IMC ’12. New York,
ISPs [159]. Moreover, preventive detection and NY, USA: ACM, 2012, pp. 1–14. [Online]. Available:
defense approaches at home computers and http://doi.acm.org/10.1145/2398776.2398778
routers would go a long way in curtailing the [10] T. Chen, “Stuxnet, the real start of cyber warfare?” IEEE
Network, vol. 24, no. 6, 2010.
botnet phenomenon. Currently, there are very
[11] J. Baltazar, J. Costoya, and R. Flores, “The real face of
few bot detection tools [160], [161]. Majority koobface : The largest web 2 . 0 botnet explained,”
of the existing detection approaches do not Trend Micro Threat Research, 2009. [Online]. Available:
incorporate defense. It is not practical to del- http://tinyurl.com/c95m86f
[12] T. Werner, “The inside story of the kelihos bot-
egate defense to home users as majority of net takedown,” http://tinyurl.com/3gzmtzd, threat-
them are not technologically competent. These post:The Kaspersky Lab Security News Service, 2011,
insights motivated us to develop a framework [Online; accessed 2-December-2011].
for botnet detection and defense [162]. An in- [13] S. Stover, D. Dittrich, J. Hernandez, and S. Dietrich,
“Analysis of the storm and nugache trojans: P2P is here,”
tegrated strategy where high-level dismantle- in USENIX ;login, vol. 32, no. 6, 2007.
ment of botnet C&C hubs is complemented by [14] T. Holz, M. Steiner, F. Dahl, E. Biersack, and F. Freilling,
detection and defense on home and network “Measurements and mitigation of peer-to-peer-based
botnets: a case study on storm worm,” in Proceedings of
computers is expected to yield promising re- the 1st Usenix Workshop on Large-Scale Exploitsand Emer-
sults in tackling the botnet problem. gent Threats (LEET, Berkeley, CA, USA, 2008.
[15] P. Barford and V. Yegneswaran, “An Inside Look at
Botnets,” in Malware Detection, ser. Advances in Infor-
ACKNOWLEDGMENTS mation Security, M. Christodorescu, S. Jha, D. Maughan,
D. Song, and C. Wang, Eds. Boston, MA: Springer US,
This work is supported by Pakistan National 2007, vol. 27, ch. 8, pp. 171–191. [Online]. Available:
ICT R&D Fund. http://dx.doi.org/10.1007/978-0-387-44599-1 8
[16] I. Arce, E. Levy, and E. Levy, “An analysis of the slapper
worm,” IEEE Security & Privacy, vol. 1, pp. 82–87, 2003.
R EFERENCES [17] J. Stewart, “Phatbot trojan analysis,”
[1] D. Dagon, “Botnet Detection and Response-The network http://tinyurl.com/9srw4gh, 2004, [Online; accessed
is the infection,” Cooperative Association for Internet Data 15-December-2011].
Analysis DNS-OARC Workshop, July, vol. 25, 2005. [18] A. Nappa, A. Fattori, M. Balduzzi, M. Dell’Amico, and
[2] T. Micro, “Taxonomy of botnet threats,” Micro, L. Cavallaro, Detection of Intrusions and Malware, and
pp. 1–15, November 2006. [Online]. Available: Vulnerability Assessment, ser. Lecture Notes in Computer
http://tinyurl.com/c7mlsjo Science. Springer Berlin / Heidelberg, 2010, ch. Take
[3] D. Dagon, G. Gu, C. P. Lee, and W. Lee, “A a Deep Breath: A Stealthy, Resilient and Cost-Effective
taxonomy of botnet structures,” Twenty-Third Annual Botnet Using Skype.
Computer Security Applications Conference ACSAC 2007, [19] A. Berger and M. Hefeeda, “Exploiting sip for botnet
vol. 36, pp. 325–339, 2007. [Online]. Available: communication,” 2009 5th IEEE Workshop on Secure
http://tinyurl.com/8kxuknw Network Protocols, pp. 31–36, 2009. [Online]. Available:
[4] J. Nazario, “Bot and botnet taxonomy,” http://tinyurl.com/8n9rkex
http://tinyurl.com/6bcthj, 2008, [Online; accessed [20] J. Caballero, P. Poosankam, C. Kreibich, and D. Song,
15-December-2011]. “Dispatcher: Enabling active botnet infiltration using
36

automatic protocol reverse-engineering,” in ACM Confer- [36] CWSandBox, “Cwsandbox,” http://mwanalysis.org/,


ence on Computer and Communications Security, Nov 2009. [Online; accessed 15-December-2011].
[21] C. Y. Cho, D. Babi ć, E. C. R. Shin, and [37] A. Mushtaq, “The dead giveaways of vm-aware mal-
D. Song, “Inference and analysis of formal models ware,” http://tinyurl.com/4ejusr2, Jan 2011, [Online;
of botnet command and control protocols,” in accessed 15-December-2011].
Proceedings of the 17th ACM conference on Computer [38] S. Shin and G. Gu, “Conficker and beyond: A large-
and communications security, ser. CCS ’10. New York, scale empirical study,” in Proceedings of Annual Computer
NY, USA: ACM, 2010, pp. 426–439. [Online]. Available: Security Applications Conference (ACSAC), 2010.
http://doi.acm.org/10.1145/1866307.1866355 [39] L. T. Borup, “Peer-to-peer botnets: A case study on
[22] J. Nazario, “Twitter based botnet command and control,” waledac,” Master’s thesis, Technical University of Den-
http://tinyurl.com/8ojo5wl, 2009, [Online; accessed 15- mark, Kongens Lyngby, Denmark, 2009.
December-2011]. [40] M. Antonakakis, R. Perdisci, Y. Nadji, N. Vasiloglou,
[23] E. J. Kartaltepe, J. A. Morales, S. Xu, and R. Sandhu, S. Abu-Nimeh, W. Lee, and D. Dagon, “From
“Social network-based botnet command-and-control: throw-away traffic to bots: detecting the rise
emerging threats and countermeasures,” in Proceedings of dga-based malware,” in Proceedings of the
of the 8th international conference on Applied cryptography 21st USENIX conference on Security symposium,
and network security, ser. ACNS’10. Berlin, Heidelberg: ser. Security’12. Berkeley, CA, USA: USENIX
Springer-Verlag, 2010, pp. 511–528. [Online]. Available: Association, 2012, pp. 24–24. [Online]. Available:
http://dl.acm.org/citation.cfm?id=1894302.1894342 http://dl.acm.org/citation.cfm?id=2362793.2362817
[24] A. Lelli, “Trojan.whitewell: Whats your (bot) facebook [41] V. Tiu, “Information about worm:win32/conficker.d,”
status today?” http://tinyurl.com/yeb5rvb, 2009, [On- http://tinyurl.com/8c84vhl, March 2009, [Online; ac-
line; accessed 15-December-2011]. cessed 15-December-2011].
[25] B. Stone-Gross, M. Cova, L. Cavallaro, B. Gilbert, [42] Tor, “Tor: Anonymity online,”
M. Szyd-lowski, R. Kemmerer, C. Kruegel, and G. Vi- https://www.torproject.org/, [Online; accessed 20-
gna, “Your botnet is my botnet: Analysis of a botnet December-2011].
takeover,” in Proceedings of the 16th ACM conference on [43] D. Brown, “Resilient botnet command and control with
Computer and Communications Security (CCS), Nov 2009. tor.” Presented at DEF CON 18, Las Vegas, Nevada,
[26] R. Westervelt, “Botnet masters turn to google, social net- USA, 2010.
works to avoid detection,” http://tinyurl.com/8ta9nly, [44] Tor, “Tor hidden services,”
2009, [Online; accessed 15-December-2011]. https://www.torproject.org/docs/hidden-services.html.en,
[Online; accessed 12-November-2012].
[27] Damballa, “The command structure of the operation
[45] D. Fisher, “Storm, nugache lead dangerous new botnet
aurora botnet: History, patterns, and findings,” Tech.
barrage,” http://tinyurl.com/8folbmw, 2007, [Online;
Rep., 2010.
accessed 15-December-2011].
[28] R. Deibert, A. Manchanda, R. Rohozinski, N. Villeneuve,
[46] US-Cert, “Malware tunneling in ipv6.”
and G. Walton, “Tracking ghostnet: Investigating
http://tinyurl.com/6zvv568, 2005, [Online; accessed
a cyber espionage network,” Network, vol. JR02-
15-December-2011].
2009, no. JR02-2009, p. 53, 2009. [Online]. Available:
[47] E. Cooke, F. Jahanian, and D. McPherson, “The zombie
http://tinyurl.com/d5q3cj
roundup: Understanding, detecting, and disrupting
[29] A. Cole, M. Mellor, and D. Noyes, “Botnets: The rise of botnets,” ACM USENIX Workshop on Steps to Reducing
the machines,” in Proceedings on the 6th Annual Security Unwanted Traffic on the Internet SRUTI, vol. 7, pp. 39–44,
Conference, 2007. 2005. [Online]. Available: http://tinyurl.com/8h94o9u
[30] D. Wang, S. Savage, and G. M. Voelker, “Juice: A longi- [48] G. Gu, R. Perdisci, J. Zhang, and W. Lee, “Botminer:
tudinal study of an seo campaign,” in NDSS, San Diego, Clustering analysis of network trafficfor protocol- and
CA, USA, 2013. structure-independent botnet detection,” in Usenix Secu-
[31] N. Lewis, “Zeus botnet analysis: Past, present and future rity Symposium, 2008.
threats,” http://tinyurl.com/d3vwcll, 2010, [Online; ac- [49] P. Wurzinger, L. Bilge, T. Holz, J. Gobel, C. Kruegel,
cessed 15-December-2011]. and E. Kirda, “Automatically generating models for
[32] A. Decker, D. Sancho, L. Kharouni, M. Goncharov, and botnet detection,” in European Symposium on Research in
R. McArdle, “Pushdo/cutwail botnet: A study of the Computer Security (ESORICS), 2009.
pushdo/ cutwail botnet,” TrendMicro Labs, Tech. Rep., [50] J. Goebel and T. Holz, “Rishi: Identify bot contaminated
2009. hostsby irc nickname evaluation,” in USENIX Workshop
[33] I. Traynor, “Russia accused of unleashing cyberwar to on HotTopics in Understanding Botnets (HotBots’07), 2007.
disable estonia,” http://tinyurl.com/5pmk5g, May 2007, [51] M. Roesch, “Snort - lightweight intrusion detection for-
[Online; accessed 12-December-2011]. networks,” in Proceedings of USENIX LISA’99), 1999.
[34] B. Stone-Gross, G. S. T. Holz, and G. Vigna., “The under- [52] G. Gu, P. Porras, V. Yegneswaran, M. Fong, and W. Lee,
ground economy of spam: A botmasters perspective of “Bothunter: Detecting malware infection through ids-
coordinating large-scale spam campaigns,” in USENIX driven dialog correlation,” in Usenix Security Symposium,
Workshop on Large-Scale Exploits and Emergent Threats 2007.
(LEET), 2011. [53] L. Zhuang, J. Dunagan, D. Simon, H. Wang, I. Osipkov,
[35] E. Athanasopoulos, A. Makridakis, S. Antonatos, D. An- G. Hulten, and J. Tygar, “Characterizing botnets from
toniades, S. Ioannidis, K. Anagnostakis, and E. Markatos, email spam records,” in USENIX Workshop on Large-Scale
“Antisocial networks: turning a social network into a Exploits and Emergent Threats, 2008.
botnet,” in Proceedings of the 11th Information Security [54] A. Ramachandran, N. Feamster, and D. Dagon, “Re-
Conference, Taipei, Taiwan, 2008. vealing botnet membership using dnsbl counter-
37

intelligence,” in Conference on Steps to Reducing Unwanted [70] A. Karasaridis, B. Rexroad, and D. Hoeflin, “Wide-scale
Traffic on the Internet (SRUTI), 2006. botnet detection and characterization,” in Proceedings of
[55] R.Villamarin-Salomon and J. Brustoloni, “Identifying the first conference on First Workshop on Hot Topics in
botnets using anomaly detection techniques applied to Understanding Botnets (HotBots’07), 2007.
dns traffic,” in Proc. 5th IEEE Consumer Communications [71] T.-F. Yen and M. K. Reiter, “Are your hosts trading
and Networking Conference (CCNC 2008), 2008. or plotting? telling p2p file-sharing and bots apart,” in
[56] H. Choi, H. Lee, H. Lee, and H. Kim, “Botnet detection International Conference on Distributed Computing Systems,
by monitoring group activities in dns traffic,” in Proc. 7th 2010.
IEEE International Conference on Computer and Information [72] E. Passerini, R. Paleari, L. Martignoni, and D. Bruschi,
Technology (CIT 2007), 2007. “Fluxor : Detecting and monitoring fast-flux service
[57] J. Zhang, Y. Xie, F. Yu, D. Soukal, and W. Lee, “Inten- networks,” Detection of Intrusions and Malware and
tion and origination: An inside look at large-scale bot Vulnerability Assessment, pp. 186–206, 2008. [Online].
queries,” in to appear in NDSS, 2013. Available: http://tinyurl.com/c735zn9
[58] L. Aniello, G. Lodi, and R. Baldoni, “Inter-domain [73] T. Holz, C. Gorecki, K. Rieck, and F. Freiling, “Detection
stealthy port scan detection through complex event and Mitigation of Fast-Flux Service Networks,” in
processing,” in Proceedings of the 13th European Workshop Proceedings of NDSS 2008, San Diego, CA, USA, Feb.
on Dependable Computing, ser. EWDC ’11. New York, 2008. [Online]. Available: http://tinyurl.com/9q64vk5
NY, USA: ACM, 2011, pp. 67–72. [Online]. Available: [74] G. Gu, V. Yegneswaran, P. Porras, J. Stoll, and W. Lee,
http://doi.acm.org/10.1145/1978582.1978597 “Active botnet probing to identify obscure command
[59] L. Aniello, G. A. Di Luna, G. Lodi, and and control channels,” in Proceedings of the 26th Annual
R. Baldoni, “A collaborative event processing system Computer Security Applications Conference (ACSAC), 2010.
for protection of critical infrastructures from cyber [75] M. Neugschwandtner, P. M. Comparetti, and C. Platzer,
attacks,” in Proceedings of the 30th international “Detecting malware’s failover c&c strategies with
conference on Computer safety, reliability, and security, squeeze.” in ACSAC, R. H. Zakon, J. P. McDermott, and
ser. SAFECOMP’11. Berlin, Heidelberg: Springer- M. E. Locasto, Eds. ACM, 2011, pp. 21–30. [Online].
Verlag, 2011, pp. 310–323. [Online]. Available: Available: http://tinyurl.com/9wz2uvj
http://dl.acm.org/citation.cfm?id=2041619.2041651 [76] R. Perdisci, W. Lee, and N. Feamster, “Behavioral clus-
[60] F. Doelitzscher, C. Reich, M. Knahl, and N. Clarke, “In- teringof http-based malware and signature generation
cident detection for cloud environments,” in Proceedings using malicious network traces,” in USENIX Symposium
of the Third International Conference on Emerging Network on Networked Systems Design & Implementation (NSDI),
Intelligence (EMERGING 2011), Nov 2011. 2010.
[61] G. Gu, J. Zhang, and W. Lee, “Botsniffer: Detecting bot- [77] C. Rossow and C. J. Dietrich, “Provex: Detecting bot-
net command and control channels in network traffic,” nets with encrypted command and control channels,” in
in Network and Distributed System Security Symposium DIMVA, 2013, pp. 21–40.
(NDSS), 2008. [78] T. Nguyen and G. Armitage, “A survey of techniques
[62] T. Wang and S.-Z. Yu, “Centralized botnet detection by for internet traffic classification using machine
traffic aggregation,” in IEEE International Symposium on learning,” IEEE Communications Surveys Tutorials,
Parallel and Distributed Processing with Applications, 2009. vol. 10, no. 4, pp. 56–76, 2008. [Online]. Available:
[63] T. Strayer, R. Walsh, C. Livadas, and D. Lapsley, “De- http://tinyurl.com/9hpoa3d
tecting botnets with tight command and control,” in [79] C. Livadas, R. Walsh, D. Lapsley, and W. Strayer, “Using
Proceedings 2006 31st IEEE Conference on Local Computer machine learning technliques to identifybotnet traffic,”
Network, 2006. in Proceedings of the 2nd IEEELCN Workshop on Network
[64] T. Strayer, D. Lapsley, R. Walsh, and C. Livadas, Botnet Security, Nov 2006.
detection based on network behavior, ser. Advances in In- [80] G. Jacob, R. Hund, T. Holz, and C. Kruegel, “Jackstraws:
formation Security. Springer, 2008, vol. 36, pp. 1–24. Picking command and control connections from bot
[65] T.-F. Yen and M. K. Reiter, “Traffic aggregation for mal- traffic,” in USENIX Security Symposium, 2011.
ware detection,” in Conference on Detection of Intrusions [81] S. Savage, D. Wetherall, A. R. Karlin, and T. E. Anderson,
and Malware & Vulnerability Assessment (DIMVA), 2008. “Practical network support for ip traceback,” Computer
[66] L. Liu, S. Chen, G. Yan, and Z. Zhang, Communication Review, vol. 30, pp. 295–306, 2000.
“Bottracer:execution-based bot-like malware detection,” [82] D. X. Song and A. Perrig, “Advanced and authenticated
in 11th Information Security Conference, 2008. marking schemes for ip traceback,” in IEEE INFOCOM,
[67] E. Stinson and J. C. Mitchell, “Characterizing bots’ re- 2001, pp. 878–886.
mote control behavior,” in International Conference on [83] S. M. Bellovin, M. Leech, and T. Taylor, “ICMP
Detectionof Intrusions & Malware, and Vulnerability Assess- traceback messages,” Obsolete Internet draft, February
ment (DIMVA), 2007. 2003. [Online]. Available: http://tinyurl.com/be2sa93
[68] W. Lu, M. Tavallaee, and A. A. Ghorbani, “Automatic [84] A. Mankin, D. Massey, C. long Wu, S. F. Wu, and
discovery of botnet communities on large-scale L. Zhang, “On design and evaluation of ”intention-
communication networks,” in Proceedings of the 4th driven” icmp traceback,” in International Conference on
International Symposium on Information, Computer, and Computer Communications and Networks, 2001.
Communications Security, ser. ASIACCS ’09. New York, [85] A. Belenky and N. Ansari, “Ip traceback with determin-
NY, USA: ACM, 2009, pp. 1–10. [Online]. Available: istic packet marking,” 2003.
http://doi.acm.org/10.1145/1533057.1533062 [86] A. Yaar, A. Perrig, and D. Song, “Pi: A path identification
[69] J. Binkley and S. Singh, “An algorithm for anomaly- mechanism to defend against ddos attacks,” in IEEE
based botnet detection,” in Usenix Steps to Reducing Symposium on Security and Privacy, 2003, pp. 93–107.
Unwanted Traffic on the Internet Workshop (SRUTI), 2006. [87] D. Ramsbrock, X. Wang, and X. Jiang, “A first step
38

towards live botmaster traceback,” in Proceedings of : Why size estimates remain challenging,” in USENIX
the 11th international symposium on Recent Advances in Workshop on Hot Topics in Understanding Botnet, 2007.
Intrusion Detection, ser. RAID ’08. Berlin, Heidelberg: [102] J. Zhang, X. Luo, R. Perdisci, G. Gu, W. Lee,
Springer-Verlag, 2008, pp. 59–77. [Online]. Available: and N. Feamster, “Boosting the scalability of botnet
http://dx.doi.org/10.1007/978-3-540-87403-4 4 detection using adaptive traffic sampling,” in Proceedings
[88] A. C. Snoeren, “Hash-based ip traceback,” Computer of the 6th ACM Symposium on Information, Computer and
Communication Review, vol. 31, pp. 3–14, 2001. Communications Security, ser. ASIACCS ’11. New York,
[89] S. Staniford-Chen and L. T. Heberlein, “Holding NY, USA: ACM, 2011, pp. 124–134. [Online]. Available:
intruders accountable on the internet,” in Proceedings http://doi.acm.org/10.1145/1966913.1966930
of the 1995 IEEE Symposium on Security and [103] FireEye, “Next generation threat protection- fireeye
Privacy, ser. SP ’95. Washington, DC, USA: IEEE inc.” http://www.fireeye.com/, [Online; accessed 12-
Computer Society, 1995, pp. 39–. [Online]. Available: December-2011].
http://dl.acm.org/citation.cfm?id=882491.884246 [104] Damballa, “Damballa::homepage,”
[90] Y. Zhang and V. Paxson, “Detecting stepping stones,” in http://www.damballa.com/, [Online; accessed 12-
USENIX Security Symposium, 2000. December-2011].
[91] X. Wang, D. S. Reeves, and S. F. Wu, “Inter-packet de- [105] T. Holz, M. Engelberth, and F. Freiling, “Learning
lay based correlation for tracing encrypted connections more about the underground economy: A case-study of
through stepping stones,” in European Symposium on keylog-gers and dropzones,” in European Symposium on
Research in Computer Security (ESORICS), 2002, pp. 244– Research in Computer Security (ESORICS), 2009.
263. [106] C. Kanich, C. Kreibich, K. Levchenko, B. Enright, G. M.
[92] D. L. Donoho, A. G. Flesia, U. Shankar, V. P. J. Coit, Voelker, V. Paxson, and S. Savage, “Spamalytics: An
, S. Staniford, J. Coit, and S. Staniford, “Multiscale empirical analysis of spam marketing conversion,” in
stepping-stone detection: Detecting pairs of jittered inter- Proceedings of the 15th ACM Conference on Computer and
active streams by exploiting maximum tolerable delay,” Communications Security, Alexandria, Virginia, USA, Oct
in Proc. of The 5th International Symposium on Recent 2008, pp. 3–14.
Advances in Intrusion Detection (RAID). Springer, 2002, [107] R. Ford and S. Gordon, “Cent, five cent, ten cent, dollar:
pp. 17–35. hitting botnets where it really hurts,” in Proceedings of
the 2006 Workshop on New Security Paradigms (NSPW’06).
[93] A. Blum, D. Song, and S. Venkataraman, “Detection of
New York, NY, USA: ACM, 2007, p. 310.
interactive stepping stones: Algorithms and confidence
bounds,” in Conference of Recent Advance in Intrusion De- [108] URIBL, “uribl-website,” http://www.uribl.com/, [On-
tection (RAID), (Sophia Antipolis, French Riviera. Springer, line; accessed 12-December-2011].
2004, pp. 258–277. [109] T. A. Meyer and B. Whateley, “SpamBayes: Effective
open-source, Bayesian based, email classification sys-
[94] L. Zhang, A. G. Persaud, A. Johnson, and Y. Guan, “De-
tem,” in Proceedings of the First Conference on Email and
tection of stepping stone attack under delay and chaff
Anti-Spam (CEAS), 2004.
perturbations,” in International Performance, Computing,
[110] A. Pitsillidis, K. Levchenko, C. Kreibich, C. Kanich,
and Communications Conference, 2006.
G. Voelker, V. Paxson, N. Weaver, and S. Savage, “
[95] X. Wang and D. S. Reeves, “Robust correlation of Botnet Judo: Fighting Spam with Itself ,” in Proceedings
encrypted attack traffic through stepping stones by of the 17th Annual Network and Distributed System Security
manipulation of interpacket delays,” in Proceedings Symposium (NDSS), San Diego, CA, USA, March 2010.
of the 10th ACM conference on Computer and
[111] T. Peng, C. Leckie, and K. Ramamohanarao, “Survey of
communications security, ser. CCS ’03. New York,
network-based defense mechanisms countering the dos
NY, USA: ACM, 2003, pp. 20–29. [Online]. Available:
and ddos problems,” ACM Comput. Surv., vol. 39, April
http://doi.acm.org/10.1145/948109.948115
2007.
[96] W. T. Strayer, C. E. Jones, I. Castineyra, J. B. Levin, [112] P. Bright, “How Operation b107 decapitated the Rus-
and R. R. Hain, “An integrated architecture for attack tock botnet,” http://tinyurl.com/4bajdjx, 2011, [Online;
attribution,” Tech. Rep. BBN REPORT-8384, Dec 2003. accessed 10-December-2011].
[97] K. Yoda and H. Etoh, “Finding a connection chain for [113] B. Furfie, “Laws must change to combat botnets Kasper-
tracing intruders,” in Proceedings of the 6th European sky,” http://tinyurl.com/9tpdyj5, Feb 2011, [Online; ac-
Symposium on Research in Computer Security (ESORICS), cessed 10-December-2011].
2000, pp. 191–205. [114] A.-P. E. C. AEC, “Guide on Policy and Technical Ap-
[98] X. Wang, X. Wang, D. S. Reeves, D. S. Reeves, S. F. proaches against Botnet,” http://tinyurl.com/9b68qmj,
Wu, S. F. Wu, J. Yuill, and J. Yuill, “Sleepy watermark Dec 2008, [Online; accessed 10-December-2011].
tracing: An active network-based intrusion response [115] J. Leyden, “Botnet-harbouring survey fails to accounts
framework,” in Proc. of the 16th International Information for sinkholes,” http://tinyurl.com/cgohqqg, Oct 2010,
Security Conference, 2001, pp. 369–384. [Online; accessed 10-December-2011].
[99] J. B. Grizzard and T. Johns, “Peer-to-peer botnets: [116] G. L. Orgill, G. W. Romney, M. G. Bailey, and P. M.
Overview and case study,” in USENIX Workshop on Hot Orgill, “The urgency for effective user privacy-education
Topics in Understanding Botnets (HotBots07), 2007. to counter social engineering attacks on secure computer
[100] M. A. Rajab, J. Zarfoss, F. Monrose, and A. Terzis, systems,” in Proceedings of the 5th conference on Information
“A multifaceted approach to understanding the bot- technology education, ser. CITC5 ’04. New York, NY, USA:
net phenomenon,” in ACM Internet Measurement Confer- ACM, 2004, pp. 177–181.
ence(IMC), 2006. [117] N. Mody, M. O’Reirdan, S. Masiello, and J. Zebek,
[101] M. Rajab, J. Zarfoss, F. Monrose, and A. Terzis, “My “Common best practices for mitigating large scale bot
botnetis bigger than yours (maybe, better than yours) infections in residential networks,” MAAWG, july 2009.
39

[118] T. Ormerod, L. Wang, M. Debbabi, A. Youssef, H. Bin- modern botnets,” International Journal of Computer Science
salleeh, A. Boukhtouta, and P. Sinha, “Defaming bot- and Information Security, vol. 2, no. 1, 2009.
net toolkits: A bottom-up approach to mitigating the [134] I. Ullah, N. Khan, and H. A. Aboalsamh, “Survey on
threat,” in Proceedings of the 4th International Conference botnet: Its architecture, detection, prevention and miti-
on Emerging Security Information, Systems and Technologies gation,” in ICNSC, 2013, pp. 660–665.
(SECURWARE), 2010. [135] M. Eslahi, R. Salleh, and N. B. Anuar, “Bots and bot-
[119] SecuriTeam, “Sasser worm remote ftpd buffer overflow nets: An overview of characteristics, detection and chal-
exploit code (port 5554),” http://tinyurl.com/cdeyuy6, lenges,” in proceedings of 2012 IEEE International Confer-
2004, [Online; accessed 15-December-2011]. ence on Control System, Computing and Engineering (ICC-
[120] C. Y. Cho and J. Caballero, “Botnet infiltration: SCE), 2012, pp. 349–354.
Finding bugs in botnet command and control,” [136] Z. Zhu, G. Lu, Y. Chen, Z. J. Fu, P. Roberts, and
http://tinyurl.com/cz4j2z4, [Online; accessed 15- K. Han, “Botnet research survey,” in Proceedings of
December-2011]. the 2008 32nd Annual IEEE International Computer
[121] J. R. Douceur, “The sybil attack,” in Proceedings of the Software and Applications Conference, ser. COMPSAC
International workshop on Peer-To-Peer Systems (IPTPS), ’08. Washington, DC, USA: IEEE Computer
March 2002. Society, 2008, pp. 967–972. [Online]. Available:
[122] A. Singh, T.-W. J. Ngan, P. Druschel, and D. S. Wal- http://dx.doi.org/10.1109/COMPSAC.2008.205
lach, “Eclipse attacks on overlay networks: Threats and [137] S. S. C. Silva, R. M. P. Silva, R. C. G.
defenses,” in IEEE International Conference on Computer Pinto, and R. M. Salles, “Botnets: A survey,”
Communications (Infocom), 2006. Computer Networks, Oct. 2012. [Online]. Available:
[123] E. Karamatli, “Modern botnets: A survey and future http://dx.doi.org/10.1016/j.comnet.2012.07.021
directions,” http://tinyurl.com/cesnb7e, Bogazici Uni- [138] L. Bilge, D. Balzarotti, W. Robertson, E. Kirda, and
versity,Turkey, 2011. C. Kruegel, “Disclosure: Detecting botnet command
[124] G. Ollmann, “Serial variant evasion tactics,” 2009. and control servers through large-scale netflow
[Online]. Available: http://tinyurl.com/93jnurm analysis,” in ACSAC, 2012. [Online]. Available:
[125] K. Bong and J. Brozyck, “Managing large botnets,” http://www.iseclab.org/papers/disclosure.pdf
http://tinyurl.com/blcuxbo, 2007, [Online; accessed 15- [139] SecurityFocus, “Zeus botnet finds hold in ama-
December-2011]. zon cloud,” http://www.securityfocus.com/brief/1046,
[126] M. Bailey, E. Cooke, F. Jahanian, Y. Xu, and 2009, [Online; accessed 12-November-2012].
M. Karir, “A survey of botnet technology and [140] I. Burke, “Who needs botnets if you have google?”
defenses,” in Proceedings of the 2009 Cybersecurity Presented at ZaCon2, Johannesburg, South Africa, 2010.
Applications & Technology Conference for Homeland
[141] K. P. Clark, M. Warnier, and F. M. T. Brazier,
Security. Washington, DC, USA: IEEE Computer
“Botclouds - the future of cloud-based botnets,” in
Society, 2009, pp. 299–304. [Online]. Available:
CLOSER, F. Leymann, I. Ivanov, M. van Sinderen,
http://dl.acm.org/citation.cfm?id=1524292.1524347
and B. Shishkov, Eds. SciTePress, 2011, pp. 597–603.
[127] M. Feily, A. Shahrestani, and S. Ramadass, “A survey [Online]. Available: http://tinyurl.com/c6cqkkd
of botnet and botnet detection,” 2009 Third International
[142] Stratsec, “botcloud ? an emerging platform for cyber-
Conference on Emerging Security Information Systems and
attacks,” http://tinyurl.com/cubnghx, 2012, [Online; ac-
Technologies, pp. 268–273, 2009. [Online]. Available:
cessed 12-November-2012].
http://tinyurl.com/9njpehq
[128] H. R. Zeidanloo, M. J. Z. shooshtari, M. . Safari, [143] C. Xiang, F. Binxing, Y. Lihua, L. Xiaoyi, and
P. V. Amoli, and M. Zamani, “A taxonomy of Z. Tianning, “Andbot: towards advanced mobile
botnet detection techniques,” 3rd IEEE International botnets,” in Proceedings of the 4th USENIX
Conference on Computer Science and Information Technology conference on Large-scale exploits and emergent
(ICCSIT), pp. 158–162, 2010. [Online]. Available: threats, ser. LEET’11. Berkeley, CA, USA: USENIX
http://tinyurl.com/9ttpjwm Association, 2011, pp. 11–11. [Online]. Available:
http://dl.acm.org/citation.cfm?id=1972441.1972456
[129] S. Axelsson, “Intrusion Detection Systems: A Survey
and Taxonomy,” Chalmers Univ., Tech. Rep. 99-15, Mar. [144] Symantec, “Android.bmaster: A million-dollar mobile
2000. [Online]. Available: http://tinyurl.com/coej8xx botnet,” http://tinyurl.com/a4bdljv, 2012, [Online; ac-
[130] E. Stinson and J. C. Mitchell, “Towards cessed 12-November-2012].
systematic evaluation of the evadability of [145] K. Inc., “Irc bot for android.”
bot/botnet detection methods,” in Proceedings of http://tinyurl.com/7xrmlcb, 2012, [Online; accessed
the 2nd conference on USENIX Workshop on 12-November-2012].
offensive technologies. Berkeley, CA, USA: USENIX [146] X. Jiang, “Security alert: Anserverbot, new sophisticated
Association, 2008, pp. 5:1–5:9. [Online]. Available: android bot found in alternative android markets.”
http://dl.acm.org/citation.cfm?id=1496702.1496707 http://www.csc.ncsu.edu/faculty/jiang/AnserverBot/,
[131] S. Garcı́a, A. Zunino, and M. Campo, “Survey on 2011, [Online; accessed 12-November-2012].
network-based botnet detection methods,” Security [147] Micorsoft, “Flame malware collision attack explained,”
Comm. Networks, p. n/a, Jun. 2013. [Online]. Available: http://tinyurl.com/dxxlb5j, June 2012, [Online; accessed
http://dx.doi.org/10.1002/sec.800 1-Aug-2012].
[132] F. Leder, T. Werner, and P. Martini, “Proactive botnet [148] L. of Cryptography of Systems Security (CrySyS),
countermeasures an offensive approach,” in Cooperative “Duqu: A stuxnet-like malware found in the wild, tech-
Cyber Defence Centre of Excellence, Tallinn, Estonia, March nical report,” http://tinyurl.com/dxxlb5j, October 2011,
2009. [Online; accessed 1-Aug-2012].
[133] S. Stankovic and D. Simic, “Defense strategies against [149] K. Lab, “Gauss:abnormal distribution,”
40

http://tinyurl.com/8p34yp7, September 2012, [Online; Naurin Rasheed Ramay Naurin Rasheed Ramay received
accessed 12-Dec-2012]. her M.Sc in Security and Privacy as a Fulbright scholar from
[150] ——, “Shamoon the wiper - copycats at work,” Stevens Institute of Technology. She completed her BE in In-
http://tinyurl.com/a9axwgx, August 2012, [Online; ac- formation and Communication Systems from NUST School of
cessed 12-Dec-2012]. Electrical Engineering and Computer Science. She worked as
[151] Seculert, “Mahdi - the cyberwar savior?” a research assistant in the area of Network Security where she
http://tinyurl.com/brp64k4, July 2012, [Online; gained knowledge of botnets, intrusion and anomaly detection
accessed 12-Dec-2012]. methodologies. Her research interests are network security,
[152] InternetWorldStats, “Internet growth statistics,” cryptography and privacy.
http://www.internetworldstats.com/emarketing.htm,
[Online; accessed 12-December-2011].
[153] J. Stanton, K. Stam, P. Mastrangelo, and J. Jolton,
“Analysis of end user security behaviors,” Computers
Security, vol. 24, no. 2, pp. 124–133, 2005. [Online]. Kamran Riaz Khan Kamran Riaz Khan received his BS degree
Available: http://tinyurl.com/b3k6fg6 in Telecommunication Engineering in 2012 from National Uni-
[154] J. Fielding, “25% of all computers on botnets,” versity of Computer and Emerging Sciences, Pakistan. He is
http://tinyurl.com/9e7bdkr, January 2007, [Online; ac- working with SysNet labs at NUCES Islamabad as a Research
cessed 12-December-2011]. Assistant in the area of network security. He has participated
[155] CIOinsight, “Botnets still a major threat, researchers twice in Google Summer of Code as a student, with Ubuntu and
say at rsa,” http://tinyurl.com/cw5bypo, February 2011, Tor in 2010 and 2011 respectively. His research interests include
[Online; accessed 12-December-2011]. privacy protecting technologies and network security.
[156] M. S. Mimoso, “Fbi takes down dns changer
botnet; aided $14 million click fraud scheme,”
http://tinyurl.com/8nltjzf, Nov 2011, [Online; accessed
12-December-2011].
[157] D. Goodin, “Waledac botnet ‘decimated’ by ms take- Affan A. Syed Affan A. Syed is an Assistant Professor in EE
down,” http://tinyurl.com/7apnn9b, March 2010, [On- department and leads the SysNet lab at National University
line; accessed 12-December-2011]. of Computer and Emerging Sciences (NUCES), Pakistan. His
[158] ITN-News, “‘slain’ kelihos botnet still spams from be- research interest spans a wide spectrum; from exploring the
yond the grave,” http://tinyurl.com/9esmb3e, Feb 2012, deep linkages between sensing and energy in cyber-physical
[Online; accessed 1-February-2012]. systems to distributed systems, embedded systems, and com-
[159] R. McMillan, “China cleans up spam problem,” puter security. He has done research in underwater sensor
http://tinyurl.com/clnedlt, February 2011, [Online; ac- networks, network time-synchronization, high-latency medium
cessed 12-December-2011]. access control design, and industrial wireless-mesh networking.
[160] BotHunter, “Bothunter: A network-based botnet diag- He received his B.S. and M.S. in Electrical Engineering from Na-
nostic system,” http://www.bothunter.net/, [Online; ac- tional University of Science and Technology (NUST), Pakistan
cessed 12-December-2011]. and University of Southern California (USC) in 2001, and 2004
[161] TrendMicro, “Rubotted,” http://tinyurl.com/yd62cb8, respectively. He completed his Ph.D. in Computer Science from
[Online; accessed 12-December-2011]. USC in 2009 and then worked as a Post-Doctoral Research
[162] N. R. Ramay, S. Khattak, A. A. Syed, and S. A. Associate at the Information Sciences Institute (ISI) in Marina
Khayam, “Bottleneck: A generalized, flexible and exten- del Rey, CA. He is a member of ACM and IEEE, and part of the
sible framework for botnet defense,” in IEEE Symposium Phi Kappa Phi Honor society.
on Security and Privacy, 2012, May 2012, poster Paper.

Syed Ali Khayam Syed Ali Khayam received his MS and PhD
degree in Electrical Engineering from Michigan State University
in May 2003 and December 2006, respectively. From February
2007 to March 2012, he served as an assistant professor at the
School of Electrical Engineering & Computer Science (SEECS),
National University of Science and Technology (NUST), Pak-
istan. In June 2011, he co-founded a startup company, xFlow
Research which provided consultancy services to several net-
working and semiconductor vendors, including Marvell Tech-
nologies, Broadcom, Dell, Netgear, Cavium Networks, Tellabs
and Calient Technologies. Currently he is with PLUMgrid Inc.,
Sheharbano Khattak Sheharbano Khattak will start her PhD a network infrastructure software vendor. Dr. Khayam has over
in Computer Science at University of Cambridge in October 70 publications in some of the most prestigious conferences
2013. She did MS in Computer and Communication Security and journals, including IEEE Transactions, ACM and Elsevier
with distinction from NUST, Pakistan in 2013 and BS in Com- journals, ACM CCS, IEEE Infocom, RAID and ACSAC. He
puter Science from IIUI, Pakistan in 2010. Her current areas of has received research awards from Nokia Research, Korean
interest include Internet censorship, privacy and anonymity. She Research Foundation, Pakistan National ICT R&D Fund and
has worked on Network Intrusion Detection Systems, botnet Marvell Technologies. He also has 7 patents pending at USPTO,
detection tools and measurement of network data. some of which were indigenously drafted and filed by him.

You might also like