0% found this document useful (0 votes)
50 views

Development and Implementation of A Chat Bot

The document describes how to implement a chat bot on Twitter using a database and algorithm. It analyzes Twitter as a platform, outlines the design process including message reception, processing, and generation of replies. Pseudocode is provided to demonstrate the bot's functionality.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOC, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
50 views

Development and Implementation of A Chat Bot

The document describes how to implement a chat bot on Twitter using a database and algorithm. It analyzes Twitter as a platform, outlines the design process including message reception, processing, and generation of replies. Pseudocode is provided to demonstrate the bot's functionality.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOC, PDF, TXT or read online on Scribd
You are on page 1/ 5

2012 Ninthth Internationalt r ti l Conferencer on Inforf rmation Technology-- New Generationsti

Development and Implementation of a Chat Bot


in a Social Network
Salto Martínez Rodrigo, Jacques García Fausto Abraham

ELIZA is an informatics program design in 1966 by Joseph


Abstract— This document describes how to implement a Chat Weizenbaum who was trying to keep a coherent conversation
Bot on the Twitter social network for entertainment and viral with the user. ELIZA searches key words within the text
advertising using a database and a simple algorithm. Having as a written by the user and it replies with a phrase from its
main theme a successfully implementation of a Chat Bot database.
preventing people classify it as SPAM, as a result of this a Twitter
account (@DonPlaticador) that works without the intervention of II.II ALICE
a person and every day earns more followers was obtained.

Index Terms— Social Media, Sentence Processing, Knowledge ALICE (Artificial Linguistic Internet Computer Entity) is an
Databases, Artificial Intelligence. Internet project, part of the Pandora Project.
This project involves the development of many types of bots
especially chat ones. In ALICE’s webpage, the user can chat
I. INTRODUCTION with an intelligent conversation program, which simulates a
real talk. This way, the user may have problems to realize they
Y ears ago, Alan Turing proposed the question “Can machines
are talking with a robot.
This technology is developed in Java by Dr. Richard S.
think?”. Since then, there have been a large number of Bots Wallace.
pretending to answer this question and pretend to successfully
complete the “Turing’s test” [1]. III. DESIGN
The complexity is clear, and there exist a fair number of
methods to build a Chat Bot. Generally they are implemented III.I Analysis
on IRC channels, trying to cover a wide range of issues and
topics, but also leaving aside so many more opportunity An area of opportunity for the development and
areas. The method exposed in here pretends to take advantage implementation of a Chat Bot is the Social Network Twitter,
of (social interaction electronic platforms) Social Networks, since it parts from a simple concept, the exchange of short
their usage and the general rules to implement a Chat Bot messages no longer than 140 characters which drastically
oriented to a specific topic. With the help of a Relational reduces the amount of information and the way it is published.
Database, to create a dictionary with key words and phrases, The limited number of characters represents a wonderful
the Chat Bot is capable of answering questions, making advantage and an opportunity to improve the Chat Bot
specific searches and keeping a conversation [2]. performance, as it drastically reduces the amount of
The most common ways to use a Chat Bot are 1) Advertising information the bot receives and processes, allowing the
(Spam), 2) Entertainment and 3) Customer Service generation of a more accurate and detailed database with the
(Knowledge Databases). topic it manages.
It is necessary for the bot’s accurate performing, to define
the objective and topics it will have knowledge of, since the
obtained replies are based in the text, phrases and words it
receives from the users. It is also important to sort this phrases
and words by relevance and resemblance, so the answers can
be as correct as they could.
The easiest way for the bot to obtain answers depends, on the
way the users write their messages (Twitts). The bot compares
a Twitt with its database, which, as previously said, is sorted
by relevance of words and phrases, until it finds a suitable

978-0-7695-4654-4/12 $26.00 © 2012 IEEE 751


DOI 10.1109/ITNG.2012.147
answer. If this answer cannot be found, the Twitt is saved into the user who is following the bot can look at the generated
database for a posterior analysis to improve its capabilities. answers, so it will not be considered as SPAM.
It is possible to define one or more answers for one or more Once the reply has been sent, the received message’s ID is
key words or phrases, this way the bot can find a suitable reply stored for management purposes; this way, it is plausible to
to synonyms without repeating continuously the same answer, avoid a reply to old messages.
which helps with the task of making it difficult for the users to The full process y shown in Fig. 1.
realize they are not talking to a person.
It is important to highlight that the database does not contain
complete phrases; just the important and significant parts are III.II.IV Pseudocode
in there, avoiding the storing of prepositions or any other data
that does not represent relevant information. Variables:
ROW: DB Result Set.
III.II Algorithm Answer: Flag.
LastID: Twittermessage ID.
The process is divided into three different parts: 1) Message
reception, 2) Message processing, 3) Generation of a suitable Procedimiento:
reply. LastID=0
IF Mentions >0 :
III.II.I Message Reception For each Mentio:
Answer=FALSE
The bot must be capable of receiving messages written by a Get Answer From DB
user, regardless of the platform or the published method being While ROW :
used. The message must have its punctuation marks and If Key Word found &&Username_Sender!
special characters removed. The message must be changed to = Username_Bot:
upper or lower case (which must be previously defined). Get random answer
Furthermore, it must be possible for the bot to know two Send answer
things. One, if the message was generated by the bot itself; and Answer=TRUE
two, if the message has been repeated. If the answer to any of LastID=id twitt
these questions is yes, the message must be rejected and the End (If) End
bot will not work with it. Is important to avoid the bot having (While)
a conversation with itself and to stop the reply to a message IfAnswer== False:
that has already been replied to. InsertTwitt in DB
End (Id)
III.II.II Message Processing End (For each)
End (If)
After the message has been formatted, the bot must look for If LastID>0:
the remaining words in the database. Update DB LastID
The information in the database is stored into a table; one End (If)
field contains the phrases separated by the special character ‘|’
(pipe), another field stores the suitable replies to those key
words, also separated by ‘|’. Finally, a third field with numeric
values, which determinates the relevance of the coincidence, is
used to sort the results and to choose the reply with a higher
level of relevance.
The process consists in looking through all the rows in the
table until a suitable reply can be found; once it gets a positive
result, the process halts regardless of the rows that have been
revised. If the process loops into every row in the table and a
suitable answer can’t be found, the original message is stored
into the database for a subsequent analysis, finishing, with
this, the message processing.

III.II.III Generation of a suitable reply

If the process got a positive result during the previous step,


every possible answer in the chosen row is retrieved; these
answers are not classified in any specific way, for the purpose
of choosing one randomly. Once the random answer has been
picked out, the username that generated the message is added
to the beginning of it, so it can be avoided that every other

752
Fig. 1. General Chat Bot Process

753
III.III Searches III.VI Analysis of the Messages

In addition to the capability of the bot to reply to users’ Once there is a considerable number of messages that
messages, it also has the ability to perform searches within couldn’t be replied to, an analysis of the messages can be
Twitter, achieving this by using the logic operators AND, OR conducted to have a clear idea of which are the most common
and NOT. Having a limited, but well-defined number of key replies a bot gets that were not contemplated in the Database.
words and phrases to look for, allows the bot to avoid SPAM. This analysis must be done by a human being and can be done
If it exists a clear way to start a conversation, only three or in two ways.
four search terms are necessary to achieve it. The purpose of Analyzing the complete text of each message to understand
performing searches is to find users to start a conversation what it wants to transmit and creating groups of topics that
with, although the most important part of the conversation will have not been tended to, and according to this, generating new
later take place. Since Twitter accounts by default are public entries in the Database that cover them.
Twitter offers a search engine that can be access to perform Also, messages can be divided into words and phrases to
specific searches, its possible to personalize this searches get those that are most frequently shown and generate specific
defining the language of the Twittt, time, geographic zone. answers for them. The problem with this method is that there
It is also possible to search in the profile of the users, this is not a full understanding of what the user is trying to
makes easier to know the users preferences. Performing this express, and it is affected by syntax errors.
kind of searches makes the functionality of the Bot more
accurate. IV. IMPLEMENTATION
For the implementation of this Chat Bot, a web server with
III.IV Mentions Internet access, PHP 5+, MySql and access keys to the Twitter
API were used.
The algorithm used for the implementation of this Bot can
In a Twitter approach, a mention is a Twitt that contains the
work in any other programming language and with any other
username of another person. It is the easiest way to interact
database manager.
among users; therefore it is the bot’s most important part. The
message processing and the reply generation will always start
V. TESTS AND RESULTS
when a mention is received, this is the moment the interaction
between the user and the bot begins. Once a user has sent a To perform the tests, different Twitter accounts were
mention to the bot, it is because the bot has caught the created with different goals 1) @DonPlaticador 2)
attention of said user. From this point on, it is important to @WootterC 3) Siguientescena_.
keep this attention. A real conversation with other users is
different from a search, because there is no way to know what IV.I @DonPlaticador
will the users write or what are they pretending to express; this
is when the database plays an important role. If said database This account’s goal is to entertain. It is the one that most
is well organized and as complete as it can be, the retrieved closely follows the lineaments of a Chat Bot. @DonPlaticador
replies can seem really natural and allow a conversation to resembles a Talking Parrot that lives to party and makes
flow without major problems. company to lonely or bored people.

III.V Contests

Companies have started noticing the importance of Twitter


to place their brand and reach more market segments. A very
common practice is offering promos or giving away products,
mainly through activities which main purpose is attracting the
Fig. 2. @DonPlaticador
most number of people to get to know them and talk about
them. Most of these activities have as rules that people must @DonPlaticador uses searches to begin and follow
follow the company and encourage other people to do the conversations. To consider this account a success, it was
same, in order to win. Easing the management of these necessary for people not to report it as SPAM, for it to keep a
activities, validation of said rules, and informing the user of conversation by more than 5 tweets in average without the
the progress of the activity can be automated using a Bot [3]. user noticing that they are talking to a Bot, and that the user
Implementation of a Bot to take control of these tasks
followed the account without them being followed back.
reduces the workload of the person in charge of the account, as
@DonPlaticador was created on July 16, 2010 and up to
most times the amount of replies received in a couple of
today, October 1, 2011, it is still functioning without human
seconds is too large for a human being to process.
intervention, in his profile shown in Fig. 3 @DonPlaticador
has more than five thousand Followers and more than 212’000
Tweets, with followers that do not know it is a Bot so far, or
that think it needs a person to work.

754
The Bot was also used to respond the most common
question the people had, question regarding performers shows,
presentations, times, tickets sales and locations.
Once again a wide coverage of digital media was achieved,
with more than two thousand publications in Internet sites.

VI. CONCLUSION
Fig. 3. @DonPlaticador’s Profile
It is difficult to create a Chat Bot if there is no specific goal.
IV.II @WootterC Only by having a good idea of what is intended to be achieved,
and studying thoroughly the way to accomplish it, can good
This account has the goal to raise awareness of a Free results be obtained. A Bot can hardly replace a human being,
Software called Wootter, looking for people possibly interested but it is a great help to accomplish specific objectives with a
in it. To consider this account a success, the project was not limited reach.
sponsored by any other means, using only @WootterC’s Receding from the general use Chat Bots are given, a useful
account. In an average of eight months, as seen in Fig. 4 visits product can be obtained, one that allows the user to have a
from forty-seven different countries were achieved, as well as different experience without feeling plagued with useless and
wide coverage in digital media such as blogs, and invitations senseless information.
to different Free Software events organized by communities or The next step towards improving the performance of these
Universities. Bots, besides phrase or word hierarchy, is adding a numeric
The main activities this account performs are searches; it method to understand the context of the message, to
searches for terms like ‘Free Software’, ‘Open Source’, distinguish the mood and the sense of the sentence, leaving
‘Twitter Client’, it also send some Twitts that are programmed aside grammatical errors. Such grammatical errors generally
to be sent periodically. are not considered and make the final user think the Bot is
programmed incorrectly, or that it just does not work the way
it should.
It would be of great help being able to add other data to the
tables of the Database, in order to have more information that
allows us to select an answer more efficiently. Leaning on the
option Twitter offers to see a conversation history, an extra
field could be defined, containing the IDs of the answers that
should have previously been sent, so the selected answer can
be considered as valid. This way, a context of what the topic
of the conversation is, and what path the conversation is
going, can be obtained.
To successfully implement a Chat Bot, a lot of factors must
Fig. 4. Analytics Wootter
be considered. It is essential to monitor continuously its
IV.II @Siguientescena_ operation at the early stages and, if necessary, make the
appropriate changes. Furthermore, the Database must be
For the fourth edition of the International Festival of persistently updated to add new search terms, keywords, or
answers that are more consistent with the people interacting
Alternative Performing Arts “Siguientescena”, organized in
with the Bot. This makes possible to limit the time to interact
Querétaro City, México, a social media campaign was
with the Bot and don’t have opportunity to know all its
conducted, with the goal of attracting visitors to the event, and
answers or limitations.
giving away tickets for it. The method for giving away the
tickets on Twitter consisted of requesting people to compose a
Tweet with the text “The @Siguientescena_ Festival takes ACKNOWLEDGMENT
@username to backstage”, where “@username” was the
individual being voted to win the tickets. Furthermore, those Especial thanks to, Irving Pérez de León, Iris Selenne
individuals had to be followers of the account to participate. Ramírez Rodriguez, Diego Octavio Ibarra Corona,
A Bot was implemented to count and validate the votes Carlos Alberto Olmos Trejo for their collaboration and
automatically, receiving over two thousand votes and insightful comments.
informing each one if the vote was valid or not, speeding the
process and obliterating account handling errors.
It also performs specific searches that have relevance to the REFERENCES
festival, such as Twitts that contain the name of the [1] Turing, A.M.: Computing Machinery & Intelligence. Mind
performers. LIX(236) (1950)
[2] Sawar, A., Atwell: Chatbots: are they really useful? LDV-Forum Band
(2007)
[3] http://business.twitter.com/

755

You might also like