Development and Implementation of A Chat Bot
Development and Implementation of A Chat Bot
Index Terms— Social Media, Sentence Processing, Knowledge ALICE (Artificial Linguistic Internet Computer Entity) is an
Databases, Artificial Intelligence. Internet project, part of the Pandora Project.
This project involves the development of many types of bots
especially chat ones. In ALICE’s webpage, the user can chat
I. INTRODUCTION with an intelligent conversation program, which simulates a
real talk. This way, the user may have problems to realize they
Y ears ago, Alan Turing proposed the question “Can machines
are talking with a robot.
This technology is developed in Java by Dr. Richard S.
think?”. Since then, there have been a large number of Bots Wallace.
pretending to answer this question and pretend to successfully
complete the “Turing’s test” [1]. III. DESIGN
The complexity is clear, and there exist a fair number of
methods to build a Chat Bot. Generally they are implemented III.I Analysis
on IRC channels, trying to cover a wide range of issues and
topics, but also leaving aside so many more opportunity An area of opportunity for the development and
areas. The method exposed in here pretends to take advantage implementation of a Chat Bot is the Social Network Twitter,
of (social interaction electronic platforms) Social Networks, since it parts from a simple concept, the exchange of short
their usage and the general rules to implement a Chat Bot messages no longer than 140 characters which drastically
oriented to a specific topic. With the help of a Relational reduces the amount of information and the way it is published.
Database, to create a dictionary with key words and phrases, The limited number of characters represents a wonderful
the Chat Bot is capable of answering questions, making advantage and an opportunity to improve the Chat Bot
specific searches and keeping a conversation [2]. performance, as it drastically reduces the amount of
The most common ways to use a Chat Bot are 1) Advertising information the bot receives and processes, allowing the
(Spam), 2) Entertainment and 3) Customer Service generation of a more accurate and detailed database with the
(Knowledge Databases). topic it manages.
It is necessary for the bot’s accurate performing, to define
the objective and topics it will have knowledge of, since the
obtained replies are based in the text, phrases and words it
receives from the users. It is also important to sort this phrases
and words by relevance and resemblance, so the answers can
be as correct as they could.
The easiest way for the bot to obtain answers depends, on the
way the users write their messages (Twitts). The bot compares
a Twitt with its database, which, as previously said, is sorted
by relevance of words and phrases, until it finds a suitable
752
Fig. 1. General Chat Bot Process
753
III.III Searches III.VI Analysis of the Messages
In addition to the capability of the bot to reply to users’ Once there is a considerable number of messages that
messages, it also has the ability to perform searches within couldn’t be replied to, an analysis of the messages can be
Twitter, achieving this by using the logic operators AND, OR conducted to have a clear idea of which are the most common
and NOT. Having a limited, but well-defined number of key replies a bot gets that were not contemplated in the Database.
words and phrases to look for, allows the bot to avoid SPAM. This analysis must be done by a human being and can be done
If it exists a clear way to start a conversation, only three or in two ways.
four search terms are necessary to achieve it. The purpose of Analyzing the complete text of each message to understand
performing searches is to find users to start a conversation what it wants to transmit and creating groups of topics that
with, although the most important part of the conversation will have not been tended to, and according to this, generating new
later take place. Since Twitter accounts by default are public entries in the Database that cover them.
Twitter offers a search engine that can be access to perform Also, messages can be divided into words and phrases to
specific searches, its possible to personalize this searches get those that are most frequently shown and generate specific
defining the language of the Twittt, time, geographic zone. answers for them. The problem with this method is that there
It is also possible to search in the profile of the users, this is not a full understanding of what the user is trying to
makes easier to know the users preferences. Performing this express, and it is affected by syntax errors.
kind of searches makes the functionality of the Bot more
accurate. IV. IMPLEMENTATION
For the implementation of this Chat Bot, a web server with
III.IV Mentions Internet access, PHP 5+, MySql and access keys to the Twitter
API were used.
The algorithm used for the implementation of this Bot can
In a Twitter approach, a mention is a Twitt that contains the
work in any other programming language and with any other
username of another person. It is the easiest way to interact
database manager.
among users; therefore it is the bot’s most important part. The
message processing and the reply generation will always start
V. TESTS AND RESULTS
when a mention is received, this is the moment the interaction
between the user and the bot begins. Once a user has sent a To perform the tests, different Twitter accounts were
mention to the bot, it is because the bot has caught the created with different goals 1) @DonPlaticador 2)
attention of said user. From this point on, it is important to @WootterC 3) Siguientescena_.
keep this attention. A real conversation with other users is
different from a search, because there is no way to know what IV.I @DonPlaticador
will the users write or what are they pretending to express; this
is when the database plays an important role. If said database This account’s goal is to entertain. It is the one that most
is well organized and as complete as it can be, the retrieved closely follows the lineaments of a Chat Bot. @DonPlaticador
replies can seem really natural and allow a conversation to resembles a Talking Parrot that lives to party and makes
flow without major problems. company to lonely or bored people.
III.V Contests
754
The Bot was also used to respond the most common
question the people had, question regarding performers shows,
presentations, times, tickets sales and locations.
Once again a wide coverage of digital media was achieved,
with more than two thousand publications in Internet sites.
VI. CONCLUSION
Fig. 3. @DonPlaticador’s Profile
It is difficult to create a Chat Bot if there is no specific goal.
IV.II @WootterC Only by having a good idea of what is intended to be achieved,
and studying thoroughly the way to accomplish it, can good
This account has the goal to raise awareness of a Free results be obtained. A Bot can hardly replace a human being,
Software called Wootter, looking for people possibly interested but it is a great help to accomplish specific objectives with a
in it. To consider this account a success, the project was not limited reach.
sponsored by any other means, using only @WootterC’s Receding from the general use Chat Bots are given, a useful
account. In an average of eight months, as seen in Fig. 4 visits product can be obtained, one that allows the user to have a
from forty-seven different countries were achieved, as well as different experience without feeling plagued with useless and
wide coverage in digital media such as blogs, and invitations senseless information.
to different Free Software events organized by communities or The next step towards improving the performance of these
Universities. Bots, besides phrase or word hierarchy, is adding a numeric
The main activities this account performs are searches; it method to understand the context of the message, to
searches for terms like ‘Free Software’, ‘Open Source’, distinguish the mood and the sense of the sentence, leaving
‘Twitter Client’, it also send some Twitts that are programmed aside grammatical errors. Such grammatical errors generally
to be sent periodically. are not considered and make the final user think the Bot is
programmed incorrectly, or that it just does not work the way
it should.
It would be of great help being able to add other data to the
tables of the Database, in order to have more information that
allows us to select an answer more efficiently. Leaning on the
option Twitter offers to see a conversation history, an extra
field could be defined, containing the IDs of the answers that
should have previously been sent, so the selected answer can
be considered as valid. This way, a context of what the topic
of the conversation is, and what path the conversation is
going, can be obtained.
To successfully implement a Chat Bot, a lot of factors must
Fig. 4. Analytics Wootter
be considered. It is essential to monitor continuously its
IV.II @Siguientescena_ operation at the early stages and, if necessary, make the
appropriate changes. Furthermore, the Database must be
For the fourth edition of the International Festival of persistently updated to add new search terms, keywords, or
answers that are more consistent with the people interacting
Alternative Performing Arts “Siguientescena”, organized in
with the Bot. This makes possible to limit the time to interact
Querétaro City, México, a social media campaign was
with the Bot and don’t have opportunity to know all its
conducted, with the goal of attracting visitors to the event, and
answers or limitations.
giving away tickets for it. The method for giving away the
tickets on Twitter consisted of requesting people to compose a
Tweet with the text “The @Siguientescena_ Festival takes ACKNOWLEDGMENT
@username to backstage”, where “@username” was the
individual being voted to win the tickets. Furthermore, those Especial thanks to, Irving Pérez de León, Iris Selenne
individuals had to be followers of the account to participate. Ramírez Rodriguez, Diego Octavio Ibarra Corona,
A Bot was implemented to count and validate the votes Carlos Alberto Olmos Trejo for their collaboration and
automatically, receiving over two thousand votes and insightful comments.
informing each one if the vote was valid or not, speeding the
process and obliterating account handling errors.
It also performs specific searches that have relevance to the REFERENCES
festival, such as Twitts that contain the name of the [1] Turing, A.M.: Computing Machinery & Intelligence. Mind
performers. LIX(236) (1950)
[2] Sawar, A., Atwell: Chatbots: are they really useful? LDV-Forum Band
(2007)
[3] http://business.twitter.com/
755