Chat Bot Making Process
Chat Bot Making Process
Chat Bot Making Process
7, 2016
Open-source devotee
Textcube maintainer / KOSS Lab.
Physicist / Neuroscientist
Adjunct Professor / Hanyang Univ. (Computer Science)
2
ARE YOU ALREADY LISTENED MY TALK?
Then lets eat again!
3
> RUNME -LOOP=4
Became the first man to get 2 official presenter
shirts in PyCON APAC 2016!
8.13.2016 (in Korean)
And now.
4
*Parody of something. Never mind.
WELCOME TO MY GARAGE!
Tons of garbage here!
5
First with the head, then with the heart.
TensorFlow
0.8 -> 0.9 -> 0.10RC0
7
Clipart* (c) thetomatos.com
INGREDIENTS FOR TODAY'S RECIPE
Data
Test: FAS dataset (26GB)
Tools
TensorFlow + Python 3
Todays insight
Multi-modal Learning models and model chaining
8
Im not sure but
Ill try to explain
the
whole process I did
(in 30 minutes?)
Game screenshot*
9 (c) CAVE
Forkcrane* (c) Iconix
Im not sure but
Ill try to explain
the
whole process I did
(in 30 minutes?)
Game screenshot*
10 (c) CAVE
Forkcrane* (c) Iconix
And I assume that
you already have
experience /
knowledge about
machine learning
and TensorFlow
11
Illustration *(c) marioandluigi97.deviantart.com
THINGS THAT WILL NOT BE COVERED TODAY
Phase space / embedding dimension Vector representation of language
sentence
Recurrent Neural Network (RNN)
Sequence-to-sequence model
GRU cell / LSTM cell
Word2Vec / Senti-Word-Net
Multi-layer stacking
Batch process for training
12
Clip * Idol M@ster the animation / Bandai Namco Games All rights reserved.
NEED TO LEARN?
codeonweb.com
https://www.codeonweb.com/course/@deep-learning-with-tensorflow-tutorials
13
14
15
16
ONE DAY IN SEOUL ITAEWON, 2013
All started with dinner talks of neuroscientists...
17
WHAT IS CHAT BOT?
Chatting bots
One of the
Oldest Human-Computer Interface (HCI) based machines
18
BASIC CHAT BOT COMPONENTS
Natural
Lexical Context Decision Response Lexical
Language
Input Analyzer maker Generator Output
Processor
20
TRADITIONAL CHAT BOTS
Natural
Lexical Context Decision Response Lexical
Language
Input Analyzer maker Generator Output
Processor
Morphemic
analyzer Search engine Templates
Taxonomy Knowledge
analyzer base
21
CHAT-BOTS WITH MACHINE LEARNING
Sentence
Natural
Lexical To Context Decision Response Lexical
Language
Input vector Analyzer maker Generator Output
Processor
converter
General problems
Korean-specific problems
Dynamic type-changes
Photo * amazon.com
BACK TO THE ORIGIN
What I learned for 9 years
27
BRAIN AS A MULTI-MODAL CONTEXT MACHINE
Selection
Functionally orthogonal connection types should
have complementary indicators for smaller dim. /
better presentation
Mixture
Final axes are weighted according to the context
density of mixtures
Weight function
Maximize the state difference in context space
28
One liner:
divide and conquer
INFORMATION PATHWAY DURING
CONVERSATION
During conversation:
2. Send information
1. Preprocessing
3. Context recognition
4. Spread / gather
processes to 6. Postprocessing to
determine answer generate sentence
5. Send conceptual
response to parietal lobe 30
Clipart* (c) cliparts.co
ARCHITECTURING
Separate the dots
Sentence generator
1. Disintegrator
3. Context parser Grammar
engine
4. Decision maker 6. Postprocessing with
using ML model tone engine to
generate sentence
Bot engine
Grammar model
Tone model
33
FINAL STRUCTURE
Deep-learning model Sentence generator
(sentence-to-sentence
+ context-aware word generator)
Knowledge engine
Lexical Disintegrator Context Lexical
Input parser Grammar Tone Output
Emotion engine generator generator
Context memory
35
CREATING ML MODELS
Define
Prepare
input function
train dataset
step function
test dataset
evaluator
Runtime environment
batch
Do
Make
Training
Estimator
Testing
Optimizer
Predicting
36
CREATING ML MODELS
Define
Prepare
input function
train dataset
step function
test dataset
evaluator
Runtime environment
batch
Do
Make
Training
Estimator
Testing
Optimizer
Predicting
37
CREATING ML MODELS
Define
Prepare
input function
train dataset
step function
test dataset
evaluator
Runtime environment
batch
Do
Make
Training
Estimator
Testing
Optimizer
Predicting
38
CREATING ML MODELS
Define
Prepare
input function
train dataset
step function
test dataset
evaluator
Runtime environment
batch
Do
Make
Training
Estimator
Testing
Optimizer
Predicting
39
MODEL CHAIN ORDER
Sentence generator
Context analyzer
Lexical Disintegrator + Lexical
Input Decision maker Output
Grammar Tone
generator generator
Fragmented Fragmented
(Almost) Text with
Normal text text text
Normal text tones
sequence sequence
Context analyzer
Lexical Disintegrator + Lexical
Input Decision maker Output
Grammar Tone
generator generator
Semantic
sequence
Input
Text as conversation
Output
42
DISINTEGRATOR
Rouzeta (https://shleekr.github.io/)
Finite State-based Korean morphological analyzer (2 month ago!)
43
DISINTEGRATOR
get_graining_data_by_disintegration
def get_training_data_by_disintegration(sentence):
disintegrated_sentence = konlpy.tag.Twitter().pos(sentence, norm=True, stem=True)
original_sentence = konlpy.tag.Twitter().pos(sentence)
inputData = []
outputData = []
is_asking = False
for w, t in disintegrated_sentence:
if t not in ['Eomi', 'Josa', 'Number', 'KoreanParticle', 'Punctuation']:
inputData.append(w+/+t)
for w, t in original_sentence:
if t not in ['Number', 'Punctuation']:
outputData.append(w)
if original_sentence[-1][1] == 'Punctuation' and original_sentence[-1][0] == "?":
if len(inputData) != 0 and len(outputData) != 0:
is_asking = True # To extract ask-response raw data
return ' '.join(inputData), ' '.join(outputData), is_asking 44
SAMPLE DISINTEGRATOR
Super simple disintegrator using twitter Korean analyzer (with KoNLPy interface)
[('', 'Noun'), ('', 'Josa'), ('', 'Noun'), ('', 'Noun'), ('', 'Josa'), ('
', 'Noun'), ('', 'Josa'), ('', 'Verb'), ('.', 'Punctuation')]
46
CONVERSATION BOT MODEL
Embedding RNN Sequence-to-sequence model for chit-chat
For testing purpose: 4-layer to 8-layer swallow-learning (without input/output layer)
47
CONTEXT PARSER
Challenges
Continuous conversation
Knowledge engine
Context-aware talks
Context
Ideas parser
Emotion engine
Context memory
Knowledge engine
48
MEMORY AND EMOTION
Context memory as short-term memory
Memorizes current context (variable categories. Tested 4-type situations.)
50
CONVERSATIONAL CONTEXT LOCATOR
Training context space
Context-marked sentences (>20000)
Context: LIFE / CHITCHAT / SCIENCE / TASK
Prepare Generated 1-gram sets with context bit
Train RNN with 1-gram-2-vec
Matching context space
Input bd 1-gram sequence to context space
Take the dominator axis
51
EMOTION ENGINE
Input: text sequence
Output: Emotion flag (6-type / 3bit)
Training set
Sentences with 6-type categorized emotion
Current emotion indicator: the most weighted emotion axis using WordVec model
Position in senti-space:
[0.95, 0.14, 0.01, 0.05, 0.92, 0.23] [1, 0, 0, 0, 0, 0] 0x01
index: 1 2 3 4 5 6 52
Illustration *(c) http://ontotext.fbk.eu/
KNOWLEDGE ENGINE
Advanced topic: Not necessary for chit-chat bots
Searches the tokenized knowledge related to current conversation
Querying information
If target of conversation is query, use knowledge engine result as inputs of sentence
generator
53
SENTENCE GENERATOR
Generates human-understandable sentence as a reply of conversation
Idea
Thinking and speaking is a separate processes in Brain
Models
Consists of two models: Grammar generator + tone generator
55
RNN SEQ2SEQ GRAMMAR MODEL
Simple grammar model (word-based model with GRUCell and RNN Seq2seq / tensorflow translation example)
HIDDEN_SIZE = 25
EMBEDDING_SIZE = 25
56
GRAMMAR GENERATOR
Training set
Make sequence by disintegrating normal sentence
Remove postpositions / conjunction from sequence
Model
3-layer Sequence-to-sequence model (for each encoder / decoder)
Hidden feature size of GRU cell: 25, Embedding dimension for each word: 25.
57
TONE GENERATOR
Tones to make sentence to be more humanized
Every sentence has tones by speaker
The most important part to build the pretty girl chat-bot
Model
3-Layer sequence-to-sequence model
58
TONE GENERATOR
Input: sentence without tones
Output: sentence with tones
Data: Normal sentences from various conversation sources
Training / test set
Remove tones from normal sentences
59
USEFUL TIPS
Sequence-to-sequence model is inappropriate for Bot engine
Easily diverges during training
Response with context-aware need to generate sentence not only from the ask,
but with context-aware data / knowledgebase / decision making process
60
USEFUL TIPS
Sequence-to-sequence model really work well with grammar / tone engine
This is important for todays.
61
TRAINING MODELS
Goal is near here
62
TRAINING BOT MODEL
Input
Disintegrated sentence sequence without postpositions / conjunction
Output
Answer sequence with nouns, pronouns, verbs, adjectives
Learning
Supervised learning (for simple communication model / replaces template)
63
TRAINING BOT MODEL
Training set
FAS log data ( http://antispam.textcube.org )
2006~2016 (from EAS data) / comments on weblogs / log size ~1TB (with spams)
Visited and crawled non-spam data, based on comment link (~26GB / MariaDB)
Preprocessing
Remove non-Korean characters from data
64
TRAINING GRAMMAR GENERATOR
Original data set
Open books without license problem ( https://ko.wikisource.org )
Preprocessing
Input data: disintegrated sentence sequence
65
TRAINING TONE GENERATOR
Original data set
Open books without license problem
Preprocessing
Input data: sentence sequence without tone
66
ONE PAGE SUMMARY
The simplest is the best
67
Lexical
Input
? Disintegrator Disintegrator NLP + StV
Deep-learning model
Context analyzer (sentence-to-sentence Context
+ context-aware word generator) analyzer
[GUESS] [CARE] [PRESENT] +
Knowledge engine Decision
Decision maker Context
parser
Context
memory
maker
Emotion engine
Grammar generator Sentence generator
Response
Grammar generator
generator
Lexical
Output
MAKING BOT
Lets make anime character bot (as I promised)!
69
DATA SOURCE
Subtitle (caption) files of many Animations!
Prototyping
Idol master conversation script (translated by online fans)
Field tests
Animations only with female characters
New data!
Communication script from Idol master 2 / OFA
70
71
DATA CONVERTER
Fetch
Remove
Logo / Ending Character names
Song scripts Nouns
Remove Numbers
Join timestamps : Lines with using
.smi to .srt
.srt files into one .txt and Japanese custom dictionary
blank lines Characters
and (Anime characters,
the next lines Locations,
of them Specific nouns)
Extract Conversations
if last_sentence [-1] == '?': Conversation data Train
conversation.add(( for sequence-to-sequence
last_sentence, Bot model bot model
current_sentence))
subtitle_converter.py pandas
73
CONVENIENCES FOR DEMO
Simple bot engine
ask response sentence similarity match engine (similar to template engine)
No knowledge engine
We just want to talk with him/her.
74
Bot training procedure (initialization)
I tensorflow/stream_executor/dso_loader.cc:108] successfully opened CUDA library libcublas.so locally
I tensorflow/stream_executor/dso_loader.cc:108] successfully opened CUDA library libcudnn.so locally
I tensorflow/stream_executor/dso_loader.cc:108] successfully opened CUDA library libcufft.so locally
I tensorflow/stream_executor/dso_loader.cc:108] successfully opened CUDA library libcuda.so locally
I tensorflow/stream_executor/dso_loader.cc:108] successfully opened CUDA library libcurand.so locally
total conversations: 4217
Transforming...
Total words, asked: 1062, response: 1128
Steps: 0
I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:924] successful NUMA node read from SysFS had
negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
I tensorflow/core/common_runtime/gpu/gpu_init.cc:102] Found device 0 with properties:
name: GeForce GTX 970
major: 5 minor: 2 memoryClockRate (GHz) 1.304
pciBusID 0000:01:00.0
Total memory: 4.00GiB
Free memory: 3.92GiB
I tensorflow/core/common_runtime/gpu/gpu_init.cc:126] DMA: 0
I tensorflow/core/common_runtime/gpu/gpu_init.cc:136] 0: Y
I tensorflow/core/common_runtime/gpu/gpu_device.cc:806] Creating TensorFlow device (/gpu:0) -> (device:
0, name: GeForce GTX 970, pci bus id: 0000:01:00.0)
I tensorflow/core/common_runtime/gpu/pool_allocator.cc:244] PoolAllocator: After 1501 get requests,
put_count=1372 evicted_count=1000 eviction_rate=0.728863 and unsatisfied allocation rate=0.818787
I tensorflow/core/common_runtime/gpu/pool_allocator.cc:256] Raising pool_size_limit_ from 100 to 110
I tensorflow/core/common_runtime/gpu/pool_allocator.cc:244] PoolAllocator: After 2405 get requests,
put_count=2388 evicted_count=1000 eviction_rate=0.41876 and unsatisfied allocation rate=0.432432
I tensorflow/core/common_runtime/gpu/pool_allocator.cc:256] Raising pool_size_limit_ from 256 to 281
Bot model training procedure (after first fitting)
ask: <REP>.
response (pred): NAME <REP>.
response (gold): NAME .
ask: <REP>.
response (pred): <REP>.
response (gold): .
ask: <REP>.
response (pred): <REP>.
response (gold): .
Grammar+Tone model training procedure (initialization)
I tensorflow/stream_executor/dso_loader.cc:108] successfully opened CUDA library libcublas.so locally
I tensorflow/stream_executor/dso_loader.cc:108] successfully opened CUDA library libcudnn.so locally
I tensorflow/stream_executor/dso_loader.cc:108] successfully opened CUDA library libcufft.so locally
I tensorflow/stream_executor/dso_loader.cc:108] successfully opened CUDA library libcuda.so locally
I tensorflow/stream_executor/dso_loader.cc:108] successfully opened CUDA library libcurand.so locally
total line: 7496
Fitting dictionary for disintegrated sentence...
Fitting dictionary for recovered sentence...
Transforming...
Total words pool size: disintegrated: 3800, recovered: 5476
I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:924] successful NUMA node read from SysFS had
negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
I tensorflow/core/common_runtime/gpu/gpu_init.cc:102] Found device 0 with properties:
name: GeForce GTX 970
major: 5 minor: 2 memory
ClockRate (GHz) 1.304
pciBusID 0000:01:00.0
Total memory: 4.00GiB
Free memory: 3.92GiB
I tensorflow/core/common_runtime/gpu/gpu_init.cc:126] DMA: 0
I tensorflow/core/common_runtime/gpu/gpu_init.cc:136] 0: YI
tensorflow/core/common_runtime/gpu/gpu_device.cc:806] Creating TensorFlow device (/gpu:0) -> (device: 0,
name: GeForce GTX 970, pci bus id: 0000:01:00.0)
I tensorflow/core/common_runtime/gpu/pool_allocator.cc:244] PoolAllocator: After 1501 get requests,
put_count=1372 evicted_count=1000 eviction_rate=0.728863 and unsatisfied allocation rate=0.818787
I tensorflow/core/common_runtime/gpu/pool_allocator.cc:256] Raising pool_size_limit_ from 100 to 110
I tensorflow/core/common_runtime/gpu/pool_allocator.cc:244] PoolAllocator: After 2405 get requests,
put_count=2388 evicted_count=1000 eviction_rate=0.41876 and unsatisfied allocation rate=0.432432
I tensorflow/core/common_runtime/gpu/pool_allocator.cc:256] Raising pool_size_limit_ from 256 to 281
Grammar+Tone model training procedure (after first fitting)
disintegrated: NOUN <REP>.
recovered (pred): <REP>.
recovered (gold): NOUN .
disintegrated: <REP>.
recovered (pred): <REP>.
recovered (gold): .
disintegrated: <UNK> .
recovered (pred): <REP>.
recovered (gold): .
disintegrated: <REP>.
recovered (pred): <REP>.
Grammar model
recovered (gold): . converges fast.
100
80
And you must need
60 GPU-accelerated
environment
40 to let them work.
20 Bot training
Grammar training
0
CPU-only GPU(GTX970)
83
USEFUL TIPS FOR ANIME CHARACTER BOT
DO NOT MIX different anime subtitles
Easily diverges during grammar model training. Strange. Huh?
84
AND TACKLES TODAY
From TensorFlow 0.9RC, Estimator/TensorFlowEstimator.restore is removed
and not returned yet
I can create / train model but cannot load model with original code on TF 0.10RC.
Response matcher (match ask sentence and return response from template pool)
85
SERVING
Like peasant in Warcraft (OR workleft?)
86
TELEGRAM API
Why Telegram?
Telegram is my primary messenger
87
SERVING TELEGRAM BOT
Python 3
/etc/supervisor/conf.d/pycon_bot.conf
[program:pycon-bot]
command = /usr/bin/python3 /home/ubuntu/pycon_bot/serve.py
supervisorctl
ubuntu@ip-###-###-###-###:~$ sudo supervisorctl
pycon-bot RUNNING pid 12417, uptime 3:29:52
88
BOT SERVING CODE
/home/ubuntu/pycon_bot/serve.py
from telegram import Updater
from pycon_bot import pycon_bot, error, model_server
bot_server = None
grammar_server = None
def main():
global bot_server, grammar_server
updater = Updater(token=[TOKENS generated via bot_father]')
job_queue = updater.job_queue
dispatcher = updater.dispatcher
dispatcher.addTelegramCommandHandler('start', start)
dispatcher.addTelegramCommandHandler("help", start)
dispatcher.addTelegramMessageHandler(pycon_bot)
dispatcher.addErrorHandler(error)
bot_server = model_server(./bot, ask.vocab, response.vocab)
grammar_server = model_server(./grammar, fragment.vocab, result.vocab)
updater.start_polling()
updater.idle()
if __name__ == '__main__':
main()
89
MODEL SERVER
pycon_bot.model_server
class model_server(self):
""" pickle version of TensorFlow model server """
def __init__(self, model_path='.', x_proc_path='', y_proc_path=''):
self.classifier = learn.TensorFlowEstimator.restore(model_path)
self.X_processor = pickle.loads(open(model_path+'/'+x_proc_path,'rb').read())
self.y_processor = pickle.loads(open(model_path+'/'+y_proc_path,'rb').read())
def predict(input_data):
X_test = X_processor.transform(input_data)
prediction = self.classifier.predict(X_test, axis=2)
return self.y_processor.reverse(prediction)
90
BOT ENGINE CODE
pycon_bot.pycon_bot
def pycon_bot(bot, update):
msg = disintegrate(update.message.text)
raw_response = bot_server.predict(msg)
response = grammar_server.predict(raw_answer)
bot.sendMessage(chat_id=update.message.chat_id, text= '.join(response))
pycon_bot.disintegrate
def disintegrate(sentence):
disintegrated_sentence = konlpy.tag.Twitter().pos(sentence, norm=True,
stem=True)
result = []
for w, t in disintegrated_sentence:
if t not in ['Eomi', 'Josa', 'Number', 'KoreanParticle', 'Punctuation']:
result.append(w)
return ' '.join(result)
91
RESULT
That's one small step for a man, one giant leap for anime fans.
92
And finally... created pretty sad bot.
Reason?
Idol M@sters conversations are mostly about
failure and recover
rather than success.
Illustration * Idol M@aster / Bandai Namco Games. All rights reserved.
SUMMARY
Today
Covers garage chat bot making procedure
94
AND NEXT...
Add Idol Master 2 / OFA game conversation script to current dataset
Suggestion from Shin Yeaji (PyCon APAC staff) and Eunjin Hwang in this week
Train bot with some unknown (to me) animations.
Finish anonymization of FAS data and re-train bot with TensorFlow (almost finished!)
In fact, FAS data-based bot is run by Caffe. (http://caffe.berkeleyvision.org/)
This speak preparation encourages me to migrate my Caffe projects to TensorFlow
Idol M@ster? 96
Internet meme * (c) Marble Entertainment / inven.co.kr
First with the head, then with the heart.
98
SELECTED REFERENCES
De Brabandere, B., Jia, X., Tuytelaars, T., & Van Gool, L. (2016, June 1). Dynamic Filter Networks. arXiv.org.
Noh, H., Seo, P. H., & Han, B. (2015, November 18). Image Question Answering using Convolutional Neural Network with Dynamic Parameter Prediction.
arXiv.org.
Andreas, J., Rohrbach, M., Darrell, T., & Klein, D. (2015, November 10). Neural Module Networks. arXiv.org.
Bengio, S., Vinyals, O., Jaitly, N., & Shazeer, N. (2015, June 10). Scheduled Sampling for Sequence Prediction with Recurrent Neural Networks. arXiv.org.
Jordan, M. I., & Mitchell, T. M. (2015). Machine learning: Trends, perspectives, and prospects. Science (New York, NY), 349(6245), 253255.
http://doi.org/10.1126/science.aac4520
Bahdanau, D., Cho, K., & Bengio, Y. (2014, September 2). Neural Machine Translation by Jointly Learning to Align and Translate. arXiv.org.
Schmidhuber, J. (2014, May 1). Deep Learning in Neural Networks: An Overview. arXiv.org. http://doi.org/10.1016/j.neunet.2014.09.003
Zaremba, W., Sutskever, I., & Vinyals, O. (2014, September 8). Recurrent Neural Network Regularization. arXiv.org.
Mikolov, T., Chen, K., Corrado, G., & Dean, J. (2013, January 17). Efficient Estimation of Word Representations in Vector Space. arXiv.org.
Schmitz, C., Grahl, M., Hotho, A., & Stumme, G. (2007). Network properties of folksonomies. World Wide Web .
Esuli, A., & Sebastiani, F. (2006). Sentiwordnet: A publicly available lexical resource for opinion mining. Presented at the Proceedings of LREC.
99