Original
Original
Original
Guide to Building
AI Applications
Second Edition
REPORT
SECOND EDITION
A Developer’s
Guide to Building
AI Applications
Create Your First Conversational
Application with Microsoft Azure AI
978-1-492-08058-9
[LSI]
Table of Contents
iii
Try AI
Better serve your
customers with AI-
infused solutions.
Build a chatbot. Train
and deploy machine
learning models.
Uncover insights from
your content. Create
intelligent apps.
Explore 12 AI
services – free for
12 months with your
account. Start free >
Get help
with your
project.
Talk to a
sales
specialist >
v
Acknowledgements
Foreword | vii
One of the most compelling applications of AI is in making our
everyday lives better and easier. Since the development of
computing, people have envisioned having meaningful dialogues
with computers, expressing our needs and ideas in the ways we
communicate with each other using natural language: say something
to the computer, and it would respond to you. Conversational AI
shifts the interaction model from domain-specific, machine-driven
commands to conversational interfaces that focus on people and
expression. With conversational AI, developers can make computers
communicate like people, by recognising words, understanding intent
and responding in ways that feel natural and familiar.
viii | Foreword
A Developer’s Guide to
Building AI Applications
Introduction
In this book, we look at the requirements for applying well-tested AI
solutions to everyday problems. To help you explore the possibilities
of AI, we will show you how to create a Virtual Assistant, a
conversational AI application that can understand language, perceive
vast amounts of information and respond intelligently. Along the
way, we will share the many AI resources and capabilities that are
available to developers.
Here is a roadmap to the contents of this book:
‘The Intersection of Data, AI and the Cloud’
This section explains the technological basis for this book and
why these technologies are increasingly offered in the cloud.
‘Microsoft Azure AI’
This section introduces the Microsoft Azure AI platform with a
variety of services, infrastructure and tools to empower
developers to build AI apps and agents and add knowledge
mining and machine learning capabilities. This book focuses on
conversational AI applications and provides pointers to
additional resources for other areas of Azure AI.
‘Conversational AI’
This section discusses the evolution of natural language
processing, Microsoft’s Language Understanding service
(formerly named LUIS) and Bot Framework ecosystem,
common use cases of conversational AI and the development life
cycle of conversational AI applications.
1 Lili Cheng, ‘Why You Shouldn’t Be Afraid of Artificial Intelligence’, Time, January 4,
2018, https://ti.me/2GEkknZ.
2 Lili Cheng, ‘Why You Shouldn’t Be Afraid of Artificial Intelligence’, Time, January 4,
2018, https://ti.me/2GEkknZ.
Microsoft Azure AI | 5
Machine learning
Developers can get access to the advanced machine learning
capabilities of the Azure AI through Azure Machine Learning
(AML) services. AML is a managed cloud service where you can
train, manage and deploy models in the cloud or to edge devices
using Python and tools like Jupyter notebooks. You can even
deploy TensorFlow image classification and recognition models,
using a variety of deep neural networks, to Microsoft’s Project
Brainwave FPGA hardware in Azure for inference and training,
which offers extremely high-scale throughput and low latency.
The book Thoughtful Machine Learning with Python: A Test-
Driven Approach provides a starting point for AI programming
that can be useful for readers interested in using AML.
To help you get started with Azure AI, you can leverage the resources
available on the Azure AI website.
In this book, we will be focusing on showing how you can build a
conversational AI application using Bot Framework.
Conversational AI
Natural language processing (NLP) gives computers the ability to
read, understand and derive meaning from human language. Since
the 1950s, computer scientists have been working on the challenges
of NLP, but limitations in computing power and data sizes hindered
advancements in processing and analysing textual components,
sentiments, parts of speech and the various entities that make up
natural language communication.
That changed in the 2010s. Advances in cloud computing, machine
learning and the availability of vast amounts of text and
conversational data from messaging systems, social media and web
chats have helped us make immense progress in NLP. The
advancements in NLP have made it possible for computers to not
only identify words in text but also to understand the meaning behind
those words and the relationships between them.
6 | Conversational AI
NLP works by analysing a large body of human-generated text and
turning it into machine-readable data. NLP identifies and extracts key
metadata from the text, including:
Entities
NLP identifies entities in text like people, places and things.
Entities can also be pieces of information requiring special
extraction, such as dates and times.
Relations
NLP identifies how entities are related using semantic
information.
Concepts
NLP extracts general concepts from the body of text that do not
explicitly appear. For example, the word ‘excel’ might return
concepts like ‘productivity tools’ and ‘numbers’, even if these
terms do not appear in the text. This is a powerful tool for making
connections that might not seem obvious at first glance.
Sentiment
NLP scores the level of positivity or negativity in the text. This
is useful, for example, to gauge sentiment related to a product or
service. Or, in a customer support context, this functionality is
helpful when determining whether to route a chat to a human
(upon detecting negativity).
Emotions
This is sentiment analysis at a finer granularity. In this case, NLP
classifies not just ‘positive’ and ‘negative’, but ‘anger’, ‘sadness’
and ‘joy’.
Keywords
NLP extracts keywords and phrases to use as a basis for indexing,
searching and sorting.
Categories
NLP creates a hierarchical taxonomy for what the data is about
and places this taxonomy in a high-level category (text
classification). This is useful for applications like recommending
relevant content, generating ads, organising emails and
determining the intent of a user.
In the past, you might have tried to simulate NLP-style capabilities
through rule-based approaches, such as regular expressions or
Conversational AI | 7
decision trees, which struggled at scale to understand the intent of
questions from a human. Or you might have used custom machine
learning models, which required access to specialised expertise, large
datasets and complex tools, limiting their implementation to only
large organisations with the resources to invest.
Now, consider where we are today. Easy-to-use APIs in the cloud
provide NLP capabilities that are powering the widespread use of
conversational AI. From the rise of open source tools to the arrival of
cloud APIs, NLP capabilities that were once solely in the domains of
academia and the research community are now available to a wider
audience across industries.
8 | Conversational AI
Language Understanding also allows developers to continuously
improve the app through active learning. Language Understanding
stores user queries and selects utterances that it is unsure of. You can
then review the utterances, select the intent and mark entities for real-
world utterances. This retrains the language model with more data.
The service integrates with other AI tools in the cloud to power
natural language processing and understanding in apps, bots and
Internet of Things (IoT) devices. Through its Bot Framework,
Microsoft incorporates Language Understanding and other cognitive
services for the development of bots.
With Bot Framework SDK, developers can easily model and build
sophisticated conversation using their favourite programming
languages. Developers can build conversational AI applications that
converse free-form or can also have more guided interactions where
the application provides user choices or possible actions. The
conversation can use simple text or more complex rich cards that
contain text, images and action buttons. Developers can add natural
language interactions and questions and answers that let users interact
with bots in a natural way.
Azure Bot Service enables you to host intelligent, enterprise-grade
conversational AI applications with complete ownership and control
Conversational AI | 9
of your data. Developers can register and connect their bots to users
on Microsoft Teams and Web Chat, Facebook Messenger and more.
To add more intelligence to a conversational AI application, you can
add and customise pretrained API models and Cognitive Services,
including language, speech, knowledge and vision capabilities.
Bot Framework also provides a set of solution accelerators and
templates to help build sophisticated conversational experiences. The
Virtual Assistant solution accelerator brings together all the
supporting components and greatly simplifies the creation of a new
project including basic conversational intents, dispatch integration,
QnA Maker, Application Insights and an automated deployment.
The Power Virtual Agents offering builds on top of the Bot
Framework platform, providing a no-code graphical interface to
create conversational experiences.
10 | Conversational AI
Retail companies are also allowing users to quickly track
packages and get order status updates, while still allowing for a
customer to be transferred to chat with a human agent.
Telecommunications companies are using virtual assistants with
AI capabilities to learn more about customers to deliver rich
customised interactions, grow revenue and increase customer
support teams’ productivity.
Enterprise Assistant
Organisations are using conversational AI to improve employee
engagement, connecting people, tasks, information and services
more effectively with more natural and intuitive interfaces. By
integrating employee assistants with voice and text interfaces
into enterprise devices and existing conversation canvases (e.g.,
Microsoft Teams, Slack and Web Chat), organisations speed up
the process of managing calendars, finding available meeting
rooms, finding people with specific Skills or contacting HR,
Integration with Dynamics, Power Apps, ServiceNow and other
IT providers simplifies accesses for employees and allows them
to easily find the data and perform the tasks that they are looking
for. Integration into searches adds the power to provide
enterprise data in a natural way for users as well.
Call centre optimisation
Integrating a conversational experience into a call centre
telephone communications system can reduce call times with
human agents by clarifying information in advance or resolving
simple requests without the need for a human agent. In addition,
the solution replaces classic interactive voice response (IVR)
solutions with a modern conversational experience and enables a
consistent user experience through the duration of the call, or
until hand-off to a human agent.
Post-call analysis assesses call quality and customer feedback,
with insights available to improve the call flow and optimise the
user experience, increase first contact resolution and meet other
key performance indicators (KPIs).
The same assistant can be exposed through additional text-only
channels, enabling end users to interact through their channel of
choice and increasing the pay-off of the investment by ensuring
all users – whether they are using SMS or richer channels – can
participate.
Conversational AI | 11
In-car voice assistant
Voice-enabled assistants integrated into cars provide drivers and
passengers the ability to perform traditional car operations (e.g.,
navigation, radio) along with productivity-focused scenarios
such as moving meetings when you’r running late, adding items
to your task list and proactive experiences where the car can
suggest tasks to complete based on events such as starting the
engine, travelling home or enabling cruise control. Other use
cases include scheduling a service for a vehicle based on a user’s
preferences for service provider, vehicle location, provider
schedule availability, severity of issue, loaner preference, both
personal and work schedules and many more variables. This is
the power of bringing an automotive supplier’s data into the
picture and illustrates the fully integrated experience possible
through the Virtual Assistant solution.
Hospitality assistant
A Virtual Assistant integrated into a hotel-room device can
provide a broad range of hospitality-focused scenarios:
extending a stay, requesting late checkout, room service,
concierge services and finding local restaurants and attractions.
The app can be linked to a productivity account, opening up more
sophisticated experiences such as alarm calls, weather warnings
and learning patterns across stays.
These are some examples of the types of conversational AI
applications we will be focusing on building in this book. Let’s now
look at the typical workflow for developing a conversational AI
application.
12 | Conversational AI
Figure 2. The typical workflow for developing a conversational AI
application
Design
Developing a bot, like developing websites and applications, should
start with a design for a great experience. When humans interact with
bots, we expect that what we say is understood, what we receive as a
response is appropriate and what we get as a service is delightful. We
expect that, if we leave mid-conversation, the bot will remember
where we left off.
Your bot represents your brand, products and services for your
customers and employees, so it is imperative to start with a design-
led approach to ensure that the goal of the bot meets the explicit or
latent need of the human it serves. To design a delightful experience,
we recommend the best practices of researching targeted users,
defining bot personas, storyboarding bot scenarios, designing
conversation flow and defining an evaluation plan, without
specifying technical development details.
For each of these design activities, here are the key questions to
answer:
Researching targeted users
Who are your users? What are their objectives, needs and
expectations? What is the context for their interaction with the
bot? What does their environment look like? How will your bot
help them? What services should your bot provide them?
Defining bot personas
What should your bot look like (for instance, an avatar)? What
should it be named? Does the bot carry out your organisation’s
values? What is your bot’s personality? Does your bot have a
gender? Can it respond to off-topic questions? What tone of
voice should your bot use? How would your bot handle different
situations? How should your bot respond (with proactive,
reactive or exception management)?
Conversational AI | 13
Storyboarding bot scenarios
What is the user journey for your bot’s targeted users? What
should your bot do and not do? What are the goals and priorities
of your bot’s use cases?
Designing conversation flow
What conversation flows can you expect for your main use
cases? Simple Q and A, push notifications, step-by-step
instructions or more complex interactions?
Defining an evaluation plan
How would you measure success? What measurements do you
want to use to improve your service, and where should you insert
instrumentation?
Before writing code, review the bot design guidelines from
Microsoft’s Bot Framework documentation for best practices.
The Bot Framework provides a set of tools for the design phase,
including:
14 | Conversational AI
Figure 3. View of a .transcript file in Bot Framework Emulator
Build
A bot is a representational state transfer (REST) web service that
communicates with the user by sending and receiving messages and
events from conversational interfaces like chat rooms or Web Chat
widgets. With Microsoft’s Azure Bot Service and Bot Framework,
you can create bots in a variety of development environments and
languages. You can start your bot development in the Azure portal or
use one of the Bot Framework SDK templates for local development.
The templates support the C#, JavaScript and Python languages with
Java support in early preview at the time of writing.
After you build the basic bot, extend its functionality in the ways your
design calls for. You can add NLP capabilities using Language
Understanding, add a knowledge base to answer common questions
using QnA Maker, add capabilities to manage complex conversation
flows and multiple knowledge domains using the Dispatch tool and
add graphics or menus using Adaptive Cards. Additionally,
Microsoft provides command-line tools to help you create, manage
and test these bot assets as part of a DevOps process.
You may access a variety of samples that showcase the
conversational capabilities available through the SDK, including
basic dialogue capabilities such as multi-turn dialogues through more
advanced capabilities such as proactive messaging and
authentication.
Conversational AI | 15
In addition, Microsoft provides a more advanced Virtual Assistant
template, which is recommended as a starting point for building a
more sophisticated conversational experience. It brings together
many best practices for building conversational experiences and
automates the integration of components that have been found to be
highly beneficial by Bot Framework developers.
For example, a conversational experience built on the Virtual
Assistant template allows developers to handle multiple languages,
NLP models for base conversational intents, custom personalities to
answer more general questions, integrated language generation for
more natural responses, an introduction experience for new users,
context switching and Skill support.
In the next section of this book, we will use the Virtual Assistant
template to create a conversational AI application.
Test
To test your conversational AI application, Microsoft provides the
Bot Framework Emulator enabling developers to test conversations
quickly and easily. You can also write unit tests using the Bot
Framework SDK, which can focus on functionality testing of specific
dialogues. Once configured through the Azure portal, your bot can be
reached through a web chat interface, enabling broader testing by end
users early in your development process.
Publish
When you are ready for your bot to be available on the web, either
publish your bot to Azure or to your own web service or data centre
– wherever a normal web application can be hosted.
Connect
Azure Bot Service does most of the work necessary to connect your
bots to a range of channels and devices. Configured through the
Azure portal, you can connect your bots to Facebook Messenger,
Slack, Microsoft Teams, Cortana, email, Telegram, Twilio, LINE
and other channels. You can also use Web Chat widgets to embed
your bots in your websites or mobile applications.
16 | Conversational AI
You can use the Direct Line channel to connect your bot to your own
client application, or the Direct Line Speech channel that enables
low-latency speech interfaces with client applications using the
Microsoft Speech SDK. Thus, you can embed text and speech
experiences into desktop applications, mobile apps and devices such
as cars, speakers and alarm clocks.
Bot Framework and members of the open source community also
provide code-based adapters to connect your bots to other channels,
such as Google Assistant, Amazon Alexa, Webex Teams, websockets
and webhooks.
Evaluate
Recordings of conversations between bots and users provide valuable
business insights to help you evaluate your bot’s performance. At this
phase, best practices include evaluating success metrics that you
defined during the design phase, reviewing instrumentation logs,
collecting user feedback, refining and iterating. Bot Framework
provides sample Application Insights queries and a Power BI
dashboard to help you grasp the full breadth of your bot’s
conversations with users and gain key insights into your bot’s health
and behaviour.
Multimodal Input
The Virtual Assistant provides a range of input mechanisms: text, tap
and speech. This can be extended as needed to include vision through
the integration of vision cognitive services. Additional input types
can easily be integrated, depending on device or canvas capabilities.
A Bot Framework-based conversational experience can also be
extended to support gestures (if available from the end user device),
enabling users to switch between input types as desired.
Adaptive Cards
Adaptive Cards provide graphic capabilities such as cards, images
and buttons inside your Assistant. The cards are platform-agnostic
pieces of UI, authored in JSON, that supported apps and services can
exchange. When delivered to a specific app, the JSON is transformed
into native UI that automatically adapts to its surroundings. It enables
you to design and integrate lightweight UI for all major platforms and
frameworks.
If the conversation canvas has a screen, these cards can be rendered
across a broad range of devices and platforms, thus providing a UX
that is consistent with the service or context in which the card is
embedded. Devices that do not have screens can make use of the
speech-friendly responses provided alongside the Adaptive Cards or
any combination of delivery mechanisms appropriate to the context.
The Virtual Assistant and related Skills work comprehensively with
Adaptive Cards, and their design and branding can be fully
customised to suit your scenario. Figure 6 shows a few examples.
4 In addition, you can follow the Bot Framework quick-start documentation to create a
simpler experience that can be extended for additional scenarios.
5 The architecture and capabilities of the template are described in the online
documentation for the Virtual Assistant template.
Developing Your Virtual Assistant | 23
Online Tutorial: Create a Virtual Assistant
Follow the online tutorial (in C# or TypeScript) to create your first
Virtual Assistant app that greets a new user and handles basic
conversational intents. During this tutorial, you will:
6 You can find details on how to connect to channels in the Bot Framework channels
documentation, with the table of contents on the left linking to additional channel-
specific instructions. In addition, you have the option to connect your Assistant to
Amazon Alexa, Google Home and others through integration done by the Bot Builder
open source community.
Connecting Assistants to Clients and Channels | 25
Optional: Adding Intelligence to
Your Assistant with Skills
A Bot Framework Skill provides a conversational component model
enabling developers to split their Assistant experience into a set of
conversational building blocks, which can be developed
independently of each other and brought together into one unified
experience. This is a common pattern for larger conversational
experiences, whereby there is one ‘parent bot’ that users interact
with which then hands them off to various ‘child’ Skills to handle
certain tasks.
Think about the broad set of common capabilities and dialogues that
developers have traditionally built themselves. Productivity
scenarios are a good example, where each organisation would need
to create its own language models, dialogues, API integration and
responses. The job is then further complicated by the need to support
multiple languages, resulting in a large amount of work required for
any organisation building their own assistant experience.
Bot Framework provides a range of multilanguage open source
Conversational Skills – including Calendar, Email, To Do and Point
of Interest – to reduce this effort. The framework also offers a number
of experimental Skills, including Phone, News, Weather, Music and
IT Service Management.
These Conversational Skills are themselves bots and incorporate
language models, dialogues and integration code. They are built in
the same way as any bot, but can be incorporated through easy
configuration into an existing conversational experience to extend
their capabilities. All aspects of each Skill are completely
customisable by developers, and the full source code is provided on
GitHub alongside the Virtual Assistant.
Organisations can also create Skills for their private use or to share
with other organisations to compose into their own experiences. For
example, a conversational app developed by a meal delivery service
for their own channels (mobile apps, websites and conversational
canvases) can also be exposed as a Skill for household IoT devices
and cars to integrate as appropriate. This highlights a core capability
of Bot Framework and Azure Bot Service: they enable you to write a
Skill once and then provide it through many different channels
(including Alexa and Google Assistant) with a single code base to
reduce duplication across different ecosystems.
Building Responsible AI
The Capgemini Research Institute, in their July 2019 report,
identified that nearly nine in ten organisations have encountered
unintended consequences resulting from the use of AI. The authors
of this report identified their top concerns, which included:
30 | Building Responsible AI
As is true for any technology, trust will ultimately depend on whether AI-
based systems can be operated reliably, safely and consistently – not only
under normal circumstances, but also in unexpected conditions or when
they are under attack.
– Microsoft President Brad Smith, The Future Computed
Building Responsible AI | 31
In order to build trust, it is critical that people understand what a
solution can do and what information that solution collects. Some
critical questions that developers should be asking themselves
include:
32 | Building Responsible AI
About the Authors
Elaine Chang is a leader of product development and customer
success for conversational AI at Microsoft, where she focuses on
solutions including Virtual Assistant Solution Accelerator and Skills.
She has been one of the key product leaders for Microsoft Bot
Framework and has led Azure Bot Service to general availability and
enterprise compliance.
Elaine is a featured speaker at Microsoft Build Conference, Microsoft
Ignite Conference, Microsoft MVP Summit, Microsoft AI Innovate
and more. Elaine is also a strategic innovator, a certified professional
coach and a business leader who advocates driving innovation
through diversity and inclusion.
Darren Jefford has over 20 years of engineering and architect
experience across a variety of industries. While at Microsoft, he has
worked in high-impact, customer-facing roles to architect and deliver
highly complex solutions using a broad range of technologies. In
recent years, he has led some of the first conversational AI projects
for a variety of organisations.
Darren is currently a principal architect in the Bot Framework team
at Microsoft, where he leads the Virtual Assistant team to enable
complex conversational experiences with key customers and the
broader developer ecosystem.
Darren is a regular speaker at Microsoft events and is also the author
of two books focusing on Visual Studio and BizTalk Server.
Building Responsible AI | 33
Get technical articles, ● Keep up on the latest
By developers, sample code and
information on upcoming
technologies
● Connect with your
for developers events in
Microsoft.Source, the
peers at community
events Sign up
Microsoft.Source newsletter curated monthly ● Learn with hands-on
developer community resources
newsletter.