Machine Learning
Machine Learning
Machine Learning
• Instead of writing code, you feed data to the generic algorithm and it builds its
own logic based on the data.
• Machine Learning (ML) encompasses a lot of things. The field is vast and is
expanding rapidly. It is a branch of Artificial Intelligence(AI). Loosely speaking, ML
is the field of study that gives computer algorithms the ability to learn without
being explicitly programmed. The outcome we want from our computer
algorithm is PREDICTION. This is different from our previous problems where we
wanted the algorithm to solve a specific problem such as finding the best web
page for our search, sorting a list of items, or generating a secure means to
computing a shared secret in cryptography. What are we going to use to predict?
An example application
• An emergency room in a hospital measures 17 variables (e.g., blood
pressure, age, etc.) of newly admitted patients.
• A decision is needed: whether to put a new patient in an intensive-care
unit.
• Due to the high cost of ICU, those patients who may survive less than a
month are given higher priority.
• Problem: to predict high-risk patients and discriminate them from low-
risk patients.
11
Another application
• A credit card company receives thousands of applications for new cards.
Each application contains information about an applicant,
• age
• Marital status
• annual salary
• outstanding debts
• credit rating
• etc.
• Problem: to decide whether an application should be approved, or to
classify applications into two categories, approved and not approved.
14
Machine Learning Types
• Supervised Learning
• Uses labeled data
• Results compared with the correct answer.
• Requires large amounts of data to refine the model and produce more
accurate results.
• Common Techniques: Classification , Regression
• Use Cases: Fraud Detection, Image Recognition
• Unsupervised Learning
• Working with unlabeled data.
• A learning algorithms is used to detect patterns
• Most common unsupervised learning technique is clustering which takes
unlabeled data and uses algorithms to put similar items into groups.
• Use cases: Customer segmentation, sentiment analysis
Reinforcement Learning
• Through this trial-and-error process
• learning was improved based on positive and negative reinforcement.
• Use Cases : Games, Robotics
Algorithm Use Case Example Outcome
Liner Regression
Supervised Estimating product price elasticity
Logistics Regression Classify customers on likeliness to repay a loan
Learning
Linear / Quadratic Discriminant Analysis Classify customer on likeliness to repay a loan
Used when we know Decision Tree Find attributes in a product that make it likely for purchase Descriptive
the classification of
Naïve Bayes Analyze sentiments to assess product perception
data and what to What Happened?
predict Support Vector Machine Analyze sentiments to assess product perception
Random Forest Predict power usage in a distribution grid
AdaBoost Detect fraudulent activity in a credit card
Machine Learning
Unsupervised
K Means Clustering Segment customers into groups by characteristics
Learning
Gaussian Mixture Model Segment customers based on less distinctive characteristics Predictive
Used when we don’t
know the classification Hierarchical clustering Inform product usage by grouping customers What Will Happen?
of data and want the
Recommender System Recommend news article to a readers based on what they are
algorithm to classify
data currently reading
Reinforcement
Learning Balance the load on electricity grids in varying demand cycles
Used when we don’t Optimize the driving behavior of self-driving cars Prescriptive
have training data and Finding real time pricing during a product auction What To Do?
only way to learn
about the
environment is to
learn with it
Machine Learning today is extensively used and has well defined
Algorithms, Tools and Technology while other AI technologies are confined
to vendor provided solutions…
Machine Learning
• Regression
• Python • Decision Tree • Scikit learn
• Hadoop • Naïve Bayes • Shogun
• Java • Support Vector • Apache Mahout
Technology • Machine • H2O
•
R
MATLAB
Algorithm • Random Forest Tools • Cloudera Oryx
• ELM • AdaBoost • GoLearn
• Scala • Gradient-boosting • Weka
trees
Landscape of ML Solutions DYI
Skymind's DL4J
Salesforce Einstein
Caffe
SAP Clea Google's TensorFlow Theano
Microsoft Cognitive Toolkit
H2O.ai's Deep Water
Business Application
Baidu's Pebble Intel BigDL
Users Engineers
Amazon Web Services' (AWS)
Embedded Machine-Learning Apache MXNet
Machine Learning APIs
Source: "Magic Quadrant for Data Science and Machine-Learning Platforms," 22 February 2018. (G00326456)
ML
Engineers
Data R, Python,
Scientists Data Science and Scala, Matlab
Data
Analysts Augmented Analytics Machine-Learning
Platforms Deep-Learning
Frameworks
Data Analysis
Software Intel Nervana Deep-Learning
Microsoft Azure Cloud Platforms
Buy Rescale AWS Deep-Learning Hardware
Google Cloud Platform Nvidia, AMD, IBM, Intel
14 © 2018 Gartner, Inc. and/or its affiliates. All rights reserved.
Application Example:
Natural Language Processing
38
An example
• Data: Loan application data
• Task: Predict whether a loan should be approved or not.
• Performance measure: accuracy.
Regression
Models Models
From Business Problem to Machine Learning
Problem: A Recipe
Step-by-step “recipe” for qualifying a business problem as a machine
learning problem
1. Do you need machine learning?
2. Can you formulate your problem clearly?
3. Do you have sufficient examples?
4. Does your problem have a regular pattern?
5. Can you find meaningful representations of your data?
6. How do you define success?
When to use machine learning
Problem formulation
2. Can you formulate your problem clearly?
• What do you want to predict given which input?
• Pattern: “given X, predict Y”
o What is the input?
o What is the output?
Example: sentiment analysis
• Given a customer review, predict its sentiment
• Input: customer review text
• Output: positive, negative, neutral
Collecting data
3. Do you have sufficient examples?
Machine learning always requires data!
Generally, the more data, the better
Each example must contain two parts (supervised learning)
o Features: attributes of the example
o Label: the answer you want to predict
Example: sentiment analysis
• Thousands of customer reviews and ratings from the Web
Regularities in the data
4. Does your problem have a regular pattern?
• Machine learning learns regularities and patterns
• Hard to learn patterns that are rare or irregular
2. Train a model
on training set
Training Set
1. Split data into
Model
training & testing subsets
Data with
3. Make predictions
Inputs & labels
on the testing set
Testing Set
Underfitting Overfitting
Easy to be good
Error
Predictor is too "simplistic" on the training data
Test set
Cannot capture the pattern Predictor is too "powerful"
Rote learning
Training set
Low High
Model complexity
55
An example: the learning task
• Learn a classification model from the data
• Use the model to classify future loan applications into
• Yes (approved) and
• No (not approved)
• What is the class for following case/instance?
56
Machine Learning : Supervised
and Unsupervised learning
Supervised learning vs. unsupervised learning
• Supervised learning: discover patterns in the data that relate data
attributes with a target (class) attribute.
• These patterns are then utilized to predict the values of the target attribute
in future data instances.
61
Differences between Supervised vs.
unsupervised Learning
• Supervised learning: classification is seen as supervised learning from
examples.
• Supervision: The data (observations, measurements, etc.) are labeled with
pre-defined classes. It is like that a “teacher” gives the classes (supervision).
• Test data are classified into these classes too.
• Unsupervised learning (clustering)
• Class labels of the data are unknown
• Given a set of data, the task is to establish the existence of classes or clusters
in the data
Difference between Classification and Clustering
Classification Clustering
• Classification is used in supervised learning technique where • Clustering is used in unsupervised learning where similar
predefined labels are assigned to instances by properties instances are grouped, based on their features or properties.
• Classification is the process of learning a model that • Clustering is a technique of organising a group of data into
elucidate different predetermined classes of data. It is a two- classes and clusters where the objects reside inside a cluster
step process, comprised of a learning step and will have high similarity and the objects of two clusters would
a classification step. In learning step, a classification model is be dissimilar to each other.
constructed and classification step the constructed model is
used to prefigure the class labels for given data. • The main target of clustering is to divide the whole data into
multiple clusters. Unlike classification process, here the class
• Classification Techniques: Decision Trees, KNN, Regression, labels of objects are not known before, and clustering
Naïve Bayes pertains to unsupervised learning.
• Example: In a banking application, the customer who applies • In clustering, the similarity between two objects is measured
for a loan may be classified as a safe and risky according to by the similarity function where the distance between those
his/her age and salary. The produced model could be in the two object is measured. Shorter the distance higher the
form of a decision tree or in a set of rules. similarity, conversely longer the distance higher the
dissimilarity.
• Classification techniques : decision tree, neural networks,
logistic regression, etc. • Clustering Techniques: K Mean
• Example: Customer Segmentation
Supervised learning process: two steps
Learning (training): Learn a model using the training data
Testing: Test the model using unseen test data to assess the
model accuracy
65
Machine Learning in Enterprise Computing
Machine learning Machine learning
New
Model Result
Data
Update by Retraining
Traditional rule-based approach vs. machine
learning
Fundamental assumption of learning
Assumption: The distribution of training examples is identical to the
distribution of test examples (including future unseen examples).
Some 8s from the MNIST data set
If you think about it, everything is just numbers
The neural network we made in Part 2 only took in a three numbers as
the input (“3” bedrooms, “2000” sq. feet , etc.). But now we want to
process images with our neural network. How in the world do we feed
• To a computer, an image is really just a grid of
numbers that represent how dark each pixel is:
To feed an image into our neural network, we simply treat the
18x18 pixel image as an array of 324 numbers:
To handle 324 inputs, we’ll just enlarge our neural
network to have 324 input nodes:
Training Data
Mmm… sweet, sweet training data
Clustering (Unsupervised Learning)
• Clustering is a technique for finding similarity groups in data, called clusters.
I.e.,
• it groups data instances that are similar to (near) each other in one cluster and data
instances that are very different (far away) from each other into different clusters.
• Clustering is often called an unsupervised learning task as no class values
denoting an a priori grouping of the data instances are given, which is the
case in supervised learning.
• Due to historical reasons, clustering is often considered synonymous with
unsupervised learning.
• In fact, association rule mining is also unsupervised
An illustration
• The data set has three natural groups of data points, i.e., 3
natural clusters.
100
What is clustering for?
Let us see some real-life examples
• Example 1: groups people of similar sizes together to make “small”,
“medium” and “large” T-Shirts.
• Tailor-made for each person: too expensive
• One-size-fits-all: does not fit all.
• Example 2: In marketing, segment customers according to their
similarities
• To do targeted marketing.
What is clustering for?
• Example 3: Given a collection of text documents, we want to organize
them according to their content similarities,
• To produce a topic hierarchy
• In fact, clustering is one of the most utilized data mining techniques.
• It has a long history, and used in almost every field, e.g., medicine, psychology,
botany, sociology, biology, archeology, marketing, insurance, libraries, etc.
• In recent years, due to the rapid increase of online documents, text clustering
becomes important.
K-Means Clustering (Unsupervised Learning
Clustering)
• The k-Means clustering algorithm, which is effective for large datasets, puts
similar, unlabeled data into different groups.
• The first step is to select k, which is the number of clusters; generally by
visualizations of that data to see if there are noticeable grouping areas.
Algorithm:
• Pick a number k of random cluster centers
• Assign every item to its nearest cluster center using a distance metric
• Move each cluster center to the mean of its assigned items
• Repeat 2-3 until convergence (change in cluster assignment less than a threshold)
Association
• Association rules help establish associations amongst data objects
inside large databases.
• This unsupervised technique is about discovering interesting
relationships between variables in large databases. For example,
people that buy a new home most likely to buy new furniture.
• Other Examples:
• A subgroup of cancer patients grouped by their gene expression
measurements
• Groups of shopper based on their browsing and purchasing histories
• Movie group by the rating given by movies viewers