Machine Learning With Matlab
Machine Learning With Matlab
Machine Learning With Matlab
Machine Learning
Basic Concepts
1
Group1
0.9
Group2
0.8
Group3
0.7
Group4
Group5
0.6
Group6
Group7
0.5
Group8
0.4
0.3
0.2
0.1
0
-0.1
0.1
0.2
0.3
0.4
0.5
0.6
Machine Learning
Characteristics and Examples
Characteristics
Lots of data (many variables)
System too complex to know
the governing equation
(e.g., black-box modeling)
Examples
AAA 93.68%
5.55%
0.59%
0.18%
0.00%
0.00%
0.00%
0.00%
AA 2.44%
92.60%
4.03%
0.73%
0.15%
0.00%
0.00%
0.06%
A 0.14%
4.18%
91.02%
3.90%
0.60%
0.08%
0.00%
0.08%
BBB 0.03%
0.23%
7.49%
87.86%
3.78%
0.39%
0.06%
0.16%
BB 0.03%
0.12%
0.73%
8.27%
86.74%
3.28%
0.18%
0.64%
B 0.00%
0.00%
0.11%
0.82%
9.64%
85.37%
2.41%
1.64%
CCC 0.00%
0.00%
0.00%
0.37%
1.84%
6.24%
81.88%
9.67%
D 0.00%
0.00%
0.00%
0.00%
0.00%
0.00%
0.00%
100.00%
AA
BBB
BB
CCC
AAA
Exploration
Modeling
Evaluation
Deployment
MPG
40
Displacement Acceleration
Weight
20
20
10
400
200
4000
2000
Horsepower
200
150
100
50
20
MPG
40
10
20
Acceleration
200
400 2000
Displacement
4000
Weight
50 100150200
Horsepow er
Data Exploration
Interactions Between Variables
8
4
6
8
20
3
20
400
200
4000
200
150
100
50
20
40
MPG
10
20
Acceleration
200
400 2000
Displacement
4000
Weight
f(t)
10
2000
Horsepower
4
6
8
Coordinate Value
Weight
Displacement Acceleration
MPG
40
-2
-1
-4
-2
-6
-3
MPG
50 100150200
Acceleration
Displacement
Weight
-8
Horsepower
Horsepow er
0.1
0.2
0.4
0.5
t
0.6
0.7
0.8
0.9
Andrews Plot
0.3
plymouth satellite
chevrolet impala
ford torino
Glyph Plot
pontiac catalina
ford torino
chevrolet impala
pontiac catalina
Chernoff Faces
7
Machine
Learning
Type of Learning
Categories of Algorithms
Unsupervised
Learning
Clustering
Classification
Supervised
Learning
Develop predictive
model based on both
input and output data
Regression
Unsupervised Learning
Clustering
K-means,
Fuzzy K-means
Hierarchical
Unsupervised
Learning
Clustering
Neural Network
Machine
Learning
Gaussian
Mixture
Classification
Supervised
Learning
Regression
Clustering
Overview
1
What is clustering?
Segment data into groups,
based on data similarity
0.9
0.8
0.7
0.6
0.5
0.4
0.3
0.2
0.1
0
-0.1
0.1
0.2
0.3
0.4
0.5
0.6
10
1
Group1
0.9
Group2
0.8
Group3
0.7
Group4
Group5
0.6
Group6
Group7
0.5
Group8
0.4
0.3
0.2
0.1
0
-0.1
0.1
0.2
0.3
0.4
0.5
0.6
Multi-dimensional dataset
11
Statistics Toolbox
12
Cosine Distance
Useful for clustering variables
Euclidean Distance
Default
>> silhouette(data,clusters)
13
Clustering
Neural Network
Outputs computed by
applying a nonlinear
transfer function with
weighted sum of inputs
Weights
Input
variables
Transfer
function
Output
Variable
Bias
14
Clustering
Neural Network
Multi-layered networks
created by cascading
15
0.8
Weight 2
0.6
0.4
0.2
-0.2
-0.5
0.5
Weight 1
0.6
Weight 2
0.5
0.4
0.3
0.2
0.1
0
-0.2
0.2
Weight 1
0.4
0.6
16
Statistics Toolbox
10
1
0
1
0.8
0.6
0.8
0.6
0.4
0.4
0.2
0.2
0
17
Cluster Analysis
Summary
No method is perfect
K-means,
Fuzzy K-means
(depends on data)
Hierarchical
Process is iterative;
explore different algorithms
Beware of local minima
Clustering
Neural Network
Gaussian
Mixture
18
Exploration
Modeling
Evaluation
Deployment
19
Supervised Learning
Classification for Predictive Modeling
Unsupervised
Learning
Decision Tree
Machine
Learning
Ensemble
Method
Classification
Supervised
Learning
Develop predictive
model based on both
input and output data
Neural Network
Support Vector
Machine
20
Classification
Overview
1
Group1
0.9
What is classification?
Predicting the best group for each point
Learns from labeled observations
Uses input features
Group2
0.8
Group3
0.7
Group4
Group5
0.6
Group6
Group7
0.5
Group8
0.4
0.3
0.2
0.1
0
-0.1
0.1
0.2
0.3
0.4
0.5
0.6
21
Example Classification
Decision Trees
Statistics Toolbox
22
Ensemble Learners
Statistics Toolbox
Overview
1.5
group2
group3
group4
group5
group6
x2
group1
group7
0.5
group8
-0.5
-0.4
-0.2
0.2
0.4
0.6
0.8
1.2
1.4
1.6
x1
Decision Trees
Statistics Toolbox
group1
group2
group3
group5
Evaluate the
>> tree(x_new)
group4
group6
group7
0.5
group8
-0.5
-0.4
-0.2
0.2
0.4
0.6
0.8
1.2
1.4
1.6
x1
24
25
1.5
group1
group2
group3
group4
group5
group6
x2
Statistics Toolbox
group7
0.5
group8
-0.5
-0.4
-0.2
0.2
0.4
0.6
0.8
1.2
1.4
1.6
x1
26
Documentation helps
you choose an
appropriate algorithm for
your particular problem
27
Statistics Toolbox
(as of R2013a)
Overview
4
1
2
Support Vectors
-1
-2
-3
-2
-1
Classification
Summary
Decision Tree
Ensemble
Method
Classification
Neural Network
Support Vector
Machine
29
Supervised Learning
Regression for Predictive Modeling
Unsupervised
Learning
Machine
Learning
Supervised
Learning
Develop predictive
model based on both
input and output data
Linear
Regression
Non-linear
Non-parametric
30
Regression
Statistics Toolbox
Curve Fitting Toolbox
Linear Regression
Common examples:
Straight line
= 0 + 11
Plane
= 0 + 11 +22
Polynomial
Polynomial
with cross terms
= 0 + 112 + 2(1 2) + 3 22
32
Nonlinear Regression
y ~ b0 + b1*cos(x*b3) +
0 + 1 cos 3 + 2 sin 3
b4*sin(x*b3)
Exponential Growth
= 0
@(b,t)(b(1)*exp(b(2)*t)
Logistic Growth
0
@(b,t)(1/(b(1)
+ exp( =
1 + 1
b(2)*x)))
33
Logistic regression
Response variable is binary (true / false)
Results are typically expressed as an odds ratio
Poisson regression
Model count data (non-negative integers)
Response variable comes from a Poisson distribution
34
Interactive environment
Visual tools for exploratory data analysis
Easy to evaluate and choose best algorithm
Apps available to help you get started
(e.g,. neural network tool, curve fitting tool)
35
Multivariate Classification in
the Life Sciences
Classification
with MATLAB
Regression
with MATLAB
36