0% found this document useful (0 votes)

9 views58 pages

Learning Algorithm

Uploaded by

Lê Phan Công Hiếu

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

9 views58 pages

Learning Algorithm

Uploaded by

Lê Phan Công Hiếu

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 58

Introduction to

Artificial Intelligent
(AI)
4. Learning
Algorithms
Motivation
Real world example:
• Fish packing plant: separate sea
bass from salmon using optical
sensing
• Features: Physical differences such
as length, lightness, width, number
and shape of fins, position of the
mouth
• Noise: variations in lighting,
position of the fish on the conveyor,
“static” due to the electronics of 2
Motivation

Histograms for the length feature for the two categories

3
Motivation

Histograms for the lightness feature for the two categories

4
decision boundary

The two features of lightness and width for sea bass and salmon

How would our system automatically determine the decision boundary? 5

Loss

Loss is the function of error over training data

Error is the difference between a single actual value and a single predicted
value 6
Loss

Regression loss functions

Python code
import numpy as np
def rmse(predictions, targets):
differences = predictions - targets
differences_squared = differences ** 2
mean_of_differences_squared =
Mean square loss differences_squared.mean()
rmse_val = np.sqrt(mean_of_differences_squared)
return rmse_val

Python code
import numpy as np
def mae(predictions, targets):
differences = predictions - targets
absolute_differences = np.absolute(differences)
mean_absolute_differences =
Mean absolute loss absolute_differences.mean()
return mean_absolute_differences 7
Learning algorithms

1969: Strassen's  1950s - 1970s: Early Foundations

algorithm for  1957: Perceptron (Frank Rosenblatt) – One of the earliest neural networks
matrix designed for binary classification.
multiplication
 1960s: K-nearest neighbors (KNN) – A simple instance-based learning
method developed for classification tasks.
 1980s: The Rise of Neural Networks
 1980: Multi-layer Perceptron training by Backpropagation – Developed by
Paul Werbos, later popularized in the 1980s for training neural networks.
 1990s: Advancements in Ensemble Methods and Optimization
All algorithms  1995: Random Forest (Leo Breiman) – A decision tree-based ensemble
are implemented learning technique that reduces overfitting.
on the CPU
 1995: Support Vector Machines gain practical relevance with the advent of
kernel methods.

8
Learning algorithms
 2000s: Kernel Methods and Probabilistic Models
2006: NVIDIA
release CUDA  2001: Adaboost – An adaptive boosting method developed by Yoav Freund and
Robert Schapire.
2009: Andrew Ng
utilized GPUs to  2010s: Deep Learning Revolution
accelerate the  2012: AlexNet (Krizhevsky et al.) – A deep convolutional neural network that
training of large won the ImageNet competition, leading to breakthroughs in computer vision.
neural networks  2014: Generative Adversarial Networks (GANs) (Ian Goodfellow et al.) –
Introduced a new framework for generating synthetic data through adversarial
learning.
 2017: Transformers (Vaswani et al.) – Revolutionized natural language
processing (NLP) by eliminating the need for recurrent neural networks.
 2020s: Scalable AI and Further Innovations
 2020: GPT-3 (OpenAI) – A large-scale transformer-based model demonstrating
significant progress in language understanding and generation.

9
Random forest

10
Adaboost
decision stumps or decision trees

‘Boosting’ : a family of algorithms which converts weak learners to

strong learners.
𝑛
𝐻 ( 𝑥 ) =𝑠𝑖𝑔𝑛 ∑ 𝛼 𝑖 h𝑖 ( 𝑥 )
𝑖=1
: learners
: weight of the leaner

11
Adaboost

12
Adaboost
 Weak learners for image recognition

Haar filters
Common features 160,000+ possible features
associated with each 24 x 24 window

13
Cascade filter

14
Cascade filter
Prepare data

Negative Images Positive Images

images which do not contain the target object images which contain the target object

A proportion of 2:1 or higher between negative and positive samples is considered accept
15
Cascade filter

16
Biological Neurons

◉ A typical biological neuron is composed of:

○ A cell body;
○ Dendrites: input channels
○ Axon: output cable; it usually branches.

17
Biological Neurons

◉ The major job of neuron:

○ receives information, usually in the form of electrical pulses, from
many other neurons.
○ sum these inputs in a complex dynamic way
○ sends out information in the form of a stream of electrical impulses
down its axon and on to many other neurons.
○ The connections (synapses) are crucial for excitation, inhibition or
modulation of the cells.
○ Learning is possible by adjusting the synapses!

How to build a mathematical model of the neuron?

18
Model of a neuron

◉ Simplest model
inputs
outputs
System

19
Model of a neuron

Input x
outputs
System

𝒎
𝑅𝑒𝑙𝑎𝑡𝑖𝑜𝑠h𝑖𝑝 𝒚 = ∑ 𝒙 𝒊
𝒊=𝟏
But:
• The neuron only fires when it is sufficiently
excited
• Firing rate has an upper bound
20
Model of a neuron

◉ Modified model:

○ b: threshold bias → the neuron will not fire till it “high” enough.

Based upon this model, is it possible for the inputs to

inhibit the activation of the neuron?
The synaptic weights!

21
Model of a neuron

𝒎
𝒖𝒌 = ∑ 𝒘 𝒊 𝒙 𝒊 𝒚 𝒌 =𝝋(𝒖𝒌 +𝒃𝒌 )
𝒊=𝟏

𝒗 𝒌 =𝒖𝒌 +𝒃𝒌
22
Model of a Neuron

◉ Three basic components for the model of a neuron:

○ A set of synapses or connecting links: characterized by a weight or
strength of its own.
○ An adder for summing the input signals, weighted by the
respective synapses of the neuron (a linear combiner).
○ An activation function for limiting the amplitude of the neuron
output
◉ Mathematical model:
𝒎
𝒖𝒌 = ∑ 𝒘 𝒊 𝒙 𝒊 𝒚 𝒌 =𝝋(𝒖𝒌 +𝒃𝒌 )
𝒊=𝟏

23
Type of Activation (Squash) Functions

◉ Threshold function
(McCulloch-Pitts model -
1943)

◉ Piecewise-linear function

24
Type of Activation (Squash) Functions

◉ Logistic function:

◉ Hyperbolic tangent
function:

25
Type of Activation (Squash) Functions

◉ Gaussian functions

26
Network Architectures

◉ Network architecture defines how nodes are connected.

27
Learning in neural networks

◉ Learning is a process by which the free parameters of a

neural network are adapted through a process of
stimulation by the environment in which the network is
embedded.
◉ Process of learning:
○ The NN is stimulated by an environment
○ The NN undergoes changes in its free parameters as a result of
this stimulation.
○ The NN responds in a new way to the environment because of the
changes that have occurred in its internal structure.

How can the network adjust the weights?

28
Simplest neural network: Perceptron

◉ Perceptron is built around the McCulloch-Pitts model.

29
Perceptron
◉ Goal: To correctly classify the set of externally
applied stimuli x1, x2,…, xn into one of two
classes, C1 and C2.

the input vector: the weight

vector:
Where n denotes the iteration step
Perceptron

◉ Output of the neuron y(n)

◉ What is the decision boundary?

31
Decision boundary

Decision boundary

◉ m = 1: ?
◉ m=2?
◉ m=2?

◉ How to choose the

proper weights?

32
Selection of weights

Two basic methods can be employed to select a suitable weight

vector
◉ By off-line calculation of weights (without learning).
○ Possible if the system is relative simple
◉ By learning procedure
○ The weight vector is determined from a given
(training) set of input-output vectors (exemplars) in
such a way to achieve the best classification of the
training vectors

33
Off-line calculation of weights

Example
Truth table of NAND
Three points (0,0), (0,1) and
(1,0) belong to one class. And (1,1)
belong to another class.

The decision boundary is the straight line described by the following

equation
x1 + x2 = 1.5 or − x1 − x2 + 1.5 = 0

 w = (1.5, −1, − 1)
Is the decision line unique for this problem?

34
Perceptron Learning

◉ if C1 and C2 are linearly separable, there exist weight

vector such that:

Given a training set () where

◉ Training target: Find a weight vector w such that the

perceptron can correctly classify the training set X.

35
Perceptron Learning

◉ Feed a pattern x to the perceptron with weight vector w, it

will produce a binary output y (1 or 0). First consider the
case,

◉ If the correct label (all the labels of the training samples are
known) is d=0; should we update the weights?
◉ If the desired output is d=1, assume the new weight vector
is w’, then we have:

◉ But how to choose Δw ?

36
Perceptron Learning

◉ if the true label is d=1, and the perceptron makes a

mistake, its synaptic weights are adjusted by

37
Perceptron Learning

◉ consider the case,

◉ only adjust the weights when the perceptron makes

mistakes (d=0)

◉ If the true label is d=0, and the perceptron makes a

mistake, its synaptic weights are adjusted by

38
Perceptron Learning

◉ To unify this algorithm

○ consider the error signal: e=d-y
○ the error signal when d=1: e=1-0=1
○ the error signal when d=0: e=0-1=-1
◉ Then

39
Perceptron Learning

◉ Algorithm Perceptron
Start with a randomly chosen weight vector w(1);
while there exist input vectors that are misclassified by w(n)
Do Let x(n) be a misclassified input vector;
Update the weight vector to

Increment n
end-while

40
Perceptron Learning

◉ Example: Let us consider a simple classification problem

where the input space is one-dimensional space, i.e., a real
line:
○ Class 1 (d = 1) : x = 0.5, 2
○ Class 2 (d = 0) : x = -1, -2

◉ Solution:

41
Perceptron Learning

42
Perceptron Convergence Theorem

◉ Perceptron Convergence Theorem:

If C1 and C2 are linearly separable, after a finite number
of steps, the weights stop changing

43
Multilayer Perceptrons

◉ Multilayer perceptrons
(MLPs)
○ Generalization of the
single-layer perceptron
◉ Consists of
○ An input layer
○ One or more hidden
layers of computation
nodes
○ An output layer of
computation nodes
◉ Architectural graph of
a multilayer
perceptron with two
hidden layers:

44
Backpropagation

45
Data Augmentation
Data Augmentation is a technique that is used to create new artificial data from
already existing data sets

Motivation
Underfitting Overfitting
The model works well with the training When a model is trained with lots of data, it
data but performs poorly with the testing starts to pick up data from the noise and
data incorrect data entries.
Reasons: Reasons:
• Low variation of data and highly • High variation of the data and low bias.
biased model. • Model created is too complex and
• Model developed can’t handle advanced.
complex data. • The size of the training data is high.
• Small size of the training dataset.
• Training data is of poor quality
containing noise.

46
Data Augmentation
Data augmentation methods
Geometric Transformation Color Transformation AI Generative
• Flipping • Brightness • Generative
• Cropping • Darkness Adversarial
Networks
• Rotating • Sharpness
• Variation Auto-
• Zooming • Saturation Encoders
• Color Augmentation • Neural Style
Transfer

47
Benefits of Neural Networks

◉ High computational power

○ Generalization : Producing reasonable outputs for inputs not
encountered during training (learning).
○ Has a massively parallel distributed structure.

◉ Useful properties and capabilities

○ Nonlinearity : Most physical systems are nonlinear
○ Adaptivity (plasticity): Has built-in capability to adapt their
synaptic weights to changes in the environment
○ Fault tolerance : If a neuron or its connecting links are damaged,
the overall response may still be ok (due to the distributed nature
of information stored in a network).

48
Limitation Neural Network

◉ Fully connected -> different from biological neuron

◉ Input size is enormous
◉ Can’t share weight
First Covolution neural network
AlexNet (2012)
CNN layers
Convolution layer
Convolution layer
Convolution layer
Activation Layer

• Activation derivative saturate at 0  deeper layer weight

unchange
Pooling layer
58

Annotated Bibliography
No ratings yet
Annotated Bibliography
5 pages
Simonkucher Case Interview Prep 2015 PDF
100% (1)
Simonkucher Case Interview Prep 2015 PDF
23 pages
ML-Lec10-Artificial Neural Networks
No ratings yet
ML-Lec10-Artificial Neural Networks
76 pages
CFBC 718 e 2 C
No ratings yet
CFBC 718 e 2 C
30 pages
Neural
No ratings yet
Neural
32 pages
A Presentation On: By: Edutechlearners
No ratings yet
A Presentation On: By: Edutechlearners
33 pages
Wk. 12. Artificial Neural Networks (12!05!2021)
No ratings yet
Wk. 12. Artificial Neural Networks (12!05!2021)
48 pages
Neural Networks Two
No ratings yet
Neural Networks Two
69 pages
Wk9-Neural Networks
No ratings yet
Wk9-Neural Networks
46 pages
Isch 4
No ratings yet
Isch 4
44 pages
Unit 6 Application of AI
No ratings yet
Unit 6 Application of AI
91 pages
Ml 02 Redes Neurais Classificacao
No ratings yet
Ml 02 Redes Neurais Classificacao
77 pages
Basics
No ratings yet
Basics
48 pages
Part7.2 Artificial Neural Networks
No ratings yet
Part7.2 Artificial Neural Networks
51 pages
This Document Is About Artificial Inteligence.
No ratings yet
This Document Is About Artificial Inteligence.
81 pages
Lecture 10 Neural Network
No ratings yet
Lecture 10 Neural Network
34 pages
CV 2025 Spring 14
No ratings yet
CV 2025 Spring 14
33 pages
Perceptron 2014
No ratings yet
Perceptron 2014
44 pages
Unit - 2
No ratings yet
Unit - 2
24 pages
03 NeuralNetworksI PDF
100% (1)
03 NeuralNetworksI PDF
78 pages
Unit 1 Notes Final
No ratings yet
Unit 1 Notes Final
36 pages
Unit V
No ratings yet
Unit V
49 pages
Artificial Neural Networks
No ratings yet
Artificial Neural Networks
66 pages
Neural Network
No ratings yet
Neural Network
82 pages
Artificial Neural Networks: HCMC University of Technology Sep. 2008
No ratings yet
Artificial Neural Networks: HCMC University of Technology Sep. 2008
71 pages
ML Lec11
No ratings yet
ML Lec11
14 pages
Neural Networks
No ratings yet
Neural Networks
10 pages
Lecture 9
No ratings yet
Lecture 9
97 pages
Module1 ECO-598 AI & ML Aug 21
No ratings yet
Module1 ECO-598 AI & ML Aug 21
45 pages
DeepLearning-Aulas2e3
No ratings yet
DeepLearning-Aulas2e3
72 pages
Soft Computing Unit 2
No ratings yet
Soft Computing Unit 2
23 pages
Unit 1
No ratings yet
Unit 1
29 pages
CMPE 442 Introduction To Machine Learning: Artificial Neural Networks
No ratings yet
CMPE 442 Introduction To Machine Learning: Artificial Neural Networks
65 pages
2023 Lecture11 NeuralNetworks
No ratings yet
2023 Lecture11 NeuralNetworks
48 pages
CC511 Week 5 - 6 - NN - BP
No ratings yet
CC511 Week 5 - 6 - NN - BP
62 pages
2021 Lecture11 NeuralNetworks
No ratings yet
2021 Lecture11 NeuralNetworks
48 pages
Neural Network
No ratings yet
Neural Network
55 pages
Neural Networks
No ratings yet
Neural Networks
40 pages
Pattern Classification Neural Networks: Abdelmoniem Bayoumi, PHD
No ratings yet
Pattern Classification Neural Networks: Abdelmoniem Bayoumi, PHD
22 pages
Data Mining Techniques: Presentation On Neural Network
No ratings yet
Data Mining Techniques: Presentation On Neural Network
55 pages
Chapter 07 Artificial Neural Network
No ratings yet
Chapter 07 Artificial Neural Network
62 pages
Unit-4 Full
No ratings yet
Unit-4 Full
48 pages
DL_UNIT-1_SAN
No ratings yet
DL_UNIT-1_SAN
58 pages
Fundamentals of Artificial Neural Networks
No ratings yet
Fundamentals of Artificial Neural Networks
27 pages
Artificial Neural Networks: HCMC University of Technology Sep. 2008
No ratings yet
Artificial Neural Networks: HCMC University of Technology Sep. 2008
71 pages
Unit 1 Until MLP
No ratings yet
Unit 1 Until MLP
56 pages
Lecture-2 Learning Process45452465442
No ratings yet
Lecture-2 Learning Process45452465442
50 pages
Neural
No ratings yet
Neural
55 pages
Lecture 6 Perceptron Learning Rule
No ratings yet
Lecture 6 Perceptron Learning Rule
32 pages
Neural Network: Presented by Lecturer Dept. of Mechatronics Engineering Rajshahi University of Engineering & Technology
No ratings yet
Neural Network: Presented by Lecturer Dept. of Mechatronics Engineering Rajshahi University of Engineering & Technology
25 pages
Chapter 5 Artificial Neural Networks
No ratings yet
Chapter 5 Artificial Neural Networks
50 pages
28 Lecture CSC462
No ratings yet
28 Lecture CSC462
28 pages
Artificial Neural Networks
No ratings yet
Artificial Neural Networks
71 pages
Artificial Neural Networks
No ratings yet
Artificial Neural Networks
54 pages
UNIT 1 Neural Networks & DL
No ratings yet
UNIT 1 Neural Networks & DL
123 pages
Lesson 7.0 Supervised Learning With Neural Networks
No ratings yet
Lesson 7.0 Supervised Learning With Neural Networks
22 pages
The Numpy Pocketbook: Essentials on the Go
From Everand
The Numpy Pocketbook: Essentials on the Go
Silas Meadowlark
No ratings yet
Bio Inspired Computing: Fundamentals and Applications for Biological Inspiration in the Digital World
From Everand
Bio Inspired Computing: Fundamentals and Applications for Biological Inspiration in the Digital World
Fouad Sabry
No ratings yet
Techniques and Tools for Artificial Intelligence. Neural Networks via R and PYTHON
From Everand
Techniques and Tools for Artificial Intelligence. Neural Networks via R and PYTHON
César Pérez López
No ratings yet
Perceptrons: Fundamentals and Applications for The Neural Building Block
From Everand
Perceptrons: Fundamentals and Applications for The Neural Building Block
Fouad Sabry
No ratings yet
Feedforward Neural Networks: Fundamentals and Applications for The Architecture of Thinking Machines and Neural Webs
From Everand
Feedforward Neural Networks: Fundamentals and Applications for The Architecture of Thinking Machines and Neural Webs
Fouad Sabry
No ratings yet
Natural Computing with Python: Learn to implement genetic and evolutionary algorithms to solve problems in a pythonic way
From Everand
Natural Computing with Python: Learn to implement genetic and evolutionary algorithms to solve problems in a pythonic way
Giancarlo Zaccone
No ratings yet
Udemy Course Quality+Checklist
No ratings yet
Udemy Course Quality+Checklist
1 page
Ejercicio de Verb To Be - Am-Are-Is-was-were
No ratings yet
Ejercicio de Verb To Be - Am-Are-Is-was-were
4 pages
Script For National Book Month
No ratings yet
Script For National Book Month
4 pages
Polish Level 1
No ratings yet
Polish Level 1
4 pages
Int J Psychol - 2023 - Mosanya - New Ecological Paradigm and Third Culture Kids Multicultural Identity Configurations
No ratings yet
Int J Psychol - 2023 - Mosanya - New Ecological Paradigm and Third Culture Kids Multicultural Identity Configurations
13 pages
Creative-Writing-REVIEWER
No ratings yet
Creative-Writing-REVIEWER
2 pages
Udemy Business 2023 WorkplaceLearningTrends Report
No ratings yet
Udemy Business 2023 WorkplaceLearningTrends Report
34 pages
Socpsycho II 1 178 1 94
No ratings yet
Socpsycho II 1 178 1 94
94 pages
Nanomaterials - Introductory Slides
No ratings yet
Nanomaterials - Introductory Slides
5 pages
Appraise of Related Literature
No ratings yet
Appraise of Related Literature
9 pages
Marine Transportation Thesis Topics
100% (3)
Marine Transportation Thesis Topics
7 pages
Qual L01
No ratings yet
Qual L01
28 pages
Referensi
No ratings yet
Referensi
9 pages
G7 Physics Comp Review Packet 2022-2023
No ratings yet
G7 Physics Comp Review Packet 2022-2023
25 pages
WEEK 7 ICPS - and - ICSS
No ratings yet
WEEK 7 ICPS - and - ICSS
31 pages
GE 9 - Rizal Law ORIBIA
No ratings yet
GE 9 - Rizal Law ORIBIA
5 pages
New Proposal On Mathematics Tutoring Application For Secondary School
No ratings yet
New Proposal On Mathematics Tutoring Application For Secondary School
4 pages
Q Skill-1-Reading Final Test
100% (1)
Q Skill-1-Reading Final Test
4 pages
Rita Smith CV
No ratings yet
Rita Smith CV
5 pages
Madhuri Resume
No ratings yet
Madhuri Resume
4 pages
Slat Result
No ratings yet
Slat Result
1 page
Fidelity Bond Forms
No ratings yet
Fidelity Bond Forms
28 pages
4530 - CIP Interim Report - Ruchi
No ratings yet
4530 - CIP Interim Report - Ruchi
15 pages
Cambridge International AS Level: Arabic 8680/31 October/November 2022
No ratings yet
Cambridge International AS Level: Arabic 8680/31 October/November 2022
3 pages
Vadnana Luthra Orignal
No ratings yet
Vadnana Luthra Orignal
11 pages
ĐỀ CƯƠNG ÔN TẬP GIỮA KÌ 2-ANH 9
No ratings yet
ĐỀ CƯƠNG ÔN TẬP GIỮA KÌ 2-ANH 9
6 pages
List of Affiliated B.Ed. College G-MAIL List Session 2014-154180
No ratings yet
List of Affiliated B.Ed. College G-MAIL List Session 2014-154180
20 pages
Profed 4
No ratings yet
Profed 4
7 pages

Learning Algorithm

Uploaded by

Learning Algorithm

Uploaded by

Introduction to

Histograms for the length feature for the two categories

Histograms for the lightness feature for the two categories

How would our system automatically determine the decision boundary? 5

Loss is the function of error over training data

Regression loss functions

1969: Strassen's  1950s - 1970s: Early Foundations

‘Boosting’ : a family of algorithms which converts weak learners to

Negative Images Positive Images

◉ A typical biological neuron is composed of:

◉ The major job of neuron:

How to build a mathematical model of the neuron?

Based upon this model, is it possible for the inputs to

◉ Three basic components for the model of a neuron:

◉ Network architecture defines how nodes are connected.

◉ Learning is a process by which the free parameters of a

How can the network adjust the weights?

◉ Perceptron is built around the McCulloch-Pitts model.

the input vector: the weight

◉ Output of the neuron y(n)

◉ What is the decision boundary?

◉ How to choose the

Two basic methods can be employed to select a suitable weight

The decision boundary is the straight line described by the following

◉ if C1 and C2 are linearly separable, there exist weight

Given a training set () where

◉ Training target: Find a weight vector w such that the

◉ Feed a pattern x to the perceptron with weight vector w, it

◉ But how to choose Δw ?

◉ if the true label is d=1, and the perceptron makes a

◉ consider the case,

◉ only adjust the weights when the perceptron makes

◉ If the true label is d=0, and the perceptron makes a

◉ To unify this algorithm

◉ Example: Let us consider a simple classification problem

◉ Perceptron Convergence Theorem:

◉ High computational power

◉ Useful properties and capabilities

◉ Fully connected -> different from biological neuron

• Activation derivative saturate at 0  deeper layer weight

You might also like