Lecture6c HyperparameterOptimization

Uploaded by

Kassa Derbie

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

18 views19 pages

Lecture6c HyperparameterOptimization

Uploaded by

Kassa Derbie

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 19

DEPARTMENT OF APPLIED MATHEMATICS, COMPUTER SCIENCE AND STATISTICS

HYPERPARAMETER
OPTIMIZATION
Big Data Science (Master in Statistical Data Analysis)
PARAMETER OPTIMIZATION
̶ So far, we have talked about parameter optimization:
̶ Our model contains trainable parameters
̶ We define a loss function
̶ An optimization algorithm searches the parameters that
minimize the loss:
‒ Analytic solutions
‒ Newton-Raphson
‒ (Stochastic) gradient descent
‒ ...

2
HYPERPARAMETER OPTIMIZATION
̶ Most models also have hyperparameters:
̶ Fixed before training the model
̶ Involve assumptions of the model
̶ Not taken into account in the gradient of the
optimization function

3
EXAMPLES OF HYPERPARAMETERS
Neural
Linear models Random Forest SVM KNN
networks
• Regularization • Number of • Kernel • Architecture •𝐾
constant trees • Margin • Number of • Distance
• Maximum • Kernel layers metric
depth parameters: • Size of each • Parameters of
• Minimum leaf • Polinomial layer approximate
size degree • Activation structures
• Criterion for • Gaussian function
split kernel width • Dropout
• Number of • ... • Regularization
features per • ...
split
• ...

4
CHOOSING HYPERPARAMETERS
̶ Manual search
̶ Grid search
̶ Random search
̶ Automated methods:
̶ Bayesian optimization
̶ Evolutionary optimization

5
MANUAL TUNING
̶ Using assumptions or knowledge to select the hyperparameters

̶ Pros:
̶ Computationally efficient

̶ Cons:
̶ Requires manual labor
̶ Prone to bias
̶ Limited combinations are tested

6
GRID SEARCH
̶ For each hyperparameter, define a subset of values that will be
tested
̶ Iteratively test all combinations

̶ Pros:
̶ The individual effect of parameters can be studied
̶ Cons:
̶ The number of combinations can become very high
̶ Few values are tested for every parameter
̶ The combined effect of parameters is not completely modeled

7
RANDOM SEARCH
̶ A random distribution is specified for each parameter
̶ Samples are drawn and tested

̶ Pros:
̶ The combined effect of parameters is somewhat modeled
̶ More values per parameter can be considered
̶ Cons:
̶ The search is not guided
̶ The individual effect of parameters is not clear

8
GRID VS RANDOM

J. Bergstra, Y. Bengio, “Random Search for Hyper-

Parameter Optimization”, Journal of Machine Learning
Research 13 (2012) 281-305

9
HYPERPARAMETER
OPTIMIZATION AS AN
OPTIMIZATION PROBLEM
10
AUTOMATED HYPERPARAMETER OPTIMIZATION
̶ Why not solve hyperparameter optimization in the
same way as parameter optimization?

̶ Main approaches:
̶ Bayesian optimization
̶ Evolutionary algorithms

11
SEQUENTIAL MODEL-BASED BAYESIAN OPTIMIZATION (SMBO)
1. Query the function 𝑓 at 𝑡 values and record the
𝑡
resulting pairs S = 𝜽𝑖 , 𝑓(𝜽𝑖 ) 𝑖=1

2. For a fixed number of iterations:

1. Fit a probabilistic model ℳ to the pairs in S
2. Apply an aquisition function 𝑎(𝜽, ℳ) to select a
promising input 𝜽 to evaluate next
3. Evaluate 𝑓(𝜽) and include 𝜽, 𝑓(𝜽) into S

12
GENETIC ALGORITHMS
̶ Applying the principles of natural selection to optimization
̶ Solutions are encoded as "chromosomes"
̶ A crossover operator combines two chromosomes into new ones
̶ A mutation operator introduces random mutations

1. Generate an initial population of solutions

2. For a number of generations:
1. Crossover solutions to increase population size
2. Apply mutation operator
3. Evaluate new solutions
4. Discard some "bad" solutions to maintain a "good" population

13
PARTITIONING
14
PARTITIONING FOR HYPERPARAMETER OPTIMIZATION
̶ Remember: NEVER TRAIN ON THE TEST SET

̶ This is also valid when training hyperparameters

15
TEST SET + CROSS VALIDATION
Valid. Training
Training
Valid.
Training
CV Valid. ...
Training
Training
Training
Valid.

Test

Daniel Peralta <daniel.peralta@ugent.vib.be> 16

NESTED CROSS VALIDATION

Test Training

Test
Valid. Training
Training
Training

...
Training Valid.

Training
Training

Training Valid.
Test

Daniel Peralta <daniel.peralta@ugent.vib.be>

17
NESTED CROSS VALIDATION: EXAMPLE
̶ 5 folds
̶ 3 classifiers: Logistic Regression, Random Forest, SVM

̶ We want to know which classifier is better suited to our problem

̶ We also want to optimize the hyperparameters of each classifier
̶ 3 inner folds for hyperparameter optimization
̶ The ultimate goal is to have a system in production doing real
predictions

18
NESTED CROSS VALIDATION: EXAMPLE
1. For each outer fold i in [1...5]:
1. Validation set: Fold i
2. Training set: Folds {1,2,3,4,5}\{i}
3. Split training set into 3 inner folds
4. For each classifier 𝐶 in {LR, RF, SVM}:
1. For each combination of hyperparameters 𝜃𝑐 for 𝐶:
1. For each inner fold j in [1...3]:
1. (Inner) validation set: Fold j
2. (Inner) training set: Folds {1,2,3}\{j}
3. Train classifier 𝐶(𝜃𝑐 ) on training set
4. Evaluate 𝐶(𝜃𝑐 ) on validation set
2. Calculate average performance of 𝐶 𝜃𝑐 across 3 inner folds
∗(𝑖)
2. Select best performing parameters 𝜃𝑐 for classifier 𝐶
∗(𝑖)
3. Evaluate C(𝜃𝑐 ) on validation set
∗(𝑖)
2. Calculate average performance of each C(𝜃𝑐 ) across all validation folds
3. Select the best classifier C ∗
4. Select 𝜃𝑐∗∗ as the optimal hyperparameters for C ∗
5. Train C ∗ 𝜃𝑐∗∗ on the entire dataset

∗(𝑖)
̶ Note that the best parameters 𝜃𝑐 for each classifier depend on the outer fold that was used for training

Academic Interview Questions
No ratings yet
Academic Interview Questions
9 pages
ADRE SXP and 408 Datasheet
No ratings yet
ADRE SXP and 408 Datasheet
33 pages
Model Training: (Anything Done While We Train The Model)
No ratings yet
Model Training: (Anything Done While We Train The Model)
194 pages
Module2.3 Hyperparameter Optimization
No ratings yet
Module2.3 Hyperparameter Optimization
29 pages
Automl: A Perspective Where Industry Meets Academy
No ratings yet
Automl: A Perspective Where Industry Meets Academy
154 pages
Bergstra12a PDF
No ratings yet
Bergstra12a PDF
25 pages
RO47002 - Lecture 2C - Hyperparameters and Cross-Validation
No ratings yet
RO47002 - Lecture 2C - Hyperparameters and Cross-Validation
10 pages
Hyper Parameter Tuning
No ratings yet
Hyper Parameter Tuning
4 pages
Lec 04 05
No ratings yet
Lec 04 05
37 pages
1 s2.0 S1674862X19300047 Main
No ratings yet
1 s2.0 S1674862X19300047 Main
15 pages
Hyperparameter Optimization For Machine Learning Models Based On Bayesian Optimization
No ratings yet
Hyperparameter Optimization For Machine Learning Models Based On Bayesian Optimization
15 pages
Hyperparameter Optimization of ML Algorithms
No ratings yet
Hyperparameter Optimization of ML Algorithms
69 pages
Lecture 9 Model Selection
No ratings yet
Lecture 9 Model Selection
15 pages
The Importance of Hyperparameters in Machine Learning
No ratings yet
The Importance of Hyperparameters in Machine Learning
8 pages
ANDONIE, R. Hyperparameter Optimization in Learning Systems. Journal of Membrane Computing. 2019.
No ratings yet
ANDONIE, R. Hyperparameter Optimization in Learning Systems. Journal of Membrane Computing. 2019.
13 pages
Hyper Parameters
No ratings yet
Hyper Parameters
24 pages
On Hyperparameter Optimization of Machine Learning Algorithms: Theory and Practice
No ratings yet
On Hyperparameter Optimization of Machine Learning Algorithms: Theory and Practice
69 pages
ML Chap 5
No ratings yet
ML Chap 5
14 pages
Hyperparameter Tuning
No ratings yet
Hyperparameter Tuning
3 pages
Hyper Parameters
No ratings yet
Hyper Parameters
7 pages
Hyperparameter Search in Machine Learning: February 2015
No ratings yet
Hyperparameter Search in Machine Learning: February 2015
6 pages
Hyperparameters
No ratings yet
Hyperparameters
2 pages
Hyperopt A Python Library For Model Selection and
No ratings yet
Hyperopt A Python Library For Model Selection and
25 pages
Module 6
No ratings yet
Module 6
4 pages
Metalearning For Hyperparameter Optimization
No ratings yet
Metalearning For Hyperparameter Optimization
20 pages
On Hyperparameter Optimization of Machine Learning Algorithms: Theory and Practice
No ratings yet
On Hyperparameter Optimization of Machine Learning Algorithms: Theory and Practice
22 pages
Lecture 2
No ratings yet
Lecture 2
31 pages
Unit 3
No ratings yet
Unit 3
37 pages
Hyperparameter Tuning
No ratings yet
Hyperparameter Tuning
4 pages
Hyperparemter and Cross Validaton
No ratings yet
Hyperparemter and Cross Validaton
8 pages
Unit 1.doc
No ratings yet
Unit 1.doc
11 pages
Optimized Hyperparameters Tuning of Multi-Class Classification Algorithms
No ratings yet
Optimized Hyperparameters Tuning of Multi-Class Classification Algorithms
17 pages
Hyper Parameter New
No ratings yet
Hyper Parameter New
4 pages
Introduction To Optimization-Lec1
No ratings yet
Introduction To Optimization-Lec1
36 pages
Grid Random Search
No ratings yet
Grid Random Search
6 pages
Unit 4 A
No ratings yet
Unit 4 A
16 pages
Comparing ML Algorithms - Anjali Garg
No ratings yet
Comparing ML Algorithms - Anjali Garg
14 pages
Hyperband
No ratings yet
Hyperband
52 pages
Tuning Decision Trees Python
No ratings yet
Tuning Decision Trees Python
50 pages
Hyperparameters
No ratings yet
Hyperparameters
8 pages
Lecture 3 1611410001002
No ratings yet
Lecture 3 1611410001002
51 pages
Quiz 1 Materials
No ratings yet
Quiz 1 Materials
159 pages
Hyperparameter Tuning
No ratings yet
Hyperparameter Tuning
4 pages
Lecture 5 - Feature Extraction, Model Building & Evaluation
No ratings yet
Lecture 5 - Feature Extraction, Model Building & Evaluation
35 pages
Hyperparameter Tuning
No ratings yet
Hyperparameter Tuning
9 pages
Lecture 15 - Recap and Midterm Review
No ratings yet
Lecture 15 - Recap and Midterm Review
37 pages
2012 Nikolaos Nikolaou MSC
No ratings yet
2012 Nikolaos Nikolaou MSC
102 pages
08 Classification
No ratings yet
08 Classification
46 pages
Comparative Study of Bayesian Optimization Process For The Best Machine Learning Hyperparameters
No ratings yet
Comparative Study of Bayesian Optimization Process For The Best Machine Learning Hyperparameters
11 pages
#Machinelearning: Mastering Tuning Hyperparameter
No ratings yet
#Machinelearning: Mastering Tuning Hyperparameter
7 pages
04 - Model Selection
No ratings yet
04 - Model Selection
62 pages
Hyperparameter Tuning
No ratings yet
Hyperparameter Tuning
6 pages
Lecture 7 - Feature Selection & Model Optimization
No ratings yet
Lecture 7 - Feature Selection & Model Optimization
48 pages
Hyperparameter Tuning - GeeksforGeeks
No ratings yet
Hyperparameter Tuning - GeeksforGeeks
23 pages
Bayesian Optimization
No ratings yet
Bayesian Optimization
15 pages
4-2 Generalizing Bayesian Optimization With Likelihood-Free Inference and Decision-Theoretic Entropies
No ratings yet
4-2 Generalizing Bayesian Optimization With Likelihood-Free Inference and Decision-Theoretic Entropies
45 pages
Artificial Intelligence
No ratings yet
Artificial Intelligence
31 pages
DR Antonio Gulli - A Collection of Advanced Data Science and Machine Learning Interview Questions Solved in Python and Spark (II) - Hands-On Big Data and Machine - Programming Interview Questions) (
No ratings yet
DR Antonio Gulli - A Collection of Advanced Data Science and Machine Learning Interview Questions Solved in Python and Spark (II) - Hands-On Big Data and Machine - Programming Interview Questions) (
112 pages
Hyperparameter Optimization
No ratings yet
Hyperparameter Optimization
1 page
Scala for Machine Learning: Leverage Scala and Machine Learning to construct and study systems that can learn from data
From Everand
Scala for Machine Learning: Leverage Scala and Machine Learning to construct and study systems that can learn from data
Patrick R. Nicolas
No ratings yet
DATA MINING and MACHINE LEARNING: CLUSTER ANALYSIS and kNN CLASSIFIERS. Examples with MATLAB
From Everand
DATA MINING and MACHINE LEARNING: CLUSTER ANALYSIS and kNN CLASSIFIERS. Examples with MATLAB
César Pérez López
No ratings yet
DATA MINING and MACHINE LEARNING. PREDICTIVE TECHNIQUES: ENSEMBLE METHODS, BOOSTING, BAGGING, RANDOM FOREST, DECISION TREES and REGRESSION TREES.: Examples with MATLAB
From Everand
DATA MINING and MACHINE LEARNING. PREDICTIVE TECHNIQUES: ENSEMBLE METHODS, BOOSTING, BAGGING, RANDOM FOREST, DECISION TREES and REGRESSION TREES.: Examples with MATLAB
César Pérez López
No ratings yet
CA 7 Complete Commands
No ratings yet
CA 7 Complete Commands
664 pages
Requirements: H2 Database Basics
No ratings yet
Requirements: H2 Database Basics
8 pages
Industrial Training Sample Report
No ratings yet
Industrial Training Sample Report
7 pages
My Results
No ratings yet
My Results
2 pages
Product: EAGLE20-0400999TT999EK9Y9HSE01XX.X Configurator: EAGLE20/30 Industrial Firewalls
No ratings yet
Product: EAGLE20-0400999TT999EK9Y9HSE01XX.X Configurator: EAGLE20/30 Industrial Firewalls
2 pages
Divide and Conquer (Mergesort) : CSE373: Design and Analysis of Algorithms
No ratings yet
Divide and Conquer (Mergesort) : CSE373: Design and Analysis of Algorithms
61 pages
Oopr Finals Reviewer
No ratings yet
Oopr Finals Reviewer
14 pages
Unit 13 - Week 11: Assignment 11
No ratings yet
Unit 13 - Week 11: Assignment 11
10 pages
GV300W Manage Tool User Guide V1.08
0% (1)
GV300W Manage Tool User Guide V1.08
22 pages
SCTS 242: Network Intrusion Detection and Penetration Testing
No ratings yet
SCTS 242: Network Intrusion Detection and Penetration Testing
21 pages
Microsoft - Low Code No Code Manual - Original
No ratings yet
Microsoft - Low Code No Code Manual - Original
22 pages
M-342 Datasheet en 1
No ratings yet
M-342 Datasheet en 1
1 page
Lecture02 Network Layer Addressing
No ratings yet
Lecture02 Network Layer Addressing
36 pages
SE601 Week 1
No ratings yet
SE601 Week 1
16 pages
Exception Handaling
No ratings yet
Exception Handaling
2 pages
Newline Cast
No ratings yet
Newline Cast
43 pages
AVEVA PDMS12 Update Administration and Databases
No ratings yet
AVEVA PDMS12 Update Administration and Databases
47 pages
PCIe Packet Generator
No ratings yet
PCIe Packet Generator
2 pages
Natasha Zamir: Herndon, Virginia
No ratings yet
Natasha Zamir: Herndon, Virginia
4 pages
Linux Emulation: Ron Minnich
No ratings yet
Linux Emulation: Ron Minnich
18 pages
TPL2
No ratings yet
TPL2
2 pages
Ankur's Resume
No ratings yet
Ankur's Resume
2 pages
A40000 English
No ratings yet
A40000 English
329 pages
BSC Soft Sys
No ratings yet
BSC Soft Sys
2 pages
A Ray-Box Intersection Algorithm and Efficient Dynamic Voxel Rendering
No ratings yet
A Ray-Box Intersection Algorithm and Efficient Dynamic Voxel Rendering
17 pages
HPE Cray XD2000-a50004301enw
No ratings yet
HPE Cray XD2000-a50004301enw
30 pages
CEWE Prometer: Precision Metering
No ratings yet
CEWE Prometer: Precision Metering
2 pages
IoT Based Agriculture Field Monitoring
No ratings yet
IoT Based Agriculture Field Monitoring
45 pages