Welcome to Scribd!

0% found this document useful (0 votes)

4 views

ML m5_2

Uploaded by

puneethsp2004

Copyright:

Available Formats

Download as PDF, TXT or read online from Scribd

ML m5_2

Uploaded by

puneethsp2004

0% found this document useful (0 votes)

4 views24 pages

Copyright

Available Formats

PDF, TXT or read online from Scribd

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Download as PDF, TXT or read online from Scribd

Download as pdf or txt

0% found this document useful (0 votes)

4 views24 pages

ML m5_2

Uploaded by

puneethsp2004

Copyright:

Available Formats

Download as PDF, TXT or read online from Scribd

Download as pdf or txt

Jump to Page

You are on page 1of 24

Search inside document

Cross-Validation and Resampling Methods

• Cross-validation is a technique used to check how well a machine learning model will
perform on new, unseen data. It’s especially useful when we don’t have a large dataset
to work with.

• It helps us estimate the model's accuracy and reliability by making the most of the
available dataset

• Why is cross-validation needed?

• When building a machine learning model, we train it on a dataset and then test its
performance on a separate dataset. However, splitting the data this way can lead to
overfitting or under fitting, especially if the dataset is small. Cross-validation helps
address this by using the data more effectively.
• How does cross-validation work?

• Divide the Dataset: The dataset is divided into multiple smaller parts, called
folds.

• Train and Test: The model is trained on some folds and tested on the remaining
ones. This process is repeated multiple times:

• In each iteration, a different fold is used as the test set, and the remaining
folds are used for training.

• Average the Results: The performance from all the test folds is averaged to get a
final evaluation score.
• Stratification is a technique used in data splitting to ensure that the proportions
of different groups or classes in the dataset are maintained in every subset, such
as training and validation sets. It ensures the data's structure remains balanced
and representative of the original dataset.

• There are several types of cross-validation techniques, each designed to handle

different scenarios or data structures. Some of them are:

• K-Fold Cross-Validation
• 5 × 2 Cross-Validation

• Bootstrapping
K-Fold Cross-Validation

•
• There are 2 problems with this.

• First, to keep the training set large, we allow validation sets that are small.

• Second, the training sets overlap considerably, namely, any two training sets share K − 2 parts.

• One extreme case of K-fold cross-validation is leave-one-out where given a dataset of N instances,
only one instance is left out as the validation set (instance) and training uses the N − 1 instances.
5 × 2 Cross-Validation
Bootstrapping
• Bootstrapping is a way to create multiple new datasets from a single original dataset .

• Instead of splitting the data like in cross-validation, bootstrap creates new samples by randomly
selecting data points from the original dataset with replacement.

• Replacement refers to the process of randomly selecting data points from the original sample such that
each data point can be chosen more than once.

• For example, if your original dataset contains the points [A, B, C, D], a bootstrap sample might look like [B,
B, D, A], where B is selected twice.

• In the bootstrap, we sample N instances from a dataset of size N with replacement. The original dataset
is used as the validation set.
Measuring Classifier Performance
• For classification, especially for two-class problems, a variety of
measures has been proposed. There are four possible cases
• This image illustrates the concept of Receiver Operating Characteristic (ROC) curves, which are
commonly used to evaluate the performance of classifiers in machine learning.
This image explains the concepts of precision and recall using Venn diagrams, which are
commonly used in information retrieval and machine learning to evaluate the performance of
classification models.

50 Presentation Hacks by Oliver Aust
Document26 pages
50 Presentation Hacks by Oliver Aust
Andriamalala Antso
0% (1)
Lecture 9 - Evaluations
Document68 pages
Lecture 9 - Evaluations
Syed Abubakar
No ratings yet
CH 05 Optimization Technique
Document58 pages
CH 05 Optimization Technique
1032210687
No ratings yet
Cross-Validation in Machine Learning
Document18 pages
Cross-Validation in Machine Learning
Priya dharshini.G
No ratings yet
Lecture Testmodels
Document31 pages
Lecture Testmodels
sowmyasanthavel
No ratings yet
ML - Module 5
Document80 pages
ML - Module 5
Adarsh Bijili
No ratings yet
UNIT03
Document52 pages
UNIT03
Amit Yadav
No ratings yet
Accuracy Measures
Document61 pages
Accuracy Measures
Sanoop Mallissery
No ratings yet
Model Selection NEW
Document24 pages
Model Selection NEW
MOHANA RAO GANGAVARAPU
No ratings yet
NLP Chapter 2
Document79 pages
NLP Chapter 2
ai20152023
No ratings yet
Cross Validation LN 12
Document11 pages
Cross Validation LN 12
M S Prasad
No ratings yet
Cross Validation LN 12
Document11 pages
Cross Validation LN 12
M S Prasad
No ratings yet
Sampling Methods in Machine Learning
Document13 pages
Sampling Methods in Machine Learning
sidif81975
No ratings yet
P-2.1.2 Cross Validation and Regularization
Document37 pages
P-2.1.2 Cross Validation and Regularization
Puneet Parihar
No ratings yet
Data Splitting and Bias Variance Tradeoff
Document14 pages
Data Splitting and Bias Variance Tradeoff
Eileen Lovegood
No ratings yet
Learning Best Practices For Model Evaluation and Hyperparameter Tuning
Document17 pages
Learning Best Practices For Model Evaluation and Hyperparameter Tuning
sanjeev dev
No ratings yet
Machine Learning-Lecture 02
Document28 pages
Machine Learning-Lecture 02
Amna Arooj
No ratings yet
4-ResamplingMethods 1
Document23 pages
4-ResamplingMethods 1
Roushan Kumar
No ratings yet
AI & ML Unit 4 Notes
Document16 pages
AI & ML Unit 4 Notes
Anandakumar A
No ratings yet
Week 10 - PROG 8510 Week 10
Document16 pages
Week 10 - PROG 8510 Week 10
Vineel Kumar
No ratings yet
Unit 2
Document28 pages
Unit 2
LOGESH WARAN P
No ratings yet
Lec - 4
Document43 pages
Lec - 4
Yonatan tamiru
No ratings yet
Unit Iv
Document14 pages
Unit Iv
alwin
No ratings yet
Unit IV Aiml
Document32 pages
Unit IV Aiml
shalini.26it
No ratings yet
Lecture 5 Evaluation_Classifer
Document61 pages
Lecture 5 Evaluation_Classifer
ujjawaltomar77
No ratings yet
Bootstrapping & Cross Validation
Document17 pages
Bootstrapping & Cross Validation
nehaalkhasim
No ratings yet
5 - Model For Predictions - ML
Document52 pages
5 - Model For Predictions - ML
ganesh
No ratings yet
Cross Validation: Chandan B K Mrs. S Asst Professor, Department of Computer Science Engineering
Document21 pages
Cross Validation: Chandan B K Mrs. S Asst Professor, Department of Computer Science Engineering
Chandan BK
No ratings yet
Unit6 Part3 General Procedure
Document19 pages
Unit6 Part3 General Procedure
tamanna sharma
No ratings yet
Lecture 4.1 AML
Document12 pages
Lecture 4.1 AML
Vivek Sreekar
No ratings yet
Ensemble Methods
Document15 pages
Ensemble Methods
brm1shubha
100% (1)
Slide 1: Fast and Informative Model Selection Using Learning Curve Cross-Validation
Document71 pages
Slide 1: Fast and Informative Model Selection Using Learning Curve Cross-Validation
Teja .Manchala
No ratings yet
Sensitivity Analysis
Document64 pages
Sensitivity Analysis
Vinoth Kumar
No ratings yet
ML Unit 2
Document35 pages
ML Unit 2
suhanisweety448
No ratings yet
Lecture-4 Model Evaluation
Document28 pages
Lecture-4 Model Evaluation
Rimsha Shabbir
No ratings yet
Unit 3 (ML)
Document26 pages
Unit 3 (ML)
BHAVIN THUMAR
No ratings yet
Untitled
Document95 pages
Untitled
Syed Hunain Ali
No ratings yet
Wa0001.
Document173 pages
Wa0001.
Mihir Maisuria
No ratings yet
Lecture Note #6_PEC-CS701E
Document11 pages
Lecture Note #6_PEC-CS701E
halderriya56732
No ratings yet
RecSysEvaluation - 1
Document100 pages
RecSysEvaluation - 1
Fern Itsn
No ratings yet
A Gentle Introduction To K-Fold Cross-Validation
Document69 pages
A Gentle Introduction To K-Fold Cross-Validation
Azeddine Ramzi
No ratings yet
Cross Validation
Document5 pages
Cross Validation
gigimi5380
No ratings yet
Cross Validation - Notes
Document10 pages
Cross Validation - Notes
vinothkumar23.04.2004
No ratings yet
Building Good Training Sets UNIT 1 PART2
Document46 pages
Building Good Training Sets UNIT 1 PART2
Aditya Sharma
No ratings yet
FAM_QUESTION_BANK_CT[1]
Document14 pages
FAM_QUESTION_BANK_CT[1]
himanshuahirrao456
No ratings yet
ML 1 2 3
Document54 pages
ML 1 2 3
Shoba Natesh
No ratings yet
ML.1Lecture.2 (Old)
Document23 pages
ML.1Lecture.2 (Old)
Annayah Usman
No ratings yet
Data Science Q&A - Latest Ed (2020) - 5 - 1
Document2 pages
Data Science Q&A - Latest Ed (2020) - 5 - 1
M K
No ratings yet
MIS410 Lecture8toLecture10
Document13 pages
MIS410 Lecture8toLecture10
Ahanaf Rasheed
No ratings yet
Answer-4 Shreyansh
Document4 pages
Answer-4 Shreyansh
SHR extreme
No ratings yet
Training Evaluation
Document42 pages
Training Evaluation
Raksa Kun
No ratings yet
Data Mining: Practical Machine Learning Tools and Techniques
Document73 pages
Data Mining: Practical Machine Learning Tools and Techniques
Arvind
No ratings yet
Business Analytics Process and Data Exploration
Document38 pages
Business Analytics Process and Data Exploration
J Warneck Gultøm
No ratings yet
Unit-Iv DWDM
Document28 pages
Unit-Iv DWDM
varsha.j2177
No ratings yet
Classification (Part II)
Document162 pages
Classification (Part II)
20051694
No ratings yet
L10 - Keras Regression
Document14 pages
L10 - Keras Regression
Kunapuli Poojitha
No ratings yet
Neural Networks for Beginners. Part 2
From Everand
Neural Networks for Beginners. Part 2
Simon Winston
No ratings yet
Data Science for Beginners: Tips and Tricks for Effective Machine Learning/ Part 4
From Everand
Data Science for Beginners: Tips and Tricks for Effective Machine Learning/ Part 4
Tom Lesley
No ratings yet
Machine Learning with Python for Beginners
From Everand
Machine Learning with Python for Beginners
Saimon Carrie
No ratings yet
DATA MINING AND MACHINE LEARNING. PREDICTIVE TECHNIQUES: REGRESSION, GENERALIZED LINEAR MODELS, SUPPORT VECTOR MACHINE AND NEURAL NETWORKS
From Everand
DATA MINING AND MACHINE LEARNING. PREDICTIVE TECHNIQUES: REGRESSION, GENERALIZED LINEAR MODELS, SUPPORT VECTOR MACHINE AND NEURAL NETWORKS
César Pérez López
No ratings yet
Python Machine Learning for Beginners: Unsupervised Learning, Clustering, and Dimensionality Reduction. Part 1
From Everand
Python Machine Learning for Beginners: Unsupervised Learning, Clustering, and Dimensionality Reduction. Part 1
Tom Lesley
No ratings yet
Module 3 PPT-B.ppt
Document42 pages
Module 3 PPT-B.ppt
puneethsp2004
No ratings yet
CG syllabus
Document5 pages
CG syllabus
puneethsp2004
No ratings yet
FAFL module 5
Document37 pages
FAFL module 5
puneethsp2004
No ratings yet
OS m5 os protection
Document25 pages
OS m5 os protection
puneethsp2004
No ratings yet
Module 4 notes
Document16 pages
Module 4 notes
puneethsp2004
No ratings yet
M5 File System
Document116 pages
M5 File System
puneethsp2004
No ratings yet
ML m5_3
Document51 pages
ML m5_3
puneethsp2004
No ratings yet
Module 3 PPT-A.pptx
Document62 pages
Module 3 PPT-A.pptx
puneethsp2004
No ratings yet
ML m5_1
Document37 pages
ML m5_1
puneethsp2004
No ratings yet
SEPM_Module4
Document19 pages
SEPM_Module4
puneethsp2004
No ratings yet
Midterm Ite101 Lesson-10
Document16 pages
Midterm Ite101 Lesson-10
Jc Torillas
No ratings yet
Prompts For AI For Creativity Course by Dave Birss - Creative AI
Document3 pages
Prompts For AI For Creativity Course by Dave Birss - Creative AI
qfazeem
No ratings yet
Ajeassp 2020 499 515
Document18 pages
Ajeassp 2020 499 515
DMT IPO
No ratings yet
Week 02 Lecture 01
Document33 pages
Week 02 Lecture 01
Osama Adeel
No ratings yet
Disease Prediction System Using Support Vector Machine and Multilinear Regression
Document7 pages
Disease Prediction System Using Support Vector Machine and Multilinear Regression
International Journal of Innovative Science and Research Technology
No ratings yet
Presentation 3
Document7 pages
Presentation 3
KK
No ratings yet
Digital Smart Podium Reseach
Document13 pages
Digital Smart Podium Reseach
Tiktok new viral Video
No ratings yet
System design and analysis
Document3 pages
System design and analysis
victornziokamutua
No ratings yet
Intelligent Robots and Drones, 2024
Document479 pages
Intelligent Robots and Drones, 2024
cazadordechangos7001
No ratings yet
Artificial Intelligence-Based Traffic Flow Predict
Document50 pages
Artificial Intelligence-Based Traffic Flow Predict
annmariaettukettil
No ratings yet
L T P Credits Artificial Intelligence Lab - 2 1: Identifying Problems and Their AI Solutions
Document4 pages
L T P Credits Artificial Intelligence Lab - 2 1: Identifying Problems and Their AI Solutions
minakshee
No ratings yet
Uowd Program Selection Guide Mar v2 2024
Document24 pages
Uowd Program Selection Guide Mar v2 2024
servermemmedov590
No ratings yet
NSU-104 Lecture 11
Document30 pages
NSU-104 Lecture 11
dettol skincare
No ratings yet
12th ICCCNT 2021 Paper 87
Document7 pages
12th ICCCNT 2021 Paper 87
Sifat
No ratings yet
Predicting Groundwater Spring Locations From Topographic and Clim
Document77 pages
Predicting Groundwater Spring Locations From Topographic and Clim
hexovi4155
No ratings yet
P7 GenderImageCategorization DL D1
Document2 pages
P7 GenderImageCategorization DL D1
alejandro martin
No ratings yet
Pankaj Kumar Mahato
Document10 pages
Pankaj Kumar Mahato
Pankaj Mahato
No ratings yet
Ie33 To EPIQCVx Upgrade
Document16 pages
Ie33 To EPIQCVx Upgrade
Nam Nguyễn
No ratings yet
PPT ch13
Document42 pages
PPT ch13
Norman Ronald
No ratings yet
417 - Artificial Intelligence Pre Board 1 Pune Region Marking Scheme
Document7 pages
417 - Artificial Intelligence Pre Board 1 Pune Region Marking Scheme
tanjirouchihams12
No ratings yet
The Evolution of Human Existence - Life in The Era of Artificial Intelligence
Document2 pages
The Evolution of Human Existence - Life in The Era of Artificial Intelligence
sujansanjayrj
No ratings yet
NNDL Assignment-2 Report
Document7 pages
NNDL Assignment-2 Report
shreyaschandru2002
No ratings yet
Dissertation Topics in Hospitals
Document8 pages
Dissertation Topics in Hospitals
WriteMyPaperOneDayOmaha
100% (1)
Part 2 - Earning Money Online Using AI
Document2 pages
Part 2 - Earning Money Online Using AI
Kaweesi Brian
No ratings yet
Ai Unit 4 Notes
Document36 pages
Ai Unit 4 Notes
Priya Mannem
No ratings yet
Applied Machine Learning Online Course
Document2 pages
Applied Machine Learning Online Course
Gautham Ch
No ratings yet
Applsci 09 00909
Document29 pages
Applsci 09 00909
mujeebdaudzai888
No ratings yet
PredictingfutureAI PDF
Document16 pages
PredictingfutureAI PDF
Alok Ghode
No ratings yet
Automating Teacher Work? A History of The Politics of Automation and Artificial Intelligence in Education
Document19 pages
Automating Teacher Work? A History of The Politics of Automation and Artificial Intelligence in Education
Mazi Prima
No ratings yet