The document discusses overfitting and underfitting in machine learning models. Overfitting occurs when a model performs well on training data but poorly on new data, often due to noise, while underfitting happens when a model fails to perform adequately on both training and new data due to insufficient or incorrect data. Solutions to overfitting include pruning, cross-validation, and regularization to improve model accuracy and generalization.
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0 ratings0% found this document useful (0 votes)
2 views
Introduction to Machine Learning
The document discusses overfitting and underfitting in machine learning models. Overfitting occurs when a model performs well on training data but poorly on new data, often due to noise, while underfitting happens when a model fails to perform adequately on both training and new data due to insufficient or incorrect data. Solutions to overfitting include pruning, cross-validation, and regularization to improve model accuracy and generalization.
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 11
Overfitting
Overfitting
• Refers to a model which performs very well w.r.t.
training set. Such model is able to predict the noise and randomness w.r.t. training set very well • But would not perform well when it comes to newly arrived data. • Occurs with nonparametric and nonlinear models. Underfitting
• Refers to a model that does not perform well
both with training set and newly arrived data. • An underfitted model would be directly rejected due to its performance via benchmark and accuracy. Reasons for underfitting occurrences
• Under fitting is said to happen when a model is
unable to capture or understand the nature of data. This has a direct impact on the accuracy of the model. • The reason for this could be: • Less data available to analyze • Wrong data Reasons for underfitting occurrences
• Wrong features considered
• It could be also be due to wrong formed model • The solution to under fitting is to make use of more data and possibly even clean data by removing unwanted features or columns. How under fitting looks when plotted: Does over fitting get affected by noise?
• Over fitting is said to happen when a model
gets easily affected by noise or outliers. • The details available in such a case are more, and as a result of this, model gets affected. The over fitting diagram when plotted: Over fitting can be overcome by:
1. Pruning: Pruning as the name suggests, is
cutting. We prune wanted data in addition to nodes that gets easily affected by wrong data. Over fitting can be overcome by:
2.Cross-Validation: Sample prediction error is one way
that helps on the problem of over fitting. This is generally accomplished with the help of k fold validation. In k fold validation, the original sample data is categorized into k subsets. One of the samples is used for testing, and remaining subsets are used to form the model. The output results which would be collected and then averaged out get the final estimation. Over fitting can be overcome by:
3. Regularization: The aim of this is to find
out features that align with the objective of the problem and thus removing features that do not contribute to the final output.