How To Improve Model
How To Improve Model
How To Improve Model
Training set
Which you run your learning algorithm on.
Dev Which you use to tune parameters, select features, and
(development) make other decisions regarding the learning algorithm.
set Sometimes also called the hold-out cross validation set .
But in the era of Deep Learning may even go down to 99 0.5 0.5
If the data size is 1,00,0000 then 5000 5000 data size will still be there
in dev and test sets
Importance of Choosing dev and test sets wisely
For example Housing data coming from Mumbai and we are trying to find the
house prices in Chandigarh.
Else wasting a lot of time in improving the performance of dev set and then
finding out that it is not working well for the test set.
Sometime we have only two partitioning of the data in that case they are called
Train/dev or train/test set.
Using a single Evaluation Metric
You should be clear about what you are trying to achieve and what you
are trying to tune
Classifier Precision Recall
A 95 90
B 98 85
Recall – of total true examples how many have been correctly extracted
Precision and Recall
• Total days – 30 (one month)
• Actual rain – 10
• Model saying 9 day (5 are correct and 4 are incorrect)
• Precision = 5/9 (How many selected item are relevant ? )
• Recall = 5/10 (How many relevant items are selected ? )
Using a single Evaluation Metric
You should be clear about what you are trying to achieve and what you
are trying to tune
Classifier Precision Recall F1 Score
A 95 90 92.4
B 98 85 91
!
F1 Score = Harmonic mean of Precision and Recall ! !
"
" #
Optimize one parameter and satisfy others
Classifier Accuracy Running Time Safety False Positive … ..
A 90 20ms No
B 92 80ms Yes
C 96 2000ms Yes
Modify input features based on Create additional features that help the algorithm eliminate a
particular category of errors. These new features could help
insights from error analysis
with both bias and variance.
Reduce or eliminate
regularization (L2, L1 reduces avoidable bias, but increase variance.
regularization, dropout)
Add regularization (L2, L1 regularization, This technique reduces variance but increases bias.
dropout)
𝑥!
𝑥"
𝑦"
𝑥#
𝑥$
Drop out regularization: Prevents Overfitting
This technique has also become popular recently. We drop out some of the hidden units for specific
training examples. Different hidden units may go off for different examples. In different iterations
of the optimization the different units may be dropped randomly.
The drop outs can also be different for different layers. So, we can select specific layers which have
higher number of units and may be contributing more towards overfitting; thus suitable for higher
dropout rates.
0 0.2 0.2 0 0 0
Drop out
• Drop out also help in spreading out the weights at all layers as the
system will be reluctant to put more weight on some specific node. So
it help in shrinking weights and has an adaptive effect on the weights.
• Dropout has a similar effect as L2 regularization for overfitting.
• We don’t use dropout for test examples
• We also need to bump up the values at the output of each layer
corresponding to the dropout
Early stopping
Training error
# iterations
Early Stopping
Sometime dev set error goes
down and then it start going By stopping halfway we also
up. So you may decide to reduce number of iterations
stop where the curve has to train and the computation
started taking a different time.
turn.