4 - Training and Testing Classifier Models

Training and testing classifier models
The procedure of training and testing classifier models in machine learning typically
involves the following steps:
1. Data Preparation
2. Splitting the Data
3. Model Selection
4. Model Training
5. Hyperparameter tuning
6. Batch normalization
7. Model Evaluation
1. Data Preparation
• Split the dataset into features (independent variables) and labels

(dependent variable).
• Preprocess the data, including handling missing values, scaling features,

and encoding categorical variables.
2. Splitting the Data
• Divide the dataset into three subsets: training, validation and test.
a. Training Up to 75 percent of the total dataset is used for training. The model learns on the
training set; in other words, the set is used to assign the weights and biases that go into the
model.
b. Validation Between 15 and 20 percent of the data is used while the
model is being trained to evaluate initial accuracy, observe how the
model learns, and fine-tune hyperparameters. The model sees validation
data but does not use it to learn weights and biases.
c. Test Between 5 and 10 percent of the data is used for the final
evaluation. Having never seen this dataset, the model is free of any bias.
3. Model Selection
• Choose a classifier model based on the problem requirements and characteristics of the data.
• Common classifier models include:
1. logistic regression,
2. decision trees,
3. random forests,
4. support vector machines, and
5. neural networks.
4. Model Training
• Train the selected model on the training set.
• The model learns the patterns and relationships between features and
labels in the training data.
5. Hyperparameter tuning
Hyperparameters can be imagined as settings for controlling the behavior of a training

algorithm, as shown below.
The algorithm learns parameters from the data during the training phase based on human-
adjustable hyperparameters. The designer sets them after theoretical deductions or adjusts
them automatically.
In the context of deep learning, examples of hyperparameters are:
1. Learning rate
2. Number of hidden units
3. Convolution kernel width
4. Regularization techniques
6. Batch normalization
Two techniques, normalization and standardization, both aim to transform the data by
putting all the data points on the same scale in preparation for training.
The normalization process usually consists of scaling the numerical data down to a scale
from zero to one.
Standardization, on the other hand, usually consists of subtracting the dataset's mean
from each data point and then dividing the difference by the dataset’s standard
deviation. That forces the standardized data to take on a mean of zero and a standard
deviation of one. Standardization is often referred to as normalization; both involve
putting data on some known or standard scale.
7. Model Evaluation
• Once a model has been trained, performance is gauged according to a

confusion matrix and precision/accuracy metrics.
a. Confusion matrix
A confusion matrix describes the performance of a classifier model, as in

the 2x2 matrix depicted below.
Consider a simple classifier that predicts whether a patient has cancer or not. There are four
possible results:
• True positives (TP): The prediction was yes, and the patient does have cancer.
• True negatives (TN): The prediction was no, and the patient does not have cancer.
• False positives (FP): The prediction was yes, but the patient does not have cancer (also
known as a "Type I error").
• False negatives (FN): The prediction was no, but the patient does have cancer (also
known as a "Type II error")
A confusion matrix can hold more than 2 classes per axis, as shown here:
b. Precision / Accuracy
It is also useful to calculate the precision and accuracy based on classifier prediction and
actual value.
Accuracy measures how often the classifier is correct overall observations. Based on the
grid above, the calculation is (TP+TN)/total = (100+50)/(60+105) = 0.91.
Precision measures how often the actual value is Yes when the prediction is Yes. In this
case, that calculation is TP/predicted yes = 100/(100+10) = 0.91.

4 - Training and Testing Classifier Models

Uploaded by

Copyright:

Available Formats

4 - Training and Testing Classifier Models

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

4 - Training and Testing Classifier Models

Uploaded by

Copyright:

Available Formats

Training and testing classifier models

• Split the dataset into features (independent variables) and labels

• Preprocess the data, including handling missing values, scaling features,

• Common classifier models include:

4. support vector machines, and

• Train the selected model on the training set.

Hyperparameters can be imagined as settings for controlling the behavior of a training

In the context of deep learning, examples of hyperparameters are:

2. Number of hidden units

3. Convolution kernel width

• Once a model has been trained, performance is gauged according to a

A confusion matrix describes the performance of a classifier model, as in

You might also like