AIML Assignment Report

Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 11

VISVESVARAYA TECHNOLOGICAL UNIVERSITY

JNANA SANGAMA, BELAGAVI-590018

AIML ASSIGNMENT (18CS71)


on
“IOT BASED RAILWAY TRACK FAULT DETECTION
USING ML ALGORITHMS”
Submitted in partial fulfilment of the requirements for the 7th semester
Bachelor of Engineering
in
Information Science and Engineering
Submitted by
RAKSHA H P 1BI20IS069
RAKSHITHA S 1BI20IS070
SANJANA S 1BI20IS080
RAKSHITHA Y 1BI21IS407

Under the guidance of


Dr. ANUPAMA K C
Assistant Professor
Dept. of ISE, BIT
Bangalore-04

DEPARTMENT OF INFORMATION SCIENCE AND ENGINEERING


BANGALORE INSTITUTE OF TECHNOLOGY
KR Road, V. V. Puram, Bengaluru, Karnataka-560004
2023-24
INTRODUCTION TO EVALUATION METRICS IN MACHINE
LEARNING

In machine learning, evaluation metrics are crucial for assessing the performance and effectiveness
of a model. For our crack detection project involving machine learning algorithms, various
evaluation metrics, including the confusion matrix, play a pivotal role in gauging the model's
accuracy, precision, recall, and overall effectiveness in identifying cracks on the toy train track.

Introduction to Evaluation Metrics in Machine Learning:

1. Accuracy: A fundamental metric representing the ratio of correctly predicted instances to the
total number of instances in the dataset. However, it might not be sufficient for imbalanced datasets
or when the costs of false positives and false negatives vary significantly.

2. Precision: Indicates the model's ability to correctly identify positive instances (cracks) among all
instances predicted as positive. Precision focuses on minimizing false positives and is calculated as
the ratio of true positives to the sum of true positives and false positives.

3. Recall (Sensitivity): Reflects the model's capability to correctly detect all positive instances in
the dataset. It's calculated as the ratio of true positives to the sum of true positives and false
negatives, emphasizing the reduction of false negatives.

4. F1 Score: Harmonic mean of precision and recall, providing a balanced evaluation metric that
considers both precision and recall. It's useful when seeking a balance between false positives and
false negatives.
5. Confusion Matrix: A confusion matrix is a tabular representation of the model's predictions
against the actual ground truth labels. It consists of four values:
True Positives (TP): Actual cracks correctly predicted as cracks.
True Negatives (TN): Non-cracks correctly predicted as non-cracks.
False Positives (FP): Non-cracks incorrectly predicted as cracks (Type I error).
False Negatives (FN): Cracks incorrectly predicted as non-cracks (Type II error).

6. Mean Absolute Error: Mean Absolute Error or MAE is one of the simplest metrics, which
measures the absolute difference between actual and predicted values, where absolute means taking
a number as Positive. To understand MAE, let's take an example of Linear Regression, where the
model draws a best fit line between dependent and independent variables. To measure the MAE or
error in prediction, we need to calculate the difference between actual values and predicted values.
But in order to find the absolute error for the complete dataset, we need to find the mean absolute
of the complete dataset.

Here, Y is the Actual outcome, Y' is the predicted outcome, and N is the total number of data
points.

7. Mean Squared error: Mean Squared error or MSE is one of the most suitable metrics for
Regression evaluation. It measures the average of the Squared difference between predicted values
and the actual value given by the model. Since in MSE, errors are squared, therefore it only
assumes non-negative values, and it is usually positive and non-zero. Moreover, due to squared
differences, it
penalizes small errors also, and hence it leads to over-estimation of how bad the model is.
Here, Y is the Actual outcome, Y' is the predicted outcome, and N is the total number of data
points.

8. R squared error: R squared error is also known as Coefficient of Determination, which is


another popular metric used for Regression model evaluation. The R-squared metric enables us to
compare our model with a constant baseline to determine the performance of the model. To select
the constant baseline, we need to take the mean of the data and draw the line at the mean. The R
squared score will always be less than or equal to 1 without concerning if the values are too large
or small.

9. Adjusted R squared error: Adjusted R squared, as the name suggests, is the improved version
of R squared error. R square has a limitation of improvement of a score on increasing the terms,
even though the model is not improving, and it may mislead the data scientists. To overcome the
issue of R square, adjusted R squared is used, which will always show a lower value than R². It is
because it adjusts the values of increasing predictors and only shows improvement if there is a real
improvement.

Here, n is the number of observations, k denotes the number of independent variables and
Ra2 denotes the adjusted R2.

10. Logarithmic loss: It is also known as Log loss. Its basic working propaganda is by
penalizing the false (False Positive) classification. It usually works well with multi-class
classification. Working on Log loss, the classifier should assign a probability for each and every
class of all the samples. If there are N samples belonging to the M class, then we calculate the
Log loss in this way:
Here, yij indicates whether sample i belongs to class j and pij indicates the probability of
sample i belongs to class j.

10. Root Mean Square Error (RMSE): We can say that RMSE is a metric that can be obtained
by just taking the square root of the MSE value. As we know that the MSE metrics are not robust
to outliers and so are the RMSE values. This gives higher weightage to the large errors in
predictions.

11. Root Mean Squared Logarithmic Error (RMSLE): There are times when the target
variable varies in a wide range of values. And hence we do not want to penalize the
overestimation of the target values but penalize the underestimation of the target values. For such
cases, RMSLE is used as an evaluation metric which helps us to achieve the above objective.

12. R2 - Score: The coefficient of determination also called the R2 score is used to evaluate the
performance of a linear regression model. It is the amount of variation in the output-dependent
attribute which is predictable from the input independent variable(s). It is used to check how
well-observed results are reproduced by the model, depending on the ratio of total deviation of
results described by the model.

13. Matthews Correlation Coefficient (MCC): The MCC is in essence a correlation coefficient
between the observed and predicted binary classifications; it returns a value between −1 and +1. A
coefficient of +1 represents a perfect prediction, 0 no better than random prediction and −1
indicates total disagreement between prediction and observation.
PROBLEM STATEMENT

The maintenance of railway tracks is a crucial task that ensures the safety and efficiency of the
transportation system. However, traditional methods of track maintenance are reactive and
inefficient, leading to higher costs and increased risk. One of the main challenges in track
maintenance is the detection and classification of faults, which can be subtle and difficult to
identify. The objective is to develop an automated system that can quickly and accurately detect
faults or abnormalities on the tracks, enabling immediate action to prevent accidents. The system
should improve the efficiency and reliability of fault detection, reduce human effort and time
required for inspections, and provide data-driven insights for better track maintenance.

To address this issue, the proposed project aims to develop an IoT-based solution utilizing Machine
Learning (ML) algorithms for the automatic detection of cracks in railway tracks. The system will
incorporate sensors along the tracks to collect data on track conditions, and an ML model will
analyze this data in real-time to identify potential cracks or defects. This project seeks to enhance
railway safety by providing an intelligent, proactive approach to track maintenance, minimizing the
risk of accidents and ensuring the reliability of the railway network.
DATASET USED

In our project involving crack detection using LiDAR and an ML algorithm, the dataset plays a
crucial role in training and evaluating the model's performance.

Dataset Description:

1. LiDAR Data: The dataset likely consists of LiDAR-generated point clouds representing the toy
train track. Each data point includes 3D coordinates (x, y, z) captured by the LiDAR sensor.
Additional attributes may include intensity or reflectivity values associated with each point.

2. Ground Truth Labels: The dataset is likely annotated or labeled to indicate cracks and non-
crack segments along the track. Each data point in the LiDAR data might be labeled as either a
crack or a non-crack segment for supervised learning.

3. Features and Attributes: Besides spatial coordinates and labels, the dataset may include
additional features or attributes derived from the LiDAR data, such as local surface normals, point
densities, or statistical features extracted from the point clouds. These features aid in training the
ML algorithm to detect patterns associated with cracks.

4. Dataset Size: The dataset's size, in terms of the number of samples (point cloud instances) and
the balance between crack and non-crack instances, influences the model's training and evaluation.
A sufficiently large and balanced dataset is crucial for robust model training.

5. Preprocessing Steps: Preprocessing techniques might have been applied to clean, normalize, or
augment the dataset. Cleaning may involve noise removal or outlier detection, while augmentation
techniques might include down sampling, up sampling, or data transformations to enhance the
dataset's diversity.

Considerations:
Imbalance: Consideration should be given to the class distribution whether there's an imbalance
between crack and non-crack instances. Addressing class imbalance might involve techniques like
oversampling, under sampling, or using weighted loss functions during model training.

Training-Validation Split: The dataset is typically divided into training and validation sets for
model training and evaluation. A common split might allocate around 70-80% for training and the
remaining for validation to ensure the model generalizes well to unseen data.
Data Integrity and Quality: We will ensure the dataset's integrity by verifying annotations,
checking for label inconsistencies, and assessing the quality of the LiDAR data. Erroneous or noisy
data might affect the model's performance.

Understanding these dataset attributes and considerations is crucial for preparing, training, and
evaluating the ML model accurately for crack detection on the toy train track. Adjustments in
preprocessing, feature engineering, or dataset balancing can significantly impact the model's
effectiveness.
PROPOSED SYSTEM

The proposed solution involves an approach to enable crack detection and immediate action on a
toy train track. Primarily, the TF Mini-S LiDAR sensor is affixed to the toy train, enabling the
precise collection of spatial data along the track. This LiDAR, continuously scanning the
surroundings, provides detailed 3D coordinates and additional information, forming a
comprehensive point cloud representation of the track's surface. This data is processed in real-time
by a Raspberry Pi, equipped with an ML model like Convolutional Neural Networks (CNNs)
designed for crack detection.

As the toy train progresses, the LiDAR sensor identifies potential cracks or irregularities on the
track surface. Once the ML algorithm detects such anomalies, an audible alarm is triggered,
alerting the driver to apply brake. This proactive approach ensures immediate actions upon crack
detection, enhancing safety measures by preventing further movement and providing a clear
warning signal for quick intervention or inspection.

The collaborative integration of LiDAR technology, ML algorithms, and the Raspberry Pi


computing platform enables a responsive and efficient crack detection system. By utilizing precise
data collection with swift analysis, the system swiftly identifies and responds to track irregularities,
enforcing a safety mechanism that halts the train and signals potential hazards, thereby ensuring a
safer operation of the toy train track.
EVALUATION METRICS

In our railway track crack detection project using LiDAR and an ML algorithm, evaluation metrics
help assess the performance and effectiveness of the system. Here are some key evaluation metrics
commonly used in machine learning and their relevance to our project:

Accuracy: Useful as an overall performance indicator but may not be sufficient alone, especially
with imbalanced data where the number of non-crack instances significantly outweighs crack
instances.

Precision and Recall: Crucial for understanding false positives and false negatives. High precision
ensures that when the model predicts a crack, it is highly likely to be accurate, while high recall
ensures the model captures most actual cracks.

F1 Score: Useful for balancing precision and recall, especially when there's a trade-off between the
two metrics.

ROC Curve and AUC: Helpful for selecting optimal thresholds and understanding the model's
ability to distinguish between crack and non-crack instances.

Using a combination of these evaluation metrics provides a comprehensive understanding of the


crack detection model's performance, highlighting its strengths and areas for improvement.
Selecting the appropriate metrics depends on the project's objectives, dataset characteristics, and
the desired balance between precision and recall in crack detection.
RESULTS
Upon implementing the crack detection system using LiDAR and an ML algorithm for the toy train
track, anticipated evaluation metrics are as follows:

The Confusion Matrix is expected to reveal a high accuracy rate, aiming for around 95% or higher,
showcasing the system's ability to accurately identify both positive and negative instances.
Precision and recall metrics are projected to achieve approximately 90% or more, reflecting the
system’s precision in classifying positive instances and its ability to capture most actual positive
cases.

The F1 Score, combining precision and recall, is estimated to be around 92-95%, indicating a
balanced performance between these crucial metrics. Specificity (True Negative Rate) is
anticipated to demonstrate values around 90-95%, illustrating the system's proficiency in
identifying true negative cases. Moreover, a low False Positive Rate, expected to be under 10%,
signifies minimal instances of incorrectly flagged negatives.

A Matthews Correlation Coefficient (MCC) above 85% is envisioned, depicting a strong


correlation between observed and predicted classifications. These projected results are indicative of
a robust crack detection system that aims to accurately identify track irregularities while
minimizing false alarms, ensuring enhanced safety measures on the toy train track.

These expected results are projected based on the system's design, the anticipated performance of
the LiDAR technology, and the model's training specifics. Actual outcomes might vary based on
various factors including data quality, model complexity, and real-world operational conditions.

You might also like