720c1402089e2bd728c0b2f748d24d63

Predictive Maintenance Model for Marine Vessels using Machine Learning Section: Research Paper
Predictive Maintenance Model for Marine Vessels using

Machine Learning
Thirein Myo 1, Zakariya Al Hasani 2, Muhammad R Ahmed 3, Badar Al Baroomi 4
1, 2, 3, 4
Military Technological College, Muscat, Oman.
Email: thirein.myo@mtc.edu.om, 2 1706009@mtc.edu.om,
1
3
Muhammad.Ahmed@mtc.edu.om, 4 badar.albaroomi@mtc.edu.om
Abstract
The field of predictive maintenance has gained increasing interest recently for various
reasons with the improvement of monitoring techniques and the increase of new
methodologies and algorithms across different learning methods. There is an urgent need for
the industry to detect faults accurately and in advance in the production environment, to
minimize maintenance costs, prevent sudden failures and ensure optimum use of machines.
Ideally, the process begins with collecting historical data from many sensors installed in
different devices. In this paper, the available propulsion system data is used due to time
limitation as the recording of historical data takes vast amount of time. Instead, the
implementation of machine learning models using two popular algorithms are focused here.
The evaluation of applied machine learning algorithms provides promising results to
implement in the industry.
Index Terms: Machine learning, Marine vessels, predictive maintenance.
1. Introduction
Technical maintenance can be defined as a group of operations and practices that aim to
assure uninterrupted and efficient machinery and equipment operation in various industrial
fields to conserve their performance as long as possible. Onboard a ship, diligence in
implementing an effective maintenance program is one of the necessary things to keep any
machinery or mechanical systems going, whether it is small equipment or a big structure.
Effective maintenance may help to extend the life span of the machine and maintain a
smooth-running condition. So, the appropriate planning of the maintenance is critical in all
types of industries including the maritime industry. Maintenance needs manpower and time
that might unavailable all the time due to the number of machines onboard the ship are more
than crew members.
There are several types of maintenance used in marine ships: preventive also referred to as
routine maintenance, corrective, planned and predictive maintenance. Routine maintenance is
carried out on a particular schedule and commonly includes activities, for instance, checking,
cleaning, inspecting, and replacing. Also, routine maintenance might be scheduled based on
monthly, weekly, or even daily. Moreover, its goals are to prevent likely issues and define
existing troubles to fix them as quickly as possible. The planned maintenance might be
scheduled once a year or as it’s needed, that is because planned maintenance is a time-
consuming process, comprehensive and expensive. Corrective maintenance includes repairs
1572
Eur. Chem. Bull. 2023, 12(Special Issue 1), 1572-1583
and necessary replacements to get back the machine's condition with full operation power.
Corrective maintenance is performed after detecting the defect or problem during the routine
maintenance inspection. The event critically of the maintenance types are shown in Figure 1.
Figure 1: Comparison of maintenance types

In most of the cases, it probably is too late for repairing anything. So, it can cost much money
and time in the end, furthermore, it can endanger the safety of the ship and crew. Predictive
maintenance (PdM) is used to continuously monitor the machine performance and its
condition during ordinary operation processes to minimize the occurrence of failures and to
make sure using optimal operation of machines. Moreover, predictive maintenance helps in
the early detection of machine defects that may cause unnecessary costs or unplanned
failures. The main goal of predictive maintenance is to schedule the corrective maintenance
and prevent sudden failures of machines before the vessel goes to the open water where
replacement and repairs are more convoluted and costly. The predictive maintenance process
uses many sensors to monitor different parameters inside a system or machine. The maritime
industry today is highly focused on having minimum malfunctions or defects on the marine
vessels by collecting the data for specific parameters using sensors fixed on different parts of
the machine, and after that analyzing and processing using artificial intelligence or machine
learning algorithms.
2. Literature Review
Maintenance technicians and machine operators have been striving in recent years to develop
predictive maintenance in various fields, including the marine industry field, in conjunction
with machine learning and artificial intelligence to know and predict the machine downtime
before its actual breakdown. The main objective of maintenance is to minimize equipment
malfunctions and to avoid failures that may cause delays operations. Jimenez et al. introduced
a proposal in developing a solution of predictive maintenance for marine vessels based on
artificial intelligence model using operational data [1]. The main idea is to use historical data
to reveal trends in equipment behavior to predict when the equipment will breakdown. Once
the failure has been identified, and predict the failure timing, predictive maintenance tasks
will be planned. Three different types of data were collected: vibrating data, lubricating oil
data and performance data to make study on them. The analyzed data were obtained from
historical values from sensors that measure the health of ship engines and compressors. One
1573
of the limitations of the study is the lack of capacity for more data. In addition, requested data
from third party providers experienced significant transmission delays.
Lazakis et al carried out a study on a methodological and systemic approach in order to
identify and analyze the physical properties of important ship mechanical systems and
components [2]. For tracking and forecasting forthcoming values of physical characteristics
connected to ship critical systems, a critical ship main engine equipment is used as input in
dynamic time series neural network. Fault Tree Analysis (FTA) and Failure Mode and Effects
Analysis (FMEA), are combined to identify the essential primary engine components and
systems, together with the pertinent parameters to be monitored. In a Panamax-sized
container ship case study, Artificial Neural Networks (ANN) are utilized to forecast future
values of all main engine cylinder exhaust gas temperatures. The information used in the
neural networks was gathered during a measuring campaign conducted on board the ship
while it was traveling through the Mediterranean. Moreover, the validation of forecast results
was carried out through comparison with actual observations made on board the ship.
Through dependability modelling and tools, the suggested hybrid technique effectively
demonstrated a systematic strategy for first identifying crucial systems and components,
followed by the use of neural networks to monitor their physical properties. In a nutshell, the
FMEA and FTA tools may work in tandem to provide a suitable general model for acquiring
important critical systems together with potential causes and consequences of failure and
pertinent physical metrics. Lastly, through the time series analysis of respective physical
attributes, the application of the ANN provides a more focused approach for assessing and
tracking the status of the detected Fault Tree components.
Gohel et al. published a proposal for the design and development of a machine learning
algorithm to carry out the predictive maintenance of the nuclear infrastructure. Prediction was
implemented using logistic regression and support vector machine algorithms. Predictive
analytics includes building the data framework to monitor the performance continuously
through analyze the sensor data to give advance alerts of component failures. In their
research, the machine learning algorithms, SVM and LR were selected to conduct a study and
compare between them. It was found that the proposed framework provide higher accuracy
than the research conducted by other researchers [3]. Berghout et al. presented a study on a
novel data-driven method for estimating the degradation of a combined gas propulsion plant
and diesel-electric for marine propulsion systems [4]. The suggested method used a particular
kind of deep belief neural network (DBN) constructed on online sequential extreme learning
machine (OS-ELM) rules. The DBN has the ability to accomplish convolutional mapping and
the pooling in each sub-network from its hidden layers in accordance with ELM with local
receptive fields theories. The newly presented framework has been assessed using data that
changes over time, derived from the system numerical model, and contrasted with its initial
variations (ELM, OS-ELM). In terms of prediction capability, the findings demonstrate that
combinatorial DBN (C-DBN) is more effective, particularly for a single output. Thus, the
adoption of planned maintenance procedures in real-time is quite promising. This
effectiveness is based on the developed approach and its dynamic adaptability together with
the forgetting mechanism and regularization paradigm. Additionally, based on the planned
filtering method and deep reconstruction, extracting more relevant feature representations
1574
shows that it is crucial for both generalization and accurate approximation. Random sampling
and constrained circumstances were used to conduct this comparative investigation. As a
result, further research must be done to analyze this dataset utilizing more cross-validation
activation functions and other probability distributions for subsampling. In order to achieve
greater levels of accuracy and generalization, it might be intriguing to look into the usage of
random search techniques for hyperparameter tuning.
Ineffective or improper maintenance can lead to a dangerous conditions on board ship that
may result to accidents, serious damage to machines and loss of life. Lazakis et al.
introduced the INCASS approach, which an innovative system that monitors machinery, ship
structures and equipment [5]. INCASS is depends on certain vessel case studies to test and
validate it under real conditions which includes data collection through sensors that installed
on machinery. The information is gathered from various sources including the OREDA
database, historical data and expert opinions and ship operators. The paper presents the
methodology which used in the INCASS project to analyze the recorded data and integrate
the results of the analysis into the DSS system.
Ideally, the work starts with installing sensors in required machines to collect the data.
However, the collection of the historical data from sensors takes huge amount of time to
achieve the acceptable dataset. Because of that, in this project the available data from Kaggle
[6] is used to analyze and modeled using machine learning algorithms. The remianing of the
paper will be organized as follows: two machine learning algorithms will be discussed in
Section III, the analysis of the data is presented in Section IV, Section V discussed results and
discussion and the papers is concluded in Section IV.
3. Machine Learning Algorithms
A. Logistic Regression Algorithms
Among the available Machine learning algorithms, Logistic algorithm and Random Forest
algorithm are selected in this work due to popularity and recommended by data scientists due
to good performance. Logistic regression (LR) is a statistical model type that is commonly
used for analytical and classification, which falls under supervised learning technique. A
linear regression algorithm is used for solving regression problems, whereas logistic
regression is used for solving classification problems [7]. LR can be used for binary and
multiclass classification problems. LR for solving multiclass classification is also called
multinomial logistic regression. The underlying formulae in LR is:
1
h ( X ) 
1  e X (1)
i 1
(y
1
J ()   i
log( p i )  (1  y i ) log(1  p i ))
m m (2)
B. Random Forest Algorithms
With the advancement in technology, a lot of machine learning frameworks have been widely
used. With the new advancement, each new framework overcomes the limitations of the
previous framework such as the limitations of interfaces of noise, parameters, and high
threshold value. Random forest is a supervised machine learning algorithm used for
classification and regression because of its simplicity and diversity. In the decision tree, it
1575
grows a single tree but in a random forest they grow multiple trees. When a decision tree is
used, the generated model gives a bad predictive model, and the problem of overfitting arises.
So instead of using a decision tree, the use of a random forest model is a good approach as it
provides a good model in terms of reducing the problem of overfitting. In a random forest,
against each provided input, every tree has the option to select the best classification result.
The following figure explains how random forest works [8].
Figure 2: Working principle of the random forest algorithm

4. Data Analysis
A. Data Collection
Acquiring the data along with gathering measurement data from different sensors and
processing the initial signals to obtain useful features that can indicate the health of the
system condition is the first step in diagnosing and pre-predicting the failure of any machine
or equipment. Raw data (unprocessed data which collected from on board ship sensors and
experts), is converted into beneficial information using the machine learning techniques,
database technologies and artificial intelligence. An important step after data mining is to
minimize the features since the extracted features are usually too many to use to perform the
operation. Common dimensionality reduction techniques, for example, kernel-PCA, Principal
Component Analysis (PCA), and Isomap to remove unessential features. Due to the
constraints of unavailability of the equipment needed to conduct the process of collecting the
required historical data for the ship’s propulsion system in the college, the data was used and
got from the Kaggle source [6] to create a predictive model.
B. Naval Propulsion System
Naval ships have many mechanical systems and machines, but some of them have great
importance and role in the ship and without them, it cannot go out at sea, such as the main
engine, propulsion system, steering gear and generators, Therefore, is very important to
maintain them continuously. Corrective or planned maintenance might be not enough to
maintain ships, especially those at sea, from experiencing machinery failures. For this reason,
1576
the role of predictive maintenance comes in the early detection of machinery failure before
the ship goes to sea. One of the most critical systems in naval ships is the propulsion system.
The marine propulsion system is a mechanism that ships used to produce thrust to move and
manoeuvre across the water. Whereas sails and paddles are still used in several small ships,
most modern vessels are propelled using mechanical systems which consist of an engine or
electric motor turning the propeller. There are different types of propulsion systems utilized
in vessels, but the diesel propulsion system type is the most popular marine propulsion
system used to convert mechanical power from thermal forces. Marine engines are the largest
and most expensive in the world and are responsible for the ship's propulsion. Therefore, it is
very important for maintenance technicians to maintain these engines regularly to ensure that
they continue operating with high efficiency, also to avoid any sudden malfunctions. The
behavior of main components of a marine vessel propulsion system cannot easily modelled
by previous physical knowledge, given a huge amount of variables that affect them. Instead,
Data-Driven Models adopt on the advanced statistical techniques to create models on the big
amount of the historical data that gathered using on board ship automation systems, without
the need of any prior knowledge. Data Driven Models (DDMs) are highly useful for
continuous monitoring of a propulsion system and its equipment and making decisions based
on an actual propulsion plant condition [9].
Figure 3: Marine Propulsion System
Figure 4: Naval Propulsion System

1577
Table 1: A sample of the dataset
Table 1 shows a sample of the dataset. The first column of the table shows the row number,
while the second column shows the unique ID number for each product or equipment. In
addition, the third column represents the type of machine which could be either low, medium
or high. The rest of the columns represent the parameters that will be analysed and predict
their failure, which are five parameters of the ship's propulsion system as they shown in the
Table: air temperature, process temperature, rotational speed, torque and tool wear. The
Target column listed “No Failure” as 0 and “Failure” as 1. The last column illustrates types of
failure along with “No Failure”.
Figure 5: Counts of failure and no failure

1578
As shown in Figure 5, the number of cases of no failure in the propulsion system is 9652,
while the number of cases of failure is 330. One of the widespread problems that can be faced
in the datasets of the prediction model used for classification is the imbalanced classes
problem, where one of the observation numbers in the target class labels is much higher than
the other class labels. In the case of this model, the number of failures is much less than the
number of failures in the system as shown in the figure; this problem often leads to
unsatisfactory results and may affect the relationship between features. Figure 6 illustrates the
number of failures in each parameter of the propulsion system. The x-axis represents the type
of failure, which are four types of parameters power, tool wear, overstrain and heat
dissipation, while the y-axis represents how many failures are in each parameter. It can be
seen from the Figure 6 that the number of failures of the heat dissipation has the highest value
at around 112 times, followed by the power with the number of failures at 95, and the number
of overstrain failures at 78 times. However, the tool wear parameter has the lowest number of
failures 45 times out of the total number of failures, 330.
Figure 6: Counts of failure types
Figure 7: The possibility of failure at each type

1579
Figure 7 shows that the possibility of failure in each parameter. As it is clear from the figure
that when all parameter values increase, the possibility of failure will increase. For example,
in the first graph, when the temperature rises, the possibility of failure of the device or system
increases. it is also noticed that when the temperature is higher than 300 K, the possibility of
failure increases significantly until it reaches a certain point at which the device will fail
which is at around 305 K. in addition, in the rotational speed graph, the possibility of failure
increases sharply when the rotational speed is above about 1750 rpm and until it reaches
around 2700 rpm, where the system will fail at this point. The probability of system failure
increases dramatically when the torque is above about 50 Nm and continues to rise until it
reaches a point where the system failure will occur, which is at about 70 Nm.
5. Results and Discussion
Predictive maintenance models using LR and Random Forest Algorithm considering all
features of the dataset were written in Python programming language. 25% of test data is
taken out from dataset to evaluate the proposed model. Confusion matrix used to measure the
performance of classification models for the given test data. The matrix is splits to two
dimensions, which are actual values and predicted values with total number of the
predictions. The predicted values are those predicted by a model, and the actual values are
true values for a given observations. This confusion matrix was in 5×5 matrix because there
are five parameter in the prediction model. in confusion matrix the all predicted values must
be correspond to actual values, which means all values should be a straight line from upper
edge of the matrix table to lower edge of the table, but it is noticeable that there are some
values for both matrix that were outside this line as it can be seen in Figure 8 and Figure 9,
which means that the prediction value does not correspond to the true value, and this is
considered a prediction error. The accuracy can be defined of a classification performance
and calculated by using the confusion matrix formula as given below:
Classification accuracy: (3)
Figure 8: Confusion Matrix of LR method

1580
Figure 9: Confusion Matrix of Random Forest method

The accuracy of the model performance using logistic regression algorithm was 99.84 %,
while the accuracy of a model by using random forest algorithm was 99.88 %. It can be
concluded that the accuracy of random forest algorithm is slightly better than logistic
regression algorithm in this case study. Moreover, the other three methods were used to
assess the model performance of each of the two algorithms, as shown in Table 2, which are:
Cohen’s Kappa score, Matthews’s correlation coefficient and Hamming loss. All methods are
used to measure the relation between a machine learning model prediction value and an
actual value. For Cohen’s Kappa score and Matthews correlation coefficient methods, the
higher the evaluation value, the more accurate the model, while in Hamming loss method the
lower value the better model performance. In all the three methods used to evaluate the
accuracy of the model, it was found that the performance of the random forest algorithm was
the better compared to the logistic regression algorithm as shown in the table.
Table 2: performance of algorithms
Machine learning Cohen’s Kappa Matthews correlation
Hamming Loss
algorithm Score coefficient
Logistic regression 0.97649 0.97650 0.002
Random forest 0.982367 0.982369 0.001
6. Conclusion
Technological advances, the need for maintenance costs, difficult operating conditions, and
optimization are a great combination that must be tackled with massive data and predictive
analytics. Given the high cost of machinery and equipment used in marine vessels and the
impact of exorbitant maintenance costs, it is expected that marine vessels will undergo rapid
1581
development, as it is noticeable that the industry is moving forward with data science and
more complex maintenance approaches. Although the predictive maintenance field is still in
its infancy in the marine industry, and there is still a lot of work to uncover the complete
potential represented by massive data and artificial intelligence. Two machine learning
algorithms were selected to be implemented on the model which are a logistic regression
algorithm and a random forest algorithm. Based on the performance of the models measured
by several evaluating techniques, the proposed model with two machine learning algorithms
gives promising result. It is planned to improve the performance of the model by tackling the
imbalance of the dataset in future study. Moreover, the implementation of the proposed
model in marine vessels is intended, beginning with collecting data by installing sensors in
marine propulsion system.
References
[1] V. J. Jimenez, N. Bouhmala, and A. H. Gausdal, “Developing a predictive
maintenance model for vessel machinery,” J. Ocean Eng. Sci., vol. 5, no. 4, pp. 358–
386, Dec. 2020, doi: 10.1016/j.joes.2020.03.003.
[2] I. Lazakis, Y. Raptodimos, and T. Varelas, “Predicting ship machinery system
condition through analytical reliability tools and artificial neural networks,” Ocean
Eng., vol. 152, pp. 404–415, Mar. 2018, doi: 10.1016/j.oceaneng.2017.11.017.
[3] H. A. Gohel, H. Upadhyay, L. Lagos, K. Cooper, and A. Sanzetenea, “Predictive
maintenance architecture development for nuclear infrastructure using machine
learning,” Nucl. Eng. Technol., vol. 52, no. 7, pp. 1436–1442, Jul. 2020, doi:
10.1016/j.net.2019.12.029.
[4] T. Berghout, L.-H. Mouss, T. Bentrcia, E. Elbouchikhi, and M. Benbouzid, “A deep
supervised learning approach for condition-based maintenance of naval propulsion
systems,” Ocean Eng., vol. 221, p. 108525, Feb. 2021, doi:
10.1016/j.oceaneng.2020.108525.
[5] I. Lazakis, K. Dikis, A. L. Michala, and G. Theotokatos, “Advanced Ship Systems
Condition Monitoring for Enhanced Inspection, Maintenance and Decision Making in
Ship Operations,” Transp. Res. Procedia, vol. 14, pp. 1679–1688, 2016, doi:
10.1016/j.trpro.2016.05.133.
[6] Tolga, Predictive Maintenance Dataset. Accessed: Jul. 08, 2022. [Online]. Available:
https://www.kaggle.com/datasets/tolgadincer/predictive-maintenance
[7] JavaTpoint, “Logistic Regression in Machine Learning,” javatpoint, Aug. 20, 2022.
https://www.javatpoint.com/logistic-regression-in-machine-learning (accessed Aug.
20, 2022).
[8] Y. Liu, Y. Wang, and J. Zhang, “New Machine Learning Algorithm: Random Forest,”
in Information Computing and Applications, vol. 7473, B. Liu, M. Ma, and J. Chang,
Eds. Berlin, Heidelberg: Springer Berlin Heidelberg, 2012, pp. 246–252. doi:
10.1007/978-3-642-34062-8_32.
[9] J. M. Apsleytt et al., “Propulsion Drive Models for Full Electric Marine Propulsion
Systems,” p. 6.
1582
AUTHORS PROFILE
Thirein Myo received his PhD in Mechanical Engineering from the University
of Adelaide, Australia in 2017. He completed Data Science Architect Master’s
Program from Intellipaat Software Solutions Pvt Ltd. Collaboration with IBM.
Furthermore, he obtained Deep Learning AI TensorFlow Developer certificate
from Coursera collaboration with Google.
Zakariya Al Hasani received Honours Degree of Bachelor of Engineering in

Marine Engineering from University of Portsmouth.
Muhammad Ahmad obtained PhD from University of Canberra, Australia. He

received Master of Engineering studies in Telecommunication and a Masters of
Engineering Management degree from the University of Technology, Sydney
(UTS), Australia.
Bader Al Baroomi obtained Bachelor of engineering in Electronics and

Communication from University of Leeds, United Kingdom in 2005. He
received Master of engineering in Systems Engineering from University of
Toulouse France in 2015.
1583

720c1402089e2bd728c0b2f748d24d63

Uploaded by

Copyright:

Available Formats

720c1402089e2bd728c0b2f748d24d63

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

720c1402089e2bd728c0b2f748d24d63

Uploaded by

Copyright:

Available Formats

Predictive Maintenance Model for Marine Vessels using Machine Learning Section: Research Paper

Predictive Maintenance Model for Marine Vessels using

Figure 1: Comparison of maintenance types

Figure 2: Working principle of the random forest algorithm

Figure 3: Marine Propulsion System

Figure 4: Naval Propulsion System

Table 1: A sample of the dataset

Figure 5: Counts of failure and no failure

Figure 6: Counts of failure types

Figure 7: The possibility of failure at each type

Figure 8: Confusion Matrix of LR method

Figure 9: Confusion Matrix of Random Forest method

Logistic regression 0.97649 0.97650 0.002

Random forest 0.982367 0.982369 0.001

Zakariya Al Hasani received Honours Degree of Bachelor of Engineering in

Muhammad Ahmad obtained PhD from University of Canberra, Australia. He

Bader Al Baroomi obtained Bachelor of engineering in Electronics and

You might also like