Audio-Based Fault Diagnosis For Belt Conveyor Rollers

Download as pdf or txt
Download as pdf or txt
You are on page 1of 10

Neurocomputing 397 (2020) 447–456

Contents lists available at ScienceDirect

Neurocomputing
journal homepage: www.elsevier.com/locate/neucom

Audio-based fault diagnosis for belt conveyor rollersR


Mingjin Yang, Wenju Zhou∗, Tianxiang Song
Shanghai University, 99 Shangda Road, Shanghai, China

a r t i c l e i n f o a b s t r a c t

Article history: In order to monitor the roller states online running on the belt conveyor, one class of fault diagnosis sys-
Received 15 May 2019 tems based on audio is studied in this paper. Firstly, the audio data is collected from the belt conveyor
Revised 23 August 2019
by sensors, which is analyzed using the stacked sparse encoders and convolutional neural network. Sec-
Accepted 6 September 2019
ondly, the fault features are extracted from the audio data by using spectral clustering algorithm. Finally,
Available online 20 March 2020
a real fault diagnosis system is applied on the belt conveyor working in the coal preparation plant. The
MSC: running result shows that the fault diagnosis system works very well for rollers fault detection with the
00-01 accuracy rate 96.7%.
99-00
© 2020 Elsevier B.V. All rights reserved.
Keywords:
Stacked sparse encoders
Convolutional neural network
Fault diagnosis
Spectral clustering
Feature extraction

1. Introduction dom forests classifier for the fault diagnosis in rolling bearings [8].
A tool combing artificial neural network with expert system is de-
Nowadays, the intelligence methods have been applied in the veloped for transformer fault diagnosis using dissolved gas-in-oil
actual production and shown that they can improve effectiveness analysis method by Wang et al. [9]. Wu et al. present an algorithm
and efficiency. Over the past decade, the artificial intelligence has to classify the industrial system faults based on K-Means clustering
become popular in every fields such as agriculture, textile indus- and probabilistic neural network (PNN) [10]. Ramos et al. present
try and electronic industry etc.[1]. However, the coal industry is a hybrid algorithm using fuzzy clustering techniques, which is ap-
still outdated and less using of artificial intelligence, so that it im- plied in a fault diagnosis scheme to detect new faults [11]. Gao
plies big costs and high risks. Therefore, it is urgently required that et al. present a new subspace clustering method to analysis high-
the artificial intelligent technology is applied in coal production. dimensional data based on locality-preserving robust latent low-
Some intelligent systems, combining big data algorithm and ma- rank recovery, which maps the high-dimensional and non-linear
chine learning algorithm, have been researched and tried to use in data into a low-dimensional latent space by preserving local simi-
coal productions, such as the intelligent control system of flotation larities [12]. Zhang et al. adopt and enhance of the rough k-means
reagents [2], the coal separation system based on machine vision algorithm in order achieve a superior clustering strategy which
technology [3], and the intelligent monitoring system [4], etc. considers imbalanced clusters [13]. Aiming to improve the accuracy
With the development of artificial intelligence, more and more of fault diagnosis, [14–16] have researched some common time and
fault diagnosis systems in industry have been extended with in- frequency methods including zero-crossing rate, power spectrum,
telligent functions [5]. Guo et al. propose a new learning algo- modern power spectral estimation, mel-frequency cepstrum coeffi-
rithm for pattern recognition, the SVMs with a binary tree recog- cient(MFCC) and empirical mode decomposition(EMD).
nition strategy are used to tackle the audio classification prob- Fault diagnosis systems are widely used in coal industry. The
lem [6]. Ying et al. present a hidden Markov model (HMM) al- key challenge of intelligent fault diagnosis is to find features that
gorithm for fault diagnosis in systems with partial and imperfect can categorize the abnormal from the normal data. A novel method
tests [7]. Wang et al. propose a novel hybrid approach of a ran- is proposed to diagnose roller faults through reducing features’
dimensions, the result shows that the fault diagnosis model is
effective and adaptive Yu et al. [17], Chen et al. [18]. proposes
R
This work was supported by National Natural Science Foundation of China
an integrated framework combining fault classification and loca-
(61833011, 61877065).

Corresponding author.
tion based on an innovative machine-learning algorithm. The pa-
E-mail address: zhouwenju@shu.edu.cn (W. Zhou). per [19] addresses a methodological framework for the diagnosis of

https://doi.org/10.1016/j.neucom.2019.09.109
0925-2312/© 2020 Elsevier B.V. All rights reserved.
448 M. Yang, W. Zhou and T. Song / Neurocomputing 397 (2020) 447–456

multi-faults in rotating machinery using features rankings, K near-


est neighbors and random forest.
Shao et al. use an auto-encoder to compress data and reduce di-
mensions, a novel convolutional deep belief network is constructed
with Gaussian visible units to learn the representative features
[20]. Wen et al. propose a new CNN based on LeNet-5 for fault
diagnosis. Through a conversion method converting signals into
two-dimensional (2-D) images, the features of the converted 2-
D images can be extracted [21]. Pan et al. propose a novel deep
learning network to learn features adaptively from raw mechanical
data without prior knowledge to detect motor faults. A novel deep
Fig. 1. The top part of belt conveyor.
learning network (LiftingNet) is proposed to learn features adap-
tively from raw mechanical data without prior knowledge [22].
The intelligent algorithms mentioned above provide some new
references for the diagnose faults, which pushes more and more
relative technologies to be used popularly to instead of human
work in industry. The fault diagnosis involves the following typical
data pipeline: data acquisition and conditioning, feature extrac-
tion, feature selection, and a final detection/classification/forecast
stage [23]. Data acquisition can resort to different types of sensors
including acoustic, vision, electric, oil spectrometers, thermal imag-
ing, or accelerations sensors and so on. Extracting the sound signa-
ture of the machine is a useful tool in fault diagnosis. The audio-
based fault diagnosis using microphones is an emerging field with
a great potential in fault diagnosis since microphones are nonin-
vasive sensors [24]. Yadav et al. [25] proposes a novel prototype-
based engine fault classification scheme employing the audio
signature of engines, the proposed method on Sum peak analysis
for the classification between healthy and faulty classes. Glowacz
and Glowacz [26] presents the early fault diagnostic technique
of stator faults of the single-phase induction motor, in which the
proposed technique is based on recognizing acoustic signals. Lee Fig. 2. The rollers of belt conveyor.

et al. [27] presents a data mining solution that utilizes audio data
to efficiently detect faults in railway condition monitoring systems.
In coal industry, the fault diagnosis for belt conveyor rollers is made mainly by the friction of rollers itself and rollers with the
still using traditional methods such as listening and looking by belt. The belt conveyor system is shown as Fig. 1 and the rollers
workers. Millions of belt conveyors play important roles in coal mounted on the belt conveyor are shown as Fig. 2. The period and
industry everyday, the automatic fault diagnosis for belt conveyor the amplitude of the sound will be changed with the load changing
rollers is needed urgently. Due to the fault sound is different from on the belt delivering. The audio data directly obtained from sen-
the normal sound made by the running belt conveyor rollers, the sors often contains much the disturbing data, the sensor is shown
fault of the rollers can be diagnosed by the sound. But the fault as Fig. 3. Therefore, it is necessary to eliminate the irrelevant data.
sound is often drowned in the noise, it is challenge to recog- Several methods are studied for extracting features to get effective
nise and locate the abnormal rollers on the running belt conveyor. characterization information as follows [28].
Many belt conveyors run in harsh industrial environments with
strong noise. The problem of remote transmission of gathered faint
signal is also needed to solve in abominably industrial environ- 2.1. Data analysis and preprocessing
ment. In the paper, the novel fault diagnosis system is proposed to
monitor the rollers state through selecting the sound data from the The sounds in the frequency range of 20–20,0 0 0 Hz can be
running belt conveyor. The stacked sparse encoders and convolu- heard by human. The spectrum of the noise generated by rollers is
tional neural network are used to analysis the audio data, and the also distributed in the same range. The noise is obviously different
spectral clustering algorithm is employed to categories the kind of when the belt conveyor is in different running states, so that the
faults. The real diagnosis system built by our method is applied in noise can be used to monitor the belt running state. The noise will
XISHAN Coal Electricity Group. The fault diagnosis system provides be changed significantly when the rollers are damaged. The sam-
a new technology for detecting the belt roller fault, which can not pling frequency of the sensors is set at 44,100 Hz in this paper. The
only replace the manual detection but also improve the fault diag- noise audio are sampled in various states from no-load to full-load.
nosis accuracy. There are 80 0 0 signal waves to be sampled and each wave keeps
This paper is organized as follows. Three feature extrac- 20 s. To avoid manual interference during sampling process, the
tion methods are presented and compared in Section 2. first three seconds and the last three seconds are removed from
Section 3 presents the audio cluster analysis and equipment fault each 20s audio signal. The audio is split evenly, and an audio se-
grading. Section 4 provides the implementation of intelligent fault quence is obtained which consists of 80,0 0 0 bytes with the length
diagnosis system. Section 5 states conclusions. of 14 s.
The audio data collected from six sensors is shown in Fig. 4.
2. Feature extraction From Fig. 4, we can see that the range of the audio fluctuation is
big. First and second of Fig. 4 show part of normal waves, Third
In order to recognise the fault sound from the normal sound, and fourth of Fig. 4 show part of disturbing waves, and others of
the features should be extracted from audio data. The sound is Fig. 4 denote part of fault waves.
M. Yang, W. Zhou and T. Song / Neurocomputing 397 (2020) 447–456 449

Fig. 3. The sensor setting on belt conveyor.

The difference method, linear fitting method and sine fitting Table 1
Variance proportion of each feature.
method are used to try to distil out the main trend on the audio
waves. The results are shown in Fig. 5. Feature SR P ZCR M S K Total
According to Fig. 5, the difference method has the best effect in Proportion 0.5642 0.1314 0.1164 0.1020 0.0588 0.0223 0.9951
data processing. Compared with the original waveform, the ampli-
tude range using the difference method is reduced from −20,0 0 0–
20,0 0 0 to −50 0 0–50 0 0, the fault information is still reserved.
variance proportion, and their total proportion is 0.9951. The prin-
2.2. Common feature extraction cipal components have most of features information, so they can
be used for further analysis.
By using the center distance and origin distance of samples,
common time-domain characteristic parameters are found, which
2.3. Feature extraction based on stack sparse auto-encode
include the mean value (M), the peak value (P), the root mean
square (RMS), the variance (V), the standard deviation (SD), the
Previous studies have shown that the location of sensor is
skewness (S) and the kurtosis (K).
closely related to the diagnosis result in the practical application.
According to the audio data is sampled in fault diagnosis, there
Sensors are placed according to its monitoring range. In order to
is a weak correlation between peak values and belt speeds, and a
avoid misjudging, the impact of sensor locations should be mini-
strong correlation between peak values and fault data. The energy
mized as soon as possible.
of the field sampling signal can be described by RMS. RMS can be
Mel-frequency cepstral coefficients(MFCC) including time and
used to judge whether the roller is running normally.
frequency is widely used in audio analysis and fault diagnosis.
The fault signal often contains high frequency components. The
Moreover, MFCC is less affected by the sensor sampling location.
zero crossing rate(ZCR) is closely related to the high frequency
MFCC frequency domain plot of an sampled data is shown in Fig. 6.
components in the signal. In order to obtain the zero crossing rate,
The abscissa represents the number of windows, the ordinate
the data is divided into frames, and then every frame obtained is
represents the characteristics of different dimensions, and the color
centralized and normalized. Finally, the zero crossing rate of each
represents the size of the eigenvalue in Fig. 6.
frame is calculated.
Auto-Encoder (AE) is a symmetric neural network that learns
In addition, it is found that the amplitude is less changed in the
features in an unsupervised way by minimizing reconstruction er-
normal signal, and more changed in abnormal signal. Therefore, it
rors [29]. But AE has some problems, it cannot effectively extract
is totaled when the peaks value is more than a certain threshold,
meaningful features because the input layer is simply copied to
and the number of peak with amplitude over the threshold in a
the hidden layer. The extension of AE is used to obtain the sparse
frame are accounted as a criterion for fault identification. In this
AE(SAE), which is inspired by the sparse encoder. By introducing
paper, the amplitude threshold is 20 0 0, the threshold of number
the sparse penalty term into the AE to learn the relatively sparse
of peaks is 9 in a frame. The statistical results(SR) are used as a
features, the performance of the traditional AE is effectively im-
feature of fault diagnosis. And then, principal component analy-
proved.
sis(PCA) algorithm is used to reduce dimensions for the suspected
Given a set of n input examples xi , i = 1, . . . , n, the weight ma-
fault data. The covariance matrix in PCA can be calculated as fol-
trices W1 , W2 , the bias vector b1 and b2 are adapted for using back-
lows:
n propagation to minimise the reconstruction error(RE). The recon-
(Xi − X̄ )(Yi − Ȳ ) struction error can be calculated as follows:
cov(X, Y ) = i=1
(1)
n−1 ninput nencoder nout put
1 λ  
n
The detailed information of the PCA results is summarized in L(X, Z ) = ||xi − zi ||2 + W jil 2 (2)
Table 1. The first six features are principal components with large 2 2
i=1 l i=1 j=1
450 M. Yang, W. Zhou and T. Song / Neurocomputing 397 (2020) 447–456

Fig. 4. Noise audio signal waveforms from different sensors.


M. Yang, W. Zhou and T. Song / Neurocomputing 397 (2020) 447–456 451

Fig. 5. Comparison of waveforms before and after processing (difference method, linear function and sine function model from top to bottom, and the waveform after
eliminating the trend is shown in blue)

Fig. 6. MFCC frequency domain.

Where X denotes the sample matrix, Z denotes the reconstruc- Table 2


Comparison of experimental results of different parameters.
tion matrix, n denotes the number of samples, Wji denotes the
weight coefficient, Input size Number of hidden layers Number of nodes RE
Further, we constrain that the expected activation of the hid- 864 2 256, 256 0.340
den units is sparse. We add a regularisation term, which penalises 864 2 256, 128 0.342
a deviation of the expected activation from a (low) fixed level ρ . 864 2 128, 128 0.362
Thus, it turns out to be the following optimisation problem: 864 3 256, 128, 128 0.239
864 3 256, 128, 64 0.253
Lsparse (X, Z ) = L(X, Z ) + β P N (3) 864 3 200, 100, 50 0.229
864 4 128, 128, 64, 32 0.120
Where β denotes penalty factor, PN is a sparse penalty term 864 4 128, 64, 32, 16 0.109
and it can be calculated as follows: 864 4 128, 32, 16, 8 0.063
864 5 128, 64, 32, 32, 16 0.121

S2
∧ 864 5 128, 32, 32, 32, 32 0.118
PN = KL(ρ|| ρ j ) (4) 864 5 128, 64, 32, 16, 8 0.084
j=1

Where S2 denotes the number of hidden layers, KL is Kullback


Leibler(KL) divergence and it can be calculated as follows: A total of 80 0 0 sets of sample data are collected, in which 70 0 0
∧ ρ 1−ρ of them are used as training sets and the 10 0 0 samples are used
KL(ρ|| ρ ) = ρ ln
j ∧
+ (1 − ρ ) ln ∧
(5) for testing sets. The number of hidden layers is set as 2, 3, 4, 5,
ρj 1 − ρj respectively. The results of multiple experiments are as follows in
The 12∗ 72 dimensional data obtained by MFCC is expanded into Table 2.
a one-dimensional vector for sparse self-coding input. So far, there Table 2 shows that the result is optimal in the case of four hid-
is still no algorithm that can automatically optimize the network den layers, in which the number of nodes are 128, 32, 16, and
structure. In this paper, the multiple comparative experiments are 8, respectively. The remaining detailed parameters are as follows:
carried out. And different parameters are selected to train for ob- activation function is sigmoid, loss function is mean square error,
taining the optimal results. learning rate is 0.1, momentum is 0.8, number of iterations is 100,
452 M. Yang, W. Zhou and T. Song / Neurocomputing 397 (2020) 447–456

batch size is 200, sparse coefficient is 0.013 and penalty coefficient


is 0.016.

3. Audio cluster analysis and equipment fault grading

The real-time audio data obtained from the belt conveyor can-
not be directly used for the fault diagnosis. Because the noise gen-
erated by the rollers during running, the collected audio signal
components are very complicated. In this paper, a clustering algo-
rithm is proposed to classify faults.

3.1. Audio recognition based on K-means algorithm

K-means clustering algorithm has become the most popular


clustering method due to its unsupervised clustering with simple
idea. In K-means clustering algorithm, the clusters are determined Fig. 7. Accuracy of K-means algorithm for different features (Solid line is the result
of features obtained based on PCA, and dotted line is the result of features obtained
by calculating the distance among sample data points. It is de-
based on SAE).
scribed as Algorithm 1.
Table 3
Algorithm 1 K-means Algorithm. Fault classification result.

Input: The data sets X; Source of noise Fault level Maintenance measure
Output: The data sets X and the corresponding cluster labels Broken roller Very serious Immediate repair
1,…,k; Roller rotates off-center, Serious Repair after shutdown
1: Initialise K clustering centers, μ1 ,…,μk Drum collision due to bolt loose
2: for j = 1 to k do Normal operating equipment Normal No repair
Belt friction
3: Choose the cluster to which sample i, C j = arg min ||x(i ) − μ j ||2 ,
In and out of coal Normal No repair
j
where C j is the label to which sample I belongs
m
1{c ( i ) = j }x ( i )
4: Update the clustering centersμ j = i=1
 m 1{c ( i ) = j }
i=1 main features of spectral clustering are that groups can be ob-
5: end for tained by using similarity measures among the piece data. A for-
6: Get the cluster that the sample belongs to. mal description of the spectral clustering algorithm is given in
7: return The corresponding cluster labels 1,…,k. Algorithm 2.

Algorithm 2 Spectral Clustering Algorithm.


The clustering quality of K-means algorithm is much depend on
Input: The data sets X;
clustering initialization center, so the cluster quality cannot be op-
Output: The data sets X and the corresponding cluster labels
timal due to random initial selection of clustering centers.
1,…,k;
The audio data is split as some little pieces with 14 s time span.
1: Turn original problem into a graph. Connect each point with
In order to evaluate the performance of algorithm models, it is as-
the
sumed as following: (1) the piece data with normal sound belongs
nearest kknn points with the highest similarity based on k-nearest
to the same cluster. (2) the piece data with abnormal sound has
neighbor.
the largest component weight as the actual label of the sample. ||xi −x j ||22
The features obtained based on PCA and SAE are taken as the 2: Calculate the affinity matrix Wi j = W ji = exp(− 2σ 2
);
input of K-means algorithm. The K value is respectively set as 2, n
Calculate the degree matrix D = ωi j ;
3, 4, 5, 6 for comparing the experiments. The experimental results j=1
are shown in Fig. 7. Calculate the Laplacian Matrix f T L f = f T D f − f T W f .
According to Fig. 7, one can see that: (1) when the number of 3: Calculate the eigenvectors of Laplacian matrices, Compute the
clusters is 2, the result is the best. (2) The features obtained based first C eigenvalues and corresponding eigenvectors.
on SAE are better than those obtained based on PCA, especially 4: The eigenvectors are clustered by K-means.
when the number of cluster is more than 2. But the features based 5: end
on PCA are more indistinguishable, so when there are only two 6: Get the cluster that the sample belongs to.
clusters, the clustering results are better than those based on SAE. 7: return The corresponding cluster labels 1,…,k.
If the clustering result is 2, the audio signal is divided into nor-
mal and fault. The analysis finds that most devices could not use
this results to diagnose. It is more consistent with the practical ap-
The SAE method is used to obtain the features. The Kknn value is
plication when the clustering result is 4. The results of experiments
set as 8, 9, 10, 11, 12 and the number of clusters is set as 2, 3, 4, 5,
are as follows in Table 3.
6 for comparing the experiments. Experimental results are shown
in Fig. 8.
3.2. Audio recognition based on spectral clustering The experimental results are best when kknn is 10. The samples
are obtained by dividing the source audio data into 10 equal parts.
K-means algorithm is easy to fall into the local optimal. When The features obtained based on PCA and SAE are taken as the in-
the number of clusters is 4, the accuracy is 90%. The results ob- puts of spectral clustering algorithm. The experimental results are
tained by spectral clustering often outperform the K-means. The shown in Fig. 9.
M. Yang, W. Zhou and T. Song / Neurocomputing 397 (2020) 447–456 453

Fig. 8. Accuracy of spectral clustering algorithm for different features (Solid line
Fig. 10. Misdiagnosis rate of spectral clustering under different values (N is normal
is the result of features obtained based on PCA, and dotted line is the result of
category, F is fault category).
features obtained based on SAE).

The nearest data points are used to define the margin and known
as support vectors.
The features obtained based on PCA and SAE are taken as the
inputs of SVM algorithm. The sampled data is labeled according to
the results of cluster analysis. The data sets are divided into train-
ing samples and test samples based on the ratio of 7:3. The SVM
classifier adopts Gaussian radial basis function as its kernel. The
penalty factor is set as 0.8. The features obtained based on SAE are
better than those obtained based on PCA. The diagnostic accuracy
of SVM and PCA are 91.9% and 81.7%, respectively.

4.1.2. Fault diagnosis based on deep neural network


A lot of work needs to do for manually extract fault features
in traditional fault diagnosis methods. However, Auto-Encoder (AE)
can extract features automatically in deep learning. What is more,
Fig. 9. Accuracy of spectral clustering algorithm for different features (Solid line the method of using SAE to obtain network parameters and initial-
is the result of features obtained based on PCA, and dotted line is the result of ize the network can solve the problem that backward propagation
features obtained based on SAE). algorithm easily falls into local optimization. The network weights
obtained by pre-training can better express the input data struc-
ture and improve generalization ability.
According to Fig. 9, the experimental results of spectral clus-
Softmax classifier is used to solve the problem of multiple sam-
tering algorithm and K-means algorithm have the same conclu-
ple recognition. The parameters in the feature extraction process
sion. However, the spectral clustering algorithm is more accurate.
of the SAE are taken as the initial values of the deep neural net-
As kknn goes up or down, the accuracy goes down. The relationship
work. At the same time, add Softmax classification layer on the
between the misdiagnosis and kknn is further analyzed. Faults and
network basis. The structure of the deep neural network includes
serious faults are grouped together. Experimental results using the
input layer (864 nodes), first hidden layer (128 nodes), second hid-
features obtained based on SAE are shown in Fig. 10.
den layer (32 nodes), third hidden layer (16 nodes), fourth hidden
According to Fig. 10, when the value of kknn reduces, the sen-
layer (8 nodes), Softmax classification layer (4 nodes) as outputs.
sitivity of spectral clustering algorithm improves. It is sensitive to
The sample sets are divided into training sets and test sets
fluctuate in normal samples, which leads to a decrease in diagnos-
based on the ratio of 7:3. The average accuracy is 94.4% by min-
tic accuracy.
imizing the logarithmic loss function. Experimental results show
that the deep neural network has more recognition accuracy than
4. Implementation of intelligent fault diagnosis system SVM algorithm.

Using the above analysis methods, this section will compare the
4.1.3. Fault diagnosis based on deep convolutional neural network
different fault diagnosis models. The research results are applied in
Convolution neural networks (CNNs), which is specifically de-
a practical fault diagnosis system for a 1500 m belt conveyor.
signed for analysing variable and complex signals, has been shown
that it outperforms many other techniques. Due to its unique abil-
4.1. Fault diagnosis based on different models ity to maintain initial information regardless of shift, scale and dis-
tortion, the shared weights and spatial sub - sampling - CNNs are
4.1.1. Fault diagnosis based on Support Vector Machine widely used.
Support Vector Machine (SVM) is developed from the optimal The MFCC feature obtained previously is 12∗ 72 dimensional ma-
separation plane with linearly separable condition. The SVM tries trix, which is used as the input of the deep convolutional neural
to place a boundary between two classes. The distance between network. 70 0 0 samples are used as training sets, and the remain-
the boundary and the nearest data point in each class is maximal. ing 10 0 0 samples are used as test sets. The multiple experiments
454 M. Yang, W. Zhou and T. Song / Neurocomputing 397 (2020) 447–456

Fig. 11. The diagnosis results for data from different sensors.
M. Yang, W. Zhou and T. Song / Neurocomputing 397 (2020) 447–456 455

Table 4
Comparison of experimental results of different parameters.

No Input size Structure of hidden layers Accuracy rate

1 72∗ 12∗ 1 Conv.(64∗ (3∗ 3)), pooling(2∗ 2), softmax(4) 0.668


Conv.(48∗ (3∗ 3)), pooling(2∗ 2),
2 72∗ 12∗ 1 Conv.(32∗ (3∗ 3)), pooling(2∗ 2), softmax(4) 0.632
Conv.(48∗ (3∗ 3)), pooling(2∗ 2),
3 72∗ 12∗ 1 Conv.(16∗ (3∗ 3)), pooling(2∗ 2), softmax(4) 0.570
Conv.(48∗ (3∗ 3)), Conv.(16∗ (3∗ 3)),
∗ ∗
4 72 12 1 pooling(2∗ 2), full connection(100), softmax(4) 0.638
Conv.(48∗ (3∗ 3)), pooling(2∗ 2), Conv.(16∗ (3∗ 3)),
5 72∗ 12∗ 1 pooling(2∗ 2), full connection(100), softmax(4) 0.660
Conv.(48∗ (3∗ 3)), pooling(2∗ 2), Conv.(16∗ (3∗ 3)),
5 72∗ 12∗ 1 pooling(2∗ 2), full connection(10), softmax(4) 0.880
Conv.(48∗ (2∗ 2)), pooling(2∗ 2), Conv.(16∗ (2∗ 2)),
∗ ∗
6 72 12 1 pooling(2∗ 2), full connection(10), softmax(4) 0.635
Conv.(32∗ (3∗ 3)), pooling(2∗ 2),
∗ ∗
7 72 12 1 Conv.(32∗ (3∗ 3)), pooling(2∗ 2), softmax(4) 0.640
Conv.(32∗ (3∗ 3)), pooling(2∗ 2), Conv.(32∗ (3∗ 3)),
8 72∗ 12∗ 1 pooling(2∗ 2), full connection(10), softmax(4) 0.900
Conv.(32∗ (3∗ 3)), pooling(2∗ 2), Conv.(16∗ (3∗ 3)),
9 72∗ 12∗ 1 pooling(2∗ 2), full connection(10), softmax(4) 0.980
Conv.(32∗ (2∗ 2)), pooling(2∗ 2), Conv.(16∗ (2∗ 2)),
∗ ∗
10 72 12 1 pooling(2∗ 2), full connection(10), softmax(4) 0.902
Conv.(32∗ (3∗ 3)), Conv.(16∗ (3∗ 3)),
11 72∗ 12∗ 1 pooling(2∗ 2), full connection(10), softmax(4) 0.693
Conv.(32∗ (3∗ 3)), pooling(2∗ 2),
12 72∗ 12∗ 1 Conv.(16∗ (3∗ 3)), full connection(10), softmax(4) 0.680
Conv.(32∗ (3∗ 3)), pooling(2∗ 2), Conv.(16∗ (3∗ 3)),
13 72∗ 12∗ 1 pooling(2∗ 2), full connection(100), softmax(4) 0.740

Table 5
Statistical results of running state.

Month No 1 2 3 4 5 6 7 8 9 10 11 12

Fault No 125 93 89 101 77 118 84 92 104 75 91 98


Serious fault No 72 49 68 52 33 66 51 56 63 46 57 49
Real total fault No 194 140 152 151 109 182 131 153 165 119 144 145
Accuracy rate 98.5% 98.6% 96.8% 98.7% 99.1% 98.9% 97.0% 96.7% 98.8% 98.3% 97.3% 98.6%

are tried in order to obtain the optimal results. The results of mul-
tiple experiments are as follows in Table 4.
Table 4 show that: (1) choosing the convolution kernel as 3∗ 3
has the optimal, and reducing the size of the convolution kernel
leads to a decrease in accuracy; (2) removing pooling layer behind
the convolutional layer leads to the decrease of classification accu-
racy; (3) the classification effect of adding full connection layer is
better than using Softmax directly; (4) over-fitting is easy to oc-
cur when the parameters of the whole connection layer increase,
which lead to the decrease of classification accuracy.
The optimal structure of the convolutional neural network in-
cludes input layer (72∗ 12∗ 1), convolution layer (32∗ (3∗ 3), step(1)),
pooling layer (2∗ 2, step(2)), convolution layer (16∗ (3∗ 3), step(1)),
and pooling layer (2∗ 2, step(2)). The learning rate is 0.01, the num-
ber of iterations is 50, the sample size is 256, and the activation
function is selected as sigmoid. The classification accuracy is 98%.
Deep convolutional neural network shows more recognition accu- Fig. 12. Comparison of different algorithm accuracy.
racy than that of SVM and deep neural network. The diagnosis re-
sults for data showing in Table 4 is showed in Fig. 11.
A real fault diagnosis system, based on the technology proposed
4.2. Intelligent fault diagnosis system in the paper, has been built and applied in XISHAN Coal Electricity
Group. It is consist of hardware and software. The hardware in-
The belt conveyor in coal preparation plant is a class of popular cludes data acquisition module, communication module, ac power
delivery equipments. The fault in belt conveyor may lead the sys- supply module, dc power supply module, control module, indus-
tem to be broken, it is necessary to monitor the equipment failure. trial computer, and so on. The software has three types of func-
But it is difficult to judge the running state from the samples data tions, fault identification function, fault statistics function and mo-
directly. In this paper, clustering algorithm is used to get the cor- bile terminal message push function. In general, hardware con-
relation between different rollers’ faults. At the same time, the es- sists mainly of 64 sensors, 11 network switches, 11 optical fiber
tablishment of fault classification standards. The above results are transceivers, 11 AC/DC Converters, 1 controller, 1 industrial com-
used for fault diagnosis. puter, 1 AC distribution cabinet and some cables and optical fibers.
456 M. Yang, W. Zhou and T. Song / Neurocomputing 397 (2020) 447–456

In the real fault diagnosis system, there are four levels which [12] J. Gao, M. Kang, J. Tian, et al., Unsupervised locality-preserving robust latent
are represented by numbers 0, 1, 2, and 3. The number 0 de- low-rank recovery-based subspace clustering for fault diagnosis, IEEE Access 6
(2018) 52345–52354.
notes the normal signal. The number 1 indicates the interference [13] T. Zhang, F. Ma, D.Y. et al., Interval type-2 fuzzy local enhancement based
signal. The number 2 means the fault. The number 3 is the seri- rough k-means clustering considering imbalanced clusters, IEEE Trans. Fuzzy
ous fault. The diagnostic accuracy of different algorithms in various Syst. (2019), doi:10.1109/TFUZZ.2019.2924402 549.
[14] L. Rabiner, M. Sambur, An algorithm for determining the endpoints of isolated
categories are shown in Fig. 12. utterances, Bell Syst. Tech. J. 54 (2) (1975) 297–315.
According to Fig. 12, the deep convolutional neural network has [15] G. Rilling, P. Flandrin, P. Goncalves, On empirical mode decomposition and its
the most accuracy in all categories. The result is used to realize algorithms, in: Proceedings of the IEEE-EURASIP Workshop on Nonlinear Signal
and Image Processing, 3, 2003, pp. 8–11. (3)
intelligent fault diagnosis. The real fault diagnosis system has been
[16] A. Sokhandan, P. Adibi, M. Sajadi, Multitask fuzzy Bregman co-clustering ap-
run more than one year, there are 12 months operation result as proach for clustering data with multisource features, Neurocomputing 247
Table 5. (2017) 102–114.
[17] X. Yu, F. Dong, E. Ding, et al., Rolling bearing fault diagnosis using mod-
ified LFDA and EMD with sensitive feature selection, IEEE Access 6 (2018)
5. Conclusion 3715–3730.
[18] Y. Chen, O. Fink, G. Sansavini, Combined fault location and classification for
In this paper, the audio-based intelligent fault diagnosis system power transmission lines fault diagnosis with integrated feature extraction,
IEEE Trans. Ind. Electron. 65 (1) (2018) 561–569.
for belt conveyors is studied, which have made some good results [19] R. Sánchez, P. Lucero, R. Vásquez, et al., Feature ranking for multi-fault diag-
in inspecting rollers running states in coal industry. The intelligent nosis of rotating machinery by using random forest and KNN, J. Intell. Fuzzy
fault diagnosis system not only has more accuracy, but also real- Syst. 34 (6) (2018) 3463–3473.
[20] H. Shao, H. Jiang, H. Zhang, et al., Electric locomotive bearing fault diagnosis
time work speed. The practical application shows that the system using a novel convolutional deep belief network, IEEE Trans. Ind. Electron. 65
can effectively improve the intelligence of production and reduce (3) (2018) 2727–2736.
the amount of labors. The real fault diagnosis system has been run- [21] L. Wen, X. Li, L. Gao, et al., A new convolutional neural network-based
data-driven fault diagnosis method, IEEE Trans. Ind. Electron. 65 (7) (2018)
ning more than one year in a coal preparation plant, with many 5990–5998.
advantages such as high reliability and adapt in harsh industrial [22] J. Pan, Y. Zi, J. Chen, et al., Liftingnet: a novel deep learning network with lay-
environments. erwise feature learning from noisy mechanical data for fault classification, IEEE
Trans. Ind. Electron. 65 (6) (2018) 4973–4982.
The fault diagnosis technology proposed in the paper provides a
[23] C. Li, D. Valente, C. Mariela, et al., A systematic review of fuzzy formalisms for
new idea for belt rollers fault diagnosis. This technology can be ex- bearing fault diagnosis, IEEE Trans. Fuzzy Syst. 27 (7) (2019) 1362–1382.
tended to relevant process industry fields. At the same time, com- [24] P. Henriquez, J. Alonso, M. Ferrer, et al., Review of automatic fault diagnosis
systems using audio and vibration signals, IEEE Trans. Syst. Man Cybern. Syst.
bined with big data technology, the idea and technology based on
44 (5) (2014) 642–652.
audio can be used for equipments life prediction. [25] S. Yadav, K. Tyagi, B. Shah, et al., Audio signature-based condition monitoring
of internal combustion engine using FFT and correlation approach, IEEE Trans.
Declaration of Competing Interest Instrum. Meas. 60 (4) (2011) 1217–1226.
[26] A. Glowacz, Z. Glowacz, Diagnosis of stator faults of the single-phase induction
motor using acoustic signals, Appl. Acoust. 117 (2017) 20–27.
None. [27] J. Lee, H. Choi, D. Park, et al., Fault detection and diagnosis of railway point
machines by sound analysis, Sensors 16 (4) (2016) 549.
References [28] S. Ramírez-Gallego, B. Krawczyk, S. García, et al., A survey on data preprocess-
ing for data stream mining: current status and future directions, Neurocom-
puting 3239 (C) (2017) 39–57.
[1] N. Zheng, L. Su, D. Zhang, et al., A computational model for Ratbot locomotion [29] H. Shin, M. Orton, D. Collins, et al., Stacked autoencoders for unsupervised fea-
based on cyborg intelligence, Neurocomputing 170 (C) (2015) 92–97. ture learning and multiple organ detection in a pilot study using 4d patient
[2] M. Tian, M. Tian, J. Yang, et al., Optimization of RBF neural network used in data, IEEE Trans. Pattern Anal. Mach. Intell. 35 (8) (2012) 1930–1943.
state recognition of coal flotation, J. Intell. Fuzzy Syst. 34 (2) (2018) 1193–1204.
[3] C. Igathinathane, U. Ulusoy, Machine vision methods based particle size distri- Mingjin Yang received the B.Sc. degree from Xuzhou Nor-
bution of ball-and gyro-milled lignite and hard coal, Powder Technol. 297 (2) mal University, China in 2011, and the M.Sc. degrees de-
(2016) 71–80. gree from Shanxi University, China in 2014. He is now a
[4] V. Agrawal, B. Panigrahi, P. Subbarao, Intelligent decision support system for doctoral student in Shanghai University. His research in-
detection and root cause analysis of faults in coal mills, IEEE Trans. Fuzzy Syst. terests include power control and fault diagnosis.
25 (4) (2017) 934–944.
[5] S. Xiao, S. Liu, F. Jiang, et al., Nonlinear dynamic response of reciprocating com-
pressor system with rub-impact fault caused by subsidence, J. Vib. Control 25
(11) (2019) 1737–1751.
[6] G. Guo, S. Li, Content-based audio classification and retrieval by support vector
machines, IEEE Trans. Neural Netw. 14 (1) (2003) 209–215.
[7] J. Ying, T. Kirubarajan, K. Pattipati, et al., A hidden markov model-based al-
gorithm for fault diagnosis with partial and imperfect tests, IEEE Trans. Syst.
Man. Cybern. Part C 30 (4) (20 0 0) 463–473. Wenju Zhou received the B.Sc. And M.Sc. degrees from
[8] Z. Wang, Q. Zhang, J. Xiong, et al., Fault diagnosis of a rolling bearing us- the Shandong Normal University, China in 1990 and 2005,
ing wavelet packet denoising and random forests, IEEE Sens. J. 17 (17) (2017) and the Ph.D. degree from the Shanghai University, China
5581–5588. in 2014. He is now a distinguished researcher and Doc-
[9] Z. Wang, Y. Liu, P. Griffin, A combined ann and expert system tool for trans- toral Supervisor at Shanghai University. His research in-
former fault diagnosis, in: Proceedings of the 20 0 0 IEEE Power Engineering So- terests include robotics control, machine vision and the
ciety Winter Meeting. Conference Proceedings (Cat. No. 0 0CH37077), 2, 20 0 0, industry applications of the automation equipment.
pp. 1261–1269.
[10] D. Wu, Q. Yang, F. Tian, et al., Fault diagnosis based on k-means clustering and
PNN, in: Proceedings of the 2010 Third International Conference on Intelligent
Networks and Intelligent Systems, 2010, pp. 173–176.
[11] A. Ramos, C. Corona, J. Verdegay, et al., An approach for fault diagnosis using
a novel hybrid fuzzy clustering algorithm, in: Proceedings of the 2018 IEEE
International Conference on Fuzzy Systems (FUZZ-IEEE), 2018, pp. 1–8.

You might also like