Lal Babu

Download as pdf or txt
Download as pdf or txt
You are on page 1of 19

Transportation Infrastructure Geotechnology

https://doi.org/10.1007/s40515-024-00436-0

TECHNICAL PAPER

A Comparative Study of Soft Computing Paradigms


for Modelling Soil Compaction Parameters

Lal Babu Tiwari1 · Avijit Burman1 · Pijush Samui1

Accepted: 2 July 2024


© The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature
2024

Abstract
Estimation of optimum water content (OWC) and maximum dry density (MDD)
are crucial compaction parameters of soils in the domains of geotechnical and geo-
logical engineering. However, determining these parameters through laboratory
tests is time-consuming. Therefore, this study aims to estimate the OWC and MDD
of soils using a widely employed soft computing paradigm called artificial neural
network (ANN). To achieve this, a comprehensive database was collected for esti-
mating the OWC and MDD of soils. The performance of the employed ANN was
compared with four additional soft computing paradigms namely extreme learn-
ing machine, support vector regressor, k-nearest neighbour regressor and group
method of data handling. Experimental results indicate that the ANN model suc-
cessfully estimates the OWC (training RMSE = 0.0400 and testing RMSE = 0.0530)
and MDD parameters (training RMSE = 0.0421 and testing RMSE = 0.0522). The
employed k-nearest neighbour and group method of data handling frameworks were
found to be less effective than other employed models with training RMSE = 0.1187
and testing RMSE = 0.0834 during OWC and training RMSE = 0.1214 and testing
RMSE = 0.1366 during MDD predictions, respectively. Overall, the employed ANN
was determined to be a best-suitable alternative to estimate the soil compaction
parameters and can be used in civil engineering projects to assess the soil compac-
tion status during the course of construction works.

Keywords Soil compaction · Soft computing · Artificial neural network · Extreme


learning machine · Machine learning

* Lal Babu Tiwari


lalbabut.ce@nitp.ac.in
Avijit Burman
avijit@nitp.ac.in
Pijush Samui
pijush@nitp.ac.in
1
Department of Civil Engineering, National Institute of Technology, Patna, Patna 800005, India

13
Vol.:(0123456789)
Transportation Infrastructure Geotechnology

1 Introduction

Engineering structures often encounter soil masses that lack the necessary prop-
erties for construction purposes (Kurnaz and Kaya 2020). In such instances, it
becomes necessary to enhance the geotechnical properties of soils in order to
meet the requirements (Nagaraj et al. 2015; Najjar et al. 1996; Omar et al. 2003;
Yousif and Mohamed 2022). The choice of soil improvement techniques to
employ depends on various factors, including the soil type at the site, soil condi-
tions and economic considerations. The ultimate objective of all these improve-
ment methods is to enhance soil density and strength while minimising permea-
bility and settlement. Among these techniques, the compaction method is utilised
to increase soil density and bearing capacity, while simultaneously reducing
permeability.
Compaction involves increasing the dry density of soil through the applica-
tion of energy while altering its water content (Ardakani and Kordnaeij 2019;
Bardhan and Asteris 2023; Günaydın, 2009). During compaction, the air volume
decreases, while the water and solid components do not compact, resulting in
closer grain alignment. When water is added to the soil and compacted, it attains
a specific dry unit weight. If the same soil is further saturated and compacted
using the same energy, the dry unit weight gradually increases. As the water
content increases, the dry unit weight reaches its maximum value known as the
maximum dry density (MDD). Beyond this limit, the dry unit weight decreases
with additional water. The water content at which the maximum dry density is
achieved is referred to as the optimum water content (OWC). The OWC and
MDD are crucial parameters that signify the compaction behaviour of soils. They
are determined through laboratory tests such as light compaction and heavy com-
paction tests. These parameters play a significant role in compacted fillings nec-
essary for engineering structures like highways, railways and earth dams (Kurnaz
and Kaya 2020).
The conventional laboratory tests for determining the OWC and MDD of soils
are known to be time-consuming, labour-intensive and demanding. Consequently,
numerous scientists and researchers have endeavoured to establish empirical cor-
relations based on regression analysis to estimate the compaction parameters
using readily available index properties of soils (Blotz et al. 1998; Di Matteo
et al. 2009; Gurtug and Sridharan 2004; Khuntia et al. 2015; Omar et al. 2003;
Sridharan and Nagaraj 2005). The influence of index properties on soil compac-
tion has been recognised for a considerable period. Main factors include grain
size distribution for coarse-grained soils, as well as consistency limits for fine-
grained soils. Moreover, the tests required to determine these index properties
generally involve relatively easy and cost-effective procedures when compared to
the compaction tests themselves.
Correlations proposed to establish connections between physical character-
istics and compaction parameters are typically derived through multiple linear
regression analysis (Sinha and Wang 2008; Wang and Yin 2020). However, a
crucial issue with these correlations is that they are often developed for specific

13
Transportation Infrastructure Geotechnology

localities or soils of the same geological origin. Using these correlations for
areas beyond the original locality can lead to substantial disparities between the
expected and computed compaction parameters. Therefore, it is essential to exer-
cise caution when utilizing compaction parameters determined through empirical
correlations, particularly when applied to regions outside the scope of the corre-
lations’ development (Bardhan and Asteris 2023; Sinha and Wang 2008; Alkayem
et al. 2023; Asteris et al. 2022).
In recent times, geotechnical engineering practices have increasingly incorporated
various studies focusing on soft computing methods. Machine learning approaches
have recently been used to predict the OWC and MDD of soils in order to address
the issue with a larger database and greater accuracy. Sinha and Wang (2008) used
55 soil samples for estimating soil compaction characteristics using artificial neural
networks (ANNs). The compaction parameters of 212 samples were estimated by
Ardakani and Kordnaeij (2019) using the group method of data handling (GMDH).
In order to estimate the OWC and MDD of soils, Kurnaz and Kaya (2020) used
GMDH, support vector machine (SVM), extreme learning machine (ELM) and
Bayesian regularisation neural network based on the results of 451 experiments uti-
lising the index properties and traditional proctor tests. Recently, Bardhan and Ast-
eris (2023) and Bardhan et al (2023b) estimated the OWC and MDD of soils using
hybrid ANN and ANFIS techniques respectively, and found satisfactory results.
In this study, a widely used soft computing technique, called ANN, was used for
the estimation of OWC and MDD of soils. The performance of the employed ANN
was compared with four additional soft computing paradigms namely ELM, support
vector regressor (SVR), k-nearest neighbour regressor (KNR) and group method of
data handling (GMDH). The index properties of soil samples were utilised as input
parameters in estimating the compaction parameters for all the models.

2 Research Significance

Over the past decades, many soft computing algorithms have been employed and
published in order to forecast the behaviour of intricate phenomena. These phenom-
ena are characterised by their highly non-linear nature, which makes it impractical
to rely on deterministic methodologies. Artificial intelligence and machine learning
play a prominent role in these methods. Although initially used in medicine (Rosen-
blatt 1958), these approaches were primarily implemented in the disciplines of sci-
ences and engineering (Zhang et al. 2021; Phoon and Wang 2023). Contemporary
intelligence approaches including slope stability analysis (Asteris et al. 2022), reli-
ability analysis (Bardhan 2024) and estimation of geotechnical parameters (Bardhan
and Asteris 2023; Bardhan et al., 2023b), are commonly employed in geotechni-
cal and geological engineering fields, as documented in the literature (Zhang et al.
2022a, 2022b). Taking the above discussion as a reference, this study employs a
widely used machine learning technique, called ANN, for the estimation of OWC
and MDD of soils. The performance of the employed ANN was compared with four
additional soft computing paradigms namely ELM, SVR, KNR and GMDH.

13
Transportation Infrastructure Geotechnology

3 Methodology

Theoretical details of ANN and ELM are presented in this section. Note that, detailed
mathematical background of ANN and ELM are not presented in this study due to the
fact that they are already established in the literature, and the works of Bardhan and
Asteris (2023), Ghani et al. (2021), Bardhan et al. (2023a), Kurnaz and Kaya (2020)
and Huang et al. (2006) can be referred to for more details.

3.1 Artificial Neural Network

ANN is composed of three major sets of neuron layers (see Fig. 1a). The first and last
ANN layers are known as input and output layers, where there is the same number
of neurons as the problem’s input/output variables, respectively (Bardhan and Asteris
2023). There is at least one layer between these two layers called hidden layers. The
signal is transmitted through the input layer. The hidden layers are the network’s com-
putational engine. Prediction is made in accordance with input variables in the output
layer. Weights and biases are two basic parameters of the ANN. Weights (w) indicate
interconnections between the neurons of a layer, and biases (b) determine the network’s
degree of freedom (DOF). Each node, except for input nodes, utilises a non-linear acti-
vation function, i.e. the transfer function, to evaluate its output and supply a collection
of inputs. The most prominent activation functions are as follows:
1
Sigmoid function ∶ f (z) = (1)
1 + e−z

Linear function ∶ f (z) = z (2)

ez − e−z
Hyperbolic tangent function ∶ f (z) = (3)
ez + e−z
In ANN, the output is then used as input for the subsequent node and so forth
until the original problem has been solved. The backpropagation algorithm is

(a) (b)

Fig. 1  A basic architecture of a ANN and b ELM

13
Transportation Infrastructure Geotechnology

used to calculate the error in a comparison between the actual and predicted out-
comes (the problem target and network outcome, respectively). Then, the error
is propagated back a layer at some time across the ANN structure. Weights are
modified in accordance with how much it has contributed to the error.

3.2 Extreme Learning Machine

A continuous probability distribution function is employed instead of a conven-


tional feed-forward ANN iterative solution to draw an ELM (Huang et al. 2006)
solution. The basic architecture of ELM is illustrated in see Fig. 1b. Lower design
complication and the capability of solving classification and regression problems
in a shorter time due to randomised bases and weights in the hidden neurons and
the least-square solution of the output is solved using the Moore–Penrose func-
tion which can be considered as the most prominent benefits of an ELM. Thus,
it is not required to employ iterative training techniques (unlike ANN) tend-
ing to convergence of the predicted solution towards the local minima instead
of the global minima for the prediction dataset. The present work employed the
ELM approach for predictor/target data pair training. Let xi represent predictors,
yi denote targets. For d-dimensional vectors of i training samples ( i = 1, 2, …,
N ), one can represent the dingle layer feed-forward network (SLFN) containing L
hidden neurons in a mathematical form as:
∑L
fL (x) = h (x)𝛽i = h(x)𝛽
i=1 i
(4)

where 𝛽 = [𝛽1, 𝛽2 … 𝛽L]T denotes the matrix of output weights between the hidden
and output neurons, h(x) = [h1, h2 … hL ] represents the hidden neuron outputs for
predictor xi, and hi (x) is hidden neuron i. The hidden neuron output is given by:
( )
hi (x) = G ai , bi , x , ai ∈ Rd , bi ∈ R (5)
( )
where G ai , bi , x is a non-linear piecewise continuous function of hidden neuron
parameters (a, b). It should meet the ELM approximation theorem. The present work
employed the sigmoid equation as a widely used neural network modelling equation
to develop the ELM as
1
Log Sigmoid ⇒ G (a, b, x) =
1 + exp(−ax + b) (6)

It is essential to minimise the error of approximation by least-square fitting


during the solution process for the weights that connect the hidden and output
layers as:

min ‖H𝛽 − T‖2 (7)


𝛽∈RL×m

where ║Hβ ‒ T║ is the Frobenius norm, H and T represent the output and target
matrix, respectively, given by:

13
Transportation Infrastructure Geotechnology

⎡ g(x1 ) ⎤ ⎡ g1 (a1 x1 + b1 ) … gL (aL x1 + bL ) ⎤


H=⎢ ⋮ ⎥=⎢ ⋮ ⋮ ⋮ ⎥ (8)
⎢ ⎥ ⎢ ⎥
⎣ g(xN ) ⎦ ⎣ g1 (aN xN + b1 ) … gL (aL xN + bL ) ⎦

⎡ t1T ⎤ ⎡ t11 … t1m ⎤


T=⎢ ⋮ ⎥=⎢ ⋮ ⋮ ⋮ ⎥ (9)
⎢ T⎥ ⎢ ⎥
⎣ tN ⎦ ⎣ tN1 … tNm ⎦

The solving of a linear equation system yields the optimal solution: 𝛽 ∗ = H + T ,


where H+ is the generalised Moore–Penrose inverse function ( +). Then, a given
input vector x is predicted by the optimal solution.

3.3 Support Vector Regressor

SVR is a technique used for regression issues, which is derived from the ideas of
SVM. SVR, unlike standard regression models, seeks to find the optimal line within
a predetermined tolerance margin, referred to as the epsilon-insensitive zone, rather
than only minimising prediction error. This zone enables the disregard of errors that
fall inside the margin, directing attention towards more substantial deviations. SVR
operates by mapping the input space to a feature space of higher dimensions uti-
lising kernel functions as linear, polynomial or radial basis functions. Within this
particular feature space, SVR identifies the hyperplane that optimally aligns with the
data, ensuring a delicate equilibrium between the complexity of the model and the
accuracy of predictions. SVR achieves stable performance with high-dimensional
data by utilising a subset of the training data called support vectors. SVR is exten-
sively utilised in several domains such as financial forecasting, environmental mod-
elling and any field that necessitates accurate prediction with controlled adaptability.

3.4 k‑Nearest Neighbours Regressor

The KNR is a non-parametric algorithm that is used to predict continuous outcomes


in a straightforward manner. Contrary to k-nearest neighbours classification, where
a class label is determined by the majority vote of neighbouring data points, KNR
estimates the value of a target variable by calculating the average of the values from
the k-nearest training data points in the feature space. Data points are commonly
evaluated for their proximity using distance metrics like Euclidean, Manhattan or
Minkowski distances. The selection of k, the number of neighbours, is essential as
it impacts the model’s balance between bias and variance: a smaller k can result in
overfitting, while a larger k can result in underfitting. KNR is advantageous because
of its simplicity and simple approach, as it does not require any assumptions about
the distribution of the data. It exhibits strong performance when dealing with non-
linear data patterns and is extensively utilised in various domains, including rec-
ommendation systems, stock price forecasting and other situations where localised
prediction is of great importance.

13
Transportation Infrastructure Geotechnology

3.5 Group Method of Data Handling

GMDH is an inductive learning technique employed to model intricate systems. The


system autonomously constructs and chooses models by iteratively generating and
evaluating hypotheses using the available data. GMDH generates polynomial mod-
els of escalating intricacy and assesses them based on criteria like as correctness
and generalisation capability. By utilising a self-organising methodology, it inte-
grates components of neural networks with evolutionary algorithms to discover the
most advantageous model structure. This approach is especially efficient for tasks
that necessitate precise forecasts from datasets that are either noisy or constrained.
The versatility of GMDH enables it to effectively manage non-linear relationships
and interactions among variables, making it well-suited for a wide range of applica-
tions including time series forecasting, pattern recognition and system identification.
GMDH is a valuable tool in various sectors, such as economics and engineering, due
to its capability to uncover the underlying data structure without the need for sub-
stantial prior information.

4 Data Description

An extensive records of soil compaction parameters from a published study by Wang


and Lin (2020) were used to develop prediction models of the OWC and MDD. Sev-
eral details are available in the gathered dataset, viz., contents of gravel (G in %),
sand (S in %) and fines (F in %), liquid limit (LL in %), plastic limit (PL in %) and
compaction energy (CE in kJ/m3). These five variables were used for the estima-
tion of OWC and MDD soils. Table 1 shows the descriptive characteristics of the
soil parameters in the employed database. Table 2 also includes the minimum and
maximum values of OWC and MDD for different soil types. Figure 2 shows the den-
sity histograms for each variable. It should be noted that the normalised values were
used in Fig. 2. Moreover, the ridgeline chart and correlation matrix for the employed
dataset are presented in Figs. 3 and 4, respectively. From Fig. 4, the degree of cor-
relation between input soil characteristics and OWC and MDD can be determined.

Table 1  Descriptive details of the dataset


Parameters G (%) S (%) F (%) LL (%) PL (%) CE (kJ/m3) OWC (%) MDD (gm/cc)

Minimum 0.00 0.00 8.60 16.00 6.10 155.00 5.30 1.09


Mean 7.47 29.45 63.09 108.73 22.00 894.07 17.51 1.75
Maximum 67.10 89.00 100.00 608.00 48.30 2755.00 43.70 2.33
Standard error 0.97 1.56 2.00 10.92 0.49 48.92 0.40 0.01
Standard 14.57 23.39 30.02 164.22 7.40 735.44 5.96 0.20
deviation
Variance 212.16 547.01 900.95 26,967.70 54.83 540,865.23 35.57 0.04
Kurtosis 3.48 − 0.47 − 1.25 3.26 0.75 2.27 2.86 0.98
Skewness 2.09 0.62 − 0.38 2.20 0.58 2.00 1.04 − 0.11

13
Transportation Infrastructure Geotechnology

Table 2  Soil type wise OWC Soil types OWC (%) MDD (gm/cc)
and MDD details
Minimum Maximum Minimum Maximum

CH 12.10 43.70 1.09 1.87


CL 10.20 22.00 1.62 2.04
CL-ML 17.00 17.00 1.78 1.78
GC 5.90 19.40 1.67 2.33
GM 13.90 13.90 1.79 1.79
GP-GC 6.80 9.20 2.06 2.20
GW-GC 5.30 7.50 2.16 2.31
MH 19.40 31.00 1.40 1.64
ML 13.60 22.00 1.55 1.85
SC 9.00 20.40 1.59 2.09
SM 9.00 13.20 1.90 2.04
SP-SC 8.80 14.50 1.83 2.06
SW-SC 7.30 9.80 2.01 2.14

(a) (b) (c) (d)


8 3 3 8
G S F LL
7 7

6 6
2 2
5 5
Density

Density

Density

Density

4 4

3 3
1 1

2 2

1 1

0 0 0 0
0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0
Parameter Parameter Parameter Parameter

(e) (f) (g) (h)


3 8 4 4
PL CE OWC MDD
7

6 3 3

2
5
Density

Density

Density

Density

4 2 2

3
1

2 1 1

0 0 0 0
0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0
Parameter Parameter Parameter Parameter

Fig. 2  Density histogram

According to the information presented in Fig. 4, it is seen that the contents of G


and S and the CE demonstrate a negative relationship with the OWC, but F content,
LL and PL demonstrate a positive relationship. On the contrary, F content, LL and
PL show a negative correlation with the MDD and G and S and the CE have a posi-
tive correlation. Overall, a wide variety of soil parameters have very little associa-
tion with OWC and MDD of soils. Following data collection, the entire dataset was

13
Transportation Infrastructure Geotechnology

G
S
F
G
LL
PL
S CE
OWC
F MDD
Parameters

LL

PL

CE

OWC

MDD
0.4 0.2 0.0 0.2 0.4 0.6 0.8 1.0 1.2 1.4

Range

Fig. 3  Ridgeline chart of the variables

1 1

G 1 0.8
G 1 0.8

S 0.21 1 0.6 S 0.21 1 0.6

0.4 0.4

F -0.65 -0.88 1 F -0.65 -0.88 1


0.2 0.2

LL -0.22 0.05 0.067 1 0 LL -0.22 0.05 0.067 1 0

-0.2 -0.2
PL -0.28 -0.11 0.23 0.62 1 PL -0.28 -0.11 0.23 0.62 1
-0.4 -0.4

CE 0.09 0.12 -0.13 0.36 0.092 1 -0.6


CE 0.09 0.12 -0.13 0.36 0.092 1 -0.6

-0.8
OWC -0.51 -0.57 0.69 0.41 0.69 -0.16 1
-0.8
MDD 0.55 0.52 -0.67 -0.49 -0.74 0.18 1
-1 -1
G

LL

PL

E
F

D
G

E
LL

PL

C
F

D
C

M
O

(a) (b)

Fig. 4  Correlation matrix between input variables and a OWC and b MDD

normalised between 0 and 1 followed by data bifurcation. Notably, the entire dataset
of 226 experimental records was divided into 2 subsets: (a) a training (TR) subset
consisting of 80% (181 samples) of the original dataset and (b) a testing (TS) subset
consisting of the remaining dataset, i.e. 45 samples.
Despite no pre-defined criterion for selecting the number of datasets to employ in a
data-driven model, the researchers’ choice will be mostly influenced by the nature of
the task. In general, a model constructed using a large sample is regarded to be supe-
rior to one constructed from a small number of observed data points. Given this, 20%

13
Transportation Infrastructure Geotechnology

dataset was chosen as the testing dataset. Computational modelling steps for estimating
OWC and MDD of soils are as follows: (a) selection of dataset; (b) data normalisation
between 0 and 1; (c) data partitioning and selection of TR and TS subsets; (d) com-
putational modelling using ANN and ELM and (f) prediction of TR and TS subsets.
Figure 5 displays the computational modelling process as a flow chart.

5 Results and Discussion

The results of the employed models used to estimate OWC and MDD of soils are
detailed in this section. As stated above, the main dataset was divided into training
(181 samples) and testing data (45 samples) subsets before the models were formed. It
should be noted that all models were built and tested using the same training (TR) and
testing (TS) subsets. The created models’ results were then evaluated using a number
of indices. For this purpose, four different performance matrices namely, determination
coefficient (R2), variance account factor (VAF), root mean square error (RMSE), mean
absolute error (MAE) and weighted mean absolute percentage error (WMAPE), were
determined. Note that these indices are widely used to assess a model’s performance
from different perspectives (Bardhan et al., 2023a; Kaloop et al. 2022; Kardani et al.
2022; Kumar et al. 2023; Liu et al. 2023; Benzaamia et al. 2024; Asteris et al. 2024;
Huat et al. 2024; He et al. 2024). The mathematical expressions of these indices are
given by:
∑n 2 ∑n
i=1 (yi − ymean ) − yi )2
i=1 (yi − ̂
(10)
2
R = ∑n
(y − ymean )2
i=1 i

𝑣𝑎𝑟(yi − ̂yi )
VAF(%) = (1 − ) × 100 (11)
𝑣𝑎𝑟(yi )


1 ∑n
RMSE = yi )2
(y − ̂ (12)
n i=1 i

Fig. 5  Illustration of computational modelling

13
Transportation Infrastructure Geotechnology

1 ∑n |( )|
MAE = y − yi |
| ̂ (13)
n i=1 | i |

∑n � yi −̂yi �
i=1 �� yi �� × yi
WMAPE = ∑n (14)
i=1 yi

where yi and ̂ yi are the actual and estimated values; n is sample numbers and ymean is
the average value. It is vital to note that the values of these indices must correspond
to their ideal values for an ideal model. The ideal values of R2, VAF, RMSE, MAE
and WMAPE are 1, 100, 0, 0 and 0, respectively (Bardhan and Asteris 2023; Ghani
et al. 2021; Khan et al. 2022; Salami et al. 2022; Topal et al. 2022; Wu et al. 2023;
Wang et al. 2020; Liu et al. 2022).
It is important to note that the construction of an optimum model requires appro-
priate tuning of hyper-parameters. In ANN, the hyper-parameters are number of hid-
den neurons ­(NH) and number of hidden layers ­(NL), whereas, in ELM, there is only
one hyper-parameters, i.e. ­NH, since in ELM, N ­ L = 1. In this study, the number of
­NH ranging from 2 to 15 was investigated in order to select the optimum value of N ­ H
for ANN and ELM. Using a trial-and-error approach, the optimum value of N ­ H was
discovered to be 8 for both ANN and ELM. However, N ­ L = 1 in both cases. During
ANN modelling, the activation functions used in hidden and output layers were tan-
sig and purelin, respectively, whereas sigmoid activation function was used during
ELM modelling. The final architecture of ANN and ELM consists of six input neu-
rons, ­NH = 8, ­NL = 1 and 1 output neuron. For SVR, radial basis function was used
as the Kernel function. During KNR modelling, the values of leaf_size and n_neigh-
bors were set in the range of 5 to 50 and 1 to 10, respectively. The most suitable
values of leaf_size and n_neighbors were determined to be 35 and 6, respectively.
The most suitable structure of GMDH consists of 6 hidden layers with 10 neurons
in each layer. The best performance was achieved when the number of hidden layers
was set to 4.
Tables 3 and 4 present the prediction results of the employed neural network-
based models for predicting soil OWC and MDD, respectively. The models’ out-
comes in predicting training (TR), testing (TS) and total outputs are provided here.
It should be noted that the performance of each model with the training subset was
used to define the goodness of fit of the generated models, whilst the performance
of each model with the testing dataset was used to assess their generalisation capa-
bilities. The employed ANN model has the greatest R2 and lowest RMSE values in
OWC and MDD prediction based on the experimental findings.
In the training phase, the employed ANN model had the maximum accuracy,
with R2 = 0.9350, VAF = 93.4980, RMSE = 0.0400 and MAE = 0.0302 against OWC
and R2 = 0.9341, VAF = 93.3524, RMSE = 0.0421 and MAE = 0.0292 against MDD
predictions. During the testing phase, these matrices were found to be R2 = 0.9050,
VAF = 87.0247, RMSE = 0.0530 and MAE = 0.0418 against OWC and R2 = 0.9131,
VAF = 88.8297, RMSE = 0.0522 and MAE = 0.0384 against MDD predictions.
For the employed ELM model, these matrices were found to be R2 = 0.8472,
VAF = 84.7219, RMSE = 0.0613 and MAE = 0.0463 against OWC and R2 = 0.8589,

13
Transportation Infrastructure Geotechnology

Table 3  Performance indices for Phases R2 VAF RMSE MAE WMAPE


OWC prediction
Training
ANN 0.9350 93.4980 0.0400 0.0302 0.0948
ELM 0.8472 84.7219 0.0613 0.0463 0.1451
SVR 0.7847 78.3070 0.0814 0.0632 0.1977
KNR 0.7557 75.3429 0.1187 0.0954 0.2987
GMDH 0.9014 89.8857 0.0719 0.0598 0.1867
Testing
ANN 0.9050 87.0247 0.0530 0.0418 0.1318
ELM 0.8785 83.9878 0.0599 0.0470 0.1481
SVR 0.8890 83.1367 0.0798 0.0666 0.2098
KNR 0.8291 78.0284 0.0834 0.0658 0.2074
GMDH 0.8965 86.0160 0.0757 0.0632 0.1992
Total
ANN 0.9249 92.3343 0.0429 0.0326 0.1022
ELM 0.8469 84.5081 0.0610 0.0464 0.1457
SVR 0.7973 79.0016 0.0811 0.0638 0.2001
KNR 0.6744 63.4560 0.1126 0.0895 0.2805
GMDH 0.8980 89.1910 0.0727 0.0605 0.1892

Table 4  Performance indices for Phases R2 VAF RMSE MAE WMAPE


MDD prediction
Training
ANN 0.9341 93.3524 0.0421 0.0292 0.0545
ELM 0.8589 85.8912 0.0608 0.0483 0.0896
SVR 0.8692 86.3074 0.0812 0.0643 0.1207
KNR 0.8210 81.9376 0.0843 0.0705 0.1320
GMDH 0.8554 84.3084 0.1214 0.1072 0.2013
Testing
ANN 0.9131 88.8297 0.0522 0.0384 0.0717
ELM 0.8793 86.8381 0.0555 0.0443 0.0827
SVR 0.8904 86.7884 0.0891 0.0762 0.1422
KNR 0.8824 86.4311 0.0759 0.0618 0.1154
GMDH 0.8523 82.0889 0.1366 0.1211 0.2260
Total
ANN 0.9277 92.5144 0.0443 0.0310 0.0579
ELM 0.8610 86.0610 0.0598 0.0475 0.0882
SVR 0.8713 86.2622 0.0829 0.0666 0.1250
KNR 0.8310 82.7531 0.0827 0.0688 0.1286
GMDH 0.8528 83.7240 0.1246 0.1100 0.2062

13
Transportation Infrastructure Geotechnology

VAF = 85.8912, RMSE = 0.0608 and MAE = 0.0483 against MDD predictions
in the training phase. During the testing phase, these indices were R2 = 0.8785,
VAF = 83.9878, RMSE = 0.0599 and MAE = 0.0470 against OWC and R2 = 0.8793,
VAF = 86.8381, RMSE = 0.0555 and MAE = 0.0443 against MDD predictions.
Overall, the employed ANN and ELM models predict soil OWC with 92.49%
(R2 = 0.9249) and 84.69% (R2 = 0.8469) and MDD with 92.77% (R2 = 0.9277) and
86.10% (R2 = 0.8610) accuracies, respectively. These findings indicate that the
employed ANN model has excellent predictive performance in both cases. The per-
formance of SVR, KNR and GMDH models can be seen in Tables 3 and 4.
According to the results presented in the tables, it is seen that the employed ANN
achieved the most superior estimation of OWC and MDD. Therefore, the expression
of the developed ANN is presented in Eq. (15).

n
yi = f (b + xi .wi ) (15)
i=1

where b is the bias term; xi is the input vector; wi is the weight vector connecting
the ith hidden neuron and the input neurons and n is the number of datasets under
consideration. The details of weights and biases are presented in Tables 5 and 6 for
OWC and MDD estimations, respectively.
It is critical to emphasise that a data-driven model is incomplete without a
visual representation of the outcomes. Visualisations make it easier to find and
understand patterns, correlations and outliers in a dataset. Visual representations
can help in investigating trends in data without having to go through intricate
details. Thus, graphical representations of the results of the employed models
are presented in the form of scatter plots, error histogram with kernel smooth
and Taylor diagram. These diagrams are quite useful for thoroughly examining
a model’s overall soundness. To better demonstrate, illustrations of actual and
estimated values of OWC and MDD are shown in Figs. 6 and 7 for the TR and

Table 5  Details of weights and biases of the ANN model (for OWC prediction)

Input to hidden weights


− 0.2851 − 1.8501 1.6729 − 0.3194 − 0.0563 1.6499 0.3605 0.4408
0.7096 − 0.3957 − 1.2495 − 0.4542 1.1879 − 0.6058 − 1.7517 1.0711
0.2163 0.6399 1.5018 − 0.9966 0.6795 − 2.6815 0.9233 0.6727
− 0.1058 − 0.3427 0.7542 − 0.0961 − 0.2581 − 0.3605 0.9262 − 0.5295
1.4579 − 0.3634 1.2445 0.2644 2.3073 − 0.0447 − 0.5627 − 0.6741
1.5235 0.1336 − 0.0825 0.2749 0.5472 0.5907 1.1809 − 1.1262
Hidden layer biases
− 2.4278 1.7801 − 2.6136 − 0.8976 0.5322 0.3803 − 2.1043 1.9041
Hidden to output layer weights
1.0871 0.1893 1.1713 − 2.3532 0.5656 0.3024 0.2595 0.7555
Output bias
− 0.0662

13
Transportation Infrastructure Geotechnology

Table 6  Details of weights and biases of the ANN model (for MDD prediction)

Input to hidden weights


− 0.7434 − 1.2766 0.7416 0.6738 − 1.1615 − 0.4158 − 0.1298 − 0.3006
− 0.7950 − 0.0195 0.2576 − 0.8791 − 0.1081 1.3879 − 0.9192 0.4867
− 0.0001 − 1.3136 0.0106 0.9037 0.4408 0.2675 − 1.0693 1.1008
− 1.2892 − 0.1583 − 2.2788 − 1.3369 0.9586 − 0.3934 0.2793 0.4822
− 1.5619 0.3298 − 0.5097 − 0.3697 1.0157 0.7176 − 0.3532 0.8924
0.5833 − 1.1278 − 0.0027 0.2840 1.9951 − 0.2235 − 0.1722 − 0.1914
Hidden layer biases
1.5169 1.3590 − 2.4753 − 0.0189 − 0.7289 − 1.2399 − 0.3471 − 1.4741
Hidden to output layer weights
− 0.7313 − 0.8056 1.5831 − 0.6448 − 0.2140 − 0.8880 − 1.4498 − 1.6123
Output bias
0.5382

TS subsets, respectively. Herein, the scatterplots of the employed models are


shown combinedly. The degree of deviation between the actual and estimated
values can be visualised by seeing red-coloured dotted lines set at 10% levels
in these diagrams. Histogram with kernel smooth of absolute error between the
actual and estimated values of OWC and MDD are presented in Fig. 8. It should
be noted that absolute error illustrations are provided for the whole datasets.
As can be observed, the error distribution is skewed to the left, with the bulk
(a) (b) (c)

Fig. 6  Scatter plot of OWC predictions: a training, b testing and c total datasets

(a) (b) (c)

Fig. 7  Scatter plot of MDD predictions: a training, b testing and c total datasets

13
Transportation Infrastructure Geotechnology

(a) 100
ANN Normal
mu sigma
ELM ANN 0.0325 0.0280
80
SVR ELM 0.0464 0.0397
SVR 0.0638 0.0501
KNR
KNR 0.0895 0.0684
60 GMDH GMDH 0.0604 0.0404
Count

40

20

0
0.00 0.05 0.10 0.15 0.20 0.25 0.30 0.35 0.40
Error range
(b)
100 ANN Normal
mu sigma
ELM ANN 0.031 0.03169
80 SVR ELM 0.04753 0.03633
SVR 0.06664 0.04934
KNR KNR 0.06879 0.046
60 GMDH GMDH 0.10997 0.05867
Count

40

20

0
0.00 0.05 0.10 0.15 0.20 0.25 0.30 0.35 0.40
Error range

Fig. 8  a, b Error histogram with kernel smooth for OWC and MDD estimations (for all datasets)

of samples falling between 0 and 0.15. These results indicate that the employed
models are closer to the ideal values of OWC and MDD of soils.
On the other hand, the Taylor diagram (Taylor 2001) is a 2-D mathematical
graphic used to provide a rapid evaluation of a model’s precision. It describes
the relationships between estimated and real observations using the R-index,
standard deviation ratio and RMSE index. A model is represented as a point in
this diagram. The position of the point should match with the reference point,
‘Ref’, for an ideal model. Figure 9 shows the Taylor diagrams for the ANN
and ELM models employed for OWC and MDD predictions. The results of the
models for the TR and TS subgroups are presented here. As can be seen, the
ANN model is the most precise (as the violet marker appears closest to the ‘Ref’
point) in both cases of OWC and MDD predictions.

13
Transportation Infrastructure Geotechnology

(a) (b)

(c) (d)

Fig. 9  Taylor diagram for OWC (a training and b testing) and MDD (c training and d testing) predictions

6 Practical Implication

Notably, ANNs have significant practical implications for solving regression problems
across various domains. ANNs are powerful tools due to their ability to model com-
plex, non-linear relationships between inputs and outputs, making them particularly
effective for regression tasks where traditional linear models fall short. The proposed
approach presented in this study can be for the estimation of soil compaction param-
eters. It is important to note that in many cases, the subgrade/sub-base materials of rail-
ways, roadways and airport runways need to undergo essential treatments in order to
enhance their strength. Typically, engineers and practitioners engage in soil stabilisa-
tion, which involves combining many soils and compacting them to enhance the initial
strength of the subgrade materials. Hence, the suggested framework can be employed
to evaluate the degree of compaction during the initial phases for various types of soils
and combinations. This, in turn, allows researchers and practitioners to design the sub-
grade/sub-base materials by adjusting the mixture proportions of different soils accord-
ing to their specific needs at the site. The process of deploying the proposed ANN to
estimate the soil compaction parameters can be outlined as follows: (a) gathering sam-
ples during the initial survey and site inspection and (b) determining fundamental soil
properties such as grain size, plasticity characteristics, etc. The proposed approach can
be utilised to estimate the OWC and MDD roughly, taking into account these character-
istics. Previously obtained test results can serve as a benchmark to verify the estimated

13
Transportation Infrastructure Geotechnology

results. However, it is usually recommended to do first laboratory trials, especially for


crucial sites, before reaching a direct solution.

7 Summary and Conclusion

The present work presents a comparative analysis five soft computing techniques
for the estimation of OWC and MDD of soils. Based on the experimental results,
the employed ANN model was found to be the best model with (a) R2 = 0.9350 and
RMSE = 0.0400 (in the training phase) and R2 = 0.9050 and RMSE = 0.0530 for
OWC and (b) R2 = 0.9341 and RMSE = 0.0421 (in the training phase) and R2 = 0.9131
and RMSE = 0.0522 for MDD predictions. The performance of KNR and GMDH
were found to be least effective against the estimation of OWC and MDD, respec-
tively. Overall performance exhibits that the employed ANN model (R2 = 0.9249 and
RMSE = 0.0429 against OWC and R2 = 0.9277 and RMSE = 0.0443 against MDD
predictions) is best precise model and can be utilised as an alternate tool to estimate
soil compaction parameters to aid geotechnical engineers in the design phase of civil
engineering projects. The main advantages of the employed ANN model include (i)
use of real-life datasets; (ii) 13 different soil types were considered; (iii) higher predic-
tion accuracy and (iv) high degree of reliability. However, selection of optimum model
using a trial-and-error approach can be seen as one of the limitations of the present
study. In addition, an external validation is necessary to ensure the robustness of the
employed model for a new dataset. Therefore, the future direction of this work may
include (i) a detailed assessment of the accuracy of other soft computing models, via
actual data from various areas of geotechnical engineering; (ii) evaluation of the ANN
model’s superiority over hybrid ANN models and (iii) implementation of meta-heuris-
tic algorithms to construct high-performance models followed by a comparative assess-
ment of results.

Author Contribution All authors contributed equally to this manuscript.

Data Availability All data will be made available on request.

Declarations
Ethics Approval and Consent to Participate Not applicable.

Consent for Publication Not applicable.

Competing Interests The authors declare no competing interests.

References
Alkayem, N.F., Shen, L., Mayya, A., Asteris, P.G., Fu, R., Di Luzio, G., Strauss, A., Cao, M.: Prediction
of concrete and FRC properties at high temperature using machine and deep learning: a review of
recent advances and future perspectives. J. Build. Eng. 83, 108369 (2023)

13
Transportation Infrastructure Geotechnology

Ardakani, A., Kordnaeij, A.: Soil compaction parameters prediction using GMDH-type neural network
and genetic algorithm. Eur. J. Environ. Civ. Eng. 23, 449–462 (2019)
Asteris, P.G., Rizal, F.I.M., Koopialipoor, M., Roussis, P.C., Ferentinou, M., Armaghani, D.J., Gordan,
B.: Slope stability classification under seismic conditions using several tree-based intelligent tech-
niques. Appl. Sci. 12(3), 1753 (2022)
Asteris, P.G., Karoglou, M., Skentou, A.D., Vasconcelos, G., He, M., Bakolas, A., Zhou, J., Armaghani,
D.J.: Predicting uniaxial compressive strength of rocks using ANN models: incorporating porosity,
compressional wave velocity, and Schmidt hammer data. Ultrasonics 141, 107347 (2024)
Bardhan, A.: Probabilistic assessment of heavy-haul railway track using multi-gene genetic program-
ming. Appl. Math. Model. 125, 687–720 (2024)
Bardhan, A., Asteris, P.G.: Application of hybrid ANN paradigms built with nature inspired meta-heuris-
tics for modelling soil compaction parameters. Transp. Geotech. 41, 100995 (2023)
Bardhan, A., Alzo’ubi, A.K., Palanivelu, S., Hamidian, P., GuhaRay, A., Kumar, G., Tsoukalas, M.Z.,
Asteris, P.G.: A hybrid approach of ANN and improved PSO for estimating soaked CBR of sub-
grade soils of heavy-haul railway corridor. Int. J. Pavement Eng. 24, 2176494 (2023a)
Bardhan, A., Singh, R.K., Ghani, S., Konstantakatos, G., Asteris, P.G.: Modelling soil compaction param-
eters using an enhanced hybrid intelligence paradigm of ANFIS and improved Grey Wolf Optimiser.
Mathematics 11, 3064 (2023b)
Benzaamia, A., Ghrici, M., Rebouh, R., Zygouris, N., Asteris, P.G.: Predicting the shear strength of rec-
tangular RC beams strengthened with externally-bonded FRP composites using constrained mono-
tonic neural networks. Eng. Struct. 313, 118192 (2024)
Blotz, L.R., Benson, C.H., Boutwell, G.P.: Estimating optimum water content and maximum dry unit
weight for compacted clays. J. Geotech. Geoenviron. Eng. 124, 907–912 (1998)
Di Matteo, L., Bigotti, F., Ricco, R.: Best-fit models to estimate modified proctor properties of compacted
soil. J. Geotech. Geoenviron. Eng. 135, 992–996 (2009)
Ghani, S., Kumari, S., Bardhan, A.: A novel liquefaction study for fine-grained soil using PCA-
based hybrid soft computing models. Sādhanā 46, 113 (2021). https://​doi.​org/​10.​1007/​
s12046-​021-​01640-1
Günaydın, O.: Estimation of soil compaction parameters by using statistical analyses and artificial neural
networks. Environ. Geol. 57, 203–215 (2009)
Gurtug, Y., Sridharan, A.: Compaction behaviour and prediction of its characteristics of fine grained soils
with particular reference to compaction energy. Soils Found. 44, 27–36 (2004)
He, B., Armaghani, D.J., Lai, S.H., He, X., Asteris, P.G., Sheng, D.: A deep dive into tunnel blasting
studies between 2000 and 2023—a systematic review. Tunn. Undergr. Space. Technol. 147, 105727
(2024)
Huang, G.B., Zhu, Q.Y., Siew, C.K.: Extreme learning machine: theory and applications. Neurocomput-
ing 70, 489–501 (2006). https://​doi.​org/​10.​1016/j.​neucom.​2005.​12.​126
Huat, C.Y., Armaghani, D.J., Lai, S.H., Motaghedi, H., Asteris, P.G., Fakharin, P.: Analyzing surface
settlement factors in single and twin tunnels: a review study. J. Eng. Res. (2024). https://​doi.​org/​10.​
1016/j.​jer.​2024.​05.​009
Kaloop, M.R., Bardhanb, A., Hu, J.W., Abd-Elrahmanc, M.: Estimation of lightweight aggregate concrete
characteristics using a novel stacking ensemble approach. Adv. NANO. Res. 13, 499–512 (2022)
Kardani, N., Aminpour, M., Raja, M.N.A., Kumar, G., Bardhan, A., Nazem, M.: Prediction of the resil-
ient modulus of compacted subgrade soils using ensemble machine learning methods. Transp. Geo-
tech. 36, 100827 (2022)
Khan, K., Iqbal, M., Jalal, F.E., Amin, M.N., Alam, M.W., Bardhan, A.: Hybrid ANN models for durabil-
ity of GFRP rebars in alkaline concrete environment using three swarm-based optimization algo-
rithms. Constr. Build. Mater. 352, 128862 (2022)
Khuntia, S., Mujtaba, H., Patra, C., Farooq, K., Sivakugan, N., Das, B.M.: Prediction of compaction
parameters of coarse grained soil using multivariate adaptive regression splines (MARS). Int. J.
Geotech. Eng. 9, 79–88 (2015)
Kumar, V., Rao, B., Burman, A., Kumar, S., Bardhan, A.: An exact solution of three-dimensional rock
mass strength criterion. Model. Earth Syst. Environ. 9, 723–734 (2023)
Kurnaz, T.F., Kaya, Y.: The performance comparison of the soft computing methods on the prediction of
soil compaction parameters. Arab. J. Geosci. 13, 1–13 (2020)
Liu, D., Liu, H., Wu, Y., Zhang, W., Wang, Y., Santosh, M.: Characterization of geo-material parameters:
gene concept and big data approach in geotechnical engineering. Geosyst. Geoenviron. 1(1), 100003
(2022)

13
Transportation Infrastructure Geotechnology

Liu, S., Wang, L., Zhang, W., He, Y., Pijush, S.: A comprehensive review of machine learning-based
methods in landslide susceptibility mapping. Geol. J. 58(6), 2283–2301 (2023)
Nagaraj, H.B., Reesha, B., Sravan, M.V., Suresh, M.R.: Correlation of compaction characteristics of natu-
ral soils with modified plastic limit. Transp. Geotech. 2, 65–77 (2015)
Najjar, Y.M., Basheer, I.A., Naouss, W.A.: On the identification of compaction characteristics by neu-
ronets. Comput. Geotech. 18, 167–187 (1996)
Omar, M., Shanableh, A., Basma, A., Barakat, S.: Compaction characteristics of granular soils in United
Arab Emirates. Geotech. Geol. Eng. 21, 283–295 (2003)
Phoon, K.K., Zhang, W.: Future of machine learning in geotechnics. Georisk Assess. Manage. Risk. Eng.
Syst. Geohazards 17(1), 7–22 (2023)
Rosenblatt, F.: The perceptron: a probabilistic model for information storage and organization in the
brain. Psychol. Rev. 65(6), 386 (1958)
Salami, B.A., Iqbal, M., Abdulraheem, A., Jalal, F.E., Alimi, W., Jamal, A., Tafsirojjaman, T., Liu, Y.,
Bardhan, A.: Estimating compressive strength of lightweight foamed concrete using neural, genetic
and ensemble machine learning approaches. Cem. Concr. Compos. 133, 104721 (2022)
Sinha, S.K., Wang, M.C.: Artificial neural network prediction models for soil compaction and permeabil-
ity. Geotech. Geol. Eng. 26, 47–64 (2008)
Sridharan, A., Nagaraj, H.B.: Plastic limit and compaction characteristics of fine grained soils. Proc. Inst.
Civ. Eng. Improv. 9, 17–22 (2005)
Taylor, K.E.: Summarizing multiple aspects of model performance in a single diagram. J. Geophys. Res.
Atmos. 106, 7183–7192 (2001). https://​doi.​org/​10.​1029/​2000J​D9007​19
Topal, U., Goodarzimehr, V., Bardhan, A., Vo-Duy, T., Shojaee, S.: Maximization of the fundamental
frequency of the FG-CNTRC quadrilateral plates using a new hybrid PSOG algorithm. Compos.
Struct. 295, 115823 (2022)
Wang, H.-L., Yin, Z.-Y.: High performance prediction of soil compaction parameters using multi expres-
sion programming. Eng. Geol. 276, 105758 (2020)
Wang, L., Wu, C., Tang, L., Zhang, W., Lacasse, S., Liu, H., Gao, L.: Efficient reliability analysis of earth
dam slope stability using extreme gradient boosting method. Acta. Geotech. 15, 3135–3150 (2020)
Wu, C., Hong, L., Wang, L., Zhang, R., Pijush, S., Zhang, W.: Prediction of wall deflection induced by
braced excavation in spatially variable soils via convolutional neural network. Gondwana. Res. 123,
184–197 (2023)
Yousif, A.A.A., Mohamed, I.A.: Prediction of compaction parameters from soil index properties case
study: dam complex of Upper Atbara Project. Am. J. Pure. Appl. Sci. 4, 1–9 (2022)
Zhang, W., Li, H., Li, Y., Liu, H., Chen, Y., Ding, X.: Application of deep learning algorithms in geotech-
nical engineering: a short critical review. Art. Intell. Rev. 54, 1–41 (2021)
Zhang, W., Gu, X., Tang, L., Yin, Y., Liu, D., Zhang, Y.: Application of machine learning, deep learning
and optimization algorithms in geoengineering and geoscience: comprehensive review and future
challenge. Gondwana. Res. 109, 1–17 (2022a)
Zhang, W., Liu, S., Wang, L., Samui, P., Chwała, M., He, Y.: Landslide susceptibility research combin-
ing qualitative analysis and quantitative evaluation: a case study of Yunyang County in Chongqing,
China. Forests 13(7), 1055 (2022b)

Publisher’s Note Springer Nature remains neutral with regard to jurisdictional claims in published maps
and institutional affiliations.

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under
a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted
manuscript version of this article is solely governed by the terms of such publishing agreement and
applicable law.

13

You might also like