Modeling The Chloride Migration of Recycled Ag
Handling Editor: Zhen Leng The use of supplementary cementitious materials such as slag and recycled aggregate in concrete can mitigate
some of the negative environmental impacts of using virgin materials. However, the durability of recycled
Keywords: aggregate concrete (RAC) and its resistance to harsh environmental conditions such as chloride penetration must
Recycled aggregate concrete be investigated before practical applications. The rapid chloride migration test (RCMT) is one of the well-
established tests that can provide valuable estimations of the concrete quality against chloride penetration.
Rapid chloride migration test
RCMT coupled with machine learning techniques can lead to authentic models, which could save in time, cost,
Ensemble learners
SHAP analysis materials, and the need for skilled technicians. In this study, five homogeneous ensemble learners, including two
types of bagging and three types of boosting techniques, were developed to model the RCMT output using a
comprehensive database collected from the literature. Different types of analysis, including statistical measures,
SHapley Additive exPlanations (SHAP) sensitivity analysis, SHAP parametric study, and comparison study, were
conducted to examine the performance of the developed models and the effects of the input features on pre
dictions. The results show that the developed extreme gradient boosting learner with the mean absolute per
centage errors of about 9% possesses excellent capability for modeling the RCMT of RAC. Besides, the RCMT
testing age is the most influential factor affecting the RCMT output, and the amounts of natural fine aggregate
and superplasticizer are in the following orders. Finally, a graphical user interface (GUI) was designed, which
allows the users to insert the input features and obtain the RCMT output in a user-friendly environment.
absorption negatively impacts the mechanical properties and durability the properties of RAC using ensemble techniques are listed in Table 1.
performance of RAC (Guo et al., 2018). All studies given in this table declare that the ensemble models
Among various types of chemical attacks that a concrete structure outperform other types of base ML models. Regarding the chloride
can experience during its service life, chloride penetration is the domi penetration resistance of RAC, K. H. Liu et al. (2022) developed several
nant damage that causes the corrosion of steel bars and deteriorates the individual and ensemble ML models for predicting the charged passed of
mechanical performance of reinforced concrete structures (Liang et al., the rapid chloride penetration test (RCPT) using 226 experimental data
2021). Environmental factors consisting of high temperature and rela gathered from the literature. They concluded that the developed
tive humidity as well as harsh exposure zones, including submerged, gradient-boosting decision tree is the best predictive model compared to
tidal, splash, and salt spray exacerbate the chloride ions penetration into other developed ML models. In addition, while developing the
concrete and reduce the service life of reinforced concrete structures. gradient-boosting decision tree, they did not investigate the effect of
The chloride ingress into concrete is a long process that may occur over some hyperparameters (e.g., the maximum tree depth), which can
the course of several years. However, to simulate this process in the significantly affect the accuracy of the developed model. For the chlo
laboratory, a few standardized short-term chloride penetration resis ride penetration resistance of concrete mixtures containing SCMs, Quan
tance tests have been proposed to investigate the concrete quality Tran (2022) employed several ML techniques, modeled the chloride
against chloride ions penetration. The rapid chloride penetration test diffusion coefficient of concrete using 127 data samples, and claimed
(RCPT), rapid chloride migration test (RCMT), resistivity test, and that the developed gradient boosting model had the best accuracy
pressure penetration test are the most popular short-term laborator among all other developed ML models. Besides, the developed models
y-based tests of chloride penetration resistance of concrete. RCPT and were made based on the default hyperparameters’ values given in the
RCMT are two well-established short-term chloride penetration resis sklearn library of Python, not the optimal values.
tance tests designed based on imposing a direct current (DC) voltage on A considerable number of experimental studies have been carried out
concrete specimens exposed to chemical solutions. There are several to investigate the chloride penetration of RAC using RCMT. Collecting
criticisms toward the RCPT consisting of chloride ion movement, the the data from relevant studies and modeling the RCMT results of RAC
role of other ions like hydroxide on the results, the non-steady-state
migration measurement, and concrete sample heating because of the
high level of applied voltage (Bagheri and Zanganeh, 2012). In addition, Table 1
the use of SCMs in concrete mixtures reduces the hydroxide ions con The ensemble models used for modeling the mechanical properties and dura
centration in the pore solution and higher resistance to chloride ion bility of RAC.
penetration is observed by the RCPT measurement. To address these Concrete type Modeled property Number of Ref
shortcomings, the RCMT was proposed which has a good correlation data
with the results of long-term chloride penetration tests (Bagheri and Self-compacting RAC Compressive 515 (de-Prado-Gil
Zanganeh, 2012). In RCMT, as per NT build 492 (NT Bulid 492, 1999), a strength et al., 2022)
standard concrete specimen is exposed to limewater on one side and 3% RAC with slag and fly RCPT 226 (K. H. Liu et al.,
ash 2022)
NaCl on another side under an applied voltage, and the chloride RAC without SCMs Sulfate resistance 143 (K. Liu et al.,
migration coefficient (CMC) is calculated using the Nernst-Einstein 2022)
equation (NT Bulid 492, 1999). Although this experimental test can RAC with slag and fly Elastic modulus 526 Han et al. (2020)
provide reliable information on the durability and chloride penetration ash
RAC without SCMs Compressive 721 Quan Tran et al.
resistance of concretes, the preparation of concrete specimens and
strength (2022)
apparatus as well as the experimental process of the RCMT, are time and RAC with slag Compressive 126 Imran et al.
resource intensive. strength (2022)
Several studies have been conducted in the last few years to model RAC without SCMs Compressive 209 Duan et al.
the mechanical properties and durability of various types of concretes strength (2020)
RAC without SCMs Compressive and 638–139 Yuan et al.
using machine learning (ML) techniques (Behnood and Golafshani,
flexural strengths (2022)
2021). Ensemble ML techniques, by combining several base ML models, RAC with silica fume, Carbonation depth 713 Nunez and Nehdi
have been employed successfully in previous years because of their slag, fly ash, and (2021)
highly reliable predictions. Examples of previous studies for modeling metakaolin
E. Mohammadi Golafshani et al. Journal of Cleaner Production 407 (2023) 136968
using ML techniques can provide several advantages, including (1) value (− 0.35 in Fig. 1(b)) of slag with output indicates the positive in
Development of a unique predictive model integrating the individual fluence of this SCM in decreasing the chloride migration of RAC.
studies, (2) Reduction of the effect of noisy data of the experimental Moreover, the unpleasant effect of RCA on chloride migration can be
studies on predictions, (3) Generation of a reliable durability prediction observed from the positive values of the linear correlation values of RAR
model for engineers, and (4) Determination of the most influential fac (0.35 in Fig. 1(b)) and RWA (0.27 in Fig. 1(b)) with outputs.
tors affecting the chloride migration for optimizing RAC mix design. The statistical parameters of the input and output features in the
Because of the better performance of the ensemble models compared to RCMT database are given in Table 2. More than 55% and 29% of samples
individual ML models, two types of bagging ensemble techniques and in the gathered database include RCA and slag, respectively. This shows
three types of boosting ensemble techniques were employed to model the importance of slag as an SCM in concrete mixtures for the chloride
the RCMT of RAC. In addition, the influential hyperparameters of the migration reduction of RAC in past studies. Besides, the mean and
ensemble techniques were tuned in order to achieve accurate models. maximum values of the compressive strength of RAC give a hopeful
This paper is organised into the following sections. First, the description promise of achieving high-performance RAC, especially in practical
of the gathered data is given in Section 2. Then, Section 3 presents the applications. Fig. 2 visualizes the counterplots between the input and
research methodology used in this study. Finally, the results and dis output features using the probability distributions. As shown in this
cussion are presented in Section 4, followed by the summarized findings figure, increasing slag, NFA, SP, CS, and TA generally decreases the CMC
of this study in Section 5. of RAC. Besides, W, RAR, and RWA negatively influence the CMC of
RAC. However, the general trend is not apparent regarding C, and more
2. Data description investigations are required. The density of C contents is higher in the
range of almost [330 kg/m3, 450 kg/m3], and the corresponding CMC
A comprehensive database of the experimental observations of values are in the range of about [10 × 10− 12 m2/s, 14 × 10− 12 m2/s].
RCMT, as one of the well-established accelerated chloride penetration The probability distributions of S and RAR are denser around zero
tests, was gathered from the literature. The total number of samples indicating a zero amount of S and RAR in a significant number of data
gathered for the experimental observations for RCMT are 227, which records in the database. The probability distribution of W is higher be
were obtained from 10 scientific studies. The references and the sample tween about 140 kg/m3 and 160 kg/m3 and the corresponding CMC
IDs used for developing ML models in this study are given in Appendix A. values vary between almost 9 × 10− 12 m2/s and 14 × 10− 12 m2/s. For
For modeling the RCMT of RAC, the contents (in kg/m3) of cement (C), NFA, two spots with high density can be observed, a small spot with a
slag (S), water (W), natural coarse aggregate (NCA), natural fine center of almost 700 kg/m3 and a big spot with a center of about 1000
aggregate (NFA), recycled coarse aggregate (RCA), superplasticizer kg/m3. The big spot has lower CMC compared to the small spot denoting
(SP), as well as the values of recycled coarse aggregate water absorption the higher amount of NFA can cause chloride migration reduction. In the
(RWA), compressive strength (MPa) at 28 days (CS), and the testing age case of RWA, two dense regions can be seen. The first region with a
in days (TA) were selected as the candidate input features of the ML center of zero is related to data records without RCA and the second
models. The linear correlation coefficients of the initial candidate input region with a center of almost 6.8% corresponds to the data records with
features are illustrated in Fig. 1(a), indicating that NCA and RCA are RCA. The centers of the dense clusters of SP and CS are almost 2.3 kg/m3
highly correlated. Hence, the RCA ratio (RAR) was defined as the and 52 MPa, respectively. Three distinct dense regions with centers of 7,
amount of RCA to the total amount of coarse aggregate, and replaced 28, and 91 days are observed in the probability distributions of TA
NCA and RCA in the final input features to reduce the multi-collinearity indicating the focus of researchers for measuring the CMC in these
risk. As shown in Fig. 1(b), the correlation coefficients of the mutual standard days.
final input features are less than 0.7, denoting a low multi-collinearity
risk in the defined system. The output of the RCMT system is the chlo 3. Research methodology
ride migration coefficient (CMC) of RAC. The negative linear correlation
Fig. 3 summarizes the general framework of this study to develop
reliable models for estimating the chloride penetration of RAC. Gath
ering a reliable database and cleaning data substantially impact the
quality and credibility of the developed ML models. Next, the collected
database is partitioned into several groups for various purposes. After
developing the ensemble models, the reliability of the ML models is
examined using anonymous data, and the best one is chosen for more
analyses. A SHapley Additive exPlanations (SHAP)-based interpretation
method is used to find the relationships between the input and output
Table 2
Statistical parameters of the input and output features of the RCMT database.
Input Mean std Min 25% 50% 75% Max
E. Mohammadi Golafshani et al. Journal of Cleaner Production 407 (2023) 136968
Fig. 2. Probability distribution counterplots between the input and output features.
This section divides the database randomly into two main groups for
the development and testing phases. In the development phase, the
Fig. 3. The research methodology.
training-validating samples are served to develop the ensemble models,
while the performances of the developed models are verified using the
features to interpret the outcomes of the final ensemble model, which is
testing samples. For the development phase, the k-fold cross-validation
followed by developing a graphical user interface. More explanations of
technique is served, allowing all samples to be used in both the
the research methodology are given in the following subsections.
training and validating stages, as demonstrated in Fig. 4. In this regard,
the training-validating samples are partitioned randomly into k-folds
with almost equal sizes, and k ensemble models are created in total. In
each iteration, one ensemble model is generated in which (k-1) folds are
E. Mohammadi Golafshani et al. Journal of Cleaner Production 407 (2023) 136968
.. .
k k
E. Mohammadi Golafshani et al. Journal of Cleaner Production 407 (2023) 136968
algorithm, and the average error of the developed ensemble model in k PI RRMSE
folds for validating sample groups determines the suitability of the
selected hyperparameters. EOi: the ith experimental observation; MOi: the ith model output; N: the
number of samples; EO: the average experimental observation.
E. Mohammadi Golafshani et al. Journal of Cleaner Production 407 (2023) 136968
uses the Shapely value to calculate the feature importance, makes the
developed ML models interpretable, and gives a bright insight into the
contributions of input features in predicting the output feature. The
Shapely value of the jth feature value (φj ) is calculated by the weighted
summation of the marginal contribution of the jth feature to the model
output over all possible subsets of features (S), excluding the jth feature,
as follows:
∑ |S|!(M − |S| − 1)!
φj = [f(S ∪ {j}) − f(S)]
E. Mohammadi Golafshani et al. Journal of Cleaner Production 407 (2023) 136968
Table 5
Error indices of the developed ensemble models.
Models Phases RMSE (10− × m2/s) MAE (10− 12
× m2/s) MAPE (%) R-squared RRMSE PI
Fig. 8. The CMC predictions against the experimental observations for a) RFL, b) BETL, c) ABL, d) GBL, and e) XBGL models (Out-of-range data points are specified
by red color). (For interpretation of the references to color in this figure legend, the reader is referred to the Web version of this article.)
the RMSE of 0.95 × 10− 12 m2/s has the best RMSE without considering possesses the first and third orders concerning the least mean bias error
outliers in the testing phase. and the standard deviation, respectively. In addition, the third rank of
Fig. 9 depicts the error probability distributions of the developed the least mean bias error and the first rank of the least prediction vari
ensemble models for the RCMT system. The results indicate that all ability belongs to the developed ABL model.
developed models possess good distributions around zero error. How Only the developed XGBL model served as the best-developed
ever, the mean bias errors of the GBL and XGBL models are lower than ensemble model for the sensitivity analysis. For this purpose, the
other ensemble models. The error probability distributions of the RFL SHAP technique was used to determine the importance of each input
and BETL models are similar, showing a slight difference between their feature in the RCMT prediction. The average impact of each input
performances. The XGBL model has the second order regarding the least feature on the output was calculated based on the mean value of abso
mean bias error and the standard deviation, while the GBL model lute SHAP values, as demonstrated in Fig. 10. The testing age, with the
E. Mohammadi Golafshani et al. Journal of Cleaner Production 407 (2023) 136968
5. Conclusions
highest mean SHAP value, is the most noteworthy feature affecting the
RCMT prediction, followed by the natural fine aggregate and super
Rapid chloride penetration test of recycled aggregate concrete (RAC)
plasticizer. There is no significant difference between the mean SHAP
can provide valuable information about the durability of concrete and
values of water, slag, and cement features influencing the CMC. In
its resistance to chloride ions ingress. In this study, a comprehensive
addition, recycled coarse aggregate water absorption has slightly more
database of the rapid chloride migration test (RCMT) of RAC was
influence than the recycled coarse aggregate ratio on the model output.
gathered from the literature, and the chloride migration coefficient
To investigate the effects of the input features on the RCMT system, a
(CMC) of RAC was the system output. Five homogeneous ensemble
parametric study was conducted for all the input features, as shown in
models, including Random forest learner (RFL), Bagged extra trees
Fig. 11. For this, the SHAP values for each input feature were plotted
learner (BETL), Adaptive boost learner (ABL), Gradient boosting learner
against the feature values. Increasing cement, slag, natural fine aggre
(GBL), and Extreme gradient boosting learner (XGBL), were developed
gate, superplasticizer, compressive strength, and testing age reduces the
to estimate the CMC of RAC. The following conclusions were drawn from
chloride migration of RAC. In contrast, the increment of water, recycled
this study.
coarse aggregate ratio, and recycled coarse aggregate water absorption
negatively impact the chloride migration of RAC. It is worth mentioning
• All developed ensemble models in this study possess “good perfor
that increasing the recycled coarse aggregate ratio to more than 0.5
mance” in the validating phase, which shows the high generality
exacerbates the chloride migration of RAC. The increment rate of
capability of ensemble models in estimating the rapid chloride
chloride migration for the recycled coarse aggregate water absorption of
migration test output of RAC.
less than about 5% is remarkable. A significant difference exists between
• The developed XGBL and GBL models need less maximum tree depth
the chloride migration of RAC with and without slag. For compressive
and a higher number of based models compared to the developed
strengths of less than almost 40 MPa, the chloride migration of RAC is
RFL, BETL, and ABL models.
more crucial than concrete with compressive strengths of more than 40
• The developed XGBL model outperforms the other ensemble models
MPa. The negative impacts of the recycled coarse aggregate ratio and
with the root mean squared error, the mean absolute percentage
recycled coarse aggregate water absorption in the concrete mixture can
error, and the R-squared of 1.04 × 10− 12 m2/s, 6.69%, and 0.94 in
be related to the weaker interfacial transition zone of concrete for the
the testing phase, respectively. In addition, the prediction perfor
higher range of RCA in the concrete mixture and the more porous
mance of the XGBL model in the testing phase is excellent, consid
structure of RCA with higher water absorption compared to the normal
ering the relative root mean squared error of 0.08.
concrete. This negative impact can be compensated using less water,
E. Mohammadi Golafshani et al. Journal of Cleaner Production 407 (2023) 136968
Fig. 11. Parametric study of a) C, b) S, c) W, d) NFA, e) RAR, f) RWA, g) SP, h) CS, and i) TA on the CMC obtained by the XGBL model.
Table 6
Comparison of different ML models for the RCMT system.
12 12 12
Best developed ML model Concrete type Input feature Data RMSE (10− MAE (10− × R-squared (10− Refs
number number × m2/s) m2/s) × m2/s)
GBL Concrete with silica fume, 9 127 3.72 2.71 0.87 Quan Tran (2022)
slag, and fly ash
XGBL Concrete with silica fume, 12 176 2.14 1.35 0.86 Taffese and
slag, and fly ash Espinosa-Leal (2022)
Developed XGBL model RAC with slag 9 227 1.04 0.73 0.94 –
in this study
Fig. 12. A graphical user interface designed for modeling the RCMT output.
E. Mohammadi Golafshani et al. Journal of Cleaner Production 407 (2023) 136968
• Testing age, the amounts of natural fine aggregate, and super chloride ion penetration resistance tests of concrete are proposed to be
plasticizer are three crucial features affecting the CMC of RAC. There modeled using machine learning techniques. The correlation between
is no significant difference between the importance of water, slag, predictions of different machine learning models of various tests can
and cement contents affecting the CMC of RAC. Moreover, the provide valuable information. Because of the importance of recycled
recycled coarse aggregate ratio and the recycled coarse aggregate aggregate water absorption in the chloride permeability of concrete, as
water absorption have the least impact on the CMC of RAC. concluded in this paper, it is suggested to improve the quality of recycled
• The CMC of RAC is more crucial for the recycled coarse aggregate aggregate using novel surface treatment techniques to alleviate the
ratio of more than 0.5, the recycled coarse aggregate water absorp penetration of RAC. The chloride ion ingress resistance of concrete
tion of less than about 5%, and the compressive strength of less than containing recycled coarse and/or fine aggregates, waste materials, and
almost 40 MPa. In addition, there is a significant difference between other types of supplementary cementitious materials is another inter
the chloride migration of RAC with and without slag. esting topic that can be modeled using machine learning techniques.
• Recycled aggregate ratio and water absorption were two parameters
used in this study as input features to model the CMC of RAC. These CRediT authorship contribution statement
two parameters affect negatively on the CMC of RAC and the effect of
the recycled coarse aggregate water absorption is slightly more than Emadaldin Mohammadi Golafshani: Conceptualization, Data
that of the recycled coarse aggregate ratio. curation, Formal analysis, Methodology, Software, Validation, Visuali
• The recycled aggregate can alleviate the environmental effects of zation, Writing – original draft. Alireza Kashani: Investigation, Super
RAC, including the preservation of precious virgin resources and the vision, Writing – review & editing. Ali Behnood: Investigation,
prevention of landfilling. Surface treatment techniques can reduce Supervision, Writing – review & editing. Taehwan Kim: Investigation,
the negative impacts of recycled aggregate on the CMC by reducing Supervision, Writing – review & editing.
their porous structure and making a better interfacial transition zone
in concrete. Declaration of competing interest
For future studies, it is suggested to use other types of machine The authors declare that they have no known competing financial
learning techniques, including fuzzy-inference systems and newly pro interests or personal relationships that could have appeared to influence
posed ensemble techniques such as CatBoost and LightGBM to model the the work reported in this paper.
chloride migration coefficient. In addition, the determination of
hyperparameters of ensemble algorithms using existing techniques is a Data availability
challenging task. To solve this problem, it is suggested to use meta
heuristic optimization algorithms to tune the hyperparameters. Other Data will be made available on request.
