0% found this document useful (0 votes)
24 views

Scikit-ANFIS Python implementation

The document presents Scikit-ANFIS, a Python implementation of the Adaptive Neuro-Fuzzy Inference System (ANFIS) that is compatible with the popular scikit-learn library. This implementation allows users to manually generate and train fuzzy systems while providing superior performance compared to existing Python-based ANFIS implementations. The paper highlights the advantages of Scikit-ANFIS in terms of usability, compatibility, and performance across various applications in machine learning and data analysis.

Uploaded by

joao
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
24 views

Scikit-ANFIS Python implementation

The document presents Scikit-ANFIS, a Python implementation of the Adaptive Neuro-Fuzzy Inference System (ANFIS) that is compatible with the popular scikit-learn library. This implementation allows users to manually generate and train fuzzy systems while providing superior performance compared to existing Python-based ANFIS implementations. The paper highlights the advantages of Scikit-ANFIS in terms of usability, compatibility, and performance across various applications in machine learning and data analysis.

Uploaded by

joao
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 19

Int. J. Fuzzy Syst.

(2024) 26(6):2039–2057
https://doi.org/10.1007/s40815-024-01697-0

Scikit-ANFIS: A Scikit-Learn Compatible Python Implementation


for Adaptive Neuro-Fuzzy Inference System
1,2 •
Dongsong Zhang Tianhua Chen2

Received: 1 August 2023 / Revised: 5 January 2024 / Accepted: 30 January 2024 / Published online: 3 June 2024
Ó The Author(s) 2024

Abstract The Adaptative neuro-fuzzy inference system Experimental results on four datasets show that our Scikit-
(ANFIS) has shown great potential in processing practical ANFIS outperforms recent Python-based implementations
data from control, prediction, and inference applications, while achieving parallel performance to ANFIS in Matlab,
reflecting advantages in both high performance and system a standard implementation officially realized by Matlab,
interpretability as a result of the hybridization of neural which indicates the performance advantages and applica-
networks and fuzzy systems. Matlab has been a prevalent tion convenience of our software.
platform that allows to utilize and deploy ANFIS conve-
niently. On the other hand, due to the recent popularity of Keywords Neuro-fuzzy  Fuzzy system  Anfis  Python 
machine learning and deep learning, which are predomi- Scikit-learn  PyTorch
nantly Python-based, implementations of ANFIS in Python
have attracted recent attention. Although there are a few
Python-based ANFIS implementations, none of them are 1 Introduction
directly compatible with scikit-learn, one of the most fre-
quently used libraries in machine learning. As such, this Since the adaptative neuro-fuzzy inference system
paper proposes Scikit-ANFIS, a novel scikit-learn com- (ANFIS) [1] was proposed in 1993 as a creative method of
patible Python implementation for ANFIS by adopting a combining the advantages of the fuzzy system and neural
uniform format such as fit() and predict() functions to network, it has been extensively applied in numerous
provide the same interface as scikit-learn. Our Scikit- fields. ANFIS is a unique five-layer neural network model
ANFIS is designed in a user-friendly way to not only that integrates fuzzy sets and logic modeling a fuzzy sys-
manually generate a general fuzzy system and train it with tem. The model features a two-step learning algorithm that
the ANFIS method but also to automatically create an comprises a forward pass and a backward pass, which
ANFIS fuzzy system. We also provide four kinds of rep- allows for automatic adjustment of the antecedent and
resentative cases to show that Scikit-ANFIS represents a consequent parameters by minimizing the error between
valuable addition to the scikit-learn compatible Python the actual and target outputs [1]. This approach provides
software that supports ANFIS fuzzy reasoning. two main benefits, allowing for automatic learning from the
data and employing fuzzy if-then rules to explain the
model-generated results. By combining the fuzzy system’s
https://scikit-learn.org/. explainability with the neural network’s self-learning
ability, this approach delivers unparalleled accuracy and
& Tianhua Chen interpretability.
T.Chen@hud.ac.uk
As a result of the above advantages, research and
1
School of Big Data and Artificial Intelligence, Xinyang application of ANFIS have drawn widespread attention in
College, Xinyang 464000, Henan, China many domains. Health and well-being are one of the pri-
2
School of Computing and Engineering, University of oritized applications, due to the interpretability and accu-
Huddersfield, Huddersfield HD1 3DH, UK racy typically required in healthcare domain that ANFIS

123
2040 International Journal of Fuzzy Systems, Vol. 26, No. 6, September 2024

may provide. Researchers have proposed a new approach led to the emergence of a popular research area known as
to clinical decision support that uses data-driven techniques deep neural fuzzy system [17], of which ANFIS is an
to create interpretable fuzzy rules. This approach combines essential component. However, although some researchers
decision tree learning mechanisms with an ANFIS frame- have tried combining deep neuro-fuzzy systems (DNFS)
work, resulting in a method that outperforms many other [17] created using Python with scikit-learn, such as PyTSK
popular machine learning techniques in terms of accuracy [18], no cases have been found that combine ANFIS with
[2]. Other researchers have used ANFIS optimized through scikit-learn, to the best of our knowledge.
artificial bee colonies to classify heartbeat sounds, aiming Conventionally, ANFIS application and development
at early detection of cardiovascular disease [3]. For the are conducted in Matlab. However, with the rapid progress
diagnosis process of Alzheimer’s disease, some researchers of deep learning and machine learning, which is commonly
have proposed a technique that first transforms it into a conducted in a Python environment, it is critical to develop
clustering problem, and then uses ANFIS to optimize fuzzy ANFIS in an environment that is directly compatible with
rules, which ultimately improves the accuracy of diagnosis Python, Sklearn, and PyTorch [19]. This will facilitate the
[4]. ANFIS is also widely used in the field of control and research and development of ANFIS and ensure compati-
engineering. A new combined ANFIS and robust propor- bility with the latest technologies.
tional integral derivative control framework are proposed This paper reports a novel implementation of ANFIS, in
for building structure damping systems, which can effec- Python programming language. To ease the use of ANFIS
tively ensure the stability and robustness of the controller in compatibility with popular machine learning models,
[5]. Some researchers have proposed using adaptive virtual which have been realized in the popular scikit-learn library,
synchronous generators with ANFIS controllers as inverter our implementation, termed Scikit-ANFIS, fully supports
controllers in photovoltaic systems, which can enhance the interfaces as specified by scikit-learn. Furthermore, our
system response in different operating scenarios [6]. ANFIS implementation, which may be utilized as an
ANFIS model has also been utilized to enhance electricity optimization method, also supports the training of an
demand forecasting accuracy in a developing country, existing fuzzy system. Through several case studies and
surpassing prior models and databases [7]. To predict cross-validated experiments, our results demonstrate the
power generation in photovoltaic systems, ANFIS models superior performance of Scikit-ANFIS software compared
[8, 9] optimized by genetic algorithm or particle swarm to other ANFIS-based or DNFS-based Python software and
optimization have been developed using Matlab [10] soft- are parallel to the standard ANFIS implementation by
ware with sound performance. These applications across a Matlab. Concretely, our contributions can be summarized
wide range of domains, demonstrate the effectiveness and as:
popularity of ANFIS as a significant data analysis and
(1) Our Scikit-ANFIS implementation is fully compat-
model construction tool predominantly in decision-making
ible with commonly used scikit-learn functions such
and forecasting tasks, which in turn calls for more efforts in
as fit() and predict() - this enables our development
building an accessible development environment to
directly applicable in combination with all existing
streamline its applications.
machine learning models and methods as typically
At present, Matlab [10] is a widely used platform for
conducted through scikit-learn.
convenient utilization and deployment of ANFIS. How-
(2) Scikit-ANFIS allows the manual generation of a
ever, due to the increasing popularity of machine learning
general-purpose Takagi-Sugeno-Kang (TSK) [20]
and deep learning, which are mainly based on Python,
fuzzy system using natural languages. To the best of
Python-based implementations of ANFIS have gained
our knowledge, our method is the only Python-based
increasing attention. Despite the availability of some
implementation that supports fuzzy reasoning with
Python-based ANFIS implementations such as ANFIS-
complex rules and logical operators of multiple
PyTorch [11], ANFIS-Numpy [12], and ANFIS-PSO [13],
choices.
what is lacking in the current research landscape is that
(3) Scikit-ANFIS utilizes the scikit anfisðÞ class to train
none of these implementations are directly compatible with
a pre-existing TSK fuzzy system and automatically
scikit-learn, one of the most commonly used machine
generate an ANFIS fuzzy system based on user-
learning libraries.
specified input–output data pairs, resulting in an
Furthermore, due to emerging advancement in deep
efficient optimized fuzzy system.
learning models, ANFIS has recently undergone new
(4) The Scikit-ANFIS implementation can automatically
developments, including cascade ANFIS [14, 15], as well
save and load the trained ANFIS with the best
as integration with deep learning technology [16]. This has

123
D. Zhang, T. Chen: Scikit-ANFIS: A Scikit-Learn Compatible Python Implementation 2041

performance to/from a local model file, which is not wi


O3i ¼ wi ¼ ; i ¼ 1; 2 ð3Þ
currently available in other Python-based ANFIS w1 þ w2
implementations.
In the fourth layer, each rule consequent is calculated with
associated parameters pi ; qi ; ri .
O4i ¼ wi fi ¼ wi ðpi x1 þ qi x2 þ ri Þ; i ¼ 1; 2 ð4Þ
2 Technical Background on ANFIS
When the values of the premise parameters are given, the
We begin with a brief introduction of the ANFIS fuzzy single node in the fifth layer can be expressed as the sum of
system [1], as shown in Fig. 1. The basic architecture of the linear combinations of consequent outputs, i.e.,
ANFIS consists of five layers with the output of the nodes X P
5 wi fi
in each respective layer represented by Oi;j where i is the O1 ¼ wi fi ¼ Pi ; i ¼ 1; 2 ð5Þ
ith node of layer j. i i wi

In the first layer, the nodes of this layer are the mem- It is important to note that ANFIS, unlike neural networks,
bership scores generated based on the values of the fuzzy grows faster in terms of the total number of parameters Pt ,
input variables, defined as: which can be calculated as follows [21]:
O1i ðx1 Þ ¼ lAi ðx1 Þ; O1i ðx2 Þ ¼ lBi ðx2 Þ; i ¼ 1; 2 ð1Þ 8
< Pp ¼ in  MFðinputÞ  coef ðMFÞ
>
where x1 ; x2 represent the crisp values of two input vari- Pc ¼ rs  ðin þ 1Þ  out ð6Þ
>
:
ables, and Ai ; Bi are the fuzzy set associated with this node, P t ¼ Pp þ P c
and lAi ðx1 Þ, lBi ðx2 Þ denote the membership function of
linguistic labels Ai and Bi respectively. Any continuous and where Pp and Pc denote the number of premise parameters
piecewise differentiable function such as the commonly and that of consequent parameters respectively; in stands
used bell-shaped, gaussian, trapezoidal, and triangular for the number of inputs, MF() is the number of member-
membership functions, can be used as a membership ship functions in each input, and coef() is the number of
function in this layer, and each membership function itself coefficients for each membership function; rs stands for the
includes a set of parameters. When the values of these number of rules, and out is the number of nodes in the fifth
parameters change, the membership function also varies, so layer.
these parameters in this layer are called premise parameters Given the original definition of ANFIS as introduced
[1]. above, it represents a TSK-type fuzzy system that naturally
In the second layer, each node represents the accumu- fits a regression and control problem. Figure 1 shows an
lated firing strength of rule antecedents through a t-norm ANFIS with two inputs and two rules and one output,
operator such as the product as: where each input has two Gaussian membership functions.
The total number of parameters in ANFIS is 14, found by
O2i ¼ wi ¼ lAi ðx1 Þ  lBi ðx2 Þ; i ¼ 1; 2 ð2Þ multiplying the coefficient (2) of the Gaussian membership
In the third layer, each node outputs a normalized firing function by the relevant values and adding them up:
strength: Pt ¼ 2  2  2 þ 2  3  1 ¼ 14.

Fig. 1 ANFIS example with two inputs, two membership functions in each input, and two rules

123
2042 International Journal of Fuzzy Systems, Vol. 26, No. 6, September 2024

Depending on how the consequent parameters are set Table 1 The terminology list of abbreviations used in the paper
and updated, Jang, the inventor of ANFIS [1], proposed Abbreviation Expansion
two learning algorithms (i.e. training strategies) for the
ANFIS model, namely hybrid and online. In hybrid TSK Takagi-Sugeno-Kang
learning, the antecedents are updated by the gradient des- ANFIS Adaptive Neuro-Fuzzy Inference System
cent method, while the consequents are calculated by the DNFS Deep Neuro-Fuzzy Systems
least squares method after fixing the premise parameters. Sklearn scikit-learn
Meanwhile, in online learning, all parameters are updated N/A No Answer
by the gradient descent method. Hybrid gradient descent and least squares estimate
As depicted in Fig. 2, the hybrid learning algorithm Online gradient descent only
comprises two stages: the forward pass and the backward PSO Particle Swarm Optimizer
pass. In the forward pass, the functional signal from Layer GA Genetic Algorithm
1 is passed directly through the ANFIS network to Layer 4, ABC Artificial Bee Colony
where the consequent parameters are calculated by the least LSE Least Squares Estimate
squares estimate (LSE) for input data X and target data RMSE Root Mean Square Error
Y. At this point, the premise parameters from the mem- MBGD Minibatch Gradient Descent
bership functions in Layer 1 remain fixed. The backward BN Batch Normalization
pass procedure starts after computing the total root mean UR Uniform Regularization
square error (RMSE) loss. During this process, the conse- LU Layer Normalization
quent parameters are kept unchanged while the premise ReLU Rectified Linear Unit
parameters are updated using the gradient descent method. MFt Membership Function types
For clarity, the list of terminology abbreviations used in FIS Fuzzy Inference System
the paper is given in Table 1. 10-CV 10-fold cross-validation
Acc Accuracy
n/a Not applicable
3 Related Work

3.1 Recent Software Development for Fuzzy


Systems
fuzzy systems is Simpful [27], which supports the natural
Generally speaking, a fuzzy system has a good level of language definition of fuzzy variables, fuzzy sets, and
interpretability, due to its knowledge encoding with fuzzy rules, as well as any order TSK reasoning method. A
imprecise knowledge and the intuitive inference mecha- common limitation is that most of the above software aims
nism that mimics human reasoning [22, 23]. During the to create a general framework, which tends to require the
early development of plain fuzzy systems in Python, many creation of a fuzzy system by hand. The manual creation
Python libraries were moving in the direction of general- would become impractical in working with even a mod-
purpose fuzzy system applications, such as PyFuzzy [24], erate-sized data set. Such limitation may also be more
Fuzzylab [25], Scikit-Fuzzy [26]. obvious in need of an automated optimization of system
However, many of these tools are outdated or no longer parameters, which can be dealt with through Matlab, but
maintained. Recently, an open source software for general

Fig. 2 The hybrid learning process implemented in ANFIS model

123
D. Zhang, T. Chen: Scikit-ANFIS: A Scikit-Learn Compatible Python Implementation 2043

existing Python implementations are usually not 3.3 Review on Deep Neuro-fuzzy Systems
applicable. Framework
Focusing on the ANFIS framework [1], which has been
a very prevalent TSK-type fuzzy system since its inception Deep neuro-fuzzy systems (DNFS), which present one of
for a variety of domain problems [28], our Scikit-ANFIS is the most advanced developments as a combination of deep
the first open-source Python tool to combine the creation of learning and fuzzy systems, have become a focus in fuzzy
a general-purpose TSK fuzzy system embedded with the logic research [31]. This is because fuzzy systems can not
ANFIS optimization method. only deal with the widespread inaccuracy and uncertainty
in the real world but also potentially enrich the represen-
3.2 Brief History of ANFIS Software Development tation of deep models. At the same time, ANFIS can be
seen as a simplified representation of DNFS [17], which
Since Jang proposed ANFIS, we make a summary of the itself is in principle a fuzzy system whose membership
recent major development of ANFIS software as shown in function parameters can be tuned by a five-layer adaptive
Table 2. Matlab-ANFIS is one of the most popular tools neural network [1].
used to implement the ANFIS model [10], which can not We further compare our Scikit-ANFIS implementation
only create the ANFIS model directly to train and test the with other deep neural fuzzy methods for regression and
data set but also utilize ANFIS as an optimization method classification in Table 3. There are several methods avail-
to train the existing fuzzy system. However, Matlab is able for solving classification tasks, including the Neuro-
commercial software that is not open to the public. Fur- Fuzzy [32] method based on C language, DNFC [33] based
thermore, an extra installation of the Matlab Engine API on Matlab, and TSK-MBGD-UR-BN [35] and PyTSK [18]
for Python is required to access Matlab from Python. based on Python 3. For regression tasks, there are also
The ANFIS-C [1] and ANFIS-Vignette [21] software various methods available, such as FCM-RDpA [36]
written in C and R respectively, are outdated and not developed based on Matlab, MBGD-RDA [34], and
regularly updated. Currently, ANFIS software such as HTSK-LN-ReLU [37] developed based on Python 3.
ANFIS-PyTorch [11], ANFIS-Numpy [12], and ANFIS- However, only our Scikit-ANFIS is capable of solving both
PSO [13] are mostly developed in Python 3 [29]. Out of the classification and regression tasks. It’s worth noting that
above three Python-based software, none supports the Python 3 has become a popular choice among the fuzzy
scikit-learn interface, and only a limited number of mem- logic research community, likely due to its widespread use
bership function types are supported (5, 3, and 3, respec- in developing artificial neural networks. Although the
tively). ANFIS-PyTorch is the only software that supports methods mentioned above offer practical solutions for their
both hybrid and online learning algorithms, while ANFIS- specific tasks and implement different optimization tech-
Numpy only supports hybrid learning, and ANFIS-PSO niques like gradient descent, minibatch gradient descent
supports the particle swarm optimizer (PSO). On the other [38], Adam [39], AdaBound [40], Powerball [41], and
hand, our Scikit-ANFIS supports 12 different membership AdaBelief [42], they do not utilize the ANFIS architecture
function types, the same as Matlab-ANFIS. Additionally, or its training methods. By contrast, our Scikit-ANFIS has
Scikit-ANFIS fully supports two learning algorithms the ability to not only adopt the ANFIS’s five-layered
(Hybrid/Online) for ANFIS training and also supports the architecture but also adapt to the upcoming requirements of
scikit-learn interface, which is more user-friendly and has DNFS research for network interpretability and high per-
more powerful application capabilities. formance with the assistance of PyTorch and Numpy
frameworks.

Table 2 Overview of the Name Language Library MFt Learning strategy Sklearn Release
software for ANFIS
ANFIS-C[1] C N/A 4 Hybrid/Online No 1993
ANFIS-Vignette[21] R N/A 4 Hybrid/Online No 2012
Matlab-ANFIS[10] Matlab N/A 12 Hybrid/Online No 2023
ANFIS-PyTorch[11] Python 3 PyTorch 5 Hybrid/Online No 2019
ANFIS-Numpy[12] Python 3 Numpy[30] 3 Hybrid No 2020
ANFIS-PSO[13] Python 3 Numpy 3 PSO No 2021
Scikit-ANFIS Python 3 PyTorch?Numpy 12 Hybrid/Online Yes 2023

123
2044 International Journal of Fuzzy Systems, Vol. 26, No. 6, September 2024

Table 3 The difference between Scikit-ANFIS and other deep neural fuzzy methods for two common tasks: regression and classification
Name Language MFt Layers Optimization method Tasks Sklearn Release

Neuro-Fuzzy[32] C 1 4 Gradient Descent Classification No 1993


DNFC[33] Matlab 1 8 Gradient Descent Classification No 2020
MBGD-RDA[34] Python 3 1 5 MBGD?AdaBound Regression No 2020
TSK-MBGD-UR-BN[35] Python 3 1 6 MBGD?AdaBound?BN?UR Classification No 2020
FCM-RDpA[36] Matlab 1 5 MBGD?Powerball?AdaBelief Regression No 2021
HTSK-LN-ReLU[37] Python 3 1 7 MBGD?Adam?LU?ReLU Regression Yes 2022
PyTSK[18] Python 3 2 6 MBGD?Adam Classification Yes 2022
Scikit-ANFIS Python 3 12 5 ANFIS Regression?Classification Yes 2023

Fig. 3 Overview of Scikit-ANFIS architecture

4 Scikit-ANFIS two options: the manual generation module can be utilized


to define and generate a fuzzy system, or the automatic
4.1 Architecture Overview method module can automatically generate an ANFIS
fuzzy system by default without definition. For training,
The diagram in Fig. 3 illustrates the overall structure of our Scikit-ANFIS uses the ANFIS optimizer module to train
Scikit-ANFIS. Scikit-ANFIS employs the data loader the initialized fuzzy system. Once training is completed,
module to read data from the dataset, which is then divided the optimized fuzzy system is selected and tested with data.
into train, and test datasets by the data splitter module. The evaluation module then examines the test outputs to
These datasets are sent to the generated fuzzy system for formulate a report.
training. Alternatively, the initialized fuzzy system can Scikit-ANFIS1 is also implemented in the Python 3
directly read the train, and test data from the dataset by the language, which mainly includes two dependencies such as
data loader module and train itself accordingly. To create a PyTorch [19] and Numpy [30]. Our Scikit-ANFIS currently
fuzzy system for predictive tasks, Scikit-ANFIS provides supports the following primary functions: (i) The twelve
types of membership functions such as Gaussian, bell,
triangular, and others. (ii) Fuzzy sets written in natural
1
The code for Scikit-ANFIS, the associated cases, and the user guide language, and complex fuzzy rules with logical operators
will be publicly available at https://github.com/hudscomdz/scikit-
anfis. Scikit-ANFIS can be installed by using the following command:
AND, OR, and NOT. (iii) Two training strategies of
pip install skanfis. ANFIS, namely hybrid and online. (iv) Automatic

123
D. Zhang, T. Chen: Scikit-ANFIS: A Scikit-Learn Compatible Python Implementation 2045

Fig. 4 The illustration of the manual generation method for the general fuzzy system

generation and training of the ANFIS fuzzy system. (v) A


uniform structure such as the fit() and predict() functions to
provide the same interface as scikit-learn.

4.2 Implementation Details

4.2.1 Manual Generation of a General Fuzzy System

Considering that Simpful [27] is already open source and


general-purpose fuzzy system software developed in
Python, our implementation also makes use of some Fig. 5 The illustration of the automatic method for ANFIS fuzzy
existing components as defined by Simpful for efficient system
development. As depicted in Fig. 4, the difference between
the manual generation method for general fuzzy system
and Simpful is mainly that the former can interact with the set consequentðÞ. By providing the first three interfaces,
ANFIS optimizer in Scikit-ANFIS, which can not only Scikit-ANFIS can send the antecedent parameters, fuzzy
realize the ANFIS training of the fuzzy system but also rules, and consequent parameters of the fuzzy system
return the trained results to the fuzzy system to generate object to the ANFIS model for training. After the training
new output. However, the latter can only generate an out- is finished, the last two interfaces accurately return the
put after passing the received input data through the fuzzy well-trained ANFIS model’s parameters to the fuzzy sys-
knowledge base without any model training operation. tem object, enabling it to perform precise fuzzy inference.
Similarly to Simpful, after receiving the natural lan-
guage information, our manual generation method auto- 4.2.2 Automated Method to Initialise an ANFIS Fuzzy
matically parses the fuzzy variables, fuzzy sets, and fuzzy System
rules, creating Scikit-ANFIS’s fuzzy system object. The
fuzzy rule uses Takagi and Sugeno’s fuzzy if-then rule [1], To facilitate users to create the ANFIS model, our Scikit-
and its natural language description supports the commonly ANFIS designs and implements an automatic method for
used fuzzy operators like AND, OR, and NOT, as detailed ANFIS fuzzy system, which shares the same scikit anfisðÞ
in [27]. When the input dataset is fed into the fuzzy system class with the ANFIS optimizer module. Figure 5 illus-
object created by the manual generation method, it can trates the functional diagram of the method, which can
conduct fuzzy reasoning through an interface namely in- automatically generate an ANFIS fuzzy system object
ference(), and provide output results. Additionally, the including a knowledge base and ANFIS model for the input
object can communicate with the ANFIS model in Scikit- dataset. The input data set is used to automatically extract a
ANFIS through five interfaces: get antecedentðÞ, knowledge base consisting of input variables, output vari-
get consequentðÞ, get rulesðÞ, set antecedentðÞ, and ables, membership functions, and fuzzy rules. This leads to

123
2046 International Journal of Fuzzy Systems, Vol. 26, No. 6, September 2024

the generation of an ANFIS model that adheres to the strict respectively, as illustrated in Fig. 3. The main procedure
requirements of the type-3 fuzzy inference system as pro- of Scikit-ANFIS encapsulated between lines 1 and 15,
posed by the original paper [1]. The rule base follows fuzzy comprises three key elements. Firstly, in line 1, a general
if-then rules and can be effortlessly mapped to an equiva- fuzzy system object fs is manually created. Following
lent ANFIS architecture [1]. The resulting ANFIS model that, lines 2 and 3 specify fuzzy sets, input variables,
comprises a five-layer neural network structure, as illus- fuzzy rules, and output variables for fs. In line 4, we
trated in Fig. 1, and provides a robust fuzzy inference check whether fs is empty. If it is not, we use the
system. antecedent and consequent parameters along with rules
from fs to represent a 5-layer ANFIS model called
4.2.3 ANFIS as an Optimizer anfis layers in line 5. At the same time, we create a
scikit anfis object based on anfis layers in line 6, also
ANFIS optimizer as an optimization module also utilizes specifying the maximum number of epochs for training
the scikit anfisðÞ class to help the initialized fuzzy system (max epoch), the training strategy (hybrid), and the task
to be trained more efficiently. The ANFIS optimizer takes type (label), and the optimizer. By default, max epoch is
the training set as input and uses forward propagation and set to 10, hybrid is set to True, indicating the use of the
cost function to calculate the total loss of the ANFIS neural hybrid training strategy, label is set to ‘‘r‘‘, indicating
network generated. The forward() method is used for for- that the model is intended for regression tasks, and op-
ward propagation, built on the PyTorch framework. The timizer is set to the gradient descent method namely
default training algorithm used is hybrid learning, with ‘‘Adam’’. Secondly, if fs is empty, we automatically
online learning available as an alternative. Then, it updates create an ANFIS fuzzy system from lines 7 to 10. Line 8
all the antecedent and consequent parameters in the model is the 5-layer ANFIS model anfis layers derived from
through the backpropagation process. This entire process is the input dataset (X, Y), and line 9 generates a
the training process for the five-layer ANFIS model, and scikit anfis object and specifies its associated parameters,
the number of times the model is trained is related to the such as max epoch, hybrid, label, and optimizer. Thirdly,
‘epoch’ hyperparameter. The optimizer used to update the the most crucial step involves the implementation of the
parameter through backpropagation is usually related to the ANFIS optimizer from lines 11 to 15. At each epoch in
‘optimizer’ hyperparameter of the model. Our Scikit- the loop, we feed the training dataset ðXtrain ; Ytrain Þ into
ANFIS has implemented various optimizers based on the scikit anfis object to complete the forward pass from
gradient descent, including Adam [39], SGD [43], Rprop layer 1 to layer 5 in its ANFIS model. During this
[44], L-BFGS [45], Adadelta [46], and Adagrad [47], with process, we update the consequent parameters and obtain
Adam being the default. the final output Y. ^ We then compute the RMSE loss
Once the training of the ANFIS model is completed, all between the predicted Y^ and the training target Ytrain ,
parameters of the current model with minimum loss can be followed by updating the premise parameters of the
saved to the local model file ‘tmp.pkl’. This saved model ANFIS model in its backward process.
file can be later used to continue training or testing, which When the ANFIS optimizer has completed training the
can be very useful. To ensure that a manually created model, confidently use fs for fuzzy inference of test data
general fuzzy system and a trained ANFIS model are from lines 16 to 20. Return the premise and consequent
consistent with each other, it is possible to transfer the parameters to fs using the set antecedentðÞ and
premise and consequent parameters from the ANFIS model set consequentðÞ interfaces. Then, directly call the infer-
to the fuzzy system. This can be done by using two inter- ence() function of fs to complete the fuzzy system rea-
faces such as set antecedentðÞ and set consequentðÞ, soning. Alternatively, we can opt for the more convenient
which are explicitly called as shown in Fig. 4. Moreover, scikit anfis object in line 21 for fuzzy inference. This
the optimized fuzzy system can give fuzzy inference results method uses the test input data Xtest to generate the pre-
based on the test data. dicted output Ypred through fuzzy reasoning of the ANFIS
model. This option has been used in subsequent cases in
4.2.4 Scikit-ANFIS this paper. It is important to note that during testing, there
is no need to provide the test target value Ytest to the trained
The implementation of our Scikit-ANFIS is detailed in scikit anfis object since all of its parameters are already
Algorithm 1. The input, training, and test datasets are fixed. Finally, in the last line, we compare and evaluate the
denoted as (X, Y), ðXtrain ; Ytrain Þ, and ðXtest ; Ytest Þ fuzzy inference result Ypred with the test target value Ytest .

123
D. Zhang, T. Chen: Scikit-ANFIS: A Scikit-Learn Compatible Python Implementation 2047

Algorithm 1 Pseudo Code for Scikit-ANFIS Implementation

Table 4 Summary of the three Dataset Name Abbr Task type Features Number of Samples
regression datasets and one
classification dataset Restaurant Tipping Problem[48] Tip Regression 2 441
Two-input Nonlinear Function[1] Sinc Regression 2 121
Predict Chaotic Dynamics Problem[1] PCD Regression 4 1000
Iris Classifier[49] Iris Classification 4 150

4.3 Code Overview using Scikit-ANFIS

Our Scikit-ANFIS can be used in two different manners.


The first method demonstrates how to optimize a manually
created TSK fuzzy system using Scikit-ANFIS in Listing 1.
Scikit-ANFIS implements an ANFIS model based on a
general TSK fuzzy system object in lines 4 to 5 and then
uses the scikit-learn interface to train and test the model in
lines 6 to 7. The ANFIS model is trained on the training
data using the fit() function. Following this, the trained
ANFIS model is tested using the predict() function with the Scikit-ANFIS can be used to automatically generate and
test data. optimize a TSK fuzzy system by creating an ANFIS model,
as demonstrated in line 3 of Listing 2. The ANFIS model
created is not only a TSK system but also adopts scikit-
learn compatible interface design, which can easily com-
plete the training and testing of the TSK fuzzy model by

123
2048 International Journal of Fuzzy Systems, Vol. 26, No. 6, September 2024

calling the fit() and predict() functions in line 4 and 5, code to manually define a general TSK fuzzy inference
respectively. system through natural language, and then train and test the
fuzzy inference system (FIS) with a Scikit-ANFIS object,
based on the Matlab data file ‘data_Tip_441.mat’ with two
inputs and one output for a total of 441 samples.

5 Experimentation

5.1 Experimental Setup

For a fair comparison and consistency with previous


ANFIS implementations (Matlab-ANFIS [10], ANFIS-
PyTorch [11], and ANFIS-Numpy [12]), this section
reports results and discussions as a result of experimenta-
tion over both regression and classification tasks. Thus, the
first regression dataset is the restaurant tipping problem
[48] taken from the Matlab file repository. The next two
regression datasets are from Jang’s literature [1], using
ANFIS to model a nonlinear sinc equation and predict
future values of a chaotic time series, respectively. Finally,
we use the ANFIS model to train and test the popular iris
benchmark dataset [49], which is essentially to a three-
class classification problem. The details regarding task
type, features (i.e. inputs), and the number of samples in all
four datasets are presented in the following Table 4. It
should be noted that we do not pre-process data or conduct
any hyperparameter tuning for a fair and straightfoward
comparison.

5.2 Case Studies

In this section, we report reports on four studies including


three regression problems and one classification problem,
with major Python code to demonstrate how Scikit-ANFIS
can be applied in practice.

5.2.1 Restaurant Tipping Problem


In line 8 of Listing 3, the data ‘tip_data’ is loaded
The restaurant tip problem is to calculate a fair tip ratio of from the Matlab file using the loadmat() command in the
the total bill according to the service and food quality of a scipy.io package. Then in line 10, using the
restaurant. Listing 3 shows an example of Scikit-ANFIS train test splitðÞ command from the sklearn package, the

123
D. Zhang, T. Chen: Scikit-ANFIS: A Scikit-Learn Compatible Python Implementation 2049

above ‘tip_data’ is split in half randomly to get the


training data and the test data. Lines 11 and 12 extract
the output ‘y_test’ and input ‘X_test’ from the test data
respectively for performance evaluation after the test
results. In line 15 a TSK fuzzy system object is created
by default. The language variable ‘Service’ and its three
triangular fuzzy sets, ‘poor’, ‘good’, and ‘excellent’, are
defined in lines 18 to 21, with values ranging from 0 to
10. Similarly, lines 22 to 24 define the language variable
‘Food’, which has two triangular fuzzy sets, ‘rancid’ and
‘delicious’. The output crisp values of ‘small’ and ‘av-
erage’ are set to 5% and 15%, respectively, in lines 27
and 28. Meanwhile, the output value of a ‘generous’ tip
is defined in line 30 as a linear function that depends on
the scores for ‘Food’ and ‘Service’. Once the fuzzy rules
are defined in lines 33 to 36, a general TSK fuzzy sys-
tem is created, and then, as shown in line 39, a Scikit-
ANFIS object needs to be generated based on it to
implement ANFIS training of this fuzzy system. Line 39
uses the default setting for the epoch number and the
training strategy of our Scikit-ANFIS object, that is, the
epoch is 10 and the training strategy is hybrid learning.
Then, ANFIS model training was conducted on the
training data ‘train_data’ using the fit() command in line
40. The predict() command is used to perform model
prediction on the input ‘X_test’ in the test data to obtain
the prediction result ‘y_pred’ in line 41. Finally, the root
mean square error (RMSE) value between ‘y_pred’ and
the actual result ‘y_test’ of the test data was calculated
in line 42.

5.2.2 Two-Input Nonlinear Function

The two-input nonlinear function is the sinc equation,


expressed as follows:
z ¼ sincðx; yÞ ¼ ðsinðxÞ=xÞ  ðsinðyÞ=yÞ ð7Þ

where x and y are two input variables, and z is the output


variable. According to [1], x and y were selected from the
range of -10 to 10 with an interval of 2, and 121 data pairs
could be obtained as training samples. Listing 4 shows how
to generate Scikit-ANFIS code to train an ANFIS model
with 16 fuzzy rules and 4 bell membership functions
assigned to each input variable, as well as the test results.

123
2050 International Journal of Fuzzy Systems, Vol. 26, No. 6, September 2024

Line 8 of Listing 4 shows that users can load data where ‘x0’, ‘x1’, ‘x2’, and ‘x3’ denote the time series value
‘sinc_data’ from csv format file ‘data_sinc_121.csv’ using of the four inputs respectively, ‘y0’ denotes the time series
the read csvðÞ command in the pandas package. Similar to value of the output, and v(t) denotes the time series value at
Listing 3, the train test splitðÞ command from the sklearn the time point t.
package is used in line 10 to split the above ‘sinc_data’ Listing 5 shows how to manually generate a general
randomly in 60:40 proportion into ‘train_data’ and ‘test_- fuzzy system with 16 fuzzy rules and 2 Gaussian mem-
data’ set. Considering that ‘test_data’ belongs to the bership functions for each of 4 input variables, then create
DataFrame type in the pandas package, ‘y_test’ can be a Scikit-ANFIS object based on this, and then complete
extracted from ‘test_data’ by pop() command in line 11 hybrid training and testing of ANFIS model. In lines 8 and
according to the variable name ‘z’, and there is no property 9, we load all the input data ‘X’ and the real output data ‘y’
‘z’ in ‘test_data’ after extraction, only two input variables. from the ‘data_PCD_1000.txt’ data file, respectively, using
Line 14 first creates a TSK fuzzy system object, and then the loadtxt() command with its ‘usecols’ parameter from
the user defines four bell fuzzy sets for each of the two the numpy package. As in Listing 3 and 4, the
language variables such as ‘x0’ and ‘x1’ from lines 17 to train test splitðÞ command from the sklearn package is
26. In lines 29 to 30, a total of 15 output variable values also used in line 11 of Listing 5 to randomly split data ‘X’
from ‘sinc_x_y0’ to ‘sinc_x_y14’ are defined as linear into ‘X_train’ and ‘X_test’ according to the specified 70:30
functions that depend on the scores of the input language proportion, and data ‘y’ into ‘y_train’ and ‘y_test’ in the
variables ‘x0’ and ‘x1’. The crisp value of the last output same proportion. In line 14, the empty TSK fuzzy system
variable ‘sinc_x_y15’ is defined as one in line 31. After the object is created for the problem. From lines 17 to 28, the
fuzzy rule set is added in lines 34 to 39 (owing to space fuzzy sets and Gaussian membership functions of the four
limitations, the paper are omitting most rules from here), a input variables such as ‘x0’, ‘x1’, ‘x2’, and ‘x3’ are
general-purpose TSK fuzzy system is complete. In line 42, defined. Lines 31 to 36 show how the 16 fuzzy rules are
a Scikit-ANFIS object is created based on the above fuzzy defined, and again most of the rules are omitted due to
system with an epoch of 250 and a learning strategy of space limitations. Since the antecedents of the first 15 rules
hybrid. Next, the ANFIS model is trained on the training use all four input language variables, the loops of lines 39
data ‘train_data’ using the fit() command in line 43. Then to 40 define the output variables of the first 15 rules in the
in line 44, the predict() command is used to model the test form of linear functions. However, line 41 specifically
data ‘test_data’, and the prediction result is ‘y_pred’. defines the output variable in the last rule, ‘R16’, as a linear
Finally, the RMSE value between the predicted result function of zero times the fourth input language variable
‘y_pred’ and the actual result ‘y_test’ is calculated in line ‘x3’, because the antecedent of the last rule uses only those
45. three input language variables. In line 44, based on the
newly created fuzzy system ‘fs’, a Scikit-ANFIS model is
5.2.3 Predict Chaotic Dynamics Problem generated that initializes 500 epochs and a hybrid training
strategy. Line 45 uses the fit() command line to feed the
The predict chaotic dynamic problem comes cirely from input data ‘X_train’ and output data ‘y_train’ to the Scikit-
[1], whose goal is to predict the future value of chaotic time ANFIS model for hybrid training. After training, the pre-
series by creating the ANFIS model. In [1], by obtaining dict() command in line 46 is used to predict the input data
time series values at each integer point, the time series ‘X_test’ in the test set, and the prediction result ‘y_pred’ is
values corresponding to four consecutive time points with obtained. Finally, the RMSE between ‘y_pred’ and ‘y_test’
an interval of six are selected as the input, the value to the is calculated in line 47 using the mean squared errorðÞ
fifth point as output, and then 1000 input–output data pairs command from the sklearn package.
can be extracted, which are formally expressed as follows:
½x0; x1; x2; x3; y0 ¼ ½vðt  18Þ; vðt  12Þ;
ð8Þ
vðt  6Þ; vðtÞ; vðt þ 6Þ

123
D. Zhang, T. Chen: Scikit-ANFIS: A Scikit-Learn Compatible Python Implementation 2051

5.2.4 Building a Classifier for Iris problem

The Iris dataset is a classic multi-classification dataset [49]


with 150 data samples in 3 classes, each corresponding to
one species of iris plant, and 50 samples. Listing 6 below
shows how our developed Scikit-ANFIS automatically
generates ANFIS model for training and testing on the
above iris dataset. First, the load irisðÞ command in scikit-
learn is used to quickly load and return the iris dataset
without downloading any files from an external website, as
shown in line 7 of Listing 6. In line 9, the train test splitðÞ
command is used again to randomly split the iris input data
‘iris.data’ into ‘X_train’ and ‘X_test’ in the specified 80:20
ratio, and output target data ‘iris.target’ into ‘y_train’ and
‘y_test’ in the same ratio. The Scikit-ANFIS fuzzy system
object is created for the classification problem in line 12,
taking ‘iris.data’ as an argument to automatically extract all
four input variables and assign two Gaussian membership
functions to each by default, and setting the epoch to 100,
and using the hybrid training method (hybrid=True). In
particular, the label parameter in line 12 is set to ‘c’,
indicating that this is an ANFIS model to solve the clas-
sification problem, while the default value is ‘r’, indicating
the regression model. Line 13 uses the fit() command line
to feed the input data ‘X_train’ and output data ‘y_train’ to
the Scikit-ANFIS model for classification training. After
the completion of the training, the predict() command in
line 14 is used to predict the input data ‘X_test’ in the test
set, and the predicted classification result is ‘y_pred’.
Subsequently, the accuracy of the model between ‘y_pred’
and ‘y_test’ is calculated and printed in line 15 using the
accuracy scoreðÞ command from the sklearn package.

123
2052 International Journal of Fuzzy Systems, Vol. 26, No. 6, September 2024

1.10

1 1 1 1 1
1.00 0.966 0.967 0.967 0.967 0.967
0.933 0.933
0.9 0.9 0.9
0.90 0.867
0.867

0.8 0.8 0.8 0.8


0.80
0.733
0.7 0.7 0.7
accuracy scores

0.70

0.60 0.567 0.567 0.567


0.533 0.533

0.50
PyTSK(gaussMF)
0.4 0.4 HTSK-LN-ReLU(gaussMF)
0.40 0.367 ANFIS-Numpy(gaussMF)
0.333 0.333 0.333 0.333 ANFIS-PyTorch(gaussMF) 0.333

0.30 Matlab-ANFIS(bellMF)
0.233 Matlab-ANFIS(gaussMF)
Scikit-ANFIS(bellMF)
0.20 0.167
Scikit-ANFIS(gaussMF)

0.10
1 5 10 15 20 25 30 35 40 50 60 80 100 200
number of epochs

Fig. 6 Comparison of accuracy scores for several software methods in terms of different epochs using the Iris dataset

Subsequently, Fig. 6 shows a comparison of output maintains the overall linear increase from 0.333 to 0.967,
accuracy scores produced by various software methods the accuracy score increases and decreases with different
such as PyTSK [18], HTSK-LN-ReLU [37], ANFIS- epochs, which has a slight fluctuation. With the increasing
Numpy [12], ANFIS-PyTorch [11], Matlab-ANFIS [10], number of epochs, the accuracy scores of the ANFIS-
and our Scikit-ANFIS with different epoch sizes under the Numpy and ANFIS-PyTorch fluctuate between 0.333 and
same Iris dataset in Table 4. For a fair comparison, all the 1.0 and between 0.167 and 1.0, respectively. In addition,
methods generate fuzzy systems two membership functions Matlab-ANFIS(bellMF) exhibits a small change from 0.8
for each of the four input variables, resulting in 16 fuzzy to 1.0, and the accuracy of Matlab-ANFIS(gaussMF)
rules. The Iris dataset is then divided into a fixed 80:20 decreases linearly from 0.933 to 0.8 as the number of
proportion for training and testing purposes (refer to List- epochs increases. In contrast, the accuracy score of our
ing 6: lines 7-9). It is important to note that the training set Scikit-ANFIS increases strictly linearly as the number of
and test set remain the same across all the methods. In epochs grows larger, with minimal variation, where Scikit-
addition, in order to explain the influence of the member- ANFIS(bellMF)’s score increases from 0.867 to 0.967 and
ship function of input variables on a fuzzy system, the text Scikit-ANFIS(gaussMF)’s from 0.9 to 0.967. The relatively
in parentheses after each software name indicates which stable accuracy score in our Scikit-ANFIS is particularly
membership function is used to construct the fuzzy system. beneficial in real scenarios where the ANFIS model is
For example, the ‘PyTSK(guassMF)’ entry and ‘Matlab- applied. This makes our Scikit-ANFIS more efficient and
ANFIS(bellMF)’ one in the figure indicate that the PyTSK adaptable to handle different ANFIS training configura-
software uses the Gaussian membership function, and the tions of fuzzy inference systems.
Matlab-ANFIS software uses the bell membership func-
tion, respectively. PyTSK and HTSK-LN-ReLU used their 5.3 Using Sklearn-supported Cross-validation
own gradient descent methods to train the dataset, while for Scikit-ANFIS
other ANFIS methods used a hybrid approach. The
experiments were repeated 10 times per group for To further demonstrate the effectiveness of our Scikit-
accuracy. ANFIS software tool, we completed the 10-fold cross-
As shown in Fig. 6, the accuracy score in the PyTSK validation (‘10-CV’ for short) experimental comparison of
increases linearly when the number of epochs increases various software for the above four datasets using
from one to two hundred, ranging from 0.533 to 0.9. cross val scoreðÞ command in Scikit-learn. ‘10-CV’
Although the accuracy score of HTSK-LN-ReLU still refers to dividing the data into 10 equal folds of smaller

123
D. Zhang, T. Chen: Scikit-ANFIS: A Scikit-Learn Compatible Python Implementation 2053

Table 5 Summary of 10-fold Methods Regresssion(RMSE#) Classificaion(Acc")


cross-validation (‘10-CV’ for
short) experiments of various Tip Sinc PCD Iris
software for ANFIS or DNFS
under four different datasets at Matlab-ANFIS(hybrid)[10] 8e-7±3e-8 0.127±0.016 0.002±5e-5 0.875±0.018
the same setting of 100 epochs ANFIS-PyTorch(hybrid)[11] 3e-6±1e-6 0.236±0.082 0.030±0.004 0.933±0.067
and Gaussian membership ANFIS-Numpy[12] 0.171±0.510 0.283±0.326 0.194±0.200 0.860±0.156
functions
Scikit-ANFIS(hybrid) 1e-6 ± 1e-7 0.109±0.068 0.041±0.007 0.952±0.054
PyTSK[18] n/a n/a n/a 0.301±0.098
HTSK-LN-ReLU[37] 0.945±0.451 0.859±0.643 1.100±0.065 0.346±0.105
Matlab-ANFIS(online)[10] 0.426±0.013 0.119±0.015 0.103±0.001 0.968±0.019
ANFIS-PyTorch(online)[11] 2.043±1.516 0.141±0.075 0.403±0.006 0.493±0.326
Scikit-ANFIS(online) 0.382±0.187 0.101±0.069 0.044±0.005 0.960±0.044
Each value in the experiment is in the form of the mean score ± standard deviation. Except for the Iris data
set in the last column of the following table, which is the accuracy (Acc) score computed at each ‘10-CV’
iteration, the data sets in the other three columns such as Tip, Sinc, and PCD are computed for the root
mean squared error (RMSE) score. # indicates that the smaller the value is, the better the performance,
while " indicates that the larger the value is, the better the performance. ‘n/a’ indicates ‘not applicable’

sets, then training the model using 9 of the folds as training


data, and testing the resulting model on the remaining 1
fold of the data for computing a performance measure such
as root mean squared error and accuracy. The four datasets
in the ‘10-CV’ experiment are from Table 4, which can be
divided into three regression sets (Tip, Sinc, PCD) and one
classification set (Iris) according to task type. In the
experiment, the processing details of regression and clas-
sification data sets are different. The main difference lies in
the setting of ‘scoring’ parameters for defining model Table 5 summarizes the mean and standard deviation of
evaluation rules in the cross val scoreðÞ command. As for evaluated scores in each of ‘10-CV’ experiments with
regression, ‘neg_mean_squared_error’ has been specified various software for ANFIS or DNFS such as Matlab-
as the ‘scoring’ parameter shown in line 4 of Listing 7, ANFIS [10], ANFIS-PyTorch [11], ANFIS-Numpy [12],
while for classification, the ‘scoring’ parameter is set to PyTSK [18], HTSK-LN-ReLU [37], and Scikit-ANFIS in
‘accuracy’ shown in line 4 of Listing 8. In addition, line 4 four datasets from Table 4. To ensure uniformity in the
of Listing 7 or 8 aims to conduct ‘10-CV’ experiments on software builds of fuzzy sets and fuzzy rules for the same
the data set composed of input data ‘X’ and output data ‘y’, data set, we have mandated that every experiment must
in which ‘model’ parameter in cross val scoreðÞ com- have an epoch of 100 and an initial step of 0.01, with two
mand can refer to our Scikit-ANFIS and any ANFIS or Gaussian membership functions designated for each input
DNFS model to be trained and tested. In line 5 of Listing 7 variable. As illustrated in Listing 3, our experiment com-
and 8, the mean and standard deviation of the model’s prises two input variables and three fuzzy rules for the Tip.
evaluation scores in ten folds are calculated and printed in Similarly, as per Listing 4, we have enforced two input
the console. variables and 16 fuzzy rules for the Sinc. For the PCD and
Iris, we have assigned four input variables and 16 fuzzy
rules, as elaborated in Listings 5 and 6. Each experiment
was repeated 10 times, and the average was the final result.
In Matlab-ANFIS, we can complete the ‘10-CV’ experi-
ment by calling the crossvalind() with the Kfold method and
anfis() commands in Matlab [10]. In addition, ANFIS-
PyTorch can be called by the cross val scoreðÞ command
through the Numpy wrapper. In contrast, the other four
software can be called directly, because all five are based on
the Python 3 programming language, and have natural
interoperability with the sklearn package. The experimental

123
2054 International Journal of Fuzzy Systems, Vol. 26, No. 6, September 2024

results under the two training strategies of the three ANFIS- step size (i.e., learning rate) update method implemented
based software Matlab-ANFIS, ANFIS-PyTorch, and Scikit- following two heuristic rules [1], which plays an important
ANFIS are distinguished by adding ‘hybrid’ or ‘online’ in role in guiding the ANFIS model to accelerate the conver-
parentheses, as shown in Table 5. However, there is only a gence speed of gradient descent when backpropagating. At the
hybrid training method for ANFIS-Numpy, and PyTSK and backward stage of Scikit-ANFIS, we apply the adaptive gra-
HTSK-LN-ReLU can be classified as online training meth- dient descent method (Adam by default) instead of strict
ods because they are based on gradient descent to realize the gradient descent to identify the parameters in the ANFIS
backpropagation update of all parameters in the network. network. This enables us to obtain an unscaled direct esti-
Since PyTSK only applies to classification problems, its mation of the parameter’s updates, which is well-suited for
experimental results in three regression data sets are denoted problems that are large in terms of data or parameters. In
as ‘n/a’. We have also highlighted the best performance addition, although PyTSK and HTSK-LN-ReLU also adopt
values in bold for both the first four methods under hybrid stochastic gradient descent methods such as Adam, they are
training strategy and the remaining five methods under different from the ANFIS model in that they completely rely
online strategy, for each data set. on input–output data pairs to generate corresponding network
The findings in Table 5 reveal that the evaluation per- structure and membership parameters. However, our Scikit-
formance (0.109±0.068 and 0.952±0.054) of our Scikit- ANFIS relies not only on input and output data, but also on
ANFIS exceeds all other three software in Sinc and Iris data human knowledge to construct fuzzy if-then rules, so the
sets under the ‘hybrid’ training method, while only the resulting fuzzy system is more consistent with the real data
RMSE performance (8e-7±3e-8 and 0.002±5e-5) Matlab- distribution and achieves remarkable results.
ANFIS exceeds ours in Tip and PCD data sets and ours still Another key contribution of our software is the support
exceeds ANFIS-PyTorch and ANFIS-Numpy. On average, of the ANFIS model as an optimization method to directly
our developed Scikit-ANFIS demonstrates performance train existing TSK fuzzy inference systems, allowing users
approximation to commercial software Matlab-ANFIS to apply the software more easily. Although Matlab-ANFIS
compared to ANFIS-PyTorch and ANFIS-Numpy. In addi- can accomplish the same function, it can only be used for
tion, under the ‘online’ training method, the RMSE perfor- fuzzy systems created using the Matlab language and is not
mance (0.382±0.187, 0.101±0.069, and 0.044±0.005) of easily compatible with the Python platform.
our Scikit-ANFIS outperforms the other three software such One of the most significant advantages of our proposed
as HTSK-LN-ReLU, Matlab-ANFIS, and ANFIS-PyTorch syntax is its high level of consistency with that of scikit-
in three data sets such as Tip, Sinc, and PCD, while the learn: both adopt a universal format that first creates a
accuracy score 0.968±0.019 of Matlab-ANFIS only slightly model, which can then be fed data through fit(), before
outperforms 0.960±0.044 of ours in Iris data set, and ours outputting a result through predict(). It is also worth noting
consistently outperforms the other three software such as again that fit() and predict() functions of our Scikit-ANFIS
PyTSK, HTSK-LN-ReLU, and ANFIS-PyTorch. These inherit the same interface provided by scikit-learn, thus
results also highlight the faster convergence of our Scikit- facilitating the proposed model to be used efficiently with
ANFIS compared to PyTSK, HTSK-LN-ReLU, Matlab- other available machine learning algorithms. This is also
ANFIS, and ANFIS-PyTorch in updating all the antecedent reinforced by the fact that output generated by predict() can
and consequent parameters of fuzzy systems based on the be directly used to calculate metrics such as RMSE and
gradient descent method. accuracy (see code example Listing 6: lines 14-15).
Although our Scikit-ANFIS helps users to efficiently
5.4 Discussions and Limitations combine the ANFIS model with other machine learning
algorithms in scikit-learn, it has some limitations. Scikit-
Figure 6 and Table 5 yield that Matlab-ANFIS is still the ANFIS relies on adaptive gradient descent methods to update
relatively best-performing software among many existing antecedent and consequent parameters of a fuzzy system,
ANFIS-based or DNFS-based software for both regression making it hard to initialize the optimal hyperparameters for
and classification tasks. However, when faced with a main- the above gradient descent method. Since there is no way to
stream Python-based machine learning library such as scikit- know in advance the optimal value for the hyperparameter,
learn, Matlab-ANFIS is not convenient to use as closed-source this limitation can be addressed by using the Grid-
commercial software. In contrast, our Scikit-ANFIS software SearchCV() function in scikit-learn’s model selection
is open source and better suited for the Python platform, package to try all possible values to find the optimal one.
outperforming state-of-the-art ANFIS-based or DNFS-based During the training process of ANFIS, the optimization
Python software in terms of performance, and is the closest to methods used play an essential role in obtaining effective
Matlab-ANFIS. This superiority can be attributed to several results [28]. The commonly used methods are Gradient
key factors. First and foremost, our software benefits from the Descent and Least Squares Estimate, but heuristic

123
D. Zhang, T. Chen: Scikit-ANFIS: A Scikit-Learn Compatible Python Implementation 2055

algorithms such as PSO [50] and GA [51] can also be existing TSK fuzzy system directly and automatically
utilized to train the premise and consequence parameters of generating an ANFIS fuzzy system based on stipulated
ANFIS. To further enhance the training method of Scikit- input–output data pairs.
ANFIS, we aim to integrate PSO and GA in a hybrid For future research, we will explore how the Scikit-
training method along with Gradient Descent or LSE. This ANFIS software can be used with deep neuro-fuzzy sys-
will improve the training process [28] and ultimately lead tems to further strengthen the performance for solving
to better overall performance of our Scikit-ANFIS. regression and classification problems. Additionally, we
Another limitation of existing Scikit-ANFIS is that the will apply it to more complex problem domains such as
fuzzy system does not directly address variables with health and care domain where both model performance and
missing values, which is also a limitation for some machine interpretability are usually among the top concerns in
learning algorithm as implemented by scikit-learn - a medical practice [59].
common workaround is to use imputation techniques (e.g.,
advanced fuzzy interpolation techniques [52] in the context Acknowledgements This work is partially supported by the Henan
Key Research and Development Breakthrough Program of China (No.
of a fuzzy system) to fill missing values with artificially 222102210191).
generated values before training and/or prediction.
One more limitation worth discussion is that Scikit- Open Access This article is licensed under a Creative Commons
ANFIS can still suffer from the curse of dimensionality Attribution 4.0 International License, which permits use, sharing,
adaptation, distribution and reproduction in any medium or format, as
problem, particularly those initialized by the grid partition long as you give appropriate credit to the original author(s) and the
method. This is because both the number of fuzzy rules and source, provide a link to the Creative Commons licence, and indicate
the training time grow exponentially with the number of if changes were made. The images or other third party material in this
fuzzy sets for each input variable, which limits the number article are included in the article’s Creative Commons licence, unless
indicated otherwise in a credit line to the material. If material is not
of input variables and membership functions, resulting in included in the article’s Creative Commons licence and your intended
reduced prediction accuracy due to the absence of impor- use is not permitted by statutory regulation or exceeds the permitted
tant characteristic variables [53]. Although the original use, you will need to obtain permission directly from the copyright
ANFIS paper [1] does not discuss this limitation possibly holder. To view a copy of this licence, visit http://creativecommons.
org/licenses/by/4.0/.
due to when data collected by then was relatively small, it’s
naturally desirable for a model to work with data of high
dimensionality to meet the growing trend. While this paper References
reports only the implementation of original ANFIS, part of
future work will concentrate on advanced dimensionality 1. Jang, J.: Anfis: adaptive-network-based fuzzy inference system.
reduction techniques, such as those based on granule IEEE Trans. Syst. Man Cybern. 23, 665–685 (1993)
computing and rough-set [54–58] for developing novel 2. Chen, T., et al.: A decision tree-initialised neuro-fuzzy approach
for clinical decision support. Artif. Intell. Med. 111, 101986
computational intelligence models and applications in the (2021)
era of big data. 3. Keikhosrokiani, P., Naidu, A., Anathan, A.B., Iryanti Fadilah, S.,
Manickam, S., Li, Z.: Heartbeat sound classification using a
hybrid adaptive neuro-fuzzy inferences system (anfis) and artifi-
cial bee colony. Digital Health 9, 85 (2023). https://doi.org/10.
6 Conclusion 1177/20552076221150741
4. Chen, T., et al.: A dominant set-informed interpretable fuzzy
It is common among the research community to apply system for automated diagnosis of dementia. Front. Neurosci. 16,
ANFIS based on the Matlab platform to a variety of 867664 (2022)
5. Zand, J.P., Katebi, J., Yaghmaei-Sabegh, S.: A generalized
regression, classification, process controls, and pattern ANFIS controller for vibration mitigation of uncertain building
recognition applications, which makes it difficult for users structure. Struct. Eng. Mech. 87, 231–242 (2023)
to combine it with the common scikit-learn library in the 6. Osheba, D.S., Osheba, S., Nazih, A., Mansour, A.S.: Performance
Python platform. Hence, in this work, we implement Sci- enhancement of PV system using VSG with ANFIS controller.
Electr. Eng. 105, 2523–2537 (2023)
kit-ANFIS, a user-friendly, and scikit-learn-compatible 7. Arévalo, P., Cano, A., Jurado, F.: Large-scale integration of
Python software using ANFIS architecture specifically renewable energies by 2050 through demand prediction with
designed for training the TSK fuzzy systems. Scikit-ANFIS ANFIS, ECUADOR case study. Energy 286, 129446 (2024)
takes a universal format to create a model and train the 8. Lara-Cerecedo, L., Pitalúa-Dı́az, N., Hinojosa-Palafox, J.: Com-
parative study of the prediction of electrical energy from a pho-
model through fit() and test it through predict(). Our Scikit- tovoltaic system using the intelligent systems ANFIS and ANFIS-
ANFIS implementations demonstrate performance gains on GA. Revista Mexicana de Ingenierı́a Quı́mica 22, 1–16 (2023)
four standard data sets compared to existing Python pro- 9. Lara-Cerecedo, L.O., Hinojosa, J.F., Pitalúa-Dı́az, N., Mat-
grams that have implemented the ANFIS or DNFS method. sumoto, Y., González-Angeles, A.: Prediction of the electricity
generation of a 60-kw photovoltaic system with intelligent
Furthermore, our Scikit-ANFIS allows for training an

123
2056 International Journal of Fuzzy Systems, Vol. 26, No. 6, September 2024

models ANFIS and optimized ANFIS-PSO. Energies 16, 6050 33. Talpur, N., Abdulkadir, S.J., Hasan, M.H.: A deep learning based
(2023) neuro-fuzzy approach for solving classification problems,
10. MathWorks: neuro-adaptive learning and anfis - r2023a (2023). 167–172 IEEE, (2020)
https://uk.mathworks.com/help/fuzzy/neuro-adaptive-learning- 34. Wu, D., Yuan, Y., Huang, J., Tan, Y.: Optimize tsk fuzzy systems
and-anfis.html. Accessed 5 Jan 2024 for regression problems: Minibatch gradient descent with regu-
11. Power, J.: Anfis in pytorch (2019). https://github.com/jfpower/ larization, droprule, and adabound (mbgd-rda). IEEE Trans.
anfis-pytorch. Accessed 5 Jan 2024 Fuzzy Syst. 28, 1003–1015 (2020)
12. Meggs, T.: Anfis (2020). https://github.com/twmeggs/anfis. 35. Cui, Y., Wu, D., Huang, J.: Optimize tsk fuzzy systems for
Accessed 5 Jan 2024 classification problems: Minibatch gradient descent with uniform
13. Gilardi, G.: Anfis (2021). https://github.com/gabrielegilardi/ regularization and batch normalization. IEEE Trans. Fuzzy Syst.
ANFIS. Accessed 5 Jan 2024 28, 3065–3075 (2020)
14. Rathnayake, N., Dang, T.L., Hoshino, Y.: A novel optimization 36. Shi, Z., et al.: Fcm-rdpa: Tsk fuzzy regression model construction
algorithm: cascaded adaptive neuro-fuzzy inference system. Int. using fuzzy c-means clustering, regularization, droprule, and
J. Fuzzy Syst. 23, 1955–1971 (2021) powerball adabelief. Inf. Sci. 574, 490–504 (2021)
15. Rathnayake, N., Rathnayake, U., Dang, T.L., Hoshino, Y.: A 37. Cui, Y., Xu, Y., Peng, R., Wu, D.: Layer normalization for tsk
cascaded adaptive network-based fuzzy inference system for fuzzy system optimization in regression problems. IEEE Trans.
hydropower forecasting. Sensors 22, 2905 (2022) Fuzzy Syst. 31, 254–264 (2022)
16. LeCun, Y., Bengio, Y., Hinton, G.: Deep learning. Nature 521, 38. Keskar, N.S., Mudigere, D., Nocedal, J., Smelyanskiy, M., Tang,
436–444 (2015) P.T.P.: On large-batch training for deep learning: generalization
17. Talpur, N., et al.: Deep neuro-fuzzy system application trends, gap and sharp minima. arXiv preprint arXiv:1609.04836 (2016).
challenges, and future perspectives: a systematic survey. Artif. https://doi.org/10.48550/arXiv.1609.04836
Intell. Rev. 56, 865–913 (2023) 39. Kingma, D.P., Ba, J.: Adam: a method for stochastic optimiza-
18. Cui, Y., Wu, D., Jiang, X., Xu, Y.: Pytsk: a python toolbox for tsk tion. arXiv preprint arXiv:1412.6980 (2014). https://doi.org/10.
fuzzy systems. arXiv preprint arXiv:2206.03310 (2022). https:// 48550/arXiv.1412.6980
doi.org/10.48550/arXiv.2206.03310 40. Luo, L., Xiong, Y., Liu, Y., Sun, X.: Adaptive gradient methods
19. Ketkar, N., Moolayil, J.: Deep Learning with Python: learn best with dynamic bound of learning rate. arXiv preprint arXiv:1902.
practices of deep learning models with PyTorch 2 edn Apress. 09843 (2019). https://doi.org/10.48550/arXiv.1902.09843
Berkeley, CA (2021) 41. Yuan, Y., Li, M., Liu, J., Tomlin, C.: On the powerball method:
20. Takagi, T., Sugeno, M.: Fuzzy identification of systems and its variants of descent methods for accelerated optimization. IEEE
applications to modeling and control. IEEE Trans. Syst. Man Control Syst. Lett. 3, 601–606 (2019)
Cybern. SMC–15, 116–132 (1985) 42. Zhuang, J., et al.: Adabelief optimizer: adapting stepsizes by the
21. Fresno, C., Fernández, E.A.: Anfis vignette (2012). https://github. belief in observed gradients. Adv. Neural. Inf. Process. Syst. 33,
com/jfpower/anfis-pytorch/blob/master/Anfis-vignette.pdf. 18795–18806 (2020)
Accessed 5 Jan 2024 43. Bottou, L.: Large-scale machine learning with stochastic gradient
22. Chen, T., Shang, C., Su, P., Shen, Q.: Induction of accurate and descent, pp. 177–186. Springer, Berlin (2010)
interpretable fuzzy rules from preliminary crisp representation. 44. Riedmiller, M., Braun, H.: A direct adaptive method for faster
Knowl.-Based Syst. 146, 152–166 (2018) backpropagation learning: The rprop algorithm. In: IEEE Inter-
23. Carter, J., Chiclana, F., Khuman, A.S., Chen, T. (eds.): Fuzzy national Conference on Neural Networks, pp. 586–591 IEEE,
logic: recent applications and developments, 1st edn. Springer, (1993)
Switzerland (2021) 45. Liu, D.C., Nocedal, J.: On the limited memory BFGS method for
24. Liebscher, R.: Pyfuzzy-python fuzzy package (2014). http:// large scale optimization. Math. Program. 45, 503–528 (1989)
pyfuzzy.sourceforge.net/. Accessed 5 Jan 2024 46. Zeiler, M.D.: Adadelta: an adaptive learning rate method. arXiv
25. Avelar, E., Castillo, O., Soria, J.: Fuzzy logic controller with preprint arXiv:1212.5701 (2012). https://doi.org/10.48550/arXiv.
fuzzylab python library and the robot operating system for 1212.5701
autonomous robot navigation: a practical approach. Intuit Type-2 47. Duchi, J., Hazan, E., Singer, Y.: Adaptive subgradient methods
Fuzzy Logic Enhanc. Neural Optim. Algor. Theory Appl. 862, for online learning and stochastic optimization. J. Mach. Learn.
355–369 (2020) Res. 12, 2121–2159 (2011)
26. Scikit-fuzzy (2023). https://pythonhosted.org/scikit-fuzzy/. 48. Pathak, A.: Restaurant tipping problem using fuzzy logic (2023).
Accessed 5 Jan 2024 https://github.com/ap1904/RTP. Accessed 5 Jan 2024
27. Spolaor, S., et al.: Simpful: a user-friendly python library for 49. Fisher, R.A.: The use of multiple measurements in taxonomic
fuzzy logic. Int. J. Comput. Intell. Syst. 13, 1687–1698 (2020) problems. Ann. Eugen. 7, 179–188 (1936)
28. Karaboga, D., Kaya, E.: Adaptive network based fuzzy inference 50. Turki, M., Bouzaida, S., Sakly, A., M’Sahli, F.: Adaptive control
system (anfis) training approaches: a comprehensive survey. of nonlinear system using neuro-fuzzy learning by pso algorithm.
Artif. Intell. Rev. 52, 2263–2293 (2019) pp. 519–523 IEEE, (2012)
29. Oliphant, T.E.: Python for scientific computing. Comput. Sci. 51. Cárdenas, J.J., Garcı́a, A., Romeral, J., Kampouropoulos, K.: Evo-
Eng. 9, 10–20 (2007) lutive ANFIS training for energy load profile forecast for an IEMS in
30. Oliphant, T.E., et al.: A guide to NumPy, vol. 1. Trelgol Pub- an automated factory. In: ETFA2011, pp. 1–8 (IEEE, 2011)
lishing, USA (2006) 52. Chen, T., Shang, C., Yang, J., Li, F., Shen, Q.: A new approach
31. Zheng, Y., Xu, Z., Wang, X.: The fusion of deep learning and for transformation-based fuzzy rule interpolation. IEEE Trans.
fuzzy systems: a state-of-the-art survey. IEEE Trans. Fuzzy Syst. Fuzzy Syst. 28, 3330–3344 (2019)
30, 2783–2799 (2022) 53. Stathakis, D., Savina, I., Nègrea, T.: Neuro-fuzzy modeling for
32. Sun, C., Jang, J.: A neuro-fuzzy classifier and its applications. In: crop yield prediction. Int. Arch. Photogramm. Remote. Sens.
Proceedings Second IEEE International Conference on Fuzzy Spat. Inf. Sci. 34, 1–4 (2006)
Systems (pp. 94-98). IEEE (1993) 54. Li, W., et al.: Feature selection approach based on improved
fuzzy c-means with principle of refined justifiable granularity.
IEEE Trans. Fuzzy Syst. 31, 2112–2126 (2023)

123
D. Zhang, T. Chen: Scikit-ANFIS: A Scikit-Learn Compatible Python Implementation 2057

55. Su, P., et al.: Corneal nerve tortuosity grading via ordered Tianhua Chen received the
weighted averaging-based feature extraction. Med. Phys. 47, Ph.D. degree in Computational
4983–4996 (2020) Intelligence from Aberystwyth
56. Li, W., et al.: Double-quantitative feature selection approach for University, Aberystwyth, U.K.,
multi-granularity ordered decision systems. IEEE Trans. Artif. in 2017. He is currently a
Intell. 1, 1–12 (2023) Reader (Associate Professor) in
57. Mac Parthaláin, N., Jensen, R., Diao, R.: Fuzzy-rough set bire- Artificial Intelligence with the
ducts for data reduction. IEEE Trans. Fuzzy Syst. 28, 1840–1850 Department of Computer Sci-
(2019) ence, School of Computing and
58. Li, W., Zhou, H., Xu, W., Wang, X.-Z., Pedrycz, W.: Interval Engineering, University of
dominance-based feature selection for interval-valued ordered Huddersfield, UK. He has pub-
data. IEEE Trans. Neural Netw. Learn. Syst. 34, 6898–6912 lished over 60 peer reviewed
(2023) papers in leading international
59. Chen, T., Carter, J., Mahmud, M., Khuman, A.S.: Artificial journals and conferences,
intelligence in healthcare: recent applications and developments, including a lead-authored paper
vol. 1. Springer Nature, Singapore (2022) selected as IEEE Transactions on Fuzzy System Publication Spotlight
Article by IEEE Computational Intelligence Society. His research
interests are: Artificial Intelligence for health and wellbeing,
Explainable AI, Neuro-Fuzzy Systems. Tianhua is an Editorial Board
Dongsong Zhang received the Member of Artificial Intelligence in Medicine journal (Elsevier),
PhD degree in Computer Sci- BMC Medical Informatics and Decision Making journal (Springer),
ence and Technology from and PLOS ONE.
National University of Defense
Technology, China, in 2012. He
currently is an assistant profes-
sor in School of Big Data and
Artificial Intelligence at
Xinyang College. His research
interests are in the area of soft
computing (Neural Network,
Fuzzy Logic), Real-Time
Systems.

123

You might also like