A Review of Data Mining Applications For
a r t i c l e i n f o a b s t r a c t
Keywords: Many quality improvement (QI) programs including six sigma, design for six sigma, and kaizen require
Knowledge discovery in databases collection and analysis of data to solve quality problems. Due to advances in data collection systems
Data mining and analysis tools, data mining (DM) has widely been applied for QI in manufacturing. Although a few
Quality improvement review papers have recently been published to discuss DM applications in manufacturing, these only
Six sigma
cover a small portion of the applications for specific QI problems (quality tasks). In this study, an exten-
Design for six sigma
Quality description
sive review covering the literature from 1997 to 2007 and several analyses on selected quality tasks are
Prediction provided on DM applications in the manufacturing industry. The quality tasks considered are; product/
Classification process quality description, predicting quality, classification of quality, and parameter optimisation.
Parameter optimisation The review provides a comprehensive analysis of the literature from various points of view: data han-
Data mining software dling practices, DM applications for each quality task and for each manufacturing industry, patterns in
Manufacturing the use of DM methods, application results, and software used in the applications are analysed. Several
summary tables and figures are also provided along with the discussion of the analyses and results.
Finally, conclusions and future research directions are presented.
Ó 2011 Elsevier Ltd. All rights reserved.
3. Do these applications contribute to the literature in terms of practices, discusses the application studies for each quality task
new methods, new methodologies, or successful implementa- and for each manufacturing industry, presents patterns of DM
tion of the existing ones? What are the reasons of selecting function and method usage, examines results of the applications,
these methods? What are the reported benefits and shortcom- and surveys the software used in these applications. Finally, con-
ings of these methods? What are the results of these clusions are presented and future research directions are
applications? discussed.
4. Which typical patterns or sequences of DM functions and meth-
ods are used in these applications to solve the QI problems?
5. Which software has been used in these applications? What are 2. Quality improvement tasks
the reasons of selecting these software products? What are the
reported benefits and shortcomings of the software? A classification of the QI and control tasks and the context of the
review are provided in the following. The classification is based on
The review of the literature covers the publications from 1997 the purpose of a study, and it does not necessarily yield disjoint
through 2007. Applications in the manufacturing sector in general classes, since the final purpose of a study can be achieved in a hier-
are considered and the publications are selected for the review archy of lower level purposes. In addition, the aim of this classifi-
based on the selection criteria provided in Table 1. cation is not to identify disjoint classes, but to present an
Each publication selected for review is analysed in detail to fill analysis of the DM applications considering different purposes in
out a table that summarises its basic characteristics which are of QI. Hence, before such a classification is presented, some back-
interest in this study as shown in Table 2. This table serves as ground on various quality initiatives and commonly used quality
the major source of information to answer the research questions. problem solving approaches is given as well as the QI and control
For this purpose, the applications are sorted according to a charac- activities that occur in a life cycle of a product.
teristic under consideration, and/or simple statistics are calculated As quality gained more and more importance over time, many
for those that satisfy a certain property. The complete table is quality initiatives and concepts have emerged. Inspection (100%),
available in Köksal, Batmaz, and Testik (2008). statistical quality control (SQC), total quality control (TQC), zero
Classification of DM functions and methods for a particular defects, total quality management (TQM), kaizen, ISO 9000 quality
application area is an important activity on its own. Even though standards, quality award programs (Malcolm Baldrige, European
DM functions are typically classified as data summarisation, classi- Quality Award and so on), 6r, DFSS, lean six sigma have been
fication, prediction, clustering and so on, these classes depend very among the most recognised ones (Fasser & Brettner, 2002; Mont-
much on the type of knowledge that can be discovered in dat- gomery, 2005). One apparent trend in the evolution of these initia-
abases (Choudhary et al., 2008; Wei, Piramuthu, & Shaw, 2003). tives is the emphasis on more proactive approaches and ‘‘upstream
Assigning the methods to the functions is also challenging, since (design) processes’’ (Kolarik, 1995; Montgomery, 2005; Phadke,
some DM methods serve multiple DM functions. In this study, 1989; Taguchi, Chowdhury, & Taguchi, 2000). Furthermore,
DM functions and methods used in the applications are classified emphasis on the bottom-line results in shorter time periods has
based on the desired knowledge to be discovered from the quality also gained importance and led to the 6r quality programs. These
data and the way they are used in the applications. programmes have received considerable recognition around the
In the following sections, a classification of QI and control tasks world due to their success, especially in improvement of ‘‘on-line’’
within the scope of this study is given first. Next, the KDD process, or downstream processes using the so called define-measure-ana-
relevant DM functions are briefly described. A comprehensive anal- lyse-improve-control (DMAIC) approach to reach six sigma (less
ysis of the reviewed literature from various points of view is pro- than 3.4 part per million (PPM) defectives) quality levels (Brady
vided subsequently. The analysis summarises data handling & Allen, 2006; Fasser & Brettner, 2002). Realising that such high
product and/or
from the area of artificial intelligence (AI) and DM into the systems.
Table 3
Quality control and improvement activities and methods (adapted from Phadke, 1989).
Product development stage Quality control and improvement Examples of traditional methods used
Product design, manufacturing process Concept design QFD, Pugh’s concept selection, TRIZ, technological forecasting
design Parameter design Design of experiments, ANOVA, response surface modelling and analysis, regression,
Tolerance design Statistical tolerancing, cost analysis
Manufacturing Inspection/Screening Pattern recognition, automated inspection
Quality analysis ANOVA, regression, classification, clustering
Process control Feedback control, feed forward control, manual adjustments
Quality monitoring Statistical process control (control charts), Principal component analysis
Customer usage Warranty and repair/replacement Replacement analysis
consist of important DM applications through on-line process Classification of quality: Classifying a quality characteristic of
monitoring and automatic process control, they make use of spe- interest for nominal, binary or ordinal outputs (such as defects).
cial methods collected under temporal data mining due to ‘time’ For a given set of input parameters, predicting the class of the
being an important variable of the models. Furthermore, the liter- quality output.
ature on these topics is too voluminous to be included in the scope Similar to predicting quality, this quality task is typically per-
of this review. For example, reviews of artificial neural networks formed at the Analyse stage of both DMAIC and DMADV cycles.
(ANNs) and wavelet transforms for quality monitoring are pro- Parameter optimisation: Based on the learned characteristics of
vided in Zorriassatine and Tannock (1998) and Ganesan, Das, and the cases yielding high quality, finding optimal levels of pro-
Venkataraman (2004), respectively. Similarly, a review paper on cess/product parameters that consistently yield target quality
process control using ANNs is provided in Hussain (1999). These performance. Note that this quality task is also performed at
reviews indicate the widespread use of DM in the area. Yet, there the Improve or Develop (Optimise) – Verify stages.
are many other DM methods in the literature, such as principal
component analysis (PCA), partial least squares (PLS), support vec- 3. The knowledge discovery in databases process and data
tor machines (SVM), and decision trees (DTs), that have found mining
applications in quality monitoring and process control. A separate
review of these might also be beneficial for future studies. Fayyad, Piatetsky-Shapiro, and Smyth (1996) define KDD as
The final phase of product development process considers the ‘‘the nontrivial process of identifying valid, novel, potentially use-
usage of the product by the customers. Since the product is in ful and ultimately understandable patterns in data’’. It consists of
the hands of the customer at this phase, quality perception can the following main steps: (i) data preparation, (ii) data preprocess-
be improved through repair or replacement of the product under ing, (iii) DM, (iv) evaluation and interpretation, (v) implementa-
warranty, and other services beyond warranty. We have also tion. Note that DM is a step in the KDD process that consists of
excluded DM applications involving QI and control at after-sales- applying data analysis and discovery algorithms. DM tasks can be
services, which typically utilise text mining approaches due to classified into two groups (Han & Kamber, 2006): descriptive and
the textual nature of the associated data. predictive. These tasks can be accomplished by using various
In summary, the QI and control activities we choose to consider methods based on DM functions, used to specify the types of pat-
in this study involve the following tasks, which we refer to as qual- terns to be mined. These functions include summarisation (charac-
ity tasks from this point on: terisation), clustering, association, classification, prediction and so
on. Below, the steps of KDD process are described briefly.
Description (or characterisation) of product and process quality:
Quality of products or processes can be defined or characterised 3.1. Data preparation
by performing the following tasks:
– Identifying attributes/variables, which affect quality DM functions mainly utilise available data sources such as data
significantly. warehouses, marts, databases or files for gathering data (Pyle,
– Ranking the attributes/variables based on their significance. 1999). In applications, the data sources are first located, accessed,
– Identifying how low, medium and high yielding products are and integrated. Next, selected data is put into a tabular format in
naturally grouped in data, and finding the most probable caus- which instances and variables take place in rows and columns,
ative factor(s) that discriminates between low and high yielding respectively (Giudici, 2003). If the data set built is very large, a rep-
products. resentative reduced data set can be obtained by sampling. In cer-
These activities are typically performed at earlier stages (define- tain situations all data may not be readily available for mining,
measure-analyse) of DMAIC and DMADV for quality analysis where data farming process can help us ‘‘defining features that
and product/process design. are the most appropriate for DM’’ (Kusiak, 2006). Then, one may
Predicting quality: When quality output is a real valued variable, collect necessary data recording the feature values directly from
developing models that relate input characteristics of quality to real time or experimental observations, or indirectly from simula-
the output, and using such models to predict what the resulting tion results.
quality characteristic value will be for a given set of input
parameter values. This quality task is typically performed at 3.2. Data preprocessing
the Analyse stage of both DMAIC and DMADV cycles dealing
with real valued quality output characteristics. The prediction Real-world data is generally dirty, incomplete and inconsistent.
models developed can be used later in Improve or Design (Opti- Redundancies may also occur due to integration of data from var-
mise) – Verify stages, or directly in the Control stage. ious sources. The main purpose of this step is to handle these kinds
problems to improve the data quality. In addition, transforming prediction methods can be categorised into the following groups
and reducing data can help to improve the accuracy and efficiency (Dunham, 2003): statistical based (S-based) methods use classical
of DM function(s). Basic data preprocessing techniques are as fol- techniques which depend on statistical theory, and thus, provide
lows (see Giudici, 2003; Pyle, 1999; Witten, 2005): statistical inference. DT-based algorithms construct DTs which
look like a flowchart using a top-down recursive approach. DT-
Data cleaning involves techniques for filling in missing values, based algorithms automatically generate rules having ‘if-then’ type
smoothing out noise, handling outliers, detecting and removing structures. ANN-based algorithms, on the other hand, consist of a
redundant data. set of connected input–output units each having a weight, which
Data transformation puts the data into appropriate forms for is updated by a learning algorithm such as backpropagation (BP)
mining when necessary. used with gradient descent (GD) optimisation technique, or the ra-
Data reduction is applied to reduce the data set to be mined. dial basis function (RBF). In addition, different classification meth-
While ‘dimension reduction’ technique eliminates unnecessary ods can be combined to improve the model accuracy. Following a
attributes, ‘data compression’ and ‘numerosity reduction’ tech- successful implementation of a classification/prediction function,
niques provide other forms of reduced data representations. or as a stand-alone DM function, optimisation can be performed
Discretisation, a form of data reduction, reduces the number of to determine the settings of factors (or design parameters) that
levels of an attribute by collecting and replacing low-level con- yield desired responses.
cepts with high-level concepts.
3.4. Evaluation and interpretation
3.3. Data mining
The KDD process described above tries to uncover previously
DM methods can be categorised based on various criteria. In unknown structures that may reside in data. Depending on the
this study, we classify them according to the ‘types of knowledge data sets and research objectives, one can start at any step and con-
mined’ (DM functionalities) such as clustering, association, classifi- tinue with the others as long as there are research questions to be
cation, and so on for achieving descriptive/predictive DM tasks answered. Besides, one may try several methods for describing or
(Dunham, 2003). modelling process/product quality. Therefore, assessing the utility
and reliability, and then, interpreting the information discovered in
3.3.1. Descriptive data mining the modelling should be the final stage in a DM process (Giudici,
Descriptive DM involves exploration of patterns and relation- 2003). Evaluation of the DM methods to reach a final decision re-
ships that may exist in data. Basic descriptive functions are sum- quires a comparison of results obtained from various DM methods
marisation, clustering, association rule generation and sequence using several measures including accuracy, time and resource
discovery (Dasu & Johnson, 2003; Giudici, 2003). These are also requirements. It is obvious that to obtain reliable results, knowl-
used for data exploration before a classification/prediction and/or edge extracted should be evaluated and interpreted correctly
optimisation DM function is implemented. In this review, sequence (Dunham, 2003).
discovery is not observed due to the nature of problems within the
scope of the study.
3.5. Implementation
Summarisation is the presentation of general characteristics of
a data set. Basic approaches are OLAP and attribute-oriented induc-
The KDD process described above provides tools for better
tion. There are numerous statistical methods available for data
understanding the relations in quality data. The final KDD step in-
summarisation (Giudici, 2003). Descriptive statistics and graphical
volves implementation of the results obtained into the QI related
displays can effectively describe univariate data. For bivariate data,
decisions of the industry (Giudici, 2003). This is typically per-
additional methods (e.g. correlation analysis (CA), scatter plots)
formed informally. Kusiak (2006) proposes a framework for more
can be used to determine the relationship that may exist between
structured and transparent decision making based on decision
the variables. For describing multivariate data, however, depen-
making constructs called decision tables, decision maps, atlases,
dency and association measures as well as multidimensional
and library.
graphs such as scatter plot matrix and Andrews curves are needed
(Martinez & Martinez, 2002).
Clustering is the process of grouping data into classes of similar 4. Analysis of the applications
objects. The similarity among objects is usually measured by dis-
tance measures. Major distance-based clustering methods can be In the following, we examine and discuss the reviewed litera-
organized in two categories (Han & Kamber, 2006). Partitioning ture from various points of view. Data handling or more specifi-
methods classify the data into k parts in such a way that observa- cally data preparation and data preprocessing before performing
tions in each part are closely related to each other. Hierarchical the DM functions in the quality tasks are discussed first. Next,
methods group the data into a tree of clusters by either using bot- DM applications in each of the quality tasks defined earlier are cat-
tom-up (agglomerative) or top-down (divisive) approaches. egorised with an emphasis on the manufacturing industries. Find-
Besides, there are others classified as density-based, grid-based ings of these applications are presented subsequently, and then
and model-based methods. some DM patterns found in these applications are explained. Final-
Association tries to identify groups of items that occur together. ly, software tools used in these applications are examined.
Assuming that a database consists of a set of records which con-
tains a set of items, most algorithms accomplish the association 4.1. Data handling
task in two steps: finding frequent item sets, and then generating
interesting if-then rules (Hand, Mannila, & Smyth, 2001). Data preparation and data preprocessing are two essential steps
that should be taken before performing any DM function. Never-
3.3.2. Predictive data mining theless, these steps are either briefly explained or considerably less
Predictive modelling can be accomplished by performing classi- emphasis is given to them in presenting the applications. Below,
fication or prediction functions to forecast future values of categor- we provide some analyses of the literature regarding data prepara-
ical or continuous type data, respectively. Major classification/ tion and preprocessing steps.
4.1.1. Preparation of quality data niques applied for that purpose, analysis of variance (ANOVA), sub-
In most of the QI applications reviewed (37% of all data sets), jective evaluation (SE) and ANN being the most noticeable among
data were collected through statistically designed experiments them. In addition, data compression techniques such as PCA to ob-
(DOE). Experimental design settings used in these applications tain manageable sizes of data are also common for data reduction.
are provided in Table 4. The number of records in the experimental Finally, discretisation techniques used in the preprocessing of data
data sets ranged from 9 to 1323. Observational data sets are also are provided in Table 8. These were applied extensively on contin-
common in the reviewed applications (28% of all data sets) and uous type of data, especially before DT-based methods were used,
these were either obtained through online sensor measurements and mostly SE method was preferred.
or from historical logs. The number of records in the online-obser- Depending on the analysis of literature reviewed above, we can
vational data sets ranged from 27 to 16,381. On the other hand, say that quality data can be either too small or too large in size. Be-
historical-observational data sets (20% of all data sets) were typi- sides, it may contain missing, outlying, inconsistent and incom-
cally obtained from two different databases, one for production plete observations (see Table 5). In addition, quality data may
and the other for quality data. The number of records in such have several mixed type (discrete and continuous) of input and
observational data sets ranged from 27 to 58,076. Alternatively, output variables (Köksal et al., 2008). Due to enormous number
process simulations and simulated data sets (15% of all data sets) of input variables, correlations among them are frequently ob-
also existed and these were used to test the performance of the served, and usually are handled by using dimension reduction
algorithms used. The largest number of records observed in the re- techniques (see Table 7). Moreover, quality data is usually imbal-
viewed applications is 250,000 and this belongs to a simulated data anced in terms of the number of records contained for defective/
set. nondefective items.
4.1.2. Preprocessing of quality data 4.2. Analysis of DM applications according to quality tasks
Data cleaning techniques used in the reviewed applications are
presented in Table 5. Most of the cleaning techniques were applied The number of DM applications for the covered quality tasks
to the historical-observational type data sets. In the applications, has been increasing for the last 10 years (see Fig. 1). A total of
data transformations were also considered in the data preprocess- 127 application studies (presented in 130 papers) within the scope
ing step. As can be seen in Table 6, normalisation of the data was of this review have been found. In this section, these applications
the most common transformation used. Preprocessing of data are reviewed according to the quality tasks performed, giving spe-
through data reduction was also common in the reviewed studies cial attention to the manufacturing industries involved.
(see Table 7). Especially, dimension reduction is observed to be a An analysis of the usage of quality tasks (shown in Fig. 2) indi-
widely used form of data reduction, and there is a long list of tech- cates that predicting quality is the most frequently performed task
Table 4
Design types used in the QI literature.
Table 5
Data cleaning techniques used in the QI literature.
Table 6
Transformations applied in the QI literature.
Transformation QI applications
Normalisation Cool et al. (1997), Lewis and Ransing (1997), Tsai et al. (1999), Cherian et al. (2000), Suneel et al. (2002), Vasudevan et al. (2002),
Rallo et al. (2002), Li et al. (2003a), Li et al. (2003b), Zuperl and Cus (2003), Shi et al. (2004), Perzyk et al. (2005), Hou et al. (2003), Ho
et al. (2006), Yin and Yu (2006), Karim et al. (2006), Zhou et al. (2006), Zhang et al. (2007), Karnik et al. (2007), Lee and Dornfeld
(2007), Chen et al. (2007) and Hung (2007)
Smoothing Li et al. (2003b)
Logarithm Umbrello et al. (2007)
Transforming categorical data to Karim et al. (2006)
Centring data Tsai et al. (1999)
Adjusting abnormal data Dhond et al. (2000)
Unspecified Shahbaz et al. (2006)
Table 7
Data reduction techniques used in the QI literature.
Table 8
Discretisation methods used in QI literature.
(42%). Applications involving classification of quality (25%) and manufacturers for those (30%). Plastics (7%), basic metals (7%), coke
parameter optimisation (23%) are also considerable in number. and refined petroleum (4%) and chemical products (4%) manufac-
However, description of quality is performed relatively less (10%). turing industries observed significantly less usage of DM methods.
A classification of the reviewed applications according to the Moreover, motor vehicles, trailers, food, beverages, textile, paper
manufacturing industry codes of ‘‘International Standard Industrial and paper products, nonmetallic mineral products, wood products
Classification (ISIC) of All Economic Activities, Revision 4, Section C except furniture, pharmaceutical products and electrical equip-
– Manufacturing’’ is provided in Fig. 3. Many DM applications in ment manufacturing industries had rare (2–0.8%) DM applications.
the literature regarding the QI activities appear to be from the me- For many other industries, DM applications within the scope
tal product manufacturing industries (with operations such as have not been observed. These include tobacco products, wearing
forming, welding, casting and machining) (37%), as well as com- apparel, leather and related products, printing and reproduction
puter and electronic product manufacturing industries and device of recorded media, machinery and equipment (other than motor
Fig. 2. Number of times DM functions are used for quality tasks in the application studies.
vehicles), and furniture manufacturing industries, which can be patterns from wafer bin map to improve yield. Association analysis
considered as potential application areas. was also used for identification of root-cause machine set (Chen,
Tseng, & Wang, 2005). In order to infer possible causes of faults
4.2.1. Description of product and process quality and manufacturing process variations in semiconductor manufac-
Product and process quality description is generally the first turing, Chien, Wang, and Cheng (2007) considered k-means clus-
stage of a QI study in an industrial application. In the computer tering. Furthermore, hierarchical clustering was used for
and electronic product manufacturing industry, self organizing describing the relationships between machines and wafer yield
maps (SOMs) have been used to identify critical poor yield factors rates (Hu & Su, 2004), for determining the cause of low quality wa-
(Gardner & Bieker, 2000) and a variant of SOM has been used to im- fers (Skinner et al., 2002), and for improving yield (Baek, Jeong, &
prove yield (Karim, Halgamuge, Smith, & Hsu, 2006) in wafer man- Han, 2005).
ufacturing. Kang, Choe, and Park (1999) also considered SOM for Metal product manufacturing industry applications also include
generating better operating conditions in semiconductor manufac- several product and process quality description studies, where
turing. On the other hand, Tseng, JothiShankar, and Wu (2004a), product/process attributes or variables affecting quality were
Tseng, JothiShankar, Wu, Xing, and Jiang (2004b) utilised rough determined, ranked based on significance, or grouped based on
set theory (RST) to identify the features that produce solder ball similarities. In packaging manufacturing De Abajo, Diez, Lobato,
defects and affect the quality in printed circuit board (PCB) manu- and Cuesta (2004) developed a tinplate quality diagnostic model,
facturing. Moreover, rule induction has been utilized to select attri- where SE, genetic algorithm (GA), SOM, and DT were used. Ali
butes that represent the quality of motherboard assemblies and Chen (1999) used multiple linear regression (MLR) and DT in
(Huang, Li, & Peng, 2006) and for eliminating waste by improving an injection moulding process for QI. For a cutting process, Chang
the quality of wafers in integrated circuit manufacturing (Kusiak, and Jiang (2002) applied PCA for dimension reduction, and for a hot
2000). Bertino, Catania, and Caglio (1999) considered DT and asso- rolling process Cser et al. (2001) used SOM for clustering. ANN was
ciation analysis for determining possible causes of faulty wafers, used by Dhond, Gupta, and Vadhavkar (2000) in a steel mill blast
whereas Hsu and Chien (2007) considered DT and ANN to extract furnace process and by Han, Han, and Liu (1999) in a pressurized
computer numeric control (CNC) turning process (Suneel, Pandle, & ing process. Kusiak (2002) used RST to derive associations among
Date, 2002), in oil methods manufacturing (Aloudat, 2006), in control parameters and the product quality in metal forming pro-
stainless still drilling process (Karnik, Gaitonde, & Davim, 2007), cess. RST was also employed together with fuzzy set theory (FST)
and in a hard machining process (Umbrello, Ambogio, Filice, & (Hou & Huang, 2004) and together with ANN (Hou, Liu, & Lin,
Shivpuri, 2007). 2003) in conveyor belts manufacturing for rule induction.
To develop prediction models for quality, other industries also Classification applications also have been observed in other
made use of DM methods. ANNs were used in all of these applica- industries to improve quality. DT was used in ultra precision man-
tions. Plastics manufacturing industries utilised ANN, especially in ufacturing (Huang & Wu, 2005). Jemwa and Aldrich (2005) consid-
plastic injection moulding processes (Kurtaran, Ozcelik, & ered an integration of SVM and DT for a continuous stirred tank
Erzurumlu, 2005; Ozcelik & Erzurumlu, 2006; Sadeghi, 2000; Shen, reactor. ANN were considered in diagnosis of the causes of pre-
Wang, & Li, 2007). Chemical process industries have a variety of stressed concrete damages (Tam, Tong, Lau, & Chan, 2004), and
ANN applications; in a thermal spray process (Guessasma, Salhi, in predicting the quality of injected plastic parts (Sadeghi, 2000).
Montavon, Gougeon, & Coddet, 2004), in a plasma spraying process Sarimveis, Doganis, and Alexandridis (2006) considered ANN and
(Wang, Fang, Zhao, & Zeng, 2007b), in a low density polyethylene FST for classifying the product quality in paper manufacturing.
process (Rallo et al., 2002), in an oxidative dehydrogenation of pro- Achiche, Baron, Balazinski, and Benaoudia (2007) characterized
pane process (Holena & Baerns, 2003), in an ethylene pyrolysis pro- wood chip quality online to optimise a thermo mechanical pulp
cess (Zhou et al., 2006), in a diamond coating like carbon process process through use of fuzzy decision support system (FDSS) based
(Ho, Lau, Lee, Ip, & Pun, 2006), and in a silicon compound manufac- on GA learning.
turing process (Li et al., 2003a). Other industry applications include
a reaming process (Mathews & Shunmugam, 1999), a laser sinter- 4.2.4. Parameter optimisation
ing process (Wang, Wang, Zhao, & Liu, 2006), a canned foods pro- Computer and electronic product manufacturing industries
cess (Chen & Ramaswamy, 2002), a beer fermentation process made use of ANN for simultaneously optimising multiple re-
(Riverol & Cooney, 2007), a worsted spinning process (Yin & Yu, sponses in an ion implantation process (Hsieh & Tong, 2001). GA
2006), and a particle board manufacturing process (Cook, Ragsdale, was used for parameter optimisation of multiple quantum well
& Major, 2000). Also for a glass coating process, Li et al. (2003b) avalanche photodiodes (Kim, Oh, Lee, Lee, & Yun, 2001). Taguchi
used both ANN and DT. method (TM) was considered in Teng and Hwang (2007) to develop
more accurate predictions of the package warpage in electronic
4.2.3. Classification of quality packages manufacturing. To optimise multiple quality responses
Classification of a quality characteristic or predicting the class of in a surface mount electronic assembly operation, Lu and Antony
a quality output is also a common task in DM applications. Com- (2002) utilised fuzzy rule based TM. Hung (2007) employed TM
puter and electronic product manufacturing industries exploited in combination with ANN and GA to optimise wire bond design
several DM methods with the purpose of classification. RST rule parameters. Moreover, RSM was used for robust design of VLSI cir-
induction was used in integrated circuit manufacturing (Kusiak, cuits (Ilumoka, 1998) and for optimising the manufacturing
2000), in PCB printed circuit board manufacturing (Kusiak & Kur- parameters in light-emitted diode (LED) packaging processes,
asek, 2001), and in motherboard assembly (Huang et al., 2006). where association rules were also used, for dynamic parameter
Furthermore, Yang and Tsai (2002) employed neuro-fuzzy systems adjustments (He, Li, & Qi, 2007).
(NFSs) in a surface mount technology assembly process and Metal product manufacturing industries employed GA for opti-
Georgilakis and Hatziargyriou (2002) utilized DT, ANN, and entro- misation of a grinding process (Brinksmeier et al., 1998), optimisa-
py network (EN) for increasing the classification success rate of tion of spot welding parameters (Hamedi et al., 2007; Tseng, 2006),
transformer iron losses. In an effort to improve wafer cleaning pro- for determination of cutting parameters in machining operations
cess Braha and Shmilovici (2002) used SOM, DT, ANN, and combi- (Cus & Balic, 2003), for obtaining optimal process conditions in
nation of multiple classifiers (CMC). To analyse final product pressure die casting (Krimpenis et al., 2006), and for optimising
quality ANN was used in Chen et al. (2007). DT was used in Kang burr size in drilling stainless steel (Karnik et al., 2007). Chong,
et al. (1999) to generate better operating conditions and to im- Albin, and Jun (2007) used patient rule induction method (PRIM)
prove the yield (Baek et al., 2005; Chien, Li, & Jeang, 2006; Chien based rule induction for determining optimal settings of process
et al., 2007; Li et al., 2006). Moreover, DT was also used for drop variables in a steel making process. ANN was used to control a
test analysis of portable electronic products (Zhou, Nelson, Xiao, hot rolling mill (Cser et al., 2001), to optimise a metal inert gas
Tripak, & Lane, 2001) and for classification in integrated circuit welding process (Meng & Butler, 1997; Tay & Butler, 1997), to opti-
manufacturing (Maimon & Rokach, 2001; Rokach & Maimon, mise cutting parameters in a turning process (Zuperl & Cus, 2003),
2006). Lu (2001) proposed ways of first reducing the massive data to optimise a steel manufacturing process (Liu, Tang, Fan, & Deng,
sets into smaller size data, then using traditional methods such as 2004), and to optimise a sintering process (Zhang, Xie, & Shen,
ANN for identifying and classifying semiconductor and electronics 2007). On the other hand, Lin and Wang (2000) used simulated
process quality problems. annealing (SA) for a surface roughness and error of roundness
Classification applications in the metal product manufacturing study in turning operations, Olabi et al. (2006) used TM for opti-
industries were not as many as those of prediction. DTs were used mising a laser welding process, Lee and Dornfeld (2007) used TM
in an aluminium coating process (Baek, Kim, & Kim, 2002), in in a cutting process. In optimising electrical discharge machining
assembly of sheet metal products (Lian et al., 2002), in casting pro- processes, Lin, Wang, Yan, and Tarng (2000) and Lin, Lin, and Ko
cesses (Bakır et al., 2006), and in assembly of automobiles (Wang, (2002) used fuzzy based TM. Furthermore, Lou and Huang (2003)
2007). Furthermore, Shahbaz, Srinivas, Harding, and Turner (2006) employed FL in an automotive coating process.
considered DT and association rules in a fan blade manufacturing Other studies optimising process parameters for use in QI stud-
process. ANN had also some applications; in hot rolling (Cser et ies are as follows: GA was used to optimise process parameters of a
al., 2001), in tinplate manufacturing (De Abajo et al., 2004), in pres- particle board manufacturing process (Cook et al., 2000), to deter-
sure die casting (Krimpenis et al., 2006), and in a cutting process mine the optimal values of process parameters in plastic injection
(Lee & Dornfeld, 2007). On the other hand, Perzyk et al. (2005) em- moulding (Kurtaran et al., 2005; Shen et al., 2007), to minimize
ployed naïve Bayesian classifier (NBC) in a steel casting process warpage of thin shell plastic parts (Ozcelik & Erzurumlu, 2006),
and Tseng et al. (2005) employed RST rule induction in a machin- and to determine parameter settings in a silicon compound manu-
facturing process (Li et al., 2003a). Other optimisation methods vantages. S-based methods used in the classification/predict-
used are, ANN for optimising multiple quality characteristics of a ing quality tasks take the advantage of statistical theory
polymerization process (Chiang et al., 2002) and for optimising which leads to ‘statistical inference’. Nevertheless, the theory
process parameter values in a wax model rapid prototyping pro- they are based on requires distributional assumptions, which
cess (Vosniakos, Maroulis, & Pantelis, 2007). For a slider manufac- may be hard to be validated for multidimensional data. This
turing process, Ho et al. (2006) also considered ANN together with may necessitate the help of a human expert. In contrast to
fuzzy rule sets (FRS), and expert systems (ESs). Holena and Baerns the S-based methods, computational methods provide ‘com-
(2003) considered sequential quadratic programming (SQP) for putational inference’ without expressing any probability. In
approximating the dependency of propane yield on catalyst com- spite of these shortcomings, it is easier with computational
position. MLR and nonlinear programming (NLP) for optimising a methods to develop automatic models without a significant
textiles batch dyeing process was also considered in Köksal, Smith, human intervention.
Fathi, Lu, and McGregor (1998). Other methods that are found to be more successful than
MLR are FAN (Jiao et al., 2004), RST (Tseng et al., 2005),
4.3. Reported performance of the DM methods TSA, CBR (Kim & Lee, 1997), and GSA combined with ANN
(Yin & Yu, 2006). Findings of the other comparative studies
The reviewed applications can be categorised into three groups for prediction related to the ANN can be listed as follows:
according to the performance of the DM methods reported by ANN with multi-layer perceptron (MLP) and BP outper-
them: forms RBF (Hamedi et al., 2007), but requires long com-
puting time.
1. In the majority of the studies, one or more of the method(s) ANN with multiple inputs is more successful than ANN
are used to accomplish the aimed quality task(s). In such with a single input (Mathews & Shunmugam, 1999).
studies, usually successful results are reported. These typi- According to Guessasma et al. (2004) for an ANN, the
cally involve a DM method such as ANN (e.g. Cool et al., most adequate architecture, learning paradigm, transfer
1997; Suneel et al., 2002; Yang et al., 2005), DT (e.g. Mieno function and the error function are the multilayer normal
et al., 1999; Tsuda et al., 2000), NLR (e.g. Feng & Wang, feedforward quick propagation, sigmoid functions and the
2002), FR (e.g. Ip et al., 2003; Xue et al., 2005), Bayesian neu- MSE, respectively.
ral network (BNN) (e.g. Vasudevan et al., 2002) or IFN (e.g. The most apparent finding based on a comparative study
Last & Kandel, 2001) for predicting quality; DT (e.g. Chien for classification related to the ANN is stated by Perzyk
et al., 2006; Kang et al., 1999), probabilistic neural network et al. (2005) as follows:
(PNN) (e.g. Tam et al., 2004), RST (e.g. Kusiak & Kurasek, NBC, an S-based method improved through Bayesian clas-
2001) or NFS (e.g. Yang & Tsai, 2002) for classifying quality; sifier (BC) by assuming class conditional independence,
TM (e.g. Chang et al., 2007; Teng & Hwang, 2007), GA (e.g. performs better than ANN.
Cus & Balic, 2003) or ANN (e.g. Hsieh & Tong, 2001) for In addition to ANN, performance of DT is also compared to
parameter optimisation; apriori (e.g. Da Cunha, Agard, & those of other methods. DT is found to be more successful
Kusiak, 2006) and an agglomerative clustering method (e.g. than MLR/GLZ for prediction/classification (Bakır et al.,
Hu & Su, 2004) for quality description. 2006; Skinner et al., 2002). The commonly used DT meth-
Alternatively, a method is used for achieving several quality ods (such as C4.5, C5.0) can deal with missing and contin-
tasks. For example, DT is used both for describing and classi- uous type data. In addition, these methods generate if-
fying quality (e.g. Baek et al., 2005); ANN is used for both then type of rules that are not provided by ANN methods.
predicting quality and parameter optimisation (e.g. Chiang However, they also suffer from ‘overfitting’ that may
et al., 2002) and also used for both classifying and predicting occur due to modelling the noise in the training data.
quality (e.g. Sadeghi, 2000). Other methods compared to DT for classification and their
In some other studies, several DM methods are used together respective performances are stated as follows:
to accomplish the quality tasks. For example, while ANN is GP is more successful than DT (Li et al., 2006).
used for predicting quality as a part of a parameter optimisa- Attribute decomposition approach (ADA), NBC, and DT
tion study, GA (e.g. Karnik et al., 2007; Shen et al., 2007), TM have the best, better and the worst performances, respec-
(e.g. Olabi et al., 2006), RSM (e.g. Ilumoka, 1998) or SQP (e.g. tively (Maimon & Rokach, 2001).
Holena & Baerns, 2003) is used for finalising the optimisa- Breadth-oblivious-wrapper (BOW) is more successful
tion. In another application, after SOM is used for classifying than NBC and DT (Rokach & Maimon, 2006).
quality, ANN is used for parameter optimisation (e.g. Cser et All DT algorithms mentioned above perform well for small
al., 2001). to medium sized databases. There are also algorithms
2. In another group of studies, several well-known DM meth- developed particularly for large databases such as SLIQ
ods are used as alternatives for accomplishing the same and SPRIT, which might be used in future QI studies.
function and their performances are compared. The most fre- Other findings of comparative studies for prediction can be
quently used DM method in this group is the ANN. Its perfor- summarised as follows:
mance is mostly compared to the performance of the MLR is more successful than SVM for prediction (Brudzew-
classical statistical modelling method MLR. These compari- ski, Kesik, Kolodziejczyk, Zborowska, & Ulaczyk, 2006).
son studies indicate that ANNs are more successful than FNN outperforms polynomial network (PN) and ANFIS for
MLR modelling for prediction (e.g. Deng & Liu, 2002; Kim prediction (Ho et al., 2002).
et al., 2003; Rallo et al., 2002). Better performance of ANNs Findings of comparative studies related to optimisation
can naturally be observed in multidimensional data since are as follows:
these are powerful tools in modelling nonlinear relationships ANN with BP learning algorithm is more accurate, but
(Fu, 1994). Yet, there is another study where both ANN and more time consuming than ANN with RBF (Zuperl & Cus,
MLR perform equally well (Dhond et al., 2000). We should 2003).
note here that both S-based and computational methods, PRIM-based rule induction is more robust compared to
such as ANN and DT, have their own advantages and disad- the functional approach (Chong et al., 2007).
Grey relational analysis (GRA) is more straightforward quality (39%). It is used relatively less for classification of qual-
than fuzzy-based TM method (Lin, Lin, & Ko, 2002). ity (22%). Besides, there is only one study (2%) that utilises the
ANN (Gaussian, RBF, GD) is successful, but quite time con- clustering function for parameter optimisation.
suming (Tay & Butler, 1997). Optimisation is only used in performing the quality task param-
eter optimisation.
3. In a relatively smaller group of studies, either a new method
Association is the most rarely used DM function (1%) compared
is developed to obtain a better performing one (e.g. Baek et
to the others, and this is used for both description and classifi-
al., 2002; Liao et al., 1999a; Liao et al., 1999b) or two or more
cation of quality.
known methods are used in combination to improve the per-
formance. Findings related to the combined methods are
Table 9 presents the number of times each DM method is used
given as the following:
to perform a DM function for a quality task in either a stand-alone
FST and RST combined is more successful than RST alone
or a combined manner. If a DM method used as a part of a com-
for classification (Hou & Huang, 2004).
bined/hybrid approach to accomplish a DM function indirectly
SE is more successful than spatial statistics integrated
serving to the eventual goal of the study, this method is counted
with adaptive resonance theory neural network (ARTNN)
in parentheses in the table. With the help of this type of display,
for clustering (Hsu & Chien, 2007).
one can easily see which (how many) DM functions (methods)
Support vector classifier (SVC) and DT combined is more
are involved in performing each quality task directly or indirectly.
rapid and SVC has better generalisation properties for
Table 9 does not contain the summarisation DM function, since
classification (Jemwa & Aldrich, 2005).
it is explicitly stated in only two of the applications (Chien et al.,
GSA in combination with ANN is more successful than
2007; Ordieres Meré et al., 2004) where summarisation techniques
MLR for prediction (Yin & Yu, 2006).
such as descriptive statistics and scatter plot matrix are used to ex-
CMC (boosting and stacked generalisation) is more suc-
plore the data.
cessful than the performances of individual methods DT,
Regarding the description of quality task, it can be observed
ANN and SOM for classification (Braha & Shmilovici,
from Table 9 that clustering, especially by SOM, PCA, agglomera-
tive clustering, and k-means (e.g. Batmaz, 2007; Sebzalli & Wang,
Hybrid DT and ANN classifier (HDTNNC) performs better
2001; Skinner et al., 2002) and classification, by DT and RST (e.g.
in terms of optimal time and accuracy compared to EN,
Bertino et al., 1999; Huang et al., 2006) were relatively common.
DT and ANN for classification (Georgilakis & Hatziargy-
Furthermore, this task is generally followed by a classification of
riou, 2002).
quality task using, for example DT or SVM (e.g. Baek et al., 2005;
Fuzzy-rule based TM is found successful for optimisation
Brudzewski et al., 2006), or by a predicting quality task using
(Lu & Antony, 2002).
ANN or DT (e.g. Chiang et al., 2002; Chien et al., 2007). Predicting
ANN used in combination with GA (e.g. Hamedi et al., 2007;
quality and classification of quality are the most frequent quality
Krimpenis et al., 2006; Vosniakos et al., 2007), used in com-
tasks observed in the literature. Typical classification/prediction
bination with TM (Olabi et al., 2006) and used in combina-
methods are generally used correspondingly for the tasks classifica-
tion with RSM (Ilumoka, 1998) produce good results.
tion/predicting quality (e.g. Baek et al., 2002; Bakır et al., 2006, etc./
Cherian et al., 2000; Deng & Liu, 2002, etc.). However, it is observed
In spite of the long list of DM methods used for different DM
that clustering and/or association methods are also used for classi-
functions, there are still some methods not found a use in QI appli-
fying/predicting quality (Chien et al., 2007; Shahbaz et al., 2006;
cations. Among these, robust regression (RR) and multivariate
Skinner et al., 2002). Additionally, some other applications used
adaptive regression splines (MARS) worth mentioning, since they
clustering methods with classification methods for classifying qual-
are found to be successful, especially for modelling complex rela-
ity. More specifically; Brudzewski et al. (2006) used a clustering
tionships (Kartal, 2007).
method to determine the number of classes in data and then used
Knowledge obtained by DM applications might be hard to inter-
this number in the SVM classification method. Hsu and Chien
pret and put in use, especially by industry people (Harding,
(2007) and Lian et al. (2002) used a clustering method to determine
Shahbaz, Srinivas, & Kusiak, 2006; Kusiak, 2006; Wang et al.,
special patterns (groups) in the data, and then used DT methods to
2007a). However, this has not been a concern raised strongly by
identify the root causes that lead to the identification of these
the applications, since most of these applications were made by
groups. Tsai, Chiu, and Chen (2005) used a clustering algorithm in
‘‘experts’’ as can be deduced from the authors’ academic identities.
relation with CBR classification method to decrease processing time
of the classification, and Sarimveis et al. (2006) used FCM to allocate
4.4. Patterns of DM function and method usage in quality tasks the input data into different clusters and then an ANN structure is
automatically formulated by assigning each cluster to a hidden
The ordering (or hierarchy) of DM functions as well as DM layer node. Clustering methods had also some use with prediction
methods used to perform a quality task is referred to as ‘patterns’. methods for the predicting quality task. For instance, Ordieres
Different patterns followed in the applications are analysed and Meré et al. (2004) determined nonhomogenous groups by cluster-
findings are presented in this section. ing methods and then developed different models such as MLR
An analysis of DM function usage for each quality task (Fig. 2) and ANN, for each group separately, whereas Rallo et al. (2002) bur-
indicates the following: ied a clustering method into the prediction methods (i.e. fuzzy pre-
dictive adaptive resonance theory neural network (Fuzzy
Prediction is the most common (44%) DM function used in the ARTMAPNN), dynamic radial basis function neural network (Dy-
applications. namic RBFNN)). On the other hand, some DM methods were used
Classification is the second most commonly (25%) used DM in combination for classification of quality. For example, ANN and
function, and it is typically used (86%) for classification of DT (e.g. Georgilakis & Hatziargyriou, 2002), DT and SVM (Jemwa
quality. It is rarely considered for description of quality (8%), & Aldrich, 2005), RST and linear programming (LP) (Kusiak, 2000),
predicting quality (4%), and for parameter optimisation (1%). and RST and FST (Hou & Huang, 2004) were used in combination.
Clustering is used in 15% of all of the used DM functions, and it Some patterns have also been observed in the parameter opti-
is commonly used for quality description (37%) and predicting misation studies. For instance, several studies directly used ANN
13460 G. Köksal et al. / Expert Systems with Applications 38 (2011) 13448–13467
DM functions Approach Method Description of quality Classification of quality Predicting quality Parameter optimisation
Clustering S-Based PCA (1) 1 (1) (2)
NN-Based SOM (1) 1 (1)
Others Agglomerative 2 1 (4)
FCM (1) (2) (2)
k-means 2 (1) (3)
Modified k-means 1 (2)
MTM 1 (1)
PAM 1 (2)
Rule Induction (1)
SE (1) (1)
Single linkage 1
Spatial Stats integrated with ART ANN (1) (1)
Unspecified 1 (3)
Association Others Apriori 1
Unspecified 2 1
Classification S-Based GLZ 2
DT-Based C4.5 1 9
C5.0 1
CART 1 1
ID5R 1
Statistical Batch Based 1
Unspecified 1 7 (1)
NN-Based BNN 2
LVQ 1 (1)
MLP with BP (GD) 5
MLP with BP(LM) 1
Others ADA 1
Boosting 1
DNN-Based 1
DT and ANN 2
EN 1
FDSS based on GA 1
FST and RST 1
GA and RST 1
GP (1)
RST 2 6
RST and LP 1
SVM and DT 1
TM and ANN 1
Prediction S-Based ANOVA 1
BBN 1 1
MLR 16 (1)
NLR 6 (1)
DT-Based CART 4
ID3 1
Unspecified 2
NN-Based ADNN 1
ComputNN 1
GRNN 3 (2)
MANN 1 (1)
MLP with BP(GD) 29 (2)
MLP with BP (SA) 1
MLP with BP(CG) 1
MLP with BP(LM) 3 (3)
G. Köksal et al. / Expert Systems with Applications 38 (2011) 13448–13467 13461
MLP with BP (unspecified) (6)
PN 1
RBF 5 (2)
Recurrent NN Feedforward BP 1
Unspecified 6 (3)
Others AN 1 (1)
FR 3
FST (1)
FST and ANN 1
GP 1
SO and MLP 1
TM and ANN 1
Optimisation S-Based RSM 1
TM 3
NN-Based ADNN 1
MLP with BP 4
Unspecified 3
Others ANN, Fuzzy Rule Sets and ES 1
ANN and GA 13
ANN and TM 1
TM, ANN and GA 1
TM and FL 3
RSM and MANN 1
Fuzzy Rule Sets 1
GA 1
GA and NLR 1
GA and FST 1
NLP and MLR 1
NLP and ANN 1
PRIM-Based rule induction 1
SA and ANN 1
for parameter optimisation (e.g. Hsieh & Tong, 2001; Meng & But- spreadsheets, databases, statistical software, DM software, special
ler, 1997; Zhang et al., 2007), while some others considered ANN purpose software, and high-level languages.
for predicting the fitness function values in optimisation by GA Well known statistical software packages; MINITAB™, SAS™,
(e.g. Brinksmeier et al., 1998; Hamedi et al., 2007; Kim et al., SPSS™ and STATISTICA™ were preferred for implementing S-based
2001; Ozcelik & Erzurumlu, 2006). In addition, ANNs were also methods such as MLR, GLZ, ANOVA and PCA. Similarly, for data
used for predicting the response variable in optimisation by RSM preparation and preprocessing, some studies preferred the well
(Ilumoka, 1998), SA (Lin & Wang, 2000), NLP (Lou & Huang, known spreadsheet-application Excel™, database management
2003), and TM (Hung, 2007; Olabi et al., 2006). Similarly, it was ob- system MS ACCESS™ or ORACLE™. However, the other studies
served that TM for parameter optimisation was either used alone mostly utilised the general purpose software MATLAB™, high level
(e.g. Chang et al., 2007; Teng & Hwang, 2007) or combined with languages such as C/C++, or various special purpose software pack-
FL (e.g. Lin, Wang, Yan, & Tarng, 2000; Lu & Antony, 2002). ages. DM software packages such as SPSS Clementine™ and SAS/
The use of more efficient clustering methods such as CLARA, or EM™ were only used in a few of the applications.
CLARANS for handling large data sets has not been observed in the MATLAB™ was used in various applications for clustering, predic-
reviewed applications. Likewise, advanced clustering methods, tion, classification and optimisation, due to its appropriateness for
such as density-based (e.g. DBSCAN, DENCLUE), grid-based (e.g. developing domain specific solutions, supported by several open
STING, WaveCluster), and combined methods (e.g. CLIQUE that source toolboxes. Even though scalability might not be achieved
adopts both grid- and density-based methods) have not been used. with MATLAB™ (Kwon, Omitaomu, & Wang, 2008), the sizes of data
This might be due to the fact that these advanced methods require in the QI applications were not too large to present this as an issue.
more elaborate approaches. In some applications of DM methods, e.g. as ANN and DT, the
programs were developed from scratch by using high level lan-
guages such as C/C++ for flexibility in handling and analysing the
4.5. Software used for mining QI data data according to the particular methodologies followed by the
Different computational tools shown in Table 10 have been Special purpose software, on the other hand, may be beneficial
used with various purposes in mining quality data. These include and more affordable, if the quality problem is clearly defined and
understood. For example, BrainMaker™ is a popular ANN tool 5. Conclusions and future research directions
which can be used with different data sources. It provides an op-
tional package based on GA to assist users to find the best network. This paper presents the results of a comprehensive review of
NeuralWorks™ and NeuroShell™ are the other special purpose soft- DM applications in manufacturing industries to the selected QI
ware preferred for ANN modelling and analysis. For association problems. Based on these results, the following concluding re-
analysis, Mine Set™ and Q-Yield™; for fuzzy applications, Fuzzy- marks can be stated, each remark addressing a research question
TECH™, FormRules™ and Fuzyy-Flou; for clustering, Rosetta, and of this study listed in the first section:
for DT applications C5.0 were used. Furthermore, some special pur-
pose software such as ProCAST™ and ATLaS allowed collecting data 1. Data characteristics and data handling: An interesting observa-
through simulation of the experimental conditions. These data tion regarding the DM applications in the literature is the
were then analysed using some other appropriate software for amount of records used in the studies. Although some studies
the chosen DM methods. reported applications using over 10,000 records, in many cases
Use of commercial or open source DM software seems to be lim- the number of records was less than 1000, typically collected
ited compared to the general purpose and special purpose ones. through statistically designed experiments. Even though effec-
Among them, SPSS Clementine™, SAS/EM™ and TANAGRA were tiveness of most DM methods depends on the size of the data
used. SPSS Clementine™ and SAS/EM™ are the most complete pack- set, this has not been reported as a concern in the applications
ages and both dominate the market (Dunham, 2003; Goebel & based on small data. Furthermore, in considerable number of
Gruenwald, 1999; Haughton et al., 2003; Rexer Analytics, 2008). the cases, separately stored production and quality data had
Their limited use observed within the scope of this review can be to be combined to obtain an appropriate data set. Manufactur-
attributed to their being less affordable compared to general pur- ing organisations with well established and integrated data col-
pose and special purpose software packages, as well as their limi- lection systems benefit to a larger extend from DM.
tations in handling the quality problems in an unconventional way, In many of the studies reviewed, data preprocessing and the use
which might be preferred by expert users. of some descriptive DM methods are not reported at all, even
though the success of a DM study depends heavily on the suc- 4. Patterns of DM function and method usage: DM methods are
cess in preprocessing of the data. The comprehensive summary either used as stand-alone or more interestingly as combined
we provide on data handling practices can be used as a refer- (hybrid). Prediction is the most frequently used DM function
ence for future studies. observed in the literature, alone or as a prior/integral part of
2. DM applications according to quality tasks and manufacturing optimisation. In most of the applications ANNs were used for
industries: It has been observed that there is an increasing trend prediction purposes as well as MLR in considerable number of
in the use of DM algorithms for QI. This can be attributed to the the others. DTs or ANNs, on the other hand, were preferred
availability of massive data sets in some manufacturing for classification. For optimisation, several studies directly used
domains and the need to improve processes with the intense ANNs while some others considered ANN for predicting the fit-
competition in the industries. ness function values in optimisation by GA.
Analysis of the literature also indicates that DM applications 5. Software tools: Among various computational tools used in the
within the context of this study were mostly encountered in reviewed applications well known statistical software packages
the metal, computer and electronic products manufacturing such as MINITAB™, SAS™, SPSS™ and STATISTICA™ were pre-
industries, and relatively less observed in plastics, glass, paper, ferred for statistical-based methods. Similarly, well known
food processing and chemical manufacturing industries. DM spreadsheet-applications (Excel™) and database management
applications can find more application domains in the other systems (MS ACCESS™, ORACLE™) were utilised for data han-
industries. However, technical training requirements to suc- dling. The other studies mostly preferred the general purpose
cessfully implement a DM study and costs of software may be software MATLAB™, high level languages such as C/C++, or var-
two major factors that prevent widespread use of the methods. ious special purpose software packages, for their flexibility and
In the literature regarding product and process quality descrip- adoptability.
tion, extracted knowledge from these studies were frequently 6. Other conclusions and directions for future studies: An important
used to accomplish the eventual goals such as quality classifica- necessity for industrial applications is speed in learning and
tion, prediction, and/or optimisation. Many applications that developing the DM models and their solutions, due to continu-
involve predicting quality task were again from the metal, com- ously changing customer and technical requirements in manu-
puter and electronic products manufacturing. Furthermore, facturing industries. 6r projects have been largely successful
ANN was the most widely used DM method in those domains. mainly because of successful training of the industry people
Applications involving classification of quality were not as for the use of the methods with the help of affordable and user
many as those in the predicting quality category. ANN and DT friendly software. For more effective and wide-spread use of the
were common methods in those applications. Parameter opti- DM approaches, DM software with comparable qualities needs
misation tasks were generally performed after predictive mod- to be developed. This, in turn, necessitates availability of more
els were developed for a quality output. These tasks mostly robust, easy to learn and implement data handling and DM
aimed for finding optimal levels of process/product parameters approaches for QI problems. Such software should also have
that consistently yield target quality. ANN, GA, and TM were the capability to help users to select the most appropriate
commonly used in parameter optimisation. methods for the problem, and to interpret the results obtained
3. Reported performance of the DM methods: Performance compar- from the applications. This study can guide researchers and
isons of a few DM methods with each other, as well as with tra- software producers in their effort to develop/further improve
ditional statistical methods, are also common in the literature. their methods and tools by providing them with critical infor-
With a relatively small number of records and variables, tradi- mation on typical characteristics of QI data collected, neces-
tional statistical methods may still provide valuable informa- sary/most preferred DM functions and methods, and expected
tion. results.
In many of the comparative studies, for prediction ANN, DT,
FAN, RST, TSA, CBR and GSA used in combination with ANN
were found more successful compared to MLR. On the other Acknowledgements
hand, for classification of quality, NBC, GP, BOW, SVM, ADA
and CMC outperformed ANN or DT. A few studies comparing This study is supported by Turkish Scientific and Technological
performances of ANN and DT indicated that ANN performed Research Institute through the contract 105M138 and by Middle
better than DT for both prediction and classification. Moreover, East Technical University through the contract BAP-2006-07-02-
findings support the fact that individual classifiers’ accuracy can 06. The authors thank both project teams especially to Fatma
be improved by combined/hybrid use of different DM methods. Güntürkün and Berna Bakır for their contributions to finding and
Furthermore, for parameter optimisation, ANN used in combi- classifying the relevant survey papers.
nation with GA were found as successful.
In QI studies, robustness (to noise in data, mixed types of vari-
ables and complexity) and ease of modelling are important con- Appendix A. Abbreviations
sidering that the practitioners are typically not experts in data
handling and DM. Even though there exist potentially useful
approaches such as RR and MARS, which are strong in these ADA attribute decomposition approach
respects, they have not found a use in the reviewed literature, ADNN adaptive neural networks
yet. Although appeared only in a few studies, FR also seems to AI artificial intelligence
be a promising approach for modelling quality data based on AN abductive network
measurements involving uncertainty or semantic scale usage ANFIS neural fuzzy inference systems
due to SE involved in their collection procedure. ANN artificial neural network
Interpretation and usage of the knowledge obtained by DM ANN-based artificial neural network based
applications might be hard. However, the application studies, ANOVA analysis of variance
mostly performed by academic experts have not given this
problem due consideration. (continued on next page)
