10 Brozyna Co

See discussions, stats, and author profiles for this publication at: https://www.researchgate.
net/publication/301343044
Statistical methods of the bankruptcy prediction in the logistics sector in

Poland and Slovakia
Article in Transformations in Business and Economics · April 2016
CITATIONS READS
29 816
3 authors:
Tomasz Pisula Jacek Brożyna

Rzeszów University of Technology Rzeszów University of Technology
45 PUBLICATIONS 118 CITATIONS 28 PUBLICATIONS 220 CITATIONS
SEE PROFILE SEE PROFILE
Grzegorz Mentel
Rzeszów University of Technology
85 PUBLICATIONS 659 CITATIONS
SEE PROFILE
Some of the authors of this publication are also working on these related projects:
Risk Management View project
Renewable energy and economic development in the European Union View project
All content following this page was uploaded by Grzegorz Mentel on 29 April 2016.
The user has requested enhancement of the downloaded file.

J. Brozyna, G. Mentel, T. Pisula ISSN 1648 - 4460
Simulation and Evaluation of Business Economic Factor
---------TRANSFORMATIONS IN --------
Brozyna, J., Mentel, G., Pisula, T. (2016), „Statistical Methods of the BUSINESS & ECONOMICS
Bankruptcy Prediction in the Logistics Sector in Poland and Slovakia”,
Transformations in Business & Economics, Vol. 15, No 1 (37), pp.93- © Vilnius University, 2002-2016
© Brno University of Technology, 2002-2016
114. © University of Latvia, 2002-2016
STATISTICAL METHODS OF THE BANKRUPTCY

PREDICTION IN THE LOGISTICS SECTOR IN POLAND
AND SLOVAKIA
1Jacek Brozyna 2Grzegorz Mentel 3Tomasz Pisula

Department of Quantitative Department of Quantitative Methods Department of Quantitative Methods
Methods Faculty of Management Faculty of Management
Faculty of Management The Rzeszow University of The Rzeszow University of
The Rzeszow University of Technology Technology
Technology Adress: Adress:
Adress: Poland Poland
Poland E-mail:gmentel@prz.edu.pl E-mail:tpisula@prz.edu.pl
E-mail: jacek.brozyna@prz.edu.pl
1
Jacek Brozyna, in 2001, he was awarded with a degree of Master of
Science in Engineering, in 2003, Bachelor’s degree in Marketing and
Management and PhD. in 2011. He has been working inthe Department
of Quantitative Methods of the Faculty of Management at the Rzeszów
University of Technology since 2002. He is the author of more than
dozen of publications in the field of using statistical and computer
methods in engineering and financial issues. He is a specialist in data
analysis and forecasting.
2
Grzegorz Mentel, he has worked in the Department of Quantitative
Methods of the Faculty of Management at the Rzeszów University of
Technology since 2000. He has been working as a lecturer since 2007.
In the years 2010-2012, he has been the branch manager of Bank
Pocztowy S.A. in Rzeszow. He is the author of 43 publications in the
field of finance and capital markets, including 2 monographs about the
risk of financial instruments. He is a specialist in the risk management
of securities, fundamental analysis, technical analysis, multivariate
analysis and forecasting.
3
Tomasz Pisula, in 1993, he earned a degree of Master of
Mathematics, and in 1999, has was awarded with PhD. He has been
working in the Department of Quantitative Methods of the Faculty of
Management at the Rzeszów University of Technology since 1993. He
is the author of more than ten publications in the field of finance and
capital markets. He is a specialist in the risk management of securities,
fundamental analysis, technical analysis, multivariate analysis and
forecasting.
TRANSFORMATIONS IN BUSINESS & ECONOMICS, Vol. 15, No 1 (37), 2016

Received: December, 2014 ABSTRACT. The fundamental subject matter of this publication is the
1st Revision: January, 2015 analysis of the issue of bankruptcy in the context of appearance of possible
2nd Revision: February, 2015 threat signals. The presented research aims at proving validation values of
Accepted: April, 2015 described models in predicting possible bankruptcy signals and evaluating
the financial condition of the TSL sector (transport, spedition, logistics)
entities from Poland and Slovakia. In order to predict the risk of company
bankruptcy from the logistics sector, the following statistical models of
bankruptcy classification were used: classic linear discriminant analysis
and logistic regression. What is more, the predictions based on the so-called
classification trees and the method of nearest neighbours was applied. The
empirical verification of correct classification by given groups of methods
of statistical bankruptcy analysis from the perspective of their efficiency
showed that these methods could be characterized with a high quality of
bankruptcy prediction. The presented concepts allow evaluating quite easily
the threat of bankruptcy for a given group of entities. One vital advantage
of the presented results is the fact of dividing research sample into the so-
called learning group, for which parameters of the analysed models were
estimated, and the test sample researching effectiveness of proper
classifications, for which all the predictions were set for a period of both one
and two years before the bankruptcy.
KEYWORDS: bankruptcy prediction, artificial intelligence, risk

modelling.
JEL classification: G33, F3.
Introduction
During the recurring financial crises or lesser or greater turmoil on the markets, the
stable existence of a company and its possibly growing income with each year often create a
vision which cannot be brought into life. Large number of companies struggle with many
problems, starting from troubles when obtaining a bank loan and ending with bad debts. Certain
situations will finally reveal the first symptoms of bankruptcy for companies, which in the
course of time may lead to bankruptcy.
The bankruptcy itself is not something, which appears suddenly, but it is a process taking
place over a longer period of time. Hence, it is possible to observe the worsening financial
condition of the entities. Thus, defining the above mentioned symptoms of incoming company
crisis can help to detect them in advance. Such approach has great importance, because it
provides the company with time to take proper measures.
The researching of bankruptcy is very important. Therefore, all the measures and
methods that allow to predict possible threats are much desired. However, it should be
remembered that the prepared research and applied predictive models should not introduce
excessive aversion to the conducted actions. Thus, this type of publications and attempts to
implement some models may give the managers a tool allowing to make more proper decisions.
In this article, four types of models from both statistical and non-statistical models
groups are described. Entities from the logistics sector active in the Podkarpacie region and the
others active in the Slovakian market were analysed. The companies from this sector are in the
area of interest of the authors of this study because of its importance for the economy of the
region and country. An essential division into two groups of companies was made, namely the
so-called healthy companies, not threatened with bankruptcy and a group of ill companies, in

the case of which a bankruptcy has been announced, or the insolvency proceedings are being
conducted. The estimation of parameters for each model was made on a learning sample;
whereas, their verification was conducted on the basis of companies classified into the test
sample.
1. The Analysis of Chosen Literature Concerning the Research on the Companies’

Bankruptcy Risk
An exhaustive analysis of works concerning the issue of predicting companies’

bankruptcy risk can be found in the work of Kumar, Ravi (2007). The authors analysed 128
publications concerning the issue of predicting companies’ bankruptcy risk, which were
published in a period from 1968 to 2005. The analysis of publications has been conducted from
a perspective of usage of statistical methods and artificial intelligence methods to solve the
problems related with predicting companies’ and banks’ bankruptcy risk.
Most publications concerned the research of bankruptcy risk for companies (both listed
and non-listed). Out of 128 analysed publications, only dozen concerned the research on bank
bankruptcy risk. In some publications, both companies and banks bankruptcy models have been
researched.
The volume of research sample used by various authors in their research is very diverse
(it ranges from 24 up to even 8977). Similar time periods for the used financial data were
diversified and included different time periods from 1997 to 2003 with a time horizon from one
year up to even a couple of years.
The usage of research techniques by various authors is as well very diverse. In research
of this type, some successfully use both statistical methods, such as: discriminant analysis,
models of logistic regression, decision trees and methods of nearest neighbours as well as the
methods based on the optimization algorithms of operational research or methods of artificial
intelligence, such as: neural networks, theory of rough sets, mathematical programming, genetic
algorithms, etc.
Source: created by the authors.

Figure 1. The Division of Research Methods Used for the Risk Assessment of Companies’ Bankruptcy

The most commonly used statistical techniques to research companies’ bankruptcy are
based on the discriminant analysis, logit models and decision trees. Nowadays, they are very
rarely used as a sole model and only research method. They are used rather as a comparison
model, in relation to the other non-statistical models, or as a component models in hybrid
approaches. The division of research models used to predict bankruptcy has been presented in
Figure 1.
One vital issue in regards to application of statistical methods to predict companies’
bankruptcy is the work of Altman et al. (1977), where for the first time, the authors introduced
a new model of classification of bankrupted companies, which they named “Zeta analysis”. The
model was estimated on the basis of data from 111 companies and included 7 diagnostic
variables. General effectiveness of correct classifications for this model was 96% for the data
one year before the bankruptcy period and only 70% for the data 5 years before the bankruptcy.
The usage of statistical methods to evaluate the bankruptcy risk can be as well found in the
works of many authors. Martin (1977) estimated the model of logistic regression for predicting
bank bankruptcy on the basis of data derived from the USA Federal Reserve Bank.
Ohlson (1980) proposed a model of logistic regression to scrutinise company’s
bankruptcy risk. Financial data were taken from the databases of a few financial institutions
(i.e. Moody’s, COMPUSTAT). The effectiveness of proper classification for data one year
before bankruptcy was more than 96%, while for the data two years before the bankruptcy,
more than 95%.
Karels, Prakash (1987) researched the application of linear discriminative analysis
models to research bankruptcy risk from an angle of fulfilling or not fulfilling the required
assumption of normality of distribution for financial indicators, which are bankruptcy
predictors. The model estimated by them, on the basis of random sample of 50 companies (data
from the COMPUSTAT database), had a general effectiveness of correct classifications on the
level of 96% for the non-bankrupt companies and 55% for bankrupts.
Kolari et al. (2002) introduced the so-called system of early warning for bankruptcy risk
for large banks in the USA. The system was based on the logistic regression models and had
general effectiveness of correct classifications of 96% and 95%, respectively, for the data one
year and two years before bankruptcy period.
Jones, Hensher (2004) presented the so-called Mixed Logit Model to predict the
bankruptcy of a company. It was a three-state logit model. They examined the model with three
states: 0 – company not threatened with bankruptcy; 1 – company threatened with bankruptcy;
2 – company that has announced bankruptcy. They proved that the model estimated by them
has better predictive properties than a classic multi-state logit model. The application of
decision trees for bankruptcy classification problems can be found, for example, in the works
by Marais et al. (1984) and Frydman et al. (1985).
The bankruptcy risk analysis based on the statistical and non-statistical models is
applied to the companies from all over the world, e.g., Tseng, Hu (2010) use four techniques
(logit model, quadratic interval logit model, backpropagation multi-layer perceptron and radial
basis function network) to predict bankrupt and non-bankrupt firms in England. Choi, Lee
(2013) by using the back-propagation neural network and multivariate discriminant analysis
present a multi-industry investigation of the bankruptcy of Korean companies. Fedorova et al.
(2013) apply the combinations of modern learning algorithms to identify the most effective
approach to bankruptcy prediction for Russian manufacturing companies in their paper. Korol
(2012) compares the effectiveness of twelve models for forecasting the bankruptcy risk of Stock
Exchange companies from the Central Europe and Latin America in his paper.

2. Characteristics of Financial Factors and Research Samples Used to Predict the

Bankruptcy of Logistics Companies
Information about bankruptcies of Polish companies was taken from the bankruptcy
database of Polish companies, i.e., Corporate Database EMIS information system (Emerging
Markets Information Service).
In order to predict the bankruptcy of logistics sector companies, 28 financial indicators
characterizing the financial condition and managing effectiveness of researched companies
have been chosen as bankruptcy predictors. The indicators have been divided into 5 groups:
financial liquidity indicators, profitability indicators (return on sales), indebtedness indicators
and financial leverage, operating effectiveness (proficiency) and other indicators of capital-
material structure of a company.
Statistical data of financial indicators for the Polish companies were taken from the
financial reports of companies. The following financial indicators were chosen for the research:
 Liquidity indicators (*100%): X1 - CURRENT LIQUIDITY INDICATOR: Current
assets/Short-term liabilities, X2 - FAST LIQUIDITY INDICATOR: (Current assets –
Stock)/Short-term liabilities, X3 - LIQUIDITY INDICATOR (KO/SB): Circulating capital
(working capital)/Balance sheet total = (Current assets – Short-term prepayments and accruals
- Short-term liabilities)/Balance sheet total, X4 - IMMEDIATELY DUE INDICATOR:
(Current assets – Stock – Short-term receivables)/Short-term liabilities, X5 – CASH
LIQUIDITY INDICATOR: Cash and cash equivalents/Short-term liabilities;
 Profitability indicators (*100%):X6 - OPERATING PROFIT MARGIN: Operating
result (profit-operating loss)/Net sales income, X7 – Profitability: Net profit/(Equity capital –
Net profit), X8 - RETURN ON ASSETS (Asset profitability) (ROA): Net profit/Balance sheet
total, X9 – RETURN ON EQUITY (profitability of equity capital) (ROE): Net profit/Equity
capital, X10 – RETURN ON CAPITAL: Net profit/(Assets in total – Short-term liabilities),
X11 – NET SALES PROFITABILITY (ROS): Net profit/Net sales income, X12 – GROSS
PROFIT MARGIN: (Net income from sales of goods and products and equal to them –
Operating expenses)/Net income from sales of goods and products and equal to them;
 Indebtedness indicators and financial leverage effect (*100%): X13 – GENERAL
DEBT: (Short-term liabilities + Long-term liabilities)/Balance sheet total, X14 - DEBT ON
EQUITY: Total liabilities/Equity capital, X15 – DEBT (Equity capital + Long-term
liabilities)/Fixed assets, X16 – ASSETS DEBT: Short-term liabilities/Balance sheet total, X17
– DEBT Gross profit/Short-term liabilities, X18 – DEBT (Net profit + Depreciation)/Total
liabilities, X19 – LONG-TERM DEBT: Long-term liabilities/Equity capital, X20 –
FINANCIAL LEVERAGE: Assets total/Equity capital, X21 – LEVERAGE
(DEBT/COMPANMY VALUE): Total liabilities/(Equity capital + Total liabilities – Cash and
its equivalents);
 Operating effectiveness indicators: X22 – RECEIVABLES TURNOVER [in days]:
Average short-term receivables/Net sales income *360, X23 - OBRÓT AKTYWAMI: Net sales
income/Assets *100%, X24 – STOCK TURNOVER [in days]: Stock/Net sales income * 360,
X25 – CASH CYCLE: Short-term receivables/Net sales income * 365 + Stock/Operating
expenses * 365 – Short-term liabilities (without special funds and other short-term financial
liabilities)/Operating expenses (without other operating expenses) * 365;
 Financial indicators characterizing the companies’ capital and material structure
(*100%):X26 – Equity capital/Balance sheet total, X27 – Fixed assets (without long-term
prepayments and accruals)/Balance sheet total, X28 – Fixed assets/Current assets.

The research samples were created on the basis of the collected statistical data. The
dependent variable was a qualitative dichotomous dependent variable Y defining whether a
company is a company which declared bankruptcy (Y=1 – bankrupt) or a company not
threatened with bankruptcy (Y=0 – non-bankrupt). Twenty-eight previously characterized
financial indicators were chosen as a set of entry variables (bankruptcy predictors).
Two research samples were created. The first one included these bankrupted companies
from the logistics sector and healthy companies corresponding to them, for which statistical
data for one year before the bankruptcy period was available (1-year prediction horizon). The
second research sample included the bankrupted and healthy companies for which statistical
data for two years before bankruptcy period was available (2-year prediction horizon). For each
of the research samples, one corresponding healthy company not threatened with bankruptcy
was chosen for one bankrupted company. In order to select healthy companies, the ratio analysis
has been applied, which is generally an accepted standard for the assessment of the functioning
of companies and has been used in practice for many years. Thanks to the thorough index
analysis, there were selected only those companies from the logistics sector whose indicators
pointed to a good financial condition and ability to pay its obligations.
The research sample for data one year before the bankruptcy period included 33
bankrupted companies and 33 healthy companies (statistical data for one year before the
bankruptcy was available for only that number of companies); whereas, in the case of data for
2 years before the bankruptcy period, there were 57 healthy companies and 57 bankrupted
companies. Research samples were divided randomly into two samples: the learning sample,
on the basis of which the prediction model parameters were estimated, and test sample
researching the effectiveness of correct classifications. The learning sample for one year
prediction horizon included 47 companies (23 bankrupts and 24 non-bankrupts), whereas the
test sample included 19 companies (10 bankrupts and 9 non-bankrupts). For two year prediction
horizon, the learning sample included 86 logistic companies (43 bankrupts and non-bankrupts),
whereas the test sample included 28 companies (14 bankrupts and non-bankrupts).
In order to scrutinize the influence of chosen variables explanatory variable on the
explained variable identifying the companies’ bankruptcy, a ranking analysis of predictors was
conducted. A vital issue when choosing proper predictors is as well posed by the necessity to
choose only such predictors, which have the best prognostic properties in scope of separation,
i.e., distinguishing between the bankrupt and healthy companies. When preparing a ranking of
predictors depending on their classifying power, in practice, there can be used the following
coefficients: Information Value (IV), Gini and Cramer’s V.
IV coefficient, i.e., information value of a predictor is expressed by the formula:
k
 n NB n B   niNB / nNB 
IV    i  i   ln  B 
i 1  nNB nB   ni / nB , (1)
NB
where k is the number of attributes (variability intervals) of the examined predictor, ni
B
- the number of healthy companies for i-variability interval of predictor’s value, ni - the number
of bankrupted companies for i-variability interval of predictor’s value, nNB - the total number of
healthy companies, nB - the total number of bankrupted companies.
The higher are the values of IV coefficient, the higher are the predictive power of the
explanatory variable in the scope of differentiation between healthy and bankrupted companies.
It is assumed that IV values above 0.3 point out to a strong predictive power, while the values
below 0.02 show complete lack of such predictive power.

Gini coefficient is based on Lorenz curve coefficient (for the so-called ROC curve, i.e.,
Receiver Operating Characteristic). It expresses the ratio of fields on the graph of ROC curve
(see Figure 2) which is expressed by the formula
k 1
A A
Gini    2  A  2  (0.5  B)  1  2  B  1    yi 1  yi    xi 1  xi 
A  B 0.5 i 1 , (2)
where k is the number of attributes (variability intervals) of the examined diagnostic
i n Bj
yi  
variable, j 1 nB - cumulated percent of bankrupts, for i-attribute value of variable,
NB
i n
xi  
j
j 1 nNB - corresponding cumulated percent of healthy companies.

It is assumed that the values of the Gini coefficient below 0.35 point out that the
predictor does not have a sufficient classifying ability to distinguish correctly between the
healthy and bankrupted companies.
1,0
B
Yi (cumulated percent of bankrupts)
0,8
0,6
A
0,4
0,2
0,0
0,0 0,2 0,4 0,6 0,8 1,0

Xi (cumulated percent of healthy companies) ROC

Figure 2. An Example of ROC Curve
Cramer’s V coefficient measures the dependence power between values of dichotomous

dependent variable 0-1 by defining company’s bankruptcy and values of the given diagnostic
variable. Values of this coefficient are contained in the interval between 0 and 1. It is based on
Chi-square independence measure and calculated with the formula
2
V
n , (3)
where n is the number of statistical observations, and  is statistic value for Chi-square
2
independence test between the variable 0-1 defining company’s bankruptcy and examined
indicator (predictor) of bankruptcy.
The higher are the V-Cramer’s coefficient values (closer to 1), the better are predictive
power of the examined indicator in predicting companies’ bankruptcy.
Table 1 and Table 2 present a set values of measures for the ranking of predictors
ordered according to the importance of the information value (IV) coefficient for the research
learning samples (one- and two-year bankruptcy prediction horizon).

Table 1. Indicators for rating predictors to learn the data estimated on the basis of one year before the
bankruptcy period
Coefficient Coefficient
Predictor Predictor
IV Cramer’s V Gini IV Cramer’s V Gini
X16 2,64 0,7 0,75 X13 1,4 0,8 0,82
X14 2,47 0,68 0,01 X6 1,37 0,52 0,58
X11 2,29 0,66 0,72 X3 1,24 0,8 0,85
X26 2,06 0,75 0,78 X4 1,23 0,67 0,75
X18 1,93 0,75 0,82 X22 1,13 0,47 0,21
X15 1,93 0,75 0,82 X5 1,04 0,77 0,82
X17 1,9 0,61 0,68 X7 0,77 0,59 0,34
X8 1,9 0,61 0,68 X19 0,67 0,5 0,04
X10 1,83 0,60 0,1 X2 0,63 0,74 0,71
X1 1,72 0,74 0,75 X12 0,61 0,58 0,58
X20 1,69 0,73 0,03 X27 0,61 0,37 0,2
X21 1,66 0,71 0,72 X24 0,53 0,35 0,26
X25 1,54 0,56 0,51 X28 0,1 0,15 0,1
X9 1,46 0,54 0,07 X23 0,03 0,08 0,07
Table 2. Indicators for rating predictors to learn the data estimated on the basis of two years before the
bankruptcy period
Predictor Coefficient Coefficient
Predictor
IV Cramer’S V Gini IV Cramer’S V Gini
X13 1,87 0,58 0,62 X27 0,48 0,34 0,31
X26 1,63 0,57 0,6 X2 0,46 0,46 0,33
X21 1,23 0,51 0,52 X5 0,42 0,47 0,5
X16 1,23 0,49 0,55 X17 0,42 0,31 0,33
X15 1,19 0,49 0,51 X6 0,4 0,3 0,31
X3 1,11 0,47 0,51 X19 0,38 0,29 0,21
X14 0,98 0,46 0,14 X28 0,33 0,28 0,24
X20 0,97 0,45 0,11 X18 0,32 0,44 0,45
X9 0,94 0,44 0,3 X24 0,31 0,27 0,1
X10 0,9 0,43 0,09 X1 0,29 0,42 0,31
X7 0,76 0,42 0,01 X22 0,27 0,26 0,25
X8 0,73 0,4 0,38 X4 0,21 0,42 0,4
X11 0,73 0,37 0,35 X23 0,19 0,21 0,16
X25 0,53 0,34 0,35 X12 0,08 0,14 0,14
The indicators which are potential bankruptcy predictors have been previously
anonymized and grouped into k=5 categories, according to the intervals of predictors
variability.
3. Characteristics of Models Used for Predicting Logistics Companies’ Bankruptcy

The following statistical models of bankruptcy classification were used to predict the
risk of company bankruptcy from the logistics sector, i.e., classic linear discriminant analysis
and logistic regression. What is more, the predictions based on the so-called classification trees
and the method of nearest neighbours were used. As it was previously mentioned, the chosen
methods represent a group of statistical methods.
3.1 Models of Linear Discriminant Analysis LDA
Before commencing to use the discriminant analysis in practice, it is needed to conduct

a series of multidimensional analyses for diagnostic variables by checking if they fulfil the basic
assumptions of a model.
Diagnostic variables in the discriminative analysis should fulfil the following
assumptions (Witkowska, 2002):
 It is assumed that discriminative variables have a multidimensional normal
distribution. Most of the diagnostic variables chosen to research bankruptcy in the analysed
research samples do not fulfil this assumption. However, as the research of other authors shows,
the multidimensional discriminative function is a good classifier, even though it violates this
assumption. Thus, this assumption was not taken into account in the further analysis, putting
trust into immunity of the discriminative analysis against non-fulfilment of this assumption.
 Divisibility of diagnostic variables, which appears in the systematic difference
between averages in groups, is assumed (U-Mann-Whitney test of average differences is used
to eliminate the indivisible variables).
 It is as well assumed that the covariance matrices are equal in the researched groups.
Research as well confirms that when groups are large enough, this assumption can be practically
omitted, because differences in the covariance matrices do not have a significant influence on
the I-type classification errors. Thus, by checking that assumption, it was omitted due to those
reasons.
The table below (Table 3) presents the test results for equality of averages in groups for
a research learning sample for 1 year before the bankruptcy period. Variables that do not fulfil
the assumption concerning significant difference in averages: X7, X9, X10, X14, X19, X20, X22,
X23, X24, X27, X28 were not taken into account in the further analysis of the discriminant model.

Table 3. Summary file from the U Mann-Whitney test results for diagnostic indicators, for which there are
no significant differences in group averages for the data one year before the bankruptcy period
Sum of ranks, Sum of ranks,
Statistics Statistics
Indicator class: class: Test probability (p-value)
U Z
non-bankrupt bankrupt
X7 667 461 185 1.926 0.054
X9 547 581 247 -0.606 0.544
X10 597 531 255 0.436 0.663
X14 586 542 266 0.202 0.839
X19 578 550 274 0.032 0.974
X20 587 541 265 0.223 0.823
X22 501 627 201 -1.585 0.113
X23 547 581 247 -0.606 0.544
X24 647 481 205 1.500 0.134
X27 634 494 218 1.224 0.221
X28 624 504 228 1.011 0.312
The U Mann-Whitney test was conducted in a similar way to research the sample for 2
years before the bankruptcy period. The variables: X7, X10, X12, X14, X19, X20, X23, X24, X27,
X28, were not taken into account for the discriminant model in a further analysis due to the fact
that they did not fulfil the assumption concerning the significant differences in the group
averages.
Table 4. The selected factors and factorial loads for the financial indicators chosen for the LDA model for
the annual horizon of bankruptcy prediction
Indicator Factor 1 Factor 2 Factor 3 Factor 4
X1 0,202989 0,665655 0,122855 0,675759
X2 0,187557 0,661696 0,116269 0,679428
X3 0,963897 0,032657 0,157821 0,156122
X4 0,160394 0,044599 -0,032889 0,948049
X5 0,134304 0,062217 -0,025955 0,938738
X6 0,220255 -0,007286 0,960207 -0,011805
X8 0,863936 0,101991 0,282553 0,006641
X11 0,259115 0,495666 0,802406 -0,010217
X12 0,154252 0,027002 0,954167 -0,006586
X13 -0,975855 -0,076655 -0,139614 -0,123111
X15 0,355713 -0,026919 0,359925 0,093521
X16 -0,975831 -0,070230 -0,116049 -0,129125
X17 0,022439 0,981589 0,024921 -0,000348
X18 0,069037 0,955293 0,074854 0,140203
X21 -0,978005 -0,081108 -0,090674 -0,050183
X25 0,524968 0,070665 0,206248 0,258338
X26 0,974396 0,083771 0,135116 0,128659
Explained
6,152951 3,050959 2,849142 2,869047
variance
Variance
0,361938 0,179468 0,167597 0,168767
share
In order to eliminate the variables with large mutual correlations (replicating the same
pieces of information in the model), a multidimensional factor analysis was used in the model.
When choosing factor representatives, the factor values from predictor rank table were taken
into account to choose the most significant variables.
The method of isolating the main constituents was used. Factor loads were considered
significantly correlated with the factor at the boundary value set on the level of 0.7. The variant

with factor rotation was used (normalized Varimax method). A minimal own value of 1 and
maximal number of isolated factors of 7 were chosen for the isolation of factors.
Table 4 presents the results of factor analysis for one-year prediction horizon. Thus, the
following predictors were chosen for the LDA model for one-year prediction horizon: X21
(strong significant correlation with factors 1), X18 (strong significant correlation with factor 2),
X11 (strong significant correlation with factor 3) and X4 (strong significant correlation with
factor 4). The indicators that are weakly correlated with factors as well as with each other: X 1,
X2, X15, X25, were as well included in the model.
Similar factor analysis was conducted for the indicator values for two years before the
bankruptcy period. In that case, the following variables were chosen to the LDA model for a
two-year prediction horizon: X5, X6, X8, X9, X15, X17, X21, X22, X25, X26.
In order to estimate the linear discriminant analysis model LDA, the generalized
discriminant analysis models module from the Statistica 10 package was used. Variant of
discriminant analysis used in the package is based on the calculation for each j variable class a
dependant variable of the so-called classifying function described with the following formula
(Prusak, 2005):
FK j  a0, j  a1, j X1  ...  ak , j X k
, (4)
ak , j
where is classifying function factors for a j classifying category and k predictor.
The analysed object can be classified in this class for which the value of classifying
function for the analysed object is greater.
In the estimated models, only those diagnostic variables were left, for which the value
of statistics  - Wilks was statistically significant on the level of p<0,1. There were two variants
of diagnostic variables estimated for both one- and two-year prediction horizons.
Table 5 presents the values of multidimensional Wilks’s test for significance of
diagnostic variables in a model and the estimated factors of classifying functions of these
models.
Table 5. Results of Wilks’s test and estimation of classification functions for the LDA models
Value Effect error Discriminant functions
Predictor Test
 df
F
df
Test probability (p-value)
Class: NB Class: B
Discriminant model LDA – 1 year before the bankruptcy
absolute term Wilks’s 0,79 11,4 1 44 0,0014 -2,23126 -1,04695
X1 Wilks’s 0,72 17,3 1 44 0,0001 0,01555 0,00411
X6 Wilks’s 0,91 4,1 1 44 0,0481 0,00511 -0,05231
Discriminant model LDA – 2 years before the bankruptcy
absolute term Wilks’s 0,80 20,4 1 83 0,0000 -1,98475 -1,24563
X2 Wilks’s 0,96 3,0 1 83 0,0848 0,02028 0,01443
X26 Wilks’s 0,89 9,7 1 83 0,0025 0,00313 -0,01121
3.2 Models of Logistic Regression - Logit
General form of two-state model of logistic regression describing the dependence of the
possibility of bankruptcy of examined companies depending on a set of factors influencing the
occurrence of this event is expressed by function
1
P(Y  1)   ( 0 1 X1 ... k X k )
1 e . (5)
In order to choose potential variables for a logit model, a factor analysis was used as
well as the values of ranking statistics for the importance of predictors (Table 1 and Table 2).

For prediction horizon of 1 year, the X23 and X28 variables were discarded from the list of
potential variables, because they had low value of ranking measures, whereas for a model with
prediction horizon of 2 years, the following variables were discarded: X12, X19, X22, X23, X24,
X28.
After implementing factor analysis, the following variables were chosen for estimating
model for a one year prediction horizon: X26, X18, X20, X11, X22, X5, X10, as well as other
variables (weakly correlated with factors and between themselves): X1, X2, X7, X15, X24, X25,
X27
A list of potential diagnostic indicators for a model with two year prediction horizon,
including variables: X3, X5, X7, X8, X9, X13, X17, X21, X25, was selected in a similar way.
In order to estimate the parameters of logistic regression model, a module of generalized
linear and non-linear models was used (generalized logit model).
In the estimated models, there were only these diagnostic variables, for which the Wald
statistics value was statistically relevant on the level of p<0.05.
The table below (Table 6) presents the estimated coefficients and values of Wald
statistics for both logit models with 1 year and 2 year prediction horizon.
Table 6. Estimation of parameters for logistic regression models

Evaluation Estimation Wald Test probability
Predictor
of parameter error statistics value (p-value)
Model of logistic regression – 1 year before the bankruptcy
absolute term 6.16642 2.142386 8.284 0.004
X1 -0.04938 0.015481 10.174 0.001
X11 -0.11751 0.061335 3,670 0.050
X27 -0.04283 0.021563 3.945 0.047
Model of logistic regression – 2 years before the bankruptcy
absolute term 1,4645 0.569849 6.605 0.0102
X5 -0.05544 0.023063 5.779 0.0162
X11 -0.0722 0.036052 4.011 0.0452
X27 -0.02084 0.008869 5.522 0.0188
3.3 Classification Trees (C&RT)
C&RT (Classification and Regression Trees) is a tool for statistical analysis of data used
to create classification and regression models. Tree is a kind of a graphic model created as a
result of recurrent division of a set of output observations into numerous subsets. The aim of
such division is to gain subsets as homogenous as possible in regards to dependent variable
value. Algorithm of recurrent division (so-called Recursive Partitioning) can use different
independent variable on each stage of division. All independent variables (predictors) are
always taken into account, and the chosen variable guarantees the best division of node, namely
receives the division into the most homogenous subsets is received.
Algorithms of decision trees can be divided into 3 basic types:
 CLS (Concept Learning System);
 AID (Automatic Interaction Detection), an example of this type of algorithms are
CHAID type trees;
 C&RT (Classification and Regression Trees).

More about methods and trees algorithms in classifying and regression use can be found
in Breiman et al. (1993).
Table 7. Classification trees for bankruptcy models

Right
Node Left node Number Size of NBSize of Chosen Division Division
node
number branch of nodes class B class class variable constant
branch
Prediction horizon: 1 year to bankruptcy
Selection rule B: (X15<=75.4 AND X21>63.1)
Selection rule NB: (X15>75.4) OR (X15<=75.4 AND X21<=63.1)
Effectiveness of correct classification: learning sample = 93.6 [%], test sample=84.2 [%]
1 2 3 47 24 23 Non-bankrupt X15 78,4
2 4 5 26 4 22 Bankrupt X21 63,1
4 2 2 0 Non-bankrupt
5 24 2 22 Bankrupt
Prediction horizon: 2 years to bankruptcy
Selection rule B:
(X13>89.4) OR (X13<=89.4 AND X24>13.3) OR (X13<=89.4 AND X24<=13.3 AND X7<=-51.0)
Selection rule NB:
(X13<=89.4 AND X24<=13.3 AND X7>-51.0)
Effectiveness of correct classification: learning sample = 84.9 [%], test sample=71.4 [%]
1 2 3 86 43 43 Non-bankrupt X13 89.4
2 4 5 56 39 17 Non-bankrupt X24 13.3
4 6 7 49 38 11 Non-bankrupt X7 -51.0
6 3 0 3 Bankrupt
5 7 1 6 Bankrupt
3 30 4 26 Bankrupt
Source: created by the author.
C&RT trees algorithms were used in this publication to analyse the bankruptcy of
logistics companies. A Statistica package module – General models of classification and
regression trees was used. All 28 financial indicators were chosen as entry variables. Gini
measure was used as a method of trees division, whereas in order to choose the best trunked
tree, a V-times cross-validation as a rule of one standard error was used. Minimization of
average costs of incorrect classification was used as a criterion of optimal tree trunking (the
same costs of incorrect classification, equal to 1, were set for bankrupts and non-bankrupts).
The structure of the best classification trees for one year and two year prediction horizon
is presented in Table 7. There are rules of tree division and node creation as well as classification
effectiveness of trees that are given in the table. For a classification tree for one year prediction,
the average costs of incorrect classification amounted to 0.106 for a learning sample and 0.162
for a test sample. For a two year prediction, these costs amounted to 0.256 and 0.258,
respectively.
The figure below (Figure 3) presents a graphic illustration of the classification tree to
classify the logistics companies threatened with bankruptcy in one year period horizon.


Figure 3. Graphic Illustration of Tree Structure for Classifying the Logistics Companies for One Year
Period Horizon
3.4 Method of K-Nearest Neighbours
Method of k-nearest neighbours can be briefly characterized as a statistical method of

classification of new and unknown multidimensional objects (described by a system of
statistical variables) on the basis of observation of affiliation with one of the possible “k” class
of other objects (so-called neighbours), which are situated closest in the multidimensional space
from the researched object (in relation to the set distance specification).
In order to examine the distance between objects in the multidimensional space, the
most commonly used measures are as follows:
n
d ( p, x )   p  x 
2
i i
 Euclidean distance: i 1
, (6)
n
d ( p, x)    pi  xi 
2
 Square of the Euclidean distance: i 1 , (7)

n
d ( p, x)   pi  xi
 Manhattan distance: i 1 , (8)
d ( p, x)  max pi  xi
 Chebyshev distance i .
(9)
More information about the nearest neighbour method itself and the algorithms, e.g., for
choosing the optimum number of neighbours, can be found in the work written by Matuszyk
(2004).
A module included in the Statistica package, i.e., other methods of machine teaching,
allowing to use method of k-nearest neighbours was used to examine the bankruptcy of the
logistics companies. All 28 financial indicators were chosen as entry variables, which were
subject to the preliminary normalization in order to standardize their values. The best results of
the classifying correctness for healthy and bankrupted companies were achieved when the
measure of neighbour distance was Manhattan distance. The choice of the best model was made
by the means of V-number of cross-checking with a search for an optimum number of
neighbours on a grid from 1 to 20 with a step of 1.

Table 8. Classification models using the method of the nearest neighbours

Number of the nearest neighbours– k Efficiency of correct classifications – test sample
Prediction horizon - 1 year before the bankruptcy
k=10 89,5 [%]
Prediction horizon - 2 years before the bankruptcy
k=2 78,6 [%]
4. Validations of Estimated Bankruptcy Models
The estimated bankruptcy prediction models were subject to a thorough validation

analysis in order to choose the best models for practical applications, which will be used to
predict the bankruptcy of companies from the logistics sector from the Podkarpacie region and
companies from Slovakia. The application of models from the perspective of their best
classifying properties in correct recognition of companies threatened with bankruptcy and
healthy companies as well as proper calibration of models with data from learning samples was
analysed.
The fundamental tool used to scrutiny the classifying effectiveness of classification
models are proper classification matrices (see Table 9). TN (True Negative) number in the table
denotes the number of healthy companies properly qualified by the model. Similarly, TP (True
Positive) number denotes the number of bankrupted companies properly qualified by the model.
If healthy companies are classified by the model as bankrupts, then such classification error is
called I-type error, and FP (False Positive) means the number of these incorrect classifications.
Much more serious is the II-type classification error, which is made when the model qualifies
bankrupts as not threatened with bankruptcy, and FN denotes the number of such incorrect
classifications.
Table 9. Matrix of correct classifications for bankruptcy prediction model
True affiliation Predicted affiliation of company
of company NB B
FP (False Positive)
NB TN (True Negative)
I type error
FN (False Negative)
B TP (True Positive)
II type error
In the process of validation of models for classification of companies threatened with

bankruptcy, the most commonly used are the following model validation measures: Information
Value (IV), Gini and Divergence coefficient as well as Kolmogorov-Smirnov statistics and
Hosmer-Lemeshow statistics.
Information value coefficient (IV) for model expresses the ability of the model to
separate division of results for a population of bankrupts and non-bankrupts. It is calculated
according to the formula (1) by previously putting objects from the sample in order, sorting
them in decreasing order in relation to the estimated values of probability of objects affiliation
with negative class on the basis of a model (probability of company’s bankruptcy).
Gini coefficient is used to examine the superiority of estimated model over random
model, i.e., randomly made decisions. It is calculated by using the formula (2); however, firstly,
the objects should be put in order in research samples in relation to the decreasing values of
bankruptcy probability. k index present in formulas (1) and (2) means in this case the number

of different attributes or categories of variability for values of bankruptcy probability in the

research samples.
Validation values of IV and Gini coefficients are interpreted as follows: the higher
(closer to 1) are the values of these factors, the better is the model’s ability to classify bankrupts
and non-bankrupts correctly. Whereas for models with a strong predictive power, they should
take values of at least 0.35 or higher.
Kolmogorov-Smirnov statistics value (KS statistics) defines the maximal distance
between the distribution functions for conditional distributions in population of healthy
companies (NB) and bankrupts (B) and is calculated by using the formula (Thomas, 2009):
KS  max F (x | B)  F (x | NB)
x . (10)
Divergence as well expresses a unit of measure of distance between the scrutinized
conditional distributions of bankruptcy probability for both company classes, and it is described
  G2   B2 
2
1 1 1 
D   2  2   G   B  
2
2  G  B  2 G 2 B 2
with the following formula (Thomas, 2009): ,
(11)
 NB   x  f (x | NB)
where x is the average value of bankruptcy probability in a population
 B   x  f (x | B)
of healthy companies (NB), x - the average value of bankruptcy probability in
    x   NB   f (x | NB)  B2    x   B   f (x | B)
2 2 2
NB
a population of bankrupts (B), x , x - variance
of bankruptcy probability distribution respectively for the population of healthy companies and
bankrupts, f (x | NB), f (x | B) - percentage of healthy and bankrupt companies for a given
category of bankruptcy probability.
It is assumed that the divergence should take values above 0.5 in order for the scrutinized
distributions to lay far enough from each other and the scrutinized model to have acceptable
ability to properly separate bankrupts from companies not threatened with bankruptcy.
Hosmer-Lemeshow statistics is based on Chi-squared statistics, and it is calculated by
 n p  NBi 
2
N
HL   i i
i 1 ni pi 1  pi 
using the following formula (Thomas, 2009): , (12)
where pi is average probability of affiliation with non-bankrupt class for the given i
rating category, NBi - the number of healthy companies in a given rating category, N - set
number of rating categories, into which the range of bankruptcy probability fluctuation has been
divided. Hosmer-Lemeshow statistics has a distribution  with df  N  2 degrees of freedom.
2
The higher are the values of H-L statistics, the better is the model’s ability to differentiate
distribution in both populations (B and NB), and the better are the classifying abilities of the
model.
ROC concentration curve is a graphic way of presenting classification power of models
in correct separation of bankrupted and healthy companies in comparison with the perfect model
(having an effectiveness of 100% correct classification) and random model (completely random
classification). The measure of conformity with the perfect model is the measure of field under
ROC curve
 AUROC  0.5 Gini  1 
. The higher (closer to 1) are the values of the field under
ROC curve, the better is the predictive ability of the evaluated model.

The previously characterized measures measure the discriminative quality of models. In

order to examine both discriminative quality and calibration precision of models to learning
data and test data, Brier coefficient (Brier Score) and LL coefficient (Likelihood of the model)
are used.
Brier Score(BS) is calculated by using the following formula (Löffler, Posch, 2007):
1 n
BS    di  PDi 
2
n i 1 , (13)
where n is the number of observations in sample, d i - dummy variable with value of 1,
when company is considered bankrupt and with a value of 0 otherwise, PDi - bankruptcy
probability estimated on the basis of a model.
The lower is the Brier Score value, the better calibrated is the model for data, and it
should have better prediction properties.
LL model reliability coefficient (LL) is defined with the following formula (Prusak,
n n
LL   P Yi | X i    PDi ( X i )Yi  1  PDi ( X i ) 
1Yi
2005): i 1 i 1 (14)
where n is the number of observations, PDi ( X i ) - estimated bankruptcy probability at
given values of entry variables (independent) in a model, Yi - dummy variable defining Y=1 –
bankrupts and Y=1 – non-bankrupts.
Table 10. Validation parameters of estimated models for a prediction horizon of 1 year
Eff1 Eff2 Brier
Model IV K-S Gini Divergence H-L AUROC LL (model)
NB B Score
learning sample
88% 96% 3.6 0.83 0.89 8.8 11.2 0.95 0.081 1,8 106
Logit
test sample
78% 90% 2.8 0.80 0.91 5.7 3,3 0.95 0.108 3, 0 103
learning sample
92% 91% 4.0 0.83 0.89 5.3 17.9 0.95 0.152 1,3 1010
Network
MLP 26-8-2 test sample
89% 100% 2.4 0.89 0.82 2.9 48.2 0.91 0.162 2,1105
learning sample
92% 83% 2.6 0.75 0.86 4.2 13.1 0.93 0.135 1,5 109
Network
89% 90% 2.8 0.79 0.91 7.0 3.6 0.96 0.111 8, 6 104
learning sample
92% 96% 5.7 0.96 0.93 14.4 7.2 0.97 0.059 1,8 105
C&RT Tree
test sample
78% 90% 1.3 0.68 0.67 4.2 19.0 0.83 0.140 1,1 104
Method of test sample

k-nearest neighbours
90% 89% 2,4 0,80 0,89 6,6 4,4 0,94 0,102 1, 7 103

The higher is the values of classification model reliability for a learning sample, the
better it is calibrated on the basis of entry data. High values of reliability indicator for the test
sample should point to good classifying value of the model as well as new, unknown cases.
Table 10 and Table 11 present the validation statistics for all the examined models of
predicting bankruptcy of logistics companies.
Table 11. Validation parameters of estimated models for a prediction horizon of 2 years
Eff1 Eff2 Brier
Model IV K-S Gini Divergence H-L AUROC LL (model)
NB B Score
learning sample
74% 81% 1.9 0.58 0.65 2.2 6.0 0.82 0.172 1,1 1019
Logit
test sample
79% 79% 1.8 0.57 0.70 2.9 4.9 0.85 0.153 2,8 106
learning sample
88% 84% 3.6 0.74 0.87 5.9 9.6 0.94 0.103 4,8 1013
Network
100% 86% 3.7 0.86 0.92 10.2 2.4 0.96 0.087 3, 7 10 4
learning sample
67% 79% 2.4 0.56 0.70 2.4 14.8 0.85 0.184 1,8 1021
Network
100% 86% 3.7 0.86 0.94 6.8 11.4 0.97 0.167 4, 6 107
learning sample
88% 81% 2.8 0.70 0.75 4.2 8.8 0.88 0.127 2, 6 1016
C&RT tree
test sample
71% 71% 1.3 0.50 0.56 0.8 19.0 0.78 0.229 0
Method of test sample
k-nearest
neighbours 72% 90% 2,4 0,98 0,43 2,8 12,6 0,70 0,161 0
5. Bankruptcy Prediction for Logistics Companies from the Podkarpacie Region and
Slovakia
When setting predictions of possible bankruptcy with the help of analysed models, it is
worth separating them into two groups, as it was previously done. One of them comprises of
predictions of models estimated on the basis of data one year before the bankruptcy period,
second of them of predictions by the same group of models estimated, however, on the basis of
data for two years before the bankruptcy period (Table 12). In the first case, there is a sample
of 125 “healthy” logistics entities, out of which 82 (65.6%) are companies from the Podkarpacie
region and 43 (34.4%) are companies from Slovakia. In the second variant, the total number of
companies is 104, out of which 61 (58.7%) are companies from the Podkarpacie region and 43
(41.3%) from Slovakia.

Table 12. Average values of predictions in section of analysed models for the Podkarpacie region and
Slovakia
Poland
Slovakia
Podkarpacie
estimations on the basis of data for one year before the
bankruptcy period
LDA model 0.391036 0.381273
logit model 0.316692 0.220125
C&RT model 0.234901 0.127353
k-nearest neighbours method 0.270732 0.248837
One-year average prediction 0.303340 0.244397
Two-year average prediction 0.438284 0.391394
Three-year average prediction 0.518581 0.491453
estimations on the basis of data for two years before the
bankruptcy period
LDA model 0.348442 0.354281
logit model 0.354433 0.273509
C&RT model 0.439643 0.413356
k-nearest neighbours method 0.385246 0.395349
One-year average prediction 0.381941 0.359124
Two-year average prediction 0.564387 0.554978
Three-year average prediction 0.666512 0.671540
Dividing these above-described bankruptcy probabilities into two categories, namely up

to a value of 0.5 and above 0.5, as non-bankrupt or bankrupt, there should be underlined the
fact that in the group of active logistics companies from both the Podkarpacie region and
Slovakia, there are no negative indications for the whole sector. In the case of analyses on the
basis of data one and two years before the bankruptcy period, the C&RT indicators seem to be
characteristic. The result is that in the group of predictions estimated on the basis of the first
group of data, the probability of bankruptcy for companies from both Slovakia and the
Podkarpacie region is relatively small. In the second analysed variant, it is completely the
opposite. The estimations gained in that case are the highest in the section of the considered
methods. An increase in the probability scale in that case is intuitive, but the scale of growth
may be interesting.
When evaluating the average predictions in section of the described methods in the
variant on the basis of data one year before the bankruptcy period, it rather does not exceed the
level of 0.5. However, if examining the case of predictions based on the data two years before
the bankruptcy period, then, on the stage of two-year predictions of bankruptcy, the probability
values exceed the threshold value.
Interesting juxtaposition from the perspective of predictive values in section of eight
examined models may present depiction of number of possible company bankruptcy indications
divided among the companies from the Podkarpacie region and Slovakia (Table 13).

Table 13. Scale of bankruptcy threat in the context of number of possible bankruptcies indications by the
analysed models
Number of bankruptcies
0 1 2 3 4
estimations on the basis of data for one year before the
bankruptcy period
Poland
52 13
Podkarpacie 4 (4.88%) 5 (6.10%) 8 (9.76%)
(63.41%) (15.85%)
region
29 5 5
Slovakia 2 (4.65%) 2 (4.65%)
(67.44%) (11.63%) (11.63%)
81 13 15
Total 9 (7.20%) 7 (5.60%)
(64.80%) (10.40%) (12.00%)
estimations on the basis of data for two years before the
bankruptcy period
Poland
15 20 11 9
Podkarpacie 6 (9.84%)
(24.59%) (32.79%) (18.03%) (14.75%)
region
8 14 9 10
Slovakia 2 (4.65%)
(18.60%) (32.56%) (20.93%) (23.26%)
23 34 20 16 11
Total
(22.12%) (32.69%) (19.23%) (15.38%) (10.58%)
Scale of
Not very possible Possible
bankruptcy threat
A large number of companies from both regions was placed in the group of the so-called
small risk in the case of possible predictions concerning bankruptcy. It is the most visible in the
case of estimations made on the basis of data one year before the bankruptcy. In the case of
almost 2/3 analysed entities, not even one of the analysed models indicated a bankruptcy threat.
A slightly worse situation is in the case of calculations based on data two years before the
bankruptcy period. In this variant, there is an increased percentage of companies for which one
of the four analysed models showed a negative signal concerning the appearance of possible
bankruptcy. The increase in that case was more than quadruple and included almost 1/3 of the
entities.
When making a comparison of relations of companies from the Podkarpacie region and
Slovakia, there can be observed that in the case of the latter, a downward tendency is kept for a
number of entities, for which most models predict a bankruptcy risk. In the case of estimations
on data both one and two years before the bankruptcy period in the group of companies
threatened at most with bankruptcy, there are seven companies (16.28% of the examined ones)
in the second case when the estimations were done on the basis of data two years before the
bankruptcy period, there are 12 companies of this type in that group (27.91%). Shifting this way
of thinking to the Podkarpacie region market, a similar tendency cannot be confirmed. The
situation is a bit different. Even if most companies are in the group of low bankruptcy risk, in
the group of high bankruptcy risk, there are many of them, at least in the first considered variant.
There, the percentage of threatened entities equals 25.61%; thus, every fourth examined
company can be included. In the second case, it is a bit better, because the general percentage
of companies with an increased risk of bankruptcy stays on an almost unchanged level of
24.59%.

Conclusions
Empirical verification of correct classification by given groups of methods of statistical

bankruptcy analysis from the perspective of their efficiency showed that these methods could
be characterized with a high quality of bankruptcy prediction. The presented concepts allow to
evaluate the threat of bankruptcy for a given group of entities quite easily. A systematic
estimation of probability of possible negative effects of activity in the following years and
observation of appearing tendencies for a change in that scope is of particular importance. There
is a need to underline the fact that the results achieved by the means of the above-described
methods should not be treated as deciding about the way of drawing conclusions. The properly
used methods may only constitute important help in evaluating the actual financial condition of
the examined companies. Hence, they make some kind of signal to make wise decisions and
not a mechanism, which allows to solve a problem definitely.
When analysing the validation of methods used in the bankruptcy analysis, a closer
attention should be drawn to the difficulties, which may appear during their direct application.
A fundamental problem in that case seems to be the issue of their effective application period
due to the changing economic or legal conditions. It happens that there are some changes, for
example, in the accounting bill or in the bankruptcy law itself. Due to this fact, when choosing
the research sample, the authors of the article chose only legal criteria. Thus, the authors limited
the scope to such “bankrupts” which filed for bankruptcy or whose insolvency proceedings
were initiated. Assuming such a concept may enable the usage of the created models even in
spite of the above mentioned changes in law.
Besides good indications, it should be remembered that all the analyses have one fault.
They are based only on the financial data; they do not take into account volumes, which cannot
be measured financially, such as chances for development, moods of employees, state of
economy or quality of management. Maybe, if introducing other group of indicators based more
on management processes and showing situation of industry, it would allow filling these gaps.
In the light of all the above, it should be not forgotten that the identification of
companies’ financial risk and bankruptcy connected with it with the help of analysed models
has one substantial advantage. It is objective and quite easy, and this seems to be the most
important thing.
When evaluating not only the presented methods but their indications as well, the
attention should be paid to fact that the analyses performed on the basis of data one year before
the bankruptcy period entities from Slovakia perform better. Percentage of negative indications
in that case in the section of described concepts is lower than for the companies from the south-
eastern Poland. However, when making the same comparison for data for two years before the
bankruptcy period, Polish companies from the TSL sector perform better.
References
Altman, E.I., Haldeman, R.G., Narayanan, P. (1977), “ZETA ANALYSIS, A new model to identify bankruptcy
risk of corporations”, Journal of Banking and Finance, Vol. , No , pp.29-54.
Atiya, A.F. (2001), “Bankruptcy prediction for credit risk using neural networks: A survey and new results”, IEEE
Transactions on Neural Networks, Vol. 12, No 4, pp.929-935.
Breiman, L., Friedman, J.H., Olshen, R.A., Stone, C.J. (1993), Classification and Regression Trees, Town,
Chapman and Hall.
Choi, W.S., Lee, S. (2013), “A multi-industry bankruptcy prediction model using back-propagation neural network
and multivariate discriminant analysis”, Expert Systems with Applications, Vol. 40, No 8, pp.2941-2946.
Fedorova, E., Gilenko, E., Dovzhenko, S. (2013), “Bankruptcy prediction for Russian companies: Application of
combined classifiers”, Expert Systems with Applications, Vol. 40, No , pp.7285-7293.
Fletcher, D., Goss, E. (1993), “Application forecasting with neural networks an application using bankruptcy data”,
Information and Management, Vol. 24, No , pp.159-167.
Frydman, H., Altman, E.I.,Kao, D. (1985), “Introducing recursive partitioning for financial classification: The case
of financial distress”, Journal of Finance, Vol. 40, No 1, pp.269-291.
Hamrol, M., Chodakowski, J. (2008), “Prognozowaniezagrożeniafinansowegoprzedsiębiorstwa.
Wartośćpredykcyjnapolskichmodelianalizydyskryminacyjnej”, Badania Operacyjnei Decyzje, No 3,
pp.17-31, [Prognozowaniezagrożeniafinansowegoprzedsiębiorstwa.
Wartośćpredykcyjnapolskichmodelianalizydyskryminacyjnej, in Polish].
Jones, S., Hensher, D.A. (2004), “Predicting firm financial distress: A mixed logit model”, Accounting Review,
Vol. 79, No 4, pp.1011–1038.
Karels, G.V., Prakash, A.J. (1987), “Multivariate normality and forecasting of business bankruptcy”, Journal of
Business Finance and Accounting, Vol. 14, No 4, pp. .
Kaski, S., Sinkkonen, J., Peltonen, J. (2001), “Bankruptcy analysis with self-organizing maps in learning metrics”,
IEEE Transactions on Neural Networks, Vol. 12, No 4, pp.936-947.
Kiviluoto, K. (1998), “Predicting bankruptcies with self-organizing map”, Neurocomputing, Vol. 21, No , pp.191-
201.
Kolari, J., Glennon, D., Shin, H., Caputo, M. (2002), “Predicting large US commercial bank failures”, Journal of
Economics and Business, Vol. 54, No 32 1, pp.361-387.
Korol, T. (2012), “Early warning models against bankruptcy risk for Central European and Latin American
enterprises”, Economic Modelling, Vol. 31, No , pp. 22-30.
Kumar, P.R., Ravi, V. (2007), “Bankruptcy prediction in banks and firms via statistical and intelligent techniques
– A review”, European Journal of Operational Research, Vol. 180, No , pp.1-28.
Lam, M. (2004), “Neural networks techniques for financial performance prediction: integrating fundamental and
technical analysis”, Decision Support Systems, Vol. 37, No , pp.567-581.
Lee, K., Booth, D., Alam, P. (2005), “A comparison of supervised and unsupervised neural networks in predicting
bankruptcy of Korean firms”, Expert Systems with Applications, Vol. 29, No , pp.1-16.
Leshno, M., Spector, Y. (1996), “Neural network prediction analysis: The bankruptcy case”, Neurocomputing,
Vol. 10, No , pp.125-147.
Löffler, G., Posch, P.N. (2007), Credit risk modeling using Excel and VBA, Wydawnictwo Wiley, Chichester,
West Sussex, p.156.
Marais, M.L., Patel, J., Wolfson, M. (1984), “The experimental design of classification models: An application of
recursive partitioning and bootstrapping to commercial bank loan classifications”, Journal of Accounting
Research, Vol. 22, No , pp.87-113.
Martin, D. (1977), “Early warning of bank failure: A logit regression approach”, Journal of Banking and Finance,
Vol. 1, No , pp.249-276.
Matuszyk, A. (2004), Credit scoring – metodazarządzaniaryzykiemkredytowym, Wydawnictwo CeDeWu,
Warszawa, p.119-122, [.
Ohlson, J.A. (1980), “Financial rations and the probabilistic prediction of bankruptcy”, Journal of Accounting
Research, Vol. 18, No , pp.109-131
Prusak, B. (2005), Nowoczesnemetodyprognozowaniazagrożeniafinansowegoprzedsiębiorstw, Wydawnictwo
Difin, Warszawa, [Modern methods of forecasting the financial risks of companies, in Polish].
Serrano-Cinca, C. (1996), “Self -organizing neural networks for financial diagnosis”, Decision Support Systems,
Vol. 17, No , pp.227-238.
Tam, K.Y., Kiang, M. (1992), “Predicting bank failures: A neural network approach”, Decision Sciences, Vol. 23,
No , pp.926-947.
Thomas, L.C. (2009), Consumer credit models. Pricing, Profit and Portfolios, Oxford University Press, Oxford,
p.111.
Tseng, F.M., Hu, Y.C. (2010), “Comparing four bankruptcy prediction models: Logit, quadratic interval logit,
neural and fuzzy neural networks”, Expert Systems with Applications, Vol. 37, No 3, pp.1846-1853.
Wilson, R.L., Sharda, R. (1994), “Bankruptcy prediction using neural networks”, Decision Support Systems, Vol.
11, No , pp.545-557.
Witkowska, D. (2002), Sztucznesiecineuronoweimetodystatystyczne. Wybranezagadnieniafinansowe, C.H. Beck,
Warszawa, pp.86-87, [.
Yu, L., Wang, S., Lai, K.K., Zhou, L. (2008), Bio-Inspired Credit Risk Analysis. Computational Intelligence with
Support Vector Machines, Springer-Verlag, Berlin Heidelberg, pp.14-15.

LENKIJOS IR SLOVAKIJOS LOGISTIKOS SEKTORIAUS BANKROTO PROGNOZAVIMO

STATISTINIAI METODAI
Jacek Brożyna, Grzegorz Mentel, Tomasz Pisula
SANTRAUKA
Esminė šio straipsnio tema tai – bankroto problema galimų grėsmių atsiradimo kontekste. Pristatomas
tyrimas siekia įrodyti aprašytų modelių gautas vertes, numatant galimus bankroto ženklus ir vertinant Lenkijos ir
Slovakijos įmonių iš TSL (transporto, laivybos, logistikos) sektoriaus finansinę būklę. Tam, kad būtų galima
prognozuoti logistikos sektoriaus įmonės bankroto riziką, tokie statistiniai bankroto klasifikavimo modeliai kaip
klasikinė tiesinė diskriminantinė analizė ir logistinė regresija buvo panaudoti. Prognozės taip pat rėmėsi taip
vadinamais klasifikavimo medžiais ir artimiausio kaimyno metodu. Tinkamo klasifikavimo empirinis
patvirtinimas, kuris buvo taikytas pateiktų statistinės bankroto analizės metodų grupėms, vertinant jų efektyvumą,
atskleidė, kad šiems metodams būdingas aukštos kokybės bankroto prognozavimas. Aprašytos sąvokos suteikia
galimybę lengvai įvertinti bankroto rizikos grėsmę analizuojamose įmonėse. Vienas iš esminių pateiktų rezultatų
privalumų tai - tyrimo imties skirstymas į taip vadinamas mokymosi grupes, kurioms buvo nustatyti tirtų modelių
parametrai, ir tyrimo bandinys tinkamų klasifikacijų veiksmingumui tirti, kuriam visos prognozės buvo nustatytos
vienerių ir dviejų metų prieš bankrotą periodams.
REIKŠMINIAI ŽODŽIAI: bankroto prognozavimas, dirbtinis intelektas, rizikos modeliavimas.
View publication stats

10 Brozyna Co

Uploaded by

Document Informationclick to expand document information

Document Informationclick to expand document information

Copyright:

Available Formats

10 Brozyna Co

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

10 Brozyna Co

Uploaded by

Copyright:

Available Formats

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

Statistical methods of the bankruptcy prediction in the logistics sector in

Article in Transformations in Business and Economics · April 2016

Tomasz Pisula Jacek Brożyna

SEE PROFILE SEE PROFILE

Risk Management View project

The user has requested enhancement of the downloaded file.

STATISTICAL METHODS OF THE BANKRUPTCY

1Jacek Brozyna 2Grzegorz Mentel 3Tomasz Pisula

TRANSFORMATIONS IN BUSINESS & ECONOMICS, Vol. 15, No 1 (37), 2016

KEYWORDS: bankruptcy prediction, artificial intelligence, risk

JEL classification: G33, F3.

TRANSFORMATIONS IN BUSINESS & ECONOMICS, Vol. 15, No 1 (37), 2016

1. The Analysis of Chosen Literature Concerning the Research on the Companies’

An exhaustive analysis of works concerning the issue of predicting companies’

Source: created by the authors.

TRANSFORMATIONS IN BUSINESS & ECONOMICS, Vol. 15, No 1 (37), 2016

TRANSFORMATIONS IN BUSINESS & ECONOMICS, Vol. 15, No 1 (37), 2016

2. Characteristics of Financial Factors and Research Samples Used to Predict the

TRANSFORMATIONS IN BUSINESS & ECONOMICS, Vol. 15, No 1 (37), 2016

TRANSFORMATIONS IN BUSINESS & ECONOMICS, Vol. 15, No 1 (37), 2016

j 1 nNB - corresponding cumulated percent of healthy companies.

0,0 0,2 0,4 0,6 0,8 1,0

Source: created by the authors.

Cramer’s V coefficient measures the dependence power between values of dichotomous

TRANSFORMATIONS IN BUSINESS & ECONOMICS, Vol. 15, No 1 (37), 2016

3. Characteristics of Models Used for Predicting Logistics Companies’ Bankruptcy

TRANSFORMATIONS IN BUSINESS & ECONOMICS, Vol. 15, No 1 (37), 2016

3.1 Models of Linear Discriminant Analysis LDA

Before commencing to use the discriminant analysis in practice, it is needed to conduct

TRANSFORMATIONS IN BUSINESS & ECONOMICS, Vol. 15, No 1 (37), 2016

TRANSFORMATIONS IN BUSINESS & ECONOMICS, Vol. 15, No 1 (37), 2016

3.2 Models of Logistic Regression - Logit

TRANSFORMATIONS IN BUSINESS & ECONOMICS, Vol. 15, No 1 (37), 2016

Table 6. Estimation of parameters for logistic regression models

3.3 Classification Trees (C&RT)

TRANSFORMATIONS IN BUSINESS & ECONOMICS, Vol. 15, No 1 (37), 2016

Table 7. Classification trees for bankruptcy models

TRANSFORMATIONS IN BUSINESS & ECONOMICS, Vol. 15, No 1 (37), 2016

Source: created by the authors.

3.4 Method of K-Nearest Neighbours

Method of k-nearest neighbours can be briefly characterized as a statistical method of

 Square of the Euclidean distance: i 1 , (7)

TRANSFORMATIONS IN BUSINESS & ECONOMICS, Vol. 15, No 1 (37), 2016

Table 8. Classification models using the method of the nearest neighbours

4. Validations of Estimated Bankruptcy Models

The estimated bankruptcy prediction models were subject to a thorough validation

In the process of validation of models for classification of companies threatened with

TRANSFORMATIONS IN BUSINESS & ECONOMICS, Vol. 15, No 1 (37), 2016

of different attributes or categories of variability for values of bankruptcy probability in the

TRANSFORMATIONS IN BUSINESS & ECONOMICS, Vol. 15, No 1 (37), 2016

The previously characterized measures measure the discriminative quality of models. In

Method of test sample

TRANSFORMATIONS IN BUSINESS & ECONOMICS, Vol. 15, No 1 (37), 2016

Source: created by the authors.

TRANSFORMATIONS IN BUSINESS & ECONOMICS, Vol. 15, No 1 (37), 2016

Dividing these above-described bankruptcy probabilities into two categories, namely up