Towards Accurate Classification of Miniature Images

Download as pdf or txt
Download as pdf or txt
You are on page 1of 7

Kongre Ana Sayfası: https://www.amerikakongresi.

org/%C3%B6nceki-kongre-ki-taplari
Latin America 5th International Conference on Scientific Researches
March 17-19, 2023 - Medellin

TOWARDS ACCURATE CLASSIFICATION OF MINIATURE IMAGES

Kongre Kitabı: https://www.amerikakongresi.org/_files/ugd/797a84_42d94c1e33d641d4a0615d9494ee582c.pdf

ORCID ID: 0000-0002-1351-7565

ABSTRACT
Miniatures are small images drawn on manuscripts to describe and depict the subject of the
manuscripts in a visual manner. In this way, they strengthened the narration of the manuscripts
and supported a better grasp of the narration. They can be considered as historical documents
as they depict many contemporary subjects and events such as victories, dynastic histories,
palace celebrations or patron travels from the period in which they were made. Over the
centuries, many Ottoman masters have trained and produced valuable works on various and
rich subjects. Most of these works have survived to the present day.

It seems possible to extract meaningful and useful information from miniature arts via the image
processing and analysis methods that benefit from artificial intelligence and computational
abilities. This study aimed to automatically classify miniature works, in other words, to identify
the masters of the works from images with deep learning. In this context, a deep learning
algorithm was developed and trained to learn to identify the masters of the works from images.
The algorithm predicted the craftsmen of the miniatures with very high accuracy. Namely, the
algorithm has achieved 0.9722 categorical accuracy, 0.9706 Precision, 0.9167 Recall and
0.9968 AUC (area under the curve) scores. This study has shown that useful information can
be easily and successfully uncovered by processing and analyzing these historical arts.

Keywords: ottoman miniatures, craftsmen identifying, classification, deep learning.

1. INTRODUCTION
Miniatures are small-sized images intended to visually describe and depict the content subject
in manuscripts. Miniature paintings are made to provide a better understanding of what is told
in the texts or to strengthen the narration.

CONFERENCE BOOK Academy Global Publishing House 181


Latin America 5th International Conference on Scientific Researches
March 17-19, 2023 - Medellin

They generally depict portraits, lives of sultans, festivals, historical events, life style, nature
and city views, literary works, religious subjects, traditions and customs, women and men,
and creatures such as animals and plants [1].

Miniature art was used by the Ottomans from the 14th to the early 18th centuries, and by the
Safavid Empire in Iran in the 16th and 17th centuries [2]. Miniatures depict many
contemporary subjects from the period in which they were made. For example, great victories,
dynastic history, palace celebrations or patron travels. In this respect, miniatures are historical
documents that carry the events to the present, and therefore they can be used as evidence in
the study of social and political history.

During the Ottoman period, many miniature artists were trained for centuries. These artists
have produced very valuable works on various and rich subjects. Sometimes they depict court
entertainment and activities, sometimes wars, nature, daily life or social events. Today, a
significant portion of the images of these works have been widely accessible through various
researches or sources.

On the other hand, the new methods and techniques in software and artificial intelligence have
made it possible to process and analyze images, produce meaningful information, make
various inferences, and obtain many other useful results. Character recognition from images
[3], text reading [4], transcription [5], and translation [6], object and pattern recognition [7],
image classification [8], etc., many jobs can be performed quickly, easily and automatically. It
is very possible for miniature works to benefit from these methods and techniques. Thus,
many works that would be very difficult, time-consuming or almost impossible to achieve
without these methods and techniques can be performed effortlessly.

For example, automatic detection of the craftsmen of the miniatures from the images can be
very useful in terms of information retrieval and automatic data processing. Such a system
will pave the way not only for the artist's identity information, but also for obtaining many
more information about the work by making use of visuals.

In this study, the craftsmen of the Ottoman miniatures were tried to be identified
automatically from the images. For this purpose, a deep learning algorithm has been designed
and tailored to achieve the best performance. The algorithm was trained and extensively
tested with the images of miniature works of four different craftsmen. The algorithm achieved
very appealing success in classification. Thus, this study showed that the artists of the
miniatures can be easily identified from miniature images with very high accuracy.

2. EXPERIMENT
Miniatures contain various differences depending on factors such as the period in which they
were made, the subject to be depicted, or the style of the artist depicting them. For example,
artists have created works using different textures, motifs, techniques and a naturalist and/or
realistic styles [9]. These differences highlight the intrinsic and distinctive features that
distinguish the works from others.

In this study, it is aimed to identify the artists of the miniatures by exploiting these distinct
features. For this purpose, a convolutional neural network (CNN) with several different kind
of layers was designed and fine-tuned to achieve the best performance in identifying the
masters of the miniatures. The architecture of the model is visualized in the Figure 1.

CONFERENCE BOOK Academy Global Publishing House 182


Latin America 5th International Conference on Scientific Researches
March 17-19, 2023 - Medellin

The proposed network mainly consisted of Conv2D layers Layers Params#


that perform convolutional operation, followed by 0
MaxPooling2D and BatchNormalization layers,
respectively. This structure is repeated 3 times until the
Flatten layer, which is used to vectorize 2D convolution 0
information and adapt the output of the network to the
classification problem. The Flatten layer is followed by
another BatchNormalization layer, and finally a Dense 0
layer with 4 neurons. In the last layer, in accordance with
the multiple classification problem, the 'softmax' activation
is used. Conv2D layers consist of 8 filters with (3,3) 224
kernels and have a 'relu' activation. All MaxPooling2D
layers take the maximum value within 8x8 window. The
0
network consists of a total of 1,556 parameters, of which
1,492 are trainable and 64 are non-trainable. The
categorical cross entropy is used as cost function within 32
ADAM optimizer.

Normally, images vary considerably in size. Therefore, 584


their size needed to be fixed to a certain dimension. In the
experiments, it was seen that the 512x512 pixel size
offered good results. After the dimensions of the images 0
were standardized, they were subjected to data
augmentation and then normalization procedures. In the
data augmentation procedure, the following operations 32
were performed, respectively, through the Sequential layer
of Keras:
584
RandomRotation(0.10)
RandomRotation(0.15) 0
RandomRotation(0.25)

In the normalization layer, the images were normalized 32


between 0.0 and 1.0 by dividing the intensities by 255.0.

2.1. Dataset 0

The images were downloaded from


https://www.turkishculture.org. The dataset was composed 32
with a total of 380 images belonging these 4 artists: Levni,
id Lokman. Each artist
has an almost equal number of images (around 95). 36
Randomly chosen 9 works of each artist (approximately
10%) were reserved for testing. During training, 10% of Figure 2. Architecture of the
the training set was used for the validation. The details of Algorithm.
the dataset are given in Table 1. In addition, a pair of
miniature works of each author are given in Figure 2.

CONFERENCE BOOK Academy Global Publishing House 183


Latin America 5th International Conference on Scientific Researches
March 17-19, 2023 - Medellin

Table 1. Details of the Datasaet


Craftsman Training Test Total
Levni 85 9 94
87 9 96
Rumuzi 84 9 93
Seyyid Lokman 88 9 97
Total 344 36 380
2.2. Training

The visualization of the training (categorical accuracy and loss) is given in Figure 1. The
training of the network was carried on for a maximum of 1000 epochs. Batch size was 16. The
learning rate was initialed with 1E-3 and halved if there was no improvement in the
performance of the network for 100 epochs. Training was terminated when the network failed
to progress in learning for 500 epochs. A validation test was performed with 1/10 of the
training data at the end of each epoch to determine whether the network was over-fitting or
learning efficiently.
Levni Rumuzi Seyyid Lokman

3. RESULTS
The sub figures 3(a) and 3(b) show the categorical accuracy and losses, respectively, for the
training and validation. In the graphs, the blue and orange lines visualize the training and
validation, respectively. As the network could no longer continue to learn, the training was
terminated around 900th epochs. According to the graphs, the network displayed a very
successful education graph. As the training continued, the losses and classification
performances approached 0 and 1, respectively, and remained fairly stable at these levels. The
validation curve was quite fluctuating at the beginning of the training, while it became stable
in the later stages and flat at the end. As a result, the network appears to have learned
successfully. The closeness of training and validation scores also proves the learning success
of the network.

epoch epoch
(a) (b)
Figure 3. Training Performances of Algorithm. (a) Categorical Accuracy, (b) Categorical Loss.

CONFERENCE BOOK Academy Global Publishing House 184


Latin America 5th International Conference on Scientific Researches
March 17-19, 2023 - Medellin

In order to quantitatively measure the classification capability of the network, 'categorical


accuracy', Precision, Recall, Area Under Curve (AUC), True Positive Rate (TPR), False
Positive Rate (FPR), True Negative Rate (TNR) and False Negative Rate (FNR) metrics
calculated. Some calculations are given in Table 2, and the confusion matrix in Figure 4.
According to the confusion matrix, 35 of the 36 test images were correctly classified by the
algorithm. The painting, which was incorrectly identified as the work of Rumuzi, was actually
the work of Sayyid Lokman.

Table 2. Classification Performance of the Model in Categorical Accuracy, Precision, Recall and
Area Under Curve (AUC)
Categorical
Accuracy Precision Recall AUC
0.9722 0.9706 0.9167 0.9968

Figure 4. Confusion Matrix.

The effects of applying (or vice versa) Batch Normalization and Max Pooling techniques on
the performance of the algorithm were also investigated. The results are given in Table 3. As
can be seen, the classification performance of the algorithm is significantly reduced when
these techniques are not applied individually or together. Especially when Max Pooling
technique is not applied, the number of parameters of the network increased from thousands
to millions and accordingly the need for memory and computational power has increased
exponentially.

Table 3. Classification Performance of the Model with and without Batch Normalization and
Max Pooling.
BatchNorm. MaxPooling Accuracy Params#
Yes Yes 0.9722 1,556
Yes No 0.7778 16.78M
No Yes 0.7500 1,428
No No 0.7778 8.39M

To further examine the excellence in classification of the algorithm, the Receiver Operating
Characteristic (ROC) and Precision-Recall curves are also given in Figure 5. Both graphs
clearly demonstrate the excellence of the algorithm in classification. All values are extremely
close to 1.

CONFERENCE BOOK Academy Global Publishing House 185


Latin America 5th International Conference on Scientific Researches
March 17-19, 2023 - Medellin

(a) (b)
Figure 5. (a) Receiver Operating Characteristic (ROC) and (b) Precision-Recall Curves for
Classification.
4. CONCLUSION
This study tried to classify the miniature works of the Ottoman period with deep learning
techniques. In this context, the focus is on identifying the craftsmen of the miniature works
from their images. A deep network specially developed for classification was trained from
sample miniature images, and asked to predict the craftsmen of the miniatures that it had not
seen before. In the experiments, the algorithm achieved a very high and promising
classification accuracy. The algorithm showed 0.9722 categorical accuracy, 0.9706 Precision,
0.9167 Recall and 0.9968 area under the curve (AUC) values. This study showed that these
miniatures can easily and automatically be classified with high accuracy.

As a result, this study has revealed that useful information can easily and successfully be
uncovered from these valuable historical arts by image processing and analysis techniques of
artificial intelligence.

REFERENCES
[1] Akdeniz Sanat, 6, 11, 2013.
[2] Asl, M. M., A Comparative Analysis of Factors Influencing the Evolution of Miniature

3, 2, 484 493, 2017.


[3] Koyuncu, B. Koyuncu, B. Handwritten Character Recognition by using Convolutional
Deep Neural Network; Review, International Journal of Engineering Technologies
IJET, 5, 1, 1 5, 2019.
[4] Weinman, J. J., Butler, Z., Knoll, D., Feild, J. Toward Integrated Scene Text Reading,
IEEE Trans. Pattern Anal. Mach. Intell., 36, 2, 375 387, 2014, doi:
10.1109/TPAMI.2013.126.
[5] Cour, T., Jordan, C., Miltsakaki, E., Taskar, B. Movie/script: Alignment and parsing of
video and text transcription, Computer Vision ECCV 2008: 10th European
Conference on Computer Vision, October 12-18, 2008, Proceedings, Part IV 10, 158

CONFERENCE BOOK Academy Global Publishing House 186


Latin America 5th International Conference on Scientific Researches
March 17-19, 2023 - Medellin

171, Marseille, France, 2008.


[6] Lopez, A. Statistical Machine Translation, ACM Comput. Surv., 40, 3, 2008, doi:
10.1145/1380584.1380586.
[7] Liu, J., Sun, J., Wang, S. Pattern recognition: An overview, IJCSNS Int. J. Comput.
Sci. Netw. Secur., 6, 6, 57 61, 2006.
[8] Temiz, H. Automatic and Accurate Classification of Hotel Bathrooms from Images
with Deep Learning, International Journal of Engineering Research and Development,
14, 3, 211 218, 2022. https://doi.org/10.29137/umagd.1217004.
[9]
iyet Art,
7, 1, 21 40, 2021. https://doi.org/10.46641/medeniyetsanat.810829.

CONFERENCE BOOK Academy Global Publishing House 187

You might also like