Paper 8863
Paper 8863
Paper 8863
IJARSCT
International Journal of Advanced Research in Science, Communication and Technology (IJARSCT)
Abstract: Devanagari Character Recognition is a system in which handwritten Image is recognized and
converted into a digital form. Devanagari handwritten character recognition system is based on Deep
learning technique, which manages the recognition of Devanagari script particularly Hindi. This
recognition system mainly has five stages i.e. Pre-processing, Segmentation, Feature Extraction, Prediction
and Post processing. This paper has analyzed the approach for recognition of handwritten Devanagari
characters. There are various approaches to solve this. Some of the methods along with their accuracy and
techniques used are discussed here. Depending upon the dataset and accuracies of each character the
techniques differs.
I. INTRODUCTION
Handwriting recognition has been one of the most enchanting and demanding research areas in today’s digitalized
world, which has evolved through the combination of artificial intelligence and machine learning. It contributes
exceptionally to the advancements of the interface between humans and machines. Handwritten Character Recognition
is basically ability of a system to identify human handwritten input. In general, it is classified into two types an on-line
and an off-line handwriting recognition system. The handwriting can be from many sources, such as images, paper
documents, or other devices; this is considered to be as offline system. Non-Indian languages, such as English, Chinese,
German, Japanese, Korean, etc. are already grown-up as compared to Indian scripts. Although, Indic scripts have some
more challenges in handwriting recognition than Latin, Chinese and Japanese because of the presence of variations in
the order of strokes or symbols, half consonant, etc.
(a) (b)
(c)
Fig 1. (a) Vowels of Devanagari Script (b) Numerals of Devanagari Script (c) Consonants of Devanagari Script
A. Image Acquisition
Image acquisition part plays the main role in OCR problem. No matter how accurate model is on testing, the real- world
image will never be same. In real world image, there will be lots of noise, blur, and many other quality degradations of
image.
B. Data Pre-processing
In this stage the image is converted into grayscale, and a NumPy array is prepared to store the image pixels. After this
the intention is to find foreground and background colors. Removing some noise and doing threshold makes it easier for
image to recognize text, and find foreground color. Here combination of threshold of Otsu and Binary is used. It is a
way to create a binary image from grayscale or full-color image. This is mainly done in order to separate “object” or
foreground pixels from background pixels to aid in image processing.
There are various processes in preprocessing of data such as:
1. Binarization
2. Noise Elimination
3. Skew Correction
4. Size Normalization
5. Thinning.
F. Classification
Each segment is passed to prediction process. Before doing actual prediction, the shape of Segment must be resized as
that of the neural network input. Thus, each segment is converted into 30 by 30 sized image and in addition, we added a
1-pixel borders around it with background color. Then our segments will be of 32 by 32 shape which is the input shape
for our model. Then it is to the neural network. If the segment has high prediction then is assumed that the character
should be shown. Prediction can be wrong also depending upon the image quality due to the false segmentation.
Character Recognition techniques can be classified as:
1. K-nearest neighbors [24,35]
2. Support Vector Machine [11,14,15]
3. Convolution Neural Networks [1,32]
4. Hybrid Network [18]
5. Artificial Neural Network [28]
These methods are mainly used but more methods could be taken under consideration.
G. Prediction
This is the overall collection of previous processes. The actual recognition is seen after there is bordered around the
found character and its corresponding label as well. The final detection takes some time and gives accurate prediction.
If poor image is given then it gives false prediction. So, high resolution image is recommended.
B. Deep Learning
Deep learning [1, 27, 40] is a branch of machine learning in artificial intelligence (AI) that has networks capable of
learning unsupervised from data that is unlabeled. Other name for Deep learning is deep neural network. Convolutional
Neural Networks (CNN) along with various other Neural Networks such as ANN, RNN and other hybrid models are
used on Deep Learning techniques.
[3] A Novel SVM - basedOperations are Support vector Data samples are 82.04
Handwritten Tamil characterperformed on the machine (SVM) collected from
recognition system digitized image to method different writers on
enhance the A4 sized documents
quality of the and then scanned.
image.
[4] Fuzzy Model Based Recognition- Fuzzy Model 4750 samples 90.65
of Handwritten Hindi Characters Based
[5] A Bilingual OCR for Hindi-Scanned document Based on Principal 200000 characters 96.7
Telugu Documents and itsis Filtered and Component
Applications Binarized. Analysis followed
by support vector
classification
Copyright to IJARSCT DOI: 10.48175/IJARSCT-8863 77
www.ijarsct.co.in
ISSN (Online) 2581-9429
IJARSCT
International Journal of Advanced Research in Science, Communication and Technology (IJARSCT)
7%
5%
7% IEEE
Springer
48% Elsevier
12%
ACM
Conference
Others
21%
Number of Papers Vs
14 Years
12
Number of Papers
10
0
< 2010 2010 -2015 2015 -2018 2019 -2021
Years
16 15
14
Percentage Of Papers
12
10
7
8 6 6
6 4
3 3
4 2
2
0
SVM CNN Fuzzy KNN Bayes PCA Gaussian Other
Models
Deep Learning Models
V. CONCLUSION
Character Recognition is one of the most common applications in image processing. Due to complexities of Indian
languages, it has been recognized as one of the challenging researches is in the field of computer vision and pattern
recognition. But still, a lot of research is being done on large datasets of these languages to handle the complexities and
other issues.
This paper carries out a study on handwritten character recognition using deep learning with the help of various similar
kinds of papers. This paper also represents a survey of preprocessing techniques, various classifiers used and
recognition techniques for handwritten Devanagari character recognition. Deep learning techniques are commonly
performed for character recognition due to high tolerance and less errors. This survey paper helps researches and
developers to understand various techniques and the way they are implemented for recognition. The image is scanned
firstly, and then data is preprocessed. The preprocessing involves various techniques such as Binarization, removal of
noise and Normalization. After that, features are extracted from the preprocessed data so that the relevant data is further
used to train the model. Various classification techniques are applied and the best approach is considered based on
accuracy. Various models which uses multilayer perceptron compares input image with the trained set to get high
accuracy. This paper has focused on various approaches effective algorithm. The study concludes that SVM classifier as
well as CNN classifier both provides better results with an accuracy of 99.6% and 98.47% respectively. The study
points out that the work in Devanagari scripts still in progress, so further a lot can be done in this domain.
REFERENCES
[1]. Shailesh Acharya, Ashok Kumar Pant, Prashnna Kumar Gyawali. Deep Learning Based Large Scale
Handwritten Devanagari Character Recognition, 2015 9th International Conference on Software, Knowledge,
Information Management and Applications (SKIMA).
[2]. C. V. Lakshmi, R. Jain, and C. Patvardhan. Handwritten Devnagari numerals recognition with higher
accuracy, in Proc. Int. onfut.Intell. Multimedia Appl., 2007.
[3]. Shanthi N and Duraiswami K. A Novel SVM - based Handwritten Tamil character recognition system,
Springer Pattern Analysis & Applications,Vol-13, No. 2, 173-180,2010.
[4]. M. Hanmandlu, O. V. R. Murthy, and V. K. Madasu. Fuzzy Model based recognition of handwritten Hindi
characters, in Proc. Int. Conf. Digital Image Comput. Tech. Appl., 2007.
[5]. C. V. Jawahar, M. N. S. S. K. Pavan Kumar, S. S. Ravi Kiran. A Bilingual OCR for HindiTelugu Documents
and its Applications, Proc. of the 11th ICPR, vol. II, pp. 200-203, 1992.
[6]. C. Chandra Sekhar. Online Handwritten Character Recognition of Devanagari and Telugu Characters using
Support Vector Machines, Tenth International Workshop on Frontiers in Handwriting Recognition, Universit
éde Rennes 1, Oct 2006, La Baule (France). inria00104402.