Sanskrit As Inter-Lingua Language in Machine Translation: Sunita Chand
Sanskrit As Inter-Lingua Language in Machine Translation: Sunita Chand
Sanskrit As Inter-Lingua Language in Machine Translation: Sunita Chand
in Machine Translation
Sunita Chand
Abstract This paper gives an insight into the role of Sanskrit as inter-lingua
language in Multi-language machine translation. Inter-lingua and direct transfor-
mation based approaches have been used for a long period complementing each
other while sometimes competing with each other. Inter-lingua based approach is
efficient when used for multi-lingual machine translation e.g. Angla-Bharati system
uses pseudo lingua for Indian language (PLIL) as inter-lingua language for trans-
lation from Hindi to other Indian regional language. It is proposed to use Sanskrit as
an inter-lingua in Multi-language machine translation.
Keywords Inter-lingua ⋅
Machine translation ⋅ Natural language ⋅ Corpus
based ⋅
Sanskrit ⋅
Hindi
1 Introduction
Natural language processing has evolved from artificial intelligence field in order to
provide linguistic intelligence to machines so that it can unambiguously interpret
the sentences given as input in regional language and do the desired processing.
Machine translation (MT) is a task that requires knowledge of various other dis-
ciplines such as computational linguistics, cognitive science, computer science etc.
MT has been made possible by various approaches classified as follows:
S. Chand (✉)
University of Delhi, New Delhi, India
e-mail: sunitamk@gmail.com
In this approach the words or phrases in the source language are translated as they
are to a target language by using a dictionary. It needs a comprehensive dictionary
of all the words and their phrases. Obviously It seems very unrealistic that this
method can cope-up with complexity and ambiguous nature of a natural language
e.g. Anusaaraka.
This approach involves generating the database of rules used by the source lan-
guage as well as target language and obtaining a parse of the source language for
Sanskrit as Inter-Lingua Language in Machine Translation 29
mapping to target language structure using these rules. There have been many
systems developed using this approach e.g. Angla-Bharati, Matra, Anubaad etc.
This approach has the limitations that we can’t incorporate all the rules in a system
due to which such system suffers from being inadequate, providing limited cov-
erage and also sometimes producing incorrect translations.
Corpus based system are further divide as example based MT system e.g.
ANUBHARATI and Statistical MT system (SBMT) e.g. GOOGLE Translator,
Bing Translator etc. These systems learn how to translate by analyzing existing
human translations (known as bilingual text corpora). The success of these systems
is obviously dependent upon availability of representative parallel corpora with
wide and adequate coverage in the domain of application.
In this approach, the translation between source language and the target language is
accomplished by using some intermediate language which is capable of presenting
whole information contained in source language sentences in unambiguous form.
Further the intermediate language is translated to target language text.
This approach is very efficient for designing of multilingual translation systems
with minimum additional effort. The quality and success of this approach depends
on the ‘virtues’ of the intermediate language and the intermediate structure obtained
from the source text. PLIL is one such intermediate structure used by Angla-Bharati
system [1]. Other languages used as inter-lingua are UNL (Universal networking
language) [2], KANT system [3].
2 Features of an Inter-Lingua
The entire Sanskrit grammar, known as Ashtadhyayi was created by sage Panini
with the help of fourteen distinctive sounds that he conceived from God Shiva’s
damru (small hand-drum which God Shiva holds in His hand).
The perfection of Sanskrit grammar can be proved very easily by the
extensiveness of its grammatical tenses, one form for the present tense, three
forms for the past tense and two forms for the future tense. There is an
exclusive representation for, potential mood, imperative mood, benedictive
mood (called asheerling, which is used for indicating blessing), and condi-
tional. It has three separate words for each of the three grammatical persons
(first, second and third person), and it further distinguishes among ekvachan,
dvi-vachan and bahu-vachan i.e., if it is referring to one, two or more than
two people. Also the three categories of the verbs, known as atmanepadi,
parasmaipadi and ubhaipadi. signifies that the outcome of the action is
related to the doer or the other person or both respectively.
In this way there are ninety forms of one single verb.
For example: ‘kri’ root word (known as dhatu) means ‘to do’. Sanskrit has
ninety forms of verbs like this e.g., karoti, kurutah, kurvanti, etc. whereas in
English language, there are only a few forms of each word e.g., do, doing, and
done in the below figure. Additional words e.g. is, was, will, has been, have
been, had, had been etc. are added to these forms of verb to distinguish the
tenses. But in Sanskrit language there are distinct single words for all kinds of
uses and situations. There are words for all the three genders for the nouns
and pronouns and each word has twenty-one forms of its own to cover all
situations.
32 S. Chand
Sanskrit as Inter-Lingua Language in Machine Translation 33
4 Conclusion
Considering all the above points as explained above, it is quite evident that Sanskrit
is the source of all other languages of the world and not a derivation of any
language. As such, it can represent any other language thus qualifying as the
inter-lingua language that has the capacity to represent the content of any source
language. Hence it qualifies as an inter-lingua language which can be used in the
mapping of multiple source languages to multiple target languages. As opposed to
34 S. Chand
the KANT system [7], which produces a source F-structure as the inter-lingua
language, the proposed system may be easier to implement as each transfer from
source language to Sanskrit language will be governed by the well known gram-
matical rules of Sanskrit which can be further transferred to any other language.
References
1. Sinha RMK, Jain A (2003) AnglaHindi: an English to Hindi machine aided translation system.
http://anglahindi.iitk.ac.in, MTS-2003
2. Dave S, Parikh J, Bhattacharyya P (2001) Inter-lingua-based English-Hindi machine translation
and language divergence. Mach Transl 16:251–304
3. Nyberg EH (1996) Controlled Language and Knowledge Based Machine Translation:
Principles and Practice. In: First International workshop on controlled language applications,
Katholieke University, Leuven, 26–27 March 1996
4. Adusumilli KK Natural languages translation using an intermediate language. IAENG Int J
Comput Sci 33:1, IJCS_33_1_20
5. Lampert A (2004) Inter-lingua in machine translation. Technical Report, 2004
6. Al Ansary S Inter-lingua-based machine translation Systems: UNL versus other inter-linguas
7. Mitamura T, Nyberg EH, Carbonell JG (1991) An efficient inter-lingua translation system for
multi-lingual document production. In: Proceedings of machine translation summit III,
Washington D.C., 2–4 July 1991