Sentiment Analysis
Sentiment Analysis
Sentiment Analysis
NILEEMA PATIL
Dept. of Physics, LTCE, Affiliated to Mumbai University, Mumbai, India. patil. nileema@gmail.com
SURABHI ROTIWAR
Department of Information Technology, FRCRCE, Affiliated to Mumbai University, Mumbai, India.
surabhi.rotiwar@gmail.com
JASON NUNES,
Department of Information Technology, FRCRCE, Affiliated to Mumbai University, Mumbai, India. jason.nunes@gmail.com
94 | P a g e
NOVATEUR PUBLICATIONS
International Journal of Research Publications in Engineering and Technology [IJRPET]
ISSN: 2454-7875
VOLUME 3, ISSUE 6, Jun.-2017
• To create an system that that uses a corpus based be redundant while calculating the overall polarity of the
approach and can carry out advanced Sentiment Analysis sentence. Hence they are known as stop words.
for a vernacular language i.e. Marathi that detects hidden • Obtaining the polarity of relevant keywords from the
sentiments in text and analyzes the content accordingly. corpus. Here, it is +0.75 and +0.5.
• To create an system that that uses a corpus based • Calculating cumulative polarity.
approach and can carry out advanced Sentiment Analysis The cumulative polarity P is calculated as P=(a+b+c+….)/n
for a vernacular language i.e. Marathi that detects hidden where; a,b,c are the individual polarities mapped from the
sentiments in text and analyzes the content accordingly. corpus, n= No of words.
B. ALGORITHM
The algorithm first find individual polarity of each word in
the sentence and then find the cumulative polarity to
determine if the sentiment is positive, negative or neutral
and to what degree. Sentence in Marathi is a input for the
algorithm. The steps are follows:
• Elimination of stop words. Any words that do not
attribute any specific polarity to the sentence are found to Figure3: Negative polarity output
95 | P a g e
NOVATEUR PUBLICATIONS
International Journal of Research Publications in Engineering and Technology [IJRPET]
ISSN: 2454-7875
VOLUME 3, ISSUE 6, Jun.-2017
Following Table 2 provides different test cases its better services and facilities in terms of sentiment analysis.
polarity. The scarcity of resources is one of the biggest challenges
Table 2: Test cases with Result while dealing with sentiment analysis for Marathi
Input Output language. This system focused on resource creation which
मला वाटते ती, अततशय सुंदर आहे 0.84, Highly Positive
includes building of an up to date corpus for Marathi
तिती सुंदर आहे ती! 0.75 Highly Positive
ती आहे म्हणून द: खी? 0.0 Neutral Polarity language. The algorithm being proposed is used to give
मी खूप आनुंदी आहे 0.14 Positive Polarity cumulative polarity using the Wordnet. This sentiment
मी आनुंदी नाही -0.12 Negative Polarity analysis model proposes a novel and effective approach to
मी नाही धन्य तिुंवा सखी -0.125, Negative polarity
achieve desired functionality for Marathi language. This
तो नाही आहे िी, मी आनुंदी नाही 0.125, Positive Polarity
system mainly focused on creation of a vast and diverse
"तम्ही िसे आहात?" 0.0 Neutral Polarity
“"मी द: खी आहे " 0.0 Neutral Polarity up-to-date corpus, efficient mapping of data and
After careful analysis of the test cases, the few generation of accurate sentiments for the data to erase the
discrepancies in the obtained results were mainly due to language barriers faced in the field of Sentiment analysis
the following 3 factors: for Marathi.
1) Corpus word limit and Translation accuracy : If
words are not present in corpus built, then online Yandex VII. FUTURE SCOPE:
translator is used and sentiment is analyzed with English The scope of this system is limited for sentence level
SentiWordNet. The freely available Yandex translator that which can further be increased so as to analyze the
system is using provides an accuracy in the range of 60- sentiment for paragraphs and even larger text documents.
70%.Thus, especially for Asian, vernacular languages, the This can be done, by firstly finding the individual polarity
translation is only partially accurate. Thus, these of the sentences using our suggested algorithm and then
inaccurate translations affect the overall polarity of the finding the net cumulative polarity of all sentences to
sentences which can result in minor discrepencies in the obtain the overall polarity of the paragraph or document.
overall output. However, for smaller and simple sentences, There is a need for an optimized and accurate algorithm to
the translation accuracy is comparatively higher than that do the same. Further limited size of corpus can be
of complex compound sentences. It was an observation increased by not only considering adjectives but also verbs
that the framing of the sentences makes a difference in the and nouns which are already present in the English Senti
sentiment analysis. Word Net. This while evaluating the sentiment, better
2) Limited scope of English SentiWordNet: The English accuracy and results can be obtained. Also, increase the
SentiwordNet 3.0 which system is using to obtain the dictionary size by regularly updating it with new words.
polarities of the respective words, also needs a lot of Further real time update of dictionary can be future
optimization. Many words which are actually of neutral research direction in the field of sentiment analysis of
polarity, are misclassified as adjectives having higher or Marathi language.
lower polarity and hence either provide wrong polarities
or tamper with the algorithm. REFERENCES:
3) Non acceptance of special characters: This system 1) Report 'Indian Languages - Defining India's Internet', a
perfectly analyzes and parses basic punctuation like study by KPMG in India and Google April 2017.
commas, question marks, exclamation marks etc., but a 2) Sentiment analysis, https:// en.wikipedia.org/ wiki/
special character like “” etc. is parsed along with the word, Sentiment_ analysis, accessed in Aug. 2016.
the system fails to recognize and process it and doesn’t 3) Alessia D’Andrea, Fernando Ferri, Patrizia Grifoni,
give any output. Hence there is a need to include a separate “Approaches, Tools and Applications for Sentiment
exception class to eliminate these special characters. Analysis Implementation”, International Journal of
Computer Applications (0975 – 8887) Volume 125 –
VI. CONCLUSION: No.3, September 2015
Sentiment Analysis has been quite popular and has lead 4) H. K. Walaa Medhat, Ahmed Hassan, “Sentiment
to building of better products, understanding user’s analysis algorithms and applications: A survey,” Ain
opinion, executing and managing of business decisions. Shams Engineering Journal (2014) 5, 1093–111, 2014.
With rapidly increasing technology, the early approach of 5) S. S. Namrata Godbole, Manjunath Srinivasaiah, “Large-
word-of-mouth has been shifted towards the mass opinion scale sentiment analysis for news and blogs,”
what the people like and appreciate in majority. The rise in ICWSM’2007, Boulder, Colorado,USA, 2007.
user-generated content for Marathi language across 6) G. Q. Soha Ahmed, Michael Pasquier, “Key issues in
various genres- news, culture, arts, sports etc has opened conducting sentiment analysis on Arabic social media
the data to be explored and mined effectively, to provide text,” IIT`13, 2013.
96 | P a g e
NOVATEUR PUBLICATIONS
International Journal of Research Publications in Engineering and Technology [IJRPET]
ISSN: 2454-7875
VOLUME 3, ISSUE 6, Jun.-2017
7) D. J. Hatem Ghorben, “Sentiment analysis of french
movie reviews,” Proceedings of the 7th Atlantic Web
Intelligence Conference, AWIC 2011, pg. no. 19-28,
2011.
8) M. K. Yakshi sharma, Veenu mangat, “A practical
approach to sentiment analysis of Hindi tweets,” 1st
International Conference on Next Generation
Computing Technologies (NGCT), pg. no. 677-680,
2015.
9) Denecke, Kerstin. "Using sentiwordnet for multilingual
sentiment analysis." Data Engineering Workshop, 2008.
ICDEW 2008. IEEE 24th International Conference on.
IEEE, 2008.
97 | P a g e