Sentiment Analysis On Online Product Reviews
Dr. Rajesh Bose1, Raktim Kumar Dey2, Sandip Roy3, *, Dr. Debabrata Sarddar4
1, 2
Simplex Infrastructures Ltd., 3 Brainware Group of Institutions, 4 University of Kalyani
Kolkata, India.
{bose.raj00028, deyraktim, *sandiproy86, dsarddar1}
Abstract. Today people are exchanging their thoughts through online Web
forums, blogs, and different social media platforms. Sometime they are giving
reviews and opinions on different products, brand, and their services. Their
reviews towards a product is not only improves the product quality but also
influence purchase decisions of the consumers. Thus, product review analysis is
widely accepted platform where consumer can easily aware about their
requirements. In this experiment, we track 568,454 fine food reviews of 74,258
products and 256,059 users on Amazon over a period of ten years. To analyze
the result, we select six most popular products and users based on the plain text
review and NRC emotion lexicon is used which can be categorized eight basic
emotions and two sentiments. Wordclouds also help our research to make
comparisons between the eight emotion categories. Our results show that how
sentiment analysis will help to identify the consumers’ behaviors and overcome
those risks to meet the consumers’ satisfaction.
1 Introduction
Today, electronic word-of-mouth (e-WOM) is one of the most important factors for
digital marketing [1]. Companies are using the digital platforms for promoting their
products. Nowadays customer’s online reviews can influence the purchase decisions
of a product. Though, Amazon, eBay, Flipkart, and Walmart, different e-commerce
companies are analyzing their customers’ reviews in different ways [2, 3].
Unstructured data such as text, audio, photos, or videos are posted by the
customers in social media websites, Web forums, blogs, and different online review
platform [4, 5]. Natural Language Processing (NLP) is an automatic programming
technique to analyze and comprehend large amounts of customers’ opinions where
different companies build a chatbots to support online customer service interactions
[6, 7, 8].
Sentiment classifications play an important role for classifying unstructured data.
Cui et al classified online product reviews into positive and negative classes [9]. In
their experiment, they used different machine learning algorithms to evaluate different
trade-offs of 100K of online-product reviews. Fang et al tackled the sentiment polarity
categorization problem in online product reviews of [10]. They
categorized their outcomes into two level; a) sentence-level categorization, b) review-
level categorization.
In this manuscript, we used NRC emotion lexicon that categorized customers’
reviews into eight emotions (anger, fear, trust, anticipation, sadness, surprise, disgust,
and joy) and two sentiments (positive and negative) [11]. Our proposed analyzing
system can enhance the review technique, which has discussed by Escalona [12].
The detailed sections of the manuscript are as follows. Section 2 is described
literature Survey. Section 3 deals with on data preparation along with review analysis.
Result analysis part is discussed in section 4 and the section 5 draws conclusions and
examines the possibilities of next future.
2 Literature Survey
Dataset has collected from the Kaggle which consists of reviews of fine foods from
Amazon from October 1999 to October 2012 [29, 30]. Dataset includes 568,454
reviews, 74,258 products, 256,059 users, and 260 users with more than 50 reviews.
Dataset in the form of plain text review is mainly unstructured data [31]. At first, we
apply data pre-processing method on our dataset [32, 33]. In this data preprocessing
steps, we have followed the steps below [34]:
Remove all URLs (e.g., screen name (e.g. @username), and all
hashtags (e.g. #topic)
Remove all symbols, punctuations, and numbers
Remove Stopwords
Substitute any non UTF-8 by space
Replace all the emoticons with their sentiment
Change text to lowercase
Replace words with their stems or roots
Remove the retweets
In our experiment, NRC emotion lexicon, large word list (like other lexicon AFINN,
ANEW, EmoLex, LabMT, General Inquirer, and SentiWordNet), is used for
sentiment analysis [11, 35]. This lexicon (version 0.92) had about 14,200 unigrams
word types with word-level emotion [11, 36].
Fig. 1 and 3, bar charts represent the eight emotions (anger, fear, trust, anticipation,
sadness, surprise, disgust, and joy) and two sentiments (positive and negative). We
track six most popular users and products for our sentiment analysis.
For user-centric approach, we found potential reviewer who made their own
opinion to the m number of products on the Amazon. Our aim is to find out the
sentiment based on their reviews. These analyses will help to identify the sentiment of
the customers. Similarly, in product-centric approach, we found the best reviewed-
product which has reviewed by n number of customers. We also analyzed the
sentiment or emotion of the n number of customers towards the products.
Fig. 2 and 4, word cloud is used to visualize review data which illustrates the
keywords used more frequently by the customers. We categorized our reviewed text
into eight emotion categories [37, 38].
In each category, highest frequency word will identify the most insightful of the
customers [39]. In our experiment, most of the amazon customers reviewed by the
below keywords; e.g. “good” or “wonderful” in surprise section and “flavor” or
“organic” or “smell” in disgust section.
Fig. 1. Distribution of emotion of six most popular product reviews of Amazon
Fig. 2. Word Cloud of six most popular product reviews of Amazon users.
Fig. 3. Distribution of emotion of six most popular product reviews of Amazon
Fig. 4. Word Cloud of six most popular product reviews of Amazon products
4 Result Analysis
In our research, we have done the sentiment or emotion classification using NRC
emotion lexicon [11, 36]. Our experiment results show that how customers’ reviews
are important for digital marketing research [40, 41]. Sometime customers given
ratings differ with their comment [42, 43, 44]. Researchers work on the sentiment
analysis of the customers’ reviews of different e-commerce websites [45, 46].
Our research is not free from constraints. Future work may be performed as below:
we will include some topic modeling based sentiment analysis features on the plain
text reviews of the customers on e-commerce websites, which will predict the best
product as per the customers’ needs [47]. We will also perform perplexity analysis
and lexicon quality assessment on plain text review data of the customers in the near
future [48].
