Project Report

Abstract
As online marketplaces have been popular during the past decades, the online
sellers and merchants ask their purchasers to share their opinions about the
products they have bought. As a result, millions of reviews are being generated
daily which makes it difficult for a potential consumer to make a good
decision on whether to buy the product. Analyzing this enormous amount
of opinions is also hard and time consuming for product manufacturers. This
thesis considers the problem of classifying reviews by their overall
semantic (positive or negative). To conduct the study two different
supervised machine learning techniques, NLTK and transformers, has been
attempted on beauty products from Amazon. Their accuracies have then
been compared. The results showed that the SVM approach outperforms the
Naive Bayes approach when the data set is bigger. However, both algorithms
reached promising accuracies of at least 80%.
Introduction
As online marketplaces have been popular during the past decades, the
online sellers and merchants ask their purchasers to share their opinions
about the products they have bought. Everyday millions of reviews are
generated all over the Internet about di↵erent products, services and
places. This has made the Internet the most important source of getting
ideas and opinions about a product or a service.
However, as the number of reviews available for a product grows, it is

becoming more difficult for a potential consumer to make a good decision
on whether to buy the product. Di↵erent opinions about the same product
on one hand and ambiguous reviews on the other hand makes customers
more confused to get the right decision. Here the need for analyzing this
contents seems crucial for all e-commerce businesses.
Sentiment analysis and classification is a computational study which

attempts to address this problem by extracting subjective information from
the given texts in natural language, such as opinions and sentiments.
Di↵erent approaches have used to tackle this problem from natural
language processing, text analysis, computational linguistics, and
biometrics. In recent years, Machine learning methods have got popular in
the semantic and review analysis for their simplic- ity and accuracy.
Amazon is one of the e-commerce giants that people are using every day
for online purchases where they can read thousands of reviews dropped by
other customers about their desired products. These reviews provide
valuable opinions about a product such as its property, quality and
recommendations which helps the purchasers to understand almost
every detail of a product. This is not only beneficial for consumers but
also helps sellers who are manufacturing their own products to
understand the consumers and their needs better.
This project is considering the sentiment classification problem for online

reviews using supervised approaches to determine the overall semantic of
customer reviews by classifying them into positive and negative sentiment.
Project Description
This project is all about Amazon reviews analysis using sentiment analysis given
by Amazon customers on fine foods. We actually analyze the motion behind
the text. To be more clear, we will analyze these texts and calculate sentiment
scores for each of them. So we will use Vader and RoBERTa models to calculate
polarity scores.
Problem Statement
This project’s primary objective is to conduct sentiment analysis, determine

the polarity scores of the texts, and understand the sentiment hidden in each
one.
Prerequisites
 Jupyter or Colab Notebook to run the code

 Basic understanding of Python language
 Basics of Natural Language Processing(NLP)
Dataset Description
The dataset that we are using now is the Amazon Reviews on fine foods
dataset. These reviews include information about the product and the user,
ratings given by customers, and a plain text review. This dataset contains,
 A total of 568,454 reviews

 Reviews from Oct 1999 – Oct 2012
 Reviews from 256,059 users
 There are 260 users with more than 50 reviews
 On 74,258 products
Sentiment Analysis
Sentiment analysis is a technique for examining text data to determine its
sentiment. The aim is to automatically recognize and categorize opinions
stated in the text to calculate overall emotion. Sentiment analysis techniques
categorize them as good, neutral, or negative. Using machine learning and text
analytics, algorithms can categorize sentences into positive, negative, and
neutral categories. Many companies, including Amazon and Twitter, use this
sentiment analysis to analyze customer reviews on their products, and they
will improve based on these results.
Challenges
 Sarcasm is the first and foremost challenge of sentiment analysis.

Sometimes for negative amazon reviews analysis, people may give them
in a positive way, making it sarcastic.
 Sometimes text may contain negative words, but it doesn’t mean the
intention is negative. It May might confuse some of those texts while
classifying.
 It is difficult to analyze reviews with a mixture of different languages.
Companies have customers across the world. So it is common to have
multilingual reviews.
 Word ambiguity is another challenge to deal with. Some words in the
text make it difficult to classify the text correctly.
Methodology
Vader Sentiment Analysis
Text sentiment analysis is carried out using the VADER (Valence Aware Dictionary
for Sentiment Reasoning) model, which is sensitive to both the polarity
(positive/negative) and intensity (strength) of emotion. In addition to reporting on
positivity and negativity scores, VADER also provides information about the
sentiment of a statement. One can calculate the sentiment score of a text by
multiplying the intensity of each word in the text. Vader is very intelligent in
knowing positive and negative sentences based on the words in the sentence.
The Compound score in Vader is a measurement that adds together all lexical
ratings that have been scaled between -1 (the most extreme negative) and +1.
(most extreme positive).
from nltk.sentiment import SentimentIntensityAnalyzer

from tqdm.notebook import tqdm
SIA=SentimentIntensityAnalyzer()
SentimentIntensityAnalyzer() of VADER analyses a string and produces a
dictionary of scores in four categories: Negative, Neutral, Positive, and
Compound, obtained by normalizing the remaining three scores.
In the first example, It is more likely a positive sentence, giving a positive

compound. In the second example, It’s purely a negative sentence. The
compound is also negative. Finally, In the third example, we have passed the
sentence we have used.
Here the customer is complaining that the product was labeled as large-sized, but
the actual product is small-sized. This is probably a negative sentence. As
expected, our Vader model also gave a negative compound, saying it’s a negative
sentence.
#create empty dictionary to store results

results={}
for i,row in tqdm(df.iterrows(),total=len(df)):
text=row['Text']
myid=row['Id']
results[myid]=SIA.polarity_scores(text)
Vaders=pd.DataFrame(results).T
Vaders= Vaders.reset_index().rename(columns={'index': 'Id'})
Vaders=Vaders.merge(df,how='left')
Vaders
We created polarity scores for all the texts in the data frame and got negative,
neutral, positive, and compound scores for each. Then we merged this data into
the original data frame and created the Vader data frame.
RoBERTa Model
RoBERTa is a transformers model that was self-supervised and pre-trained on a
huge corpus of English data. This indicates that it was just pre-trained on the raw
texts, without any human labeling, with an automatic procedure that uses the
texts to produce inputs and labels. RoBERTa and BERT differ significantly from
each other in that RoBERTa was learned using a larger dataset and a more
efficient training method. RoBERTa was specifically trained on a dataset of 160GB
of text, which is more than 10 times bigger than the dataset used to train BERT.
Roberta analyses a string and produces a dictionary of scores in three categories:
Negative, Neutral, and Positive.
Initially install transformers and then

import AutoTokenizer, AutoModelForSequenceClassification, and softmax. Then
we loaded the model and created a tokenizer.
pip install transformers

from transformers import AutoTokenizer
from transformers import AutoModelForSequenceClassification
from scipy.special import softmax
#Load the pre-trained model

MODEL = f"cardiffnlp/twitter-roberta-base-sentiment"
tokenizer = AutoTokenizer.from_pretrained(MODEL)
model = AutoModelForSequenceClassification.from_pretrained(MODEL)
Similar to polarity scores, we have Roberta’s polarity scores. We will also create it
for each text and add it to the data frame.
def roberta_polarity_scores(sentence):
encoded_text = tokenizer(sentence, return_tensors='pt')
output = model(**encoded_text)
scores = output[0][0].detach().numpy()
scores = softmax(scores)
scores_dict = {
'roberta_neg' : scores[0],
'roberta_neu' : scores[1],
'roberta_pos' : scores[2]
}
return scores_dict
res = {}
for i, row in tqdm(df.iterrows(), total=len(df)):
try:
text = row['Text']
myid = row['Id']
vader_result = SIA.polarity_scores(text)
vader_result_rename = {}
for key, value in vader_result.items():
vader_result_rename[f"vader_{key}"] = value
roberta_result = roberta_polarity_scores(text)
both = {**vader_result_rename, **roberta_result}
res[myid] = both
except RuntimeError:
print(f'Broke at id {myid}')
results_df = pd.DataFrame(res).T
results_df = results_df.reset_index().rename(columns={'index': 'Id'})
results_df = results_df.merge(df, how='left')
Let’s View the final data frame and columns in the dataset.
results_df
# To view columns
results_df.columns
Columns output: Index([‘Id’, ‘vader_neg’, ‘vader_neu’, ‘vader_pos’,

‘vader_compound’, ‘roberta_neg’, ‘roberta_neu’, ‘roberta_pos’, ‘ProductId’,
‘UserId’, ‘ProfileName’, ‘HelpfulnessNumerator’, ‘HelpfulnessDenominator’,
‘Score’, ‘Time’, ‘Summary’, ‘Text’], dtype=’object’)
Analyzing Wrongly Classified Texts

Even though most of the texts are classified correctly, there will still be some
ambiguous, wrongly classified sentences. Sometimes they may sound positive,
but in actuality, they are negative. Similarly, positive sentences sometimes sound
like negative ones. Now we will see a few texts that our models wrongly classify.
In the first example, we took a text that was classified as positive by Roberta, but
the customer rating is 1.
This sounds more like a positive sentence with positive words like LOVE. But if we
go into details, it’s a negative text complaining about the plastic found in food. So
Roberta classified it wrong.
In the second example, we took a text that was classified as positive by Vader, but
the customer rating is 1.
It’s a negative one, but the customer might sarcastically give a review mentioning
it as a positive note. So model analyzed it as positive text.
The last two examples are the same. Both Vader and Roberta models classified it
as negative text but rated it 5.
Here the customer actually loved the food but gave the review complaining about
weight gain. So models are classified as negative.
Conclusion
The invention of transformers made it possible for researchers to vectorize each
word and define how it links to other concepts. Words can now be described
using a variety of dimensions that show how closely related they are to the
meanings and usage of other words. The use of transformers made it simpler than
ever to model the link between words. Many applications of transformers are
used in Virtual assistants, marketing, analyzing medical records, and many more.
 A vast amount of unstructured review texts are analyzed using sentiment

analysis in order to extract people’s thoughts and categorize them into
sentiment classes.
 Sentiment analysis is the result of the fusion of machine learning
technology and human emotional interpretation.
 Transformers have been employed in creating stupid joke-telling chatbots,
realistic news stories, and even better Google Search results.
 You can build your own google assistant to tell many more jokes, sing
songs, and learn these transformer models.
 This guide taught us many things, like transformers, NLTK, NLP, Sentiment
analysis, Vader, and RoBERTa models.

Project Report

Uploaded by

Document Informationclick to expand document information

Copyright:

Available Formats

Project Report

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Project Report

Uploaded by

Copyright:

Available Formats

Abstract

However, as the number of reviews available for a product grows, it is

Sentiment analysis and classification is a computational study which

This project is considering the sentiment classification problem for online

This project’s primary objective is to conduct sentiment analysis, determine

 Jupyter or Colab Notebook to run the code

 A total of 568,454 reviews

 Reviews from 256,059 users

 There are 260 users with more than 50 reviews

 Sarcasm is the first and foremost challenge of sentiment analysis.

from nltk.sentiment import SentimentIntensityAnalyzer

In the first example, It is more likely a positive sentence, giving a positive

#create empty dictionary to store results

Initially install transformers and then

pip install transformers

#Load the pre-trained model

Columns output: Index([‘Id’, ‘vader_neg’, ‘vader_neu’, ‘vader_pos’,

Analyzing Wrongly Classified Texts

 A vast amount of unstructured review texts are analyzed using sentiment

You might also like