0% found this document useful (0 votes)
4 views9 pages

Mini Project BDA

Download as pdf or txt
Download as pdf or txt
Download as pdf or txt
You are on page 1/ 9

Mini Project

Social Media Sentiment Analysis for Brand Monitoring


This mini-project involves using Sentiment Analysis to analyze customer opinions about
a brand, product, or service based on social media posts and reviews. The project uses
Natural Language Processing (NLP) techniques and various libraries like VADER,
TextBlob, or more advanced models such as BERT to process and analyze social media
data. The goal is to monitor customer sentiment in real-time and provide valuable
insights to improve brand image, marketing strategies, and customer engagement.

1. Abstract
Social media has become an essential platform for brands to engage with customers and gain
valuable insights. One of the most powerful tools to monitor customer feedback is sentiment
analysis, which involves determining the emotional tone behind social media posts, reviews,
or comments. In this project, we implement sentiment analysis to evaluate public opinion
about a brand or product by analyzing posts from social media platforms like Twitter,
Facebook, and product reviews from e-commerce platforms like Amazon. By using
Natural Language Processing (NLP) techniques such as VADER and TextBlob, the
project categorizes sentiment into positive, negative, or neutral classes. Additionally, deep
learning models such as BERT will be explored to enhance the accuracy and depth of
sentiment classification. The end result is a sentiment analysis dashboard that provides a
clear view of customer feedback over time, along with recommendations for improving
customer satisfaction and brand perception.

2. Introduction
Social media is a rich source of customer opinions, feedback, and discussions. Brands across
various industries leverage platforms such as Twitter, Facebook, and Instagram to engage
with customers and monitor their experiences. However, manually analyzing the vast volume
of customer comments and posts is an overwhelming task.

Sentiment analysis, a subfield of Natural Language Processing (NLP), automates the process
of identifying and extracting subjective information from text. By classifying the emotional
tone of a social media post or review, sentiment analysis helps brands to gauge customer
opinions and make data-driven decisions. This project aims to implement a sentiment
analysis model for brand monitoring, where sentiment analysis is applied to social media
posts and product reviews to gain insights into customer perceptions.

The project utilizes popular sentiment analysis libraries such as VADER (Valence Aware
Dictionary and sEntiment Reasoner) and TextBlob, as well as advanced models like BERT
to provide accurate sentiment predictions.
3. Problem Statement
The main objective of this project is to develop a sentiment analysis system that allows
brands to understand how customers perceive their products or services through social media
and review platforms. The key problems addressed include:

1. Real-time monitoring of customer feedback: Given the massive volume of posts on


platforms like Twitter and product reviews on e-commerce websites, there is a need
for an automated system to process and analyze the data in real-time.
2. Identifying sentiment trends: By understanding whether customer sentiment is
predominantly positive, negative, or neutral, brands can adjust their marketing or
customer service strategies.
3. Providing actionable insights: The sentiment analysis model will generate insights
that can inform decisions regarding customer service improvements, brand
messaging, product modifications, and targeted marketing campaigns.

4. Literature Review
 Sentiment Analysis: The concept of sentiment analysis, which involves determining
the sentiment behind text data, dates back to early research on opinion mining.
Initially, sentiment analysis was based on simple rule-based systems, using keyword
matching and predefined sentiment lexicons (e.g., SentiWordNet). Over the years,
machine learning techniques, particularly supervised learning models like Naive
Bayes, Support Vector Machines (SVM), and Logistic Regression, have improved
accuracy.
 VADER Sentiment Analysis: VADER (Valence Aware Dictionary and sEntiment
Reasoner) is a popular lexicon and rule-based model specifically designed for social
media text. It captures sentiment nuances, such as emoticons and slang, making it a
suitable choice for analyzing Twitter posts and social media comments.
 TextBlob: TextBlob is another sentiment analysis tool built on top of NLTK
(Natural Language Toolkit). It uses a simple approach to classify text as positive,
negative, or neutral. TextBlob is particularly user-friendly for beginner-level
sentiment analysis tasks.
 BERT and Deep Learning Models: While traditional sentiment analysis methods
rely on feature extraction and hand-crafted rules, recent advancements in Deep
Learning (particularly models like BERT—Bidirectional Encoder Representations
from Transformers) have significantly enhanced sentiment analysis. BERT's
transformer architecture captures context better by processing text in a bidirectional
manner, improving sentiment classification accuracy.
5. Methodology
The methodology for this project involves several key steps:

1. Data Collection:
o Data will be collected from social media platforms such as Twitter (using
Twitter's API) and product reviews from platforms like Amazon or Yelp.
o Twitter API will be used to gather tweets related to a specific brand or
product.
o Web scraping techniques will be employed for collecting product reviews
from websites like Amazon or Yelp.

2. Text Preprocessing:
o Cleaning the data: Removing URLs, mentions, hashtags, stopwords, and
other irrelevant information.
o Tokenization: Breaking the text into smaller units (words or phrases).
o Lemmatization: Converting words to their base or root form.
o Removing noise: Handling spelling errors, special characters, and numbers
that do not contribute to sentiment analysis.

3. Sentiment Classification:
o VADER Sentiment Analysis: VADER is a lexicon-based model that assigns
scores for positivity, negativity, and neutrality to each text.
o TextBlob: TextBlob’s polarity score can be used to classify text as positive,
negative, or neutral.
o BERT: The BERT model will be fine-tuned for sentiment analysis using a
pre-labeled dataset. It will leverage contextual embeddings to classify text
based on the sentiment.

4. Data Visualization:
o Sentiment results will be visualized using matplotlib and seaborn libraries to
generate trends over time.
o A sentiment dashboard will be developed using tools like Plotly or Dash to
display live sentiment scores, trends, and insights.
o Word clouds can be used to visualize the most common positive and negative
terms related to the brand/product.

5. Model Evaluation:
o The performance of sentiment analysis models (VADER, TextBlob, BERT)
will be evaluated using metrics such as Accuracy, Precision, Recall, and F1-
Score.
6. Implementation
Python program that demonstrates Social Media Sentiment Analysis using VADER
Sentiment Analysis and TextBlob. This example uses Twitter data for sentiment analysis
and provides visualizations for sentiment trends.

Step A: Install Required Libraries

Before starting, make sure you have the necessary libraries installed. You can do this by
running the following commands in your terminal or command prompt:

CODE:
pip install tweepy matplotlib seaborn vaderSentiment textblob

Step B: Program for Twitter Sentiment Analysis

This example collects tweets related to a brand (e.g., "Nike") using the Twitter API,
performs sentiment analysis using VADER and TextBlob, and visualizes the results.

CODE:
import tweepy
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
from vaderSentiment.vaderSentiment import SentimentIntensityAnalyzer
from textblob import TextBlob

# Step 1: Twitter API Authentication


# Replace the placeholders with your own Twitter API credentials
consumer_key = 'your_consumer_key'
consumer_secret = 'your_consumer_secret'
access_token = 'your_access_token'
access_token_secret = 'your_access_token_secret'

# Authenticate to Twitter API


auth = tweepy.OAuthHandler(consumer_key, consumer_secret)
auth.set_access_token(access_token, access_token_secret)
api = tweepy.API(auth)

# Step 2: Collect tweets related to a specific brand (e.g., Nike)


brand_name = "Nike"
tweets = tweepy.Cursor(api.search_tweets, q=brand_name, lang="en").items(100)

# Step 3: Create a DataFrame to store the tweet data


tweets_data = pd.DataFrame([[tweet.created_at, tweet.text] for tweet in tweets],
columns=["Date", "Tweet"])

# Step 4: Sentiment Analysis using VADER and TextBlob

# Initialize VADER Sentiment Analyzer


analyzer = SentimentIntensityAnalyzer()

# VADER Sentiment Analysis


tweets_data['VADER_Sentiment'] = tweets_data['Tweet'].apply(lambda x:
analyzer.polarity_scores(x)['compound'])
tweets_data['VADER_Sentiment_Class'] = tweets_data['VADER_Sentiment'].apply(lambda
x: 'Positive' if x > 0 else ('Negative' if x < 0 else 'Neutral'))

# TextBlob Sentiment Analysis


tweets_data['TextBlob_Sentiment'] = tweets_data['Tweet'].apply(lambda x:
TextBlob(x).sentiment.polarity)
tweets_data['TextBlob_Sentiment_Class'] =
tweets_data['TextBlob_Sentiment'].apply(lambda x: 'Positive' if x > 0 else ('Negative' if x < 0
else 'Neutral'))

# Step 5: Data Visualization


plt.figure(figsize=(10, 6))

# VADER Sentiment Distribution


sns.countplot(x='VADER_Sentiment_Class', data=tweets_data, palette='Set1')
plt.title('VADER Sentiment Distribution for Nike Tweets')
plt.show()

# TextBlob Sentiment Distribution


plt.figure(figsize=(10, 6))
sns.countplot(x='TextBlob_Sentiment_Class', data=tweets_data, palette='Set2')
plt.title('TextBlob Sentiment Distribution for Nike Tweets')
plt.show()

# Step 6: Sentiment Trend over Time (VADER)


tweets_data['Date'] = pd.to_datetime(tweets_data['Date'])
tweets_data.set_index('Date', inplace=True)

# VADER Sentiment Trend


plt.figure(figsize=(10, 6))
tweets_data['VADER_Sentiment'].resample('D').mean().plot()
plt.title('VADER Sentiment Trend Over Time')
plt.xlabel('Date')
plt.ylabel('Average Sentiment')
plt.show()

# Step 7: Output some sample data


print(tweets_data[['Date', 'Tweet', 'VADER_Sentiment', 'TextBlob_Sentiment']].head(10))
Step C: Explanation of the Code

1. Authentication:
o We use the tweepy library to authenticate and access the Twitter API using
OAuth keys. You'll need to create a Twitter Developer account to get your
credentials.
2. Data Collection:
o We use tweepy.Cursor to collect tweets related to the brand name (e.g., Nike)
from Twitter. This retrieves 100 tweets.
3. Sentiment Analysis:
o VADER: The VADER sentiment analyzer provides a score that ranges from -
1 (negative) to +1 (positive). It also classifies each tweet as "Positive",
"Negative", or "Neutral".
o TextBlob: TextBlob calculates sentiment polarity, where values greater than 0
indicate positive sentiment, less than 0 indicate negative sentiment, and values
close to 0 indicate neutral sentiment.
4. Visualization:
o We use Seaborn and Matplotlib to plot:
 The distribution of positive, negative, and neutral sentiments for both
VADER and TextBlob.
 A Sentiment Trend Over Time using VADER scores, resampled by
day.
5. Output:
o The program outputs sample data showing the tweet text and its corresponding
sentiment scores from both VADER and TextBlob.

SAMPLE OUTPUT:

Let's assume that the program ran successfully and you received the following outputs:

VADER Sentiment Distribution for Nike Tweets:

 Positive: 45 tweets
 Negative: 30 tweets
 Neutral: 25 tweets

(You will see a bar plot showing these counts)

TextBlob Sentiment Distribution for Nike Tweets:

 Positive: 50 tweets
 Negative: 20 tweets
 Neutral: 30 tweets

(Another bar plot for TextBlob sentiment)

Sentiment Trend Over Time (VADER):

This plot will show how the average sentiment score changes over time (e.g., positive spikes
when there are favorable tweets about Nike's products or promotions).
Sample Data Output (printed to the console):

Date Tweet VADER_Sentiment


TextBlob_Sentiment
0 2024-11-10 08:21:14 @Nike is launching a new collection tomorrow! Exciting times
ahead! 😊 0.7800 0.5000
1 2024-11-10 08:22:18 Just got my new Nike shoes. They’re so comfortable! #NikeLove
0.8310 0.6250
2 2024-11-10 08:23:42 Nike's recent campaign was horrible. I am disappointed. -0.7040
-0.5000
3 2024-11-10 08:25:05 I love Nike but their prices are getting way too high. 0.1000 -
0.2000
4 2024-11-10 08:26:30 Nike should focus more on sustainability. -0.2500 -0.2000
5 2024-11-10 08:27:47 Nike's new app update is a huge improvement! Well done! 0.7200
0.4000
6 2024-11-10 08:29:11 Nike's shoes aren't as durable as they used to be. -0.3700 -
0.4500
7 2024-11-10 08:30:23 Nike’s designs have been amazing lately! 0.6400 0.7500
8 2024-11-10 08:32:10 Not happy with my Nike sneakers, they broke after a month of use. -
0.8200 -0.6000
9 2024-11-10 08:33:45 I really appreciate how Nike supports athletes with disabilities.
0.9000 0.8500

This output shows the Date, Tweet, VADER Sentiment, and TextBlob Sentiment for each
of the first 10 tweets related to "Nike".

7. Results
 Sentiment Scores: A breakdown of the sentiment analysis results will be displayed,
showing the percentage of positive, negative, and neutral posts about the brand.
 Trend Over Time: The sentiment of posts over time (e.g., daily, weekly) will be
visualized to show whether the sentiment is improving or declining.
 Actionable Insights: For example, if negative sentiment spikes during a product
launch, it may indicate issues with the product or marketing strategy.
8. Discussion and Analysis
 Effectiveness of VADER and TextBlob: These models perform well for basic
sentiment analysis tasks and are computationally less expensive.
 BERT Model Performance: The BERT model provides more accurate and context-
aware sentiment predictions but requires more computational resources and a large
labeled dataset for fine-tuning.
 Real-World Application: This sentiment analysis system can be used for real-time
monitoring of brand reputation, allowing businesses to respond to negative feedback
proactively.

9. Conclusion
This mini-project demonstrates how sentiment analysis can be applied to social media data to
monitor customer opinions about a brand or product. By using NLP techniques like VADER,
TextBlob, and BERT, we can classify customer sentiment as positive, negative, or neutral
and visualize trends over time. The results provide businesses with valuable insights that can
guide marketing strategies, customer service improvements, and product development.

10. References
1. Hutto, C. J., & Gilbert, E. E. (2014). VADER: A Parsimonious Rule-based Model
for Sentiment Analysis of Social Media Text. Proceedings of the 8th International
Conference on Weblogs and Social Media (ICWSM 2014).
2. Liu, B. (2012). Sentiment Analysis and Opinion Mining. Morgan & Claypool
Publishers.
3. Devlin, J., Chang, M., Lee, K., & Toutanova, K. (2019). BERT: Pre-training of
Deep Bidirectional Transformers for Language Understanding. arXiv:1810.04805.
4. TextBlob Documentation: https://textblob.readthedocs.io/en/dev/
5. VADER Sentiment Analysis: https://github.com/cjhutto/vaderSentiment

You might also like