0% found this document useful (0 votes)
2 views7 pages

Sentimental Analysis

This project aims to conduct a comparative study of sentiment analysis through text using machine learning techniques, focusing on emotions like sadness, joy, love, anger, fear, and ego. Three algorithms—Stochastic Decision Tree, Random Forest, and Long Short-Term Memory—are employed to classify emotions from a dataset, with LSTM showing superior accuracy. The project includes stages for requirement gathering, design, coding, result analysis, and documentation, planned over a total duration of eight weeks.

Uploaded by

mme940623
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views7 pages

Sentimental Analysis

This project aims to conduct a comparative study of sentiment analysis through text using machine learning techniques, focusing on emotions like sadness, joy, love, anger, fear, and ego. Three algorithms—Stochastic Decision Tree, Random Forest, and Long Short-Term Memory—are employed to classify emotions from a dataset, with LSTM showing superior accuracy. The project includes stages for requirement gathering, design, coding, result analysis, and documentation, planned over a total duration of eight weeks.

Uploaded by

mme940623
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 7

SENTIMENTAL ANALYSIS THROUGH TEXT

ABSTRACT:
The objective of this project is to conduct a comparative study of
sentimental analysis through text using machine learning techniques,
specifically employing an Early Classification Based Approach for Fault
Classification. The emotions under consideration are sadness, joy, love, anger,
fear, and ego, which are commonly expressed in text data. To achieve this
goal, three machine learning algorithms, namely Stochastic Decision Tree
(DT), Random Forest (RF), and Long Short-Term Memory (LSTM), are
utilized. The study utilizes a data set containing diverse text samples
expressing different emotions, which is processed to remove irrelevant
information and then split into training and testing sets. The DT, RF, and
LSTM models are trained on the training set and evaluated on the testing set
using various evaluation metrics. The experimental results provide insights
into the performance of the three algorithms for emotion classification from
text. The findings reveal that DT and RF achieve comparable accuracy levels,
while LSTM outperforms them in terms of overall classification accuracy.
Key Words: Decision Trees, Random Forest, Long ShortTerm Memory

INPUTS CONSIDERED:

● Dataset of emotions:
This set consists of set of emotions sourced from “kaggle.com”.
● Image Pre-processing:
Techniques for data cleaning, data training.
● Machine Learning Libraries:
Machine learning libraries: NumPy, IO, Keras
Frameworks: Django
Database: SQL
● Hardware Tools: PyCharm, SQL workbench

EXPECTED OUTPUT:

● Emotion Identification:
Identifies the emotions from the input data.
● Emotion Classification:
Predicts the type of emotion.

LITERATURE STUDY:

• P. S. Sreeja and G. S. Mahalakshmi, ‘‘Emotion recognition from poems


by maximum posterior probability,’’ Int. J. Compute. Sci. Inf. Secur.,
vol. 14, pp. 36–43, 2016.
• J. Kaur and J. R. Saini, ‘‘Punjabi poetry classification: The test of 10
machine learning algorithms,’’ in Proc. 9th Int. Conf. Mach. Learn.
Comput. (ICMLC), 2017, pp. 1–5.
• G. Mohanty and P. Mishra, ‘‘Sad or glad? Corpus creation for Odia
poetry with sentiment polarity information,’’ in Proc. 19th Int. Conf.
Comput. Linguistics Intel. Text Process. (CICLing), Hanoi, Vietnam,
2018.
• Y. Hou and A. Frank, ‘‘Analysing sentiment in classical Chinese
poetry,’’ in Proc. 9th SIGHUM Workshop Lang. Technol. Cultural
Heritage, Social Sci., Hum. (LaTeCH), 2015, pp.15–24.
• Ghosh, G. Li, T. Veale, P. Rosso, E. Shut ova, J. Branden, and A.
Reyes, ‘‘SemEval2015 task 11: Sentiment analysis of figurative
language in Twitter,’’ in Proc. 9th Int. Workshop Semantic Eval.
(SemEval), 2015, pp. 470–478.
• G. Rashid, A. Ghosh, P. Bhattacharyya, and G. Haffari, ‘‘Automated
analysis of Bangla poetry for classification and poet identification,’’ in
Proc. 12th Int. Conf. Natural Lang. Process., Dec. 2015, pp. 247–253.
• K. Bischoff, C. S. Firan, R. Paiu, W. Nejdl, C. Laurier, and M. Sordo,
‘‘Music mood and theme classification-a hybrid approach,’’ in Proc.
ISMIR, Oct. 2009, pp. 657–662.
• O. Alsharif, D. Alshamaa, and N. Ghneim, ‘‘Emotion classification in
Arabic poetry using machine learning,’’ Int. J. Comput. Appl., vol. 65,
p. 16, May 2013.
• Zehe, M. Becker, F. Jannidis, and A. Hotho, ‘‘towards sentiment
analysis on German literature,’’ in Proc. Joint German/Austrian Conf.
Artif. Intell. Cham, Switzerland: Springer, 2017, pp. 387–394.
• L. Barros, P. Rodriguez, and A. Ortigosa, ‘‘Automatic classification of
literature pieces by emotion detection: A study on Quevedo’s poetry,’’
in Proc. Humaine Assoc. Conf. Affect. Comput. Intell. Interact., Sep.
2013, pp. 141–146.

HARDWARE AND SOFTWARE REQUIREMENTS:


Hardware Configuration:
processor : I3 /Intel Processor

Hard Disk : 160GB

RAM : 8GB
Software Configuration:
Operating System : windows 7/8/10

Server side Script : HTML,CSS&JS

IDE : PyCharm

Libraries Used : NumPy ,IO ,OS ,Django ,Keras

Technology : Python 3.6+.

TECHNIQUES & ALGORITHMS:


1. Random Forest:
A random forest is a machine learning technique that’s used to
solve regression and classification problems. It utilizes ensemble learning,
which is a technique that combines many classifiers to provide solutions to
complex problems. A random forest algorithm consists of many decision
trees. The ‘forest’ generated by the random forest algorithm is trained through
bagging or bootstrap aggregating. Bagging is an ensemble meta-algorithm
that improves the accuracy of machine learning algorithms.
The (random forest) algorithm establishes the outcome based on
the predictions of the decision trees. It predicts by taking the average or mean
of the output from various trees. Increasing the number of trees increases the
precision of the outcome. A random forest eradicates the limitations of a
decision tree algorithm. It reduces the over fitting of datasets and increases
precision. It generates predictions without requiring many configurations in
packages (like Scikit-learn).
Features of a Random Forest Algorithm:
• It’s more accurate than the decision tree algorithm.
• It provides an effective way of handling missing data.
• It can produce a reasonable prediction without hyper-parameter tuning.
• It solves the issue of overfitting in decision trees.
• In every random forest tree, a subset of features is selected randomly at
the node’s
• splitting point.
2. Decision Tree:

A tree has many analogies in real life and turns out that it has
influenced a wide area of machine learning, covering both classification and
regression. In decision analysis, a decision tree can be used to represent
decisions and decision making visually and explicitly. As the name goes, it
uses a tree-like model of decisions. Though a commonly used tool in data
mining for deriving a strategy to reach a particular goal. A decision tree is
drawn upside down with its root at the top. In the image on the left, the bold
text in black represents a condition/internal node, based on which the tree
splits into branches/ edges. The end of the branch that doesn’t split anymore
is the decision/leaf, in this case, whether the passenger died or survived,
represented as red and green text respectively Although, a real dataset will
have a lot more features and this will just be a branch in a much bigger tree,
but you can’t ignore the simplicity of this algorithm. The feature importance
is clear, and relations can be viewed easily. his methodology is more
commonly known as learning decision tree from data and the above tree is
called Classification tree as the target is to classify passengers as survived or
died. Regression trees are represented in the same manner, just they predict
continuous values like the price of a house. In general, Decision Tree
algorithms are referred to as CART or Classification and Regression Trees.

3. LSTM (long short term memory):

• Why Recurrent Neural Networks?


• Recurrent neural networks were created because there were a few
issues in the feed-forward neural network:
• Cannot handle sequential data
• Considers only the current input
• Cannot memorize previous inputs
• The solution to these issues is the Recurrent Neural Network (RNN).
An RNN can handle
• sequential data, accepting the current input data, and previously
receiving inputs. RNNs can
• memorize previous inputs due to their internal memory.
• It is a variety of recurrent neural networks (RNNs) that are capable of
learning long-term dependencies, especially in sequence prediction
problems. LSTM has feedback connections.
• The central role of an LSTM model is held by a memory cell known as
a ‘cell state’ that maintains its state over time. The cell state is the
horizontal line that runs through the top of the below diagram. It can
be visualized as a conveyor belt through which information just
flows, unchanged.

PROJECT PLAN WITH TIMELINES:


Stage 1: Gathering Requirements and
Literature Survey
Time Duration: 2 weeks
Project scope, data collection, and
literature review.
Stage 2: High-Level and Detailed
Design
Time Duration: 1 week
System architecture, model selection,
and pseudocode.
Stage 3: Coding
Time Duration: 3 weeks
Model implementation, preprocessing
pipelines, and integration.
Stage 4: Result Analysis
Time Duration: 1 week
Model evaluation, performance
metrics, and insights.
Stage 5: Documentation
Time Duration: 1 week
Comprehensive report with
methodology, results, and future scope.
Total Duration: 8 weeks

SIGNATURE OF TEAM MEMBERS


1.
2.
3.
4
SIGNATURE OF GUIDE

You might also like