Project Report: Sentiment Analysis in Hindi Language
Project Report: Sentiment Analysis in Hindi Language
on
Sentiment Analysis in Hindi Language
for
Digipodium
towards partial fulfillment of the requirement
for the award of degree of
I Floor, H-Block, BBDU, BBD City, Faizabad Road, Lucknow (U. P.) INDIA 226028
PHONE: HEAD: 0522-3911127, 3911321 Dept. Adm. & Exam Cell: 0522-3911326 Dept. T&P Cell: 0522-3911128; E-Mail: head.sca@gmail.com
Babu Banarasi Das University
Lucknow
CERTIFICATE
This is to certify that Project Report entitled
I have taken efforts in this project. However, it would not have been
possible without the kind support and help of many individuals and
organizations. I would like to extend my sincere thanks to all of them.
I would like to express my special gratitude and thanks to group members for
giving me such attention and time.
Signature of Student :
Himani Raj
Deepak Vishwakarma
BCA VIth Semester
Date:
ABSTRACT
Sentiment analysis in Hindi Language is the field of study that analyzes people's
opinions, sentiments, evaluations, attitudes, and emotions from written hindi language.
Sentiment analysis is one of the most active research areas in natural language
processing and is also widely studied in Data mining, Web mining, and Text mining.
In fact, this research has spread outside of computer science to the management
sciences and social sciences due to its importance to business and society as a whole.
The growing importance of sentiment analysis coincides with the growth of social
media such as reviews, forum discussions, blogs, micro-blogs, Twitter, and social
networks.
In the implemented system, Hindi sentences are collected and sentiment analysis is
performed on them to get the polarity of the given Hindi sentences as positive,
negative or neutral.
TABLE OF CONTENT
1. Problem Statement
2. Introduction
2.1. What is Sentiment?
2.2. Examples of Sentiment
3. Objectives
4. Scope
5. Proposed System
6. Feasibility Study
7. Planning Steps
8. Development Life Cycle Model
9. Module Description
10. Resources To be used:
10.1. Hardware Requirements
10.2. Software Requirements
11. Use Case Diagram
12. Class Diagram
13. Activity Diagram
14. Screenshots
15. Conclusions
INTRODUCTION OF PROJECT
Sentiment analysis is the process of using natural language processing, text analysis, and statistics
to analyze customer sentiment. The best businesses understand the sentiment of their customers—
what people are saying, how they’re saying it, and what they mean.
Customer sentiment can be found in tweets, comments, reviews, or other places where people
mention your brand. Sentiment Analysis is the domain of understanding these emotions with
software, and it’s a must-understand for developers and business leaders in a modern workplace.
Since customers express their thoughts and feelings more openly than ever before, sentiment
analysis is becoming an essential tool to monitor and understand that sentiment.
So, sentiment analysis is a natural language processing task that deals with automated extraction of
subjective content from digital text and predicting the subjectivity such as positive, negative or
neutral
NEED
OF
IDENTIFICATION
Objective:
This project aims to analyze and predict the sentiment of the Hindi
sentences as positive, negative or neutral.
To evaluate the persons opinion
To determine the emotional tone behind the series of the word
implement in algorithm for automatic classification of text into
positive /negative
To teach the machine to analyse the various grammatical nuances.
Scope:
It can be very effective in predicting movie review, websites etc.
Brand Monitoring
Keeping an Eye on Your Competition
Reviews of digital platforms can be also used to give useful data
which can be used to predict future
Proposed System
Dataset as input: It is the process in which the input is given and the training of the model is done.
Filtering using Model: In this process the filtration is done using the model of which the sentiment is
to identify.
Classifying the data: In this process the classification is observed whether the sentiment is positive,
negative or, neutral.
Majority of the existing work in this field is in English but our work is a foray into
sentiment analysis for Hindi.
The problem in Hindi sentiment analysis is classifying the polarity of a given text in
document, sentence etc.
System Design
Spiral model is a combination of sequential and prototype model. This model is best
used for large projects which involves continuous enhancements. There are specific
activities which are done in one iteration (spiral) where the output is a small prototype
of the large software. The same activities are then repeated for all the spirals till the
entire software is build.
Spiral Model
FEASIBILITY
STUDY
A feasibility study is an assessment of the practicality of a proposed project or system.
A feasibility study aims to objectively and rationally uncover the strength and
weaknesses of an existing project.
The feasibility report of the project holds the advantages and flexibility of
the project.
This is divided into three sections:-
Technical Feasibility
Economic Feasibility
Operational Feasibility
Technical Feasibility
This assessment focuses on the technical resources available to the organization. Its
helps organization determine whether the technical resources meet capacity and
whether the technical team is capable of converting the ideas into working system
Economic Feasibility
This assessment typically involves a cost/benefits analysis of the project, helping
organizations determine the viability, cost, and benefits associated with a project
before financial resources are allocated.
Operational Feasibility
This assessment involves undertaking a study to analyse and determine whether—and
how well—the organization’s needs can be met by completing the project. Operational
feasibility studies also examine how a project plan satisfies the requirements identified
Modules
1. Data collection
2. Text Processing and tokenization
3. Lemmatization & Vectorization
4. Feature Extraction
5. Model Creation
6. Model Training
7. Model Evaluation and Testing
8. Database Manager
9. View Display Manager
10. Setting system
In this project two members are involved and so we have divided the whole
project into two groups:-
■ Data collection
■ Feature Extraction
■ Model Creation
■ Model Training
■ Database Manager
■ Setting System
Use case Diagram
Class Diagram
Activity Diagram
Module Description
• Data Collection: A data set is a collection of data. As we need training data set to analysis so
it is the actual data set used to train the model for performing various actions.
• Text Processing and tokenization: Text processing modules will contain that concept in
which the coding for breaking, cleaning and analyzing the text, word into different segments.
• Lemmatization & Vectorization: In this modules all the words are converted into simple
forms by finding their particular tense then they are charged in numbers form by using
vectorization process.
• Feature Extraction: This module helps us to select all the important columns which are
necessary for prediction.
• Model Creation: In this module we write our code for the program.
• Model Training: Model Training in the process of feeding an ML algorithm with data to
help to identify and learn good values for all attributes involved. So we will be using several
types of machine learning models, of which the most common ones are supervised.
• Model Evaluation and Testing: Model evaluation aims to estimate the generalization
accuracy of a model on future data. Moreover the models will be primarily used for testing the
model performance in terms of accuracy/precision of the model.
• Database Manager: As understood by name, database manager manages database. It can
create, add, and store the data.
• View Display Manager: This module will contain code related to frontend of the
application we are creating.
• Setting System: This module contains all the setting regarding the project
SOFTWARE
REQUIREMENT SPECIFICATION
Gantt Chart
A Gantt chart is popular type of chart that illustrates a project schedule. Gantt
Chart illustrates the start and finish dates of the terminal elements and
summary elements of a project. Terminal element and summary comprise the
work breakdown structure of the project.
Analysis
10 days
Designing
30 days
Coding
34days
Unit Testing
5 days
Implementation
5 days
Functional Requirements:
1. R1 R Start application
3. R3 R prediction
4. R4 R View Records.
Client Side
Web Browser (Google Chrome, Firefox, IE9 or above)
Developer Side
Web Browser (Google Chrome, Firefox, IE9 or above)
Python 3.7 or above
Vs code
SQLite manger
Libraries
8. Hardware Requirements:
CLIENT SIDE
SERVER SIDE
Processor Dual Core or above
RAM 4 GB or above
Disk space 500 GB
Monitor 14”
Others Keyboard, mouse, Internet
Connection
Screenshots
(Database)