0% found this document useful (0 votes)
55 views

Sentiment Analysis of Comment Texts Based On BiLSTM

This document summarizes a research article that proposes a new method for sentiment analysis of comment texts. The method uses a bidirectional long short-term memory (BiLSTM) neural network with weighted word vectors that integrate sentiment information from words. The sentiment analysis method is compared to other methods like RNN, CNN, LSTM and Naive Bayes classifiers. Experimental results show the proposed BiLSTM method achieves higher precision, recall and F1 score, demonstrating its effectiveness for sentiment analysis of comments.

Uploaded by

Madhuri Desai
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
55 views

Sentiment Analysis of Comment Texts Based On BiLSTM

This document summarizes a research article that proposes a new method for sentiment analysis of comment texts. The method uses a bidirectional long short-term memory (BiLSTM) neural network with weighted word vectors that integrate sentiment information from words. The sentiment analysis method is compared to other methods like RNN, CNN, LSTM and Naive Bayes classifiers. Experimental results show the proposed BiLSTM method achieves higher precision, recall and F1 score, demonstrating its effectiveness for sentiment analysis of comments.

Uploaded by

Madhuri Desai
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 11

This article has been accepted for publication in a future issue of this journal, but has not been

fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2019.2909919, IEEE
Access

Date of publication xxxx 00, 0000, date of current version xxxx 00, 0000.
Digital Object Identifier 10.1109/ACCESS.2017.Doi Number

Sentiment Analysis of Comment Texts based


on BiLSTM
Guixian Xu1, Yueting Meng1, Xiaoyu Qiu2, Ziheng Yu1, and Xu Wu1
1
College of Information Engineering, Minzu University of China, Beijing 100081, China
2
The Library, Shandong University of Traditional Chinese Medicine, Jinan 250355, China.
Corresponding author: Guixian Xu (e-mail: guixian_xu@muc.edu.cn).
This work was supported by the Humanities and Social Science Research Planning Fund of the Ministry of Education (Grant:18YJA740059)

ABSTRACT With the rapid development of Internet technology and social networks, a large number of
comment texts are generated on the Web. In the era of big data, mining the emotional tendency of
comments through artificial intelligence technology is helpful for timely understanding of network public
opinion. The technology of sentiment analysis is a part of artificial intelligence, and its research is very
meaningful for obtaining the sentiment trend of the comments. The essence of sentiment analysis is the text
classification task, and different words have different contributions to classification. In the current
sentiment analysis researches, distributed word representation is mostly used. However, distributed word
representation only considers the semantic information of word, but ignore the sentiment information of
word. In this paper, an improved word representation method is proposed, which integrates the contribution
of sentiment information into the traditional TF-IDF algorithm and generates weighted word vectors. The
weighted word vectors are input into BiLSTM (Bidirectional Long Short Term Memory) to capture the
context information effectively, and the comment vectors are better represented. The sentiment tendency of
the comment is obtained by feedforward neural network classifier. Under the same conditions, the proposed
sentiment analysis method is compared with the sentiment analysis methods of RNN, CNN, LSTM, NB.
The experimental results show that the proposed sentiment analysis method has higher precision, recall and
F1 score. The method is proved to be effective with high accuracy on comments.

INDEX TERMS Sentiment analysis, artificial intelligence, social network, weighted word vectors,
BiLSTM

I. INTRODUCTION data mining, information retrieval and other research fields


In recent years, with the rapid development of the Internet [1]. Sentiment analysis of comments mainly focuses on the
and social networks, more and more users begin to freely sentiment orientation analysis of comment corpus, which
express their opinions on web pages. Therefore, the big data indicates that users express positive, negative or neutral
of user comments is generated on the Internet. For example, sentiments towards products or events. In addition, sentiment
the product comments are generated on E-commerce analysis can be divided into news comment analysis [2],
websites such as Jingdong and Taobao, and hotel comments product comment analysis [3], film comment analysis [4] and
are generated on travel websites such as Ctrip and ELong. other types. These comments convey the views of Internet
With the explosive increasing of comments, it is difficult to users about products, hot events, etc. Merchants can master
analyze them manually. In the era of big data, mining the the user satisfaction with the relevant product comments.
emotional tendencies of comment texts through artificial Potential users can evaluate products by viewing these
intelligence technology is helpful for timely understanding product comments.
of network public opinion. The research of sentiment The essence of sentiment analysis is the text classification
analysis is very meaningful for obtaining the sentiment task, and the contribution of different words is different to
trend of the comments. classification. For sentiment classification tasks, learning a
Sentiment analysis is a kind of text classification, low-dimensional, non-sparse word vector representation for a
involving natural language processing, machine learning, word is a key step [5]. The widely used word representation

VOLUME XX, 2017 1

2169-3536 (c) 2018 IEEE. Translations and content mining are permitted for academic research only. Personal use is also permitted, but republication/redistribution requires IEEE permission. See
http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2019.2909919, IEEE
Access

is the distributed word vector obtained by Word2vec the positive and negative comments. Wang Yizhen et al. [17]
technology [6]. The word vector has a low dimension and studied the sentiment analysis of short texts. Based on
contains the semantic information of the word. However, multiple dimensions such as sentiment features, negative
distributed word vectors do not contain sentiment features and emoji, a high-dimensional mixed feature
information about words. In this paper, the contribution of sentiment analysis model based on SVM was proposed.
the word’s sentiment information to text sentiment Sentiment analysis method based on machine learning tends
classification is embedded into the traditional TF-IDF to be more accurate, but it relies on the quality of the corpus
algorithm, and the weighted word vector is generated. labeled with polarity.
In this paper, a sentiment analysis method of comments In recent years, many scholars have introduced the method
based on BiLSTM is proposed. The remainder of this article of deep learning into sentiment analysis and achieved good
consists of four parts. Firstly, the research backgrounds of the results. The RNTN (Recursive Neural Tensor Network)
text sentiment analysis method and the representation of the model proposed by Socher [18] introduced a sentiment tree
word vector are expounded. Secondly, the detail of the library, which synthesized semantics on the syntactic tree of
proposed sentiment analysis method of comments is binary sentiment polarity and obtained good sentiment
described. Thirdly, the experiments are carried out and the analysis results in the data set of movie reviews. The
experimental results are analyzed and discussed. Finally, the CharSCNN [19] (Character to Sentence Convolutional
proposed method is summarized and the next research Neural Network) model used two convolutional layers to
direction is introduced. extract the features of related words and sentences, and
mined semantic information to improve the sentiment
II. BACKGROUND analysis of short texts like Twitter. Irsoy et al. [20] used the
Recurrent Neural Network based on time series information
A. TEXT SENTIMENT ANALYSIS TECHNOLOGY to obtain sentence representation, which further improved the
Text sentiment analysis technology mines text emotions accuracy of sentiment classification. Tai et al. [21] proposed
through computer technology. According to the object of a Tree-Structured Long Short-Term Memory Networks
sentiment analysis, text sentiment analysis can be divided model, which had achieved good results in semantic
into three levels, respectively for words [7], sentences [8], association and sentiment classification. Baziotis et al. [22]
chapters [9]. According to the classification method of introduced the attention mechanism into the LSTM, which
sentiment orientation, it can be divided into binary sentiment achieved good results in the sentiment analysis of SemEval-
classification [10], ternary sentiment classification [11] and 2017 Task4 for Twitter.
multi-sentiment classification [12]. Considering that the feature words of comments are sparse,
At present, text sentiment analysis methods are mainly and in order to better capture the context information,
divided into three categories: sentiment analysis method Bidirectional Long Short Term Memory [23] in deep
based on sentiment dictionary, sentiment analysis method learning is used to obtain the comment representation in this
based on machine learning, and sentiment analysis method paper.
based on deep learning.
The method based on sentiment dictionary uses the B. WORD REPRESENTATION
dictionary to identify sentiment words in the text and obtain In natural language processing, words in sentences or
sentiment values. Then, according to the sentiment documents are usually used as features [24-26]. Currently,
calculation rules, the text sentiment tendency is obtained. The there are two widely used word vector representations: one-
literatures [13, 14] introduced the representative research hot representation and distributed representation.
based on sentiment dictionary. Text sentiment analysis based The vector dimension of one-hot representation is decided
on sentiment dictionary does not require manual labeling of by the words’ number of the dictionary containing a large
samples and is easy to implement. However, the quality of number of words and is the same as it. The vector of the
the analysis is highly dependent on the sentiment dictionary. word only has a dimension value of 1 corresponding to the
Most of the sentiment dictionaries have problems such as position of the word in the dictionary, and the rest dimension
insufficient coverage of sentiment words and lack of domain values are 0. The method has the following problems: (1)
words. The vector dimension will be too large if there are too many
The earliest research on text sentiment analysis based on words contained in the dictionary; (2) The vector has too
machine learning was Pang et al. [15]. They used naive many 0 values, which causes the sparseness of the vector; (3)
Bayesian algorithm, maximum entropy algorithm and SVM This method ignores the semantic association of the words.
algorithm to analyze the sentiment of film reviews. Finally, Distributed representation was proposed by Hinton in the
the experimental results showed that SVM algorithm worked 1986 [27]. It maps each word into a low-dimensional real
best in dealing with the sentiment classification of movie vector, which solves the problem that the dimension of the
reviews. Goldberg [16] proposed a graph-based semi- One-hot representation word vector is too large. All word
supervised classification algorithm which scored 0-4 stars for representations constitute a word vector space, so the

VOLUME XX, 2017 1

2169-3536 (c) 2018 IEEE. Translations and content mining are permitted for academic research only. Personal use is also permitted, but republication/redistribution requires IEEE permission. See
http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2019.2909919, IEEE
Access

semantic similarity can be judged by calculating the distance TF-IDF is a combination of TF and IDF weight calculation
between words. methods. It is the most commonly used weight calculation
Bengio et al. [28] first introduced word distributed method in text categorization. The frequency of the word in a
representation into the language model of neural network, single document and the distribution of the word in the
and proposed the Neural Network Language Model (NNLM). documents are considered in this method. It can better reflect
For the NNLM model, the context of the word wt was the importance of a feature in classification. The formulas of
represented as context= {w(t-n), w(t-(n-1)), …, w(t-1)}. The TF-IDF are as follows:
NNLM model consisted of four layers which were the input
tf(t i ,d) idf (ti )
layer, the mapping layer, the hidden layer, and the output w(t i ,d) 
(2)
 [tf (t , d )  idf (t )]
2
layer. In order to obtain an efficient training model, Mikolov i i
ti d
et al. [29] removed the nonlinear hidden layer of NNLM and
proposed Word2vec technology. It contained two new log- idf (ti )  log( N / nti )  1 (3)
linear models: Continuous Bag-of-Words (CBOW) and Skip- w(ti,d) denotes the weight of the word ti in document d,
gram (SG). These models not only improved the accuracy of tf(ti,d) denotes the frequency of the word ti in document d, N
the word vector, but also greatly improved the training speed. denotes the total number of documents and nti denotes the
At present, word vectors have been applied into sentiment number of documents in which the word ti appears.
analysis. For example, Kim [30] took the lead in using word In this paper, whether a word contains sentiment
vectors as features and input word vectors into convolutional information is determined by matching sentiment dictionaries.
neural networks for sentiment classification. The model At present, Hownet sentiment dictionary, National Taiwan
achieved good classification results. Tang et al. [31] University Sentiment Dictionary (NTUSD) and Li Jun's
introduced several neural networks to effectively encode Chinese commendatory term and derogatory term Dictionary
context and sentiment information into word embeddings. of Tsinghua University [34] are three commonly used
The effectiveness of the word embedding learning algorithm sentiment dictionaries in Chinese sentiment analysis. If the
is verified on Twitter dataset. Chen et al. [32] integrated user number of words in sentiment dictionary is too large, it will
and product information into a hierarchical neural network to contain a large number of words with low sentiment
implement sentiment analysis of user product reviews. Liu information. If the number of words in sentiment dictionary
Jinshuo et al. [33] used the news about food safety as the is too small, it will ignore a large number of sentiment words
training corpus to obtain the word vectors. The trained word in the text and reduce the accuracy of sentiment classification.
vectors were input into the underlying Recursive Neural After the analysis and comparison, Li Jun's Chinese
Network to obtain the sentence representation. Then, commendatory term and derogatory term Dictionary of
sentiment analysis of the news was achieved through a high- Tsinghua University is used in this paper. The dictionary
level Recurrent Neural Network. contains a moderate number of words with high sentiment
However, the word representations in the current information. The dictionary contains 10035 sentiment words.
researches of sentiment analysis do not comprehensively The related information is shown in TABLE I and TABLE
consider the sentiment information contained in the words Ⅱ.
and its contribution to the classification. In this paper, the TABLE I
THE RELATED INFORMATION OF POSITIVE WORD
sentiment information is integrated into the traditional TF-
IDF algorithm to calculate the weight of the words. Thus, the Examples of positive word Number
word vector could be better represented.
光荣(glory)、动听(enchanting)、
5567
III. RESEARCH METHODS 勇敢(brave)、祝福(bless)、谦虚(humble)......

A. THE CONSTRUCTION OF THE WEIGHTED WORD TABLE Ⅱ


VECTOR THE RELATED INFORMATION OF NEGATIVE WORD
In this paper, Word2vec model is used to obtain distributed Examples of negative word Number
representations of words. Word2vec technology includes
CBOW model and Skip-gram model. Both CBOW model 下流(Squat)、伤悲(sadly)、谗言(calumny)、
4468
焦虑(anxious)、怒斥(irritated)、出卖(betray).....
and Skip-gram model include input layer, projection layer
and output layer. CBOW model predicts target words based The weight calculation method for word vectors is as
on context distribution. For the word wk, the context is follows:
expressed as follows: wi  tf - idf i  e (4)
context ( w k )  w k  t , w k  ( t 1) ,..., w k  ( t 1) , w k  t  (1)
 , t i is a sentiment word
On the contrary, the Skip-gram model predicts the context e   (5)
 1, t i is a non - sentiment word
based on the target word wk.

VOLUME XX, 2017 1

2169-3536 (c) 2018 IEEE. Translations and content mining are permitted for academic research only. Personal use is also permitted, but republication/redistribution requires IEEE permission. See
http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2019.2909919, IEEE
Access


Where ti is the word, tf-idfi is the TF-IDF value of the a is defined as the distributed word
 vector trained by
feature word calculated by equation 2, wi is the weight of the word2vec. The weighted word vector v constructed in this
word, and e is the weight according to whether the word paper is defined as follows:
contains sentiment information.   1 .  
v  wi  a (6)

B. RECURRENT NEURAL NETWORKS


The traditional neural network model is ineffective in dealing
with the sequence learning because it is impossible to
describe the correlation between the front and back of the
sequence. RNN (Recurrent Neural Networks) is a sequence
learning model that connects nodes between hidden layers
and can learn sequence feature dynamically. RNN which is
applied to Chinese text sentiment analysis is shown in
FIGURE 1. In the figure, the input text is "酒店的环境不错
"(The environment of the hotel is good). After word
segmentation, it becomes "酒店/的/环境/不错". Each word
is converted into the corresponding word vector ( w 1 , FIGURE 2. The diagram of LSTM network structure
w 2 , w3 , w 4 ), and then the corresponding word vector Forget gate f determines the information to forget in
Memorycell at the last moment. The input is h(t-1) and x(t).
( w 1 , w 2 , w3 , w 4 ) is sequentially input into the RNN.
The output value is between 0 and 1. The calculation method
The calculation process of RNN is as follows:
is as follows:
1) At the time t, w t is input to the hidden layer.
ft   (Wf xt U f ht 1  Vf ct 1  bf ) (7)
2) s t is the hidden layer’s output of the step t. s t is
Amon them, ht 1 and xt are the inputs of LSTM unit. W f is
based on wt and st -1 . s t  f (U * wt  W * s t 1 ) , where f is
the connecting weight of xt and forget gate f. U f is the
usually the non-linear function, such as tanh or ReLU.
3) Finally, the output D is calculated according to connecting weight of ht 1 and forget gate f. ct 1 is the state of
ot  soft max(V * st ) . Memorycell at the last moment. V f is the connecting weight
of ct 1 and forget gate f. b f is the bias term. σ(ꞏ) is sigmoid
Output layer
activation function.
V V V V
S1 S2 S3 S4 Input gate i determines the information to be updated in
W W W RNN layer
Memorycell at the current time. The calculation method is as
follows:
U U U U  it   (Wi xt  U i ht 1  Vi ct 1  bi )
w1 w2 w3 w4  (8)
Word c _ int  tanh(Wc xt  U c ht 1  Vc ct 1  bc )
embedding
 ct  f t  ct 1  it  c _ int

酒店 的 环境 不错 Amon them, Wi is the connecting weight of xt and it . Ui
is the connecting weight of ht 1 and it . Vi is the connecting
FIGURE 1. The sentiment analysis model of RNN
weight of ct 1 and it . Wc is the connecting weight of xt and
C. LONG SHORT TERM MEMORY MODEL c _ int . U c is the connecting weight of c _ int and ht 1 .
The traditional recurrent neural network model can not tanh(.) is tanh activation function. ft and it refer to weights
capture long-distance semantic connection, even if it can
transfer semantic information between words. In the process of ct 1 and c _ int . bi and bc are the bias terms.
of parameter training, the gradient decreases gradually until Output gate o determines the output value of LSTM unit.
it disappears. As a result, the length of sequential data is The calculation method is as follows:
limited. Long Short Term Memory (LSTM) overcomes the ot   (Wo xt  U o ht 1  Vo ct 1  bo ) (9)

problem of gradient disappearance by introducing Input gate  ht  ot  tanh( ct )
i, Output gate o, Forget gate f and Memorycell. The LSTM Amon them, Wo is the connecting weight of xt and ot .
network structure [35] is shown in the following FIGURE 2.
U o is the connecting weight of ht 1 and ot . Vo is the
connecting weight of ct 1 and ot . bo is the bias term.

VOLUME XX, 2017 1

2169-3536 (c) 2018 IEEE. Translations and content mining are permitted for academic research only. Personal use is also permitted, but republication/redistribution requires IEEE permission. See
http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2019.2909919, IEEE
Access

D. SENTIMENT ANALYSIS OF COMMENTS BASED categorization task. In Section A of Research Methods, the
BILSTM weighted word vectors containing sentiment information and
For overcoming the shortcomings in current comment classification contribution are constructed.
sentiment analysis methods, a sentiment analysis method of Firstly, the weighted word vectors are used as the inputs of
comments based on BiLSTM is proposed in this paper. BiLSTM model, and the outputs of BiLSTM model are used
In the traditional recurrent neural network model and as the representations of the comment texts. Then, the
LSTM model, information can only be propagated in comment text vectors are input into the feedforward neural
forward, resulting in that the state of time t only depends on network classifier. Finally, the sentiment tendency of the
the information before time t. In order to make every comments is obtained. The activation function of
moment contain the context information, BiLSTM which feedforward neural network is ReLU function. In order to
combines bidirectional recurrent neural network (BiRNN) prevent the over-fitting phenomenon in the training process,
models and LSTM units is used to capture the context dropout mechanism was introduced, and dropout discarding
information. rate was set to 0.5.
BiLSTM model treats all inputs equally. For the task of The schematic diagram of the sentiment method proposed
sentiment analysis, the sentiment polarity of the text largely in this paper is as FIGURE 3. The left subgraph is the
depends on the words with sentiment information. In this process of comment text feature extraction. The right
paper, the sentiment reinforcement of sentiment word vector subgraph is the process of obtaining the sentiment polarity of
is realized. Sentiment analysis tasks are essentially text the comment text. Among them, NodeNum refers to the
categorization tasks, and distributed word vectors do not take number of nodes of LSTM hidden layer.
into account the contributions of different words to the
Feedforward neural
Text feature extraction
network classifier
Input node:
NodeNum*2

w1 LSTM LSTM Hidden node:


NodeNum/2
Output node:
2

w2 LSTM LSTM

Softmax

w3 LSTM LSTM

wn LSTM LSTM

Ducument
Word embedding BiLSTM layer representation

FIGURE 3. The comment sentiment analysis method proposed in this paper


Scikit-learn library of python are used to build the proposed
IV. EXPERIMENT sentiment analysis method and comparative experiments.

A. EXPERIMENTAL ENVIRONMENT B. DATA SET


In this paper, the experimental hardware platform is Intel The experimental corpus which has equal number of positive
Xeon E5 (6 cores), 32G memory, GTX 1080 Ti. The and negative texts includes 15000 hotel comment texts (Data
experimental software platform is Ubuntu 16.04 operating set) crawled from Ctrip (https://www.ctrip.com/). The
system and development environment is Python3.5 polarities of the comment texts have been labeled on Ctrip
programming language. The Tensorflow library and the website. Examples of Data set are shown in Table Ⅲ.
TABLE Ⅲ
EXAMPLES OF DATA SET

Positive Negative
不错,下次还考虑入住。交通也方便,在餐厅吃的也不错。 酒店 比 较旧 , 不符 合 四星 标 准, 出 行不 是 很方 便 。(The
(I will consider living in this hotel next time. The traffic is hotel is old and does not meet the four-star standard. Travel is
convenient, and eating in the restaurant is also not bad.) not very convenient.)

VOLUME XX, 2017 1

2169-3536 (c) 2018 IEEE. Translations and content mining are permitted for academic research only. Personal use is also permitted, but republication/redistribution requires IEEE permission. See
http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2019.2909919, IEEE
Access

挺好的住的是公寓房,环境不错哦我很喜欢!下次来还打算 停车要收费,房间没有窗户,贵,周边餐饮不方便,性价比
住 那 里 。 ( I live in an apartment that is very nice. The 差!(Parking is charged and the room has no windows. The
environment is good. I like it very much and I plan to choose the hotel is too expensive. Meanwhile, there are not many dining
hotel next time.) nearby.The hotel is low cost performance)
住 宿 方 便 , 房 间 清 洁 , 价 格 实 惠 , 服 务 尚 可 。 ( The 房间小,进去还一股霉味,电视是老款,问题是还放不出
accommodation is convenient. The room is clean. The price is 来,这个房间真对不起这个价格。(The room is small and has
affordable. The service is acceptable.) a musty smell. The TV is old and broken. Therefore, the price of
the hotel is expensive.)
The distributed representations of words are the 300- provided by DataScience (https://mlln.cn). The parameters
dimensional word vectors trained by the Skip-Gram model are shown in Table Ⅳ.
TABLE Ⅳ
TRAINING PARAMETERS FOR WORD2VEC
Window Dynamic Sub-sampling Low-Frequency Iteration Negative
size window word Sampling*
5 Yes 1e-5 10 5 5

C. EVALUATION INDICATORS 92
The evaluation indicators in this paper are precision(P),
recall(R) and F1 score. The relevant parameters are shown in
90
Table Ⅴ.
TABLE Ⅴ
THE RELEVANT PARAMETERS

F1(%)
88
Labeled as Labeled as other
the class classes
The number of texts that the
a b 86
classifier recognizes as the class
The number of texts that the
c d
classifier recognizes as other classes
84
The formula for calculating precision(P), recall(R) and F1
0 20 40 60 80 100 120 140 160 180 200
score is as follows: Epoch
a
P (10) FIGURE 4. Relationship between Epochs and F1 score
ab It can be seen from FIGURE 4 that with the growth of
R
a Epochs, the classification performance F1 score of the model
(11)
ac gradually increases. It tends to be stable when Epochs is 70.
2 P  R 2)  VALUE
F1  (12)
PR In this paper, the weight of the word with sentiment
information is  , which is the metric of the contribution of
D. HYPERPARAMETERS SETTING OF MODEL sentiment information to the sentiment classification task. If
The hyperparameters in the proposed sentiment model the  value is too small, it can not fully reflect the
include the number of epochs,  value, learning rate, difference between sentiment words and non-sentiment
maxLen, nodeNum, and so on. The hyperparameters with the words, reducing the effect of sentiment classification; if the
best classification effect of the model are studied. The Data  value is too large, it will over-measure the contribution of
set is randomly divided into a test set and a training set sentiment information and reduce the accuracy of sentiment
according to a ratio of 1:4. Other network parameters are classification. FIGURE 5 is the classification effect of the
unchanged, and the hyperparameters are changed to test. model at different  value.
1) EPOCHS
Epochs is the number of iterations of the training set. As
Epochs increases, the generalization ability of the model
enhances. However, if the number of epochs is too large,
over-fitting problem is easily generated, and the
generalization ability of the model reduces. Therefore, it is
important to choose the right Epochs. FIGURE 4 is the
classification effect of the model at different Epochs.

VOLUME XX, 2017 1

2169-3536 (c) 2018 IEEE. Translations and content mining are permitted for academic research only. Personal use is also permitted, but republication/redistribution requires IEEE permission. See
http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2019.2909919, IEEE
Access

the data will be truncated. If the length of data is less than


MaxLen, a zero vector is added at the end of the data until
92 the length reaches MaxLen.
91.9
The value of MaxLen is related to the input data of the
model. If MaxLen is too large, the zero vector in the data is
91 91.13 too much filled. If MaxLen is too small, the lost data
90.87 90.93
F1(%)

information is too much. Thus, MaxLen has a great influence


on the performance of the model. The distribution of the
90 90.1 length of data is shown in FIGURE 7.
89.73
89.63
89.4
89

1 2 3 4 5 10 15 20
α
FIGURE 5. Relationship between  values and F1 score
It can be seen from FIGURE 5 that when the  value is 1,
F1 score of the model is 90.87%; as the  value increases,
F1 score of the model increases first and then decreases.
When  values are 2, 3, and 4, the contribution of the
sentiment information to the model is integrated into the
weight of the word vector, so that F1 score of the model is
higher than the base value of 90.87%; meanwhile, when 
value is 2, F1 score of the model is highest, reaching at
91.90%. When  value is above 5, the weight difference
between the sentiment word and the non-sentiment word is
too large, resulting in F1 score is lower than 90.87%. FIGURE 7. Text length distribution in the dataset

3) LEARNING RATE It can be seen from FIGURE 7 that data lengths are short
The appropriate choice of learning rate is important for the and most data lengths are less than 200. Therefore, the
optimization of weights and offsets. If the learning rate is too maximum value of MaxLen is set 200. FIGURE 8 is the
large, it is easy to exceed the extreme point, making the classification effect of the model at different learning rates.
system unstable. If the learning rate is too small, the training
time is too long. FIGURE 6 is the classification effect of the
model at different learning rates. 92 92.2

91.73
92.5 91.63
91.53
91.43
91.3 91.33
F1(%)

91
92.2
92.0 90.9

91.9
F1(%)

91.7 91.67 90.27


91.5 91.63 90
91.53 90
91.4
91.23
91.0 0 20 40 60 80 100 120 140 160 180 200 220
MaxLen

FIGURE 8. Relationship between Maxlen and F1 score


90.5 It can be seen from FIGURE 8 that when MaxLen is 20, F1
0.01 0.05 0.1 0.2 0.3 0.4 0.5 0.6
Learning rate
score is only 90.27% due to discard too much valid
FIGURE 6. Relationship between learning rate and F1 score
information about the data. When MaxLen increases, F1
It can be seen from FIGURE 6 that F1 scores of the model score shows an upward trend. When MaxLen=100, F1 score
are around 92%. Morever, F1 score reaches a maximum reaches a maximum value 92.20%. When MaxLen continues
value of 92.2% when the learning rate is 0.2. to increase, F1 score shows a downward trend. This is
because the zero vector is filled too much.
4) MAXLEN
In this paper, MaxLen is the number of word vectors input 5) NODENUM
into BiLSTM. If the length of data is greater than MaxLen,

VOLUME XX, 2017 9

2169-3536 (c) 2018 IEEE. Translations and content mining are permitted for academic research only. Personal use is also permitted, but republication/redistribution requires IEEE permission. See
http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2019.2909919, IEEE
Access

The number of hidden layer nodes influences on the Seninfo+TF-IDF refers to the weighted distributed word
complexity and effect of the model. If the number of nodes is vectors with TF-IDF and sentiment information.
too small, the network learning ability will be limited. If the The Data set is randomly divided into a test set and a
number of nodes is too large, the complexity of the network training set according to a ratio of 1:4. F1 scores of ten
structure is large. At the same time, it is easier to fall into repeated experiments for different word vector
local minimum points during the training process, and the representations are shown in FIGURE 10.
network learning speed will decrease. FIGURE 9 is the
classification effect of the model at different NodeNums. 93

92.5 92

91
92.2
92.0 90

F1(%)
91.97
89
F1(%)

91.5 88
91.6
seninfo+tfidf
91.43 87 seninfo
tfidf
91.2 91.23 86 vec
91.0
85
90.93 90.9 0 1 2 3 4 5 6 7 8 9 10 11
Run
90.5
32 64 96 128 160 192 224 256 FIGURE 10. F1 scores of ten experiments for different word
NodeNum representations
It can be seen from FIGURE 10 that F1 score is only
FIGURE 9. Relationship between NodeNum and F1 score
around 88.5% when seninfo or tfidf is not integrated to word
It can be seen from FIGURE 9 that when the number of representation; When the tfidf or seninfo is integrated to the
nodes increases from 32 to 128, F1 score is improved slightly. weighted word vector, the sentiment analysis effect of
When the number of nodes exceeds 128, F1 score shows a comments is significantly improved, reaching around 91%,
downward trend. Therefore, the number of hidden layer and integrating tfidf is slightly better than seninfo. After
nodes in the model is selected to be 128. integrating tfidf and seninfo as the weight of word vector, the
sentiment analysis has the best effect, and F1 score is
E. COMPARATIVE EXPERIMENTS
basically above 92%.
1) COMPARISON OF THE SENTIMENT ANALYSIS FOR The average precision, recall and F1 score of ten repeated
DIFFERENT WORD REPRESENTATIONS experiments for different word vector representations are
In order to verify the validity of the word representation shown in Table Ⅶ and FIGURE 11.
proposed in this paper, different word representations are TABLE Ⅶ
input into BiLSTM model. The effects of sentiment analysis COMPARISON OF VARIOUS WORD VECTOR REPRESENTATIONS
are compared through experiments. The specific parameters Word
Precision Recall F1 score
representations
of the sentiment analysis model are shown in Table Ⅵ.
Seninfo+TF-IDF 91.50 92.87 92.18
TABLE Ⅵ
Seninfo 90.85 90.63 90.74
HYPERPARAMETER LIST
TF-IDF 91.18 90.92 91.05
hyperparameters value vec 89.06 88.14 88.60
Epochs 70
learning rate 0.2 93
Optimization function Mini-Batch Gradient Descent
loss function Cross Entropy Loss Function 92
MaxLen 100
Dropout 0.5 91
BatchSize 120
NodeNum 128
%

90
VectorSize 300
 value 2.0 89

The vec in this paper refers to the distributed word 88


representation generated by Word2vec containing the
semantic information of the words. TF-IDF refers to the 87
Seninfo+TF-IDF Seninfo TF-IDF vec
weighted distributed word vectors with TF-IDF, which precision recall F1
embodies the contribution of different words to the FIGURE 11. Comparison of the sentiment analysis of different word
classification task. Seninfo refers to the weighted distributed vector representations
word vectors with sentiment information, which embodies It can be seen from Table Ⅶ and FIGURE 11 that the
the difference between sentiment words and other words. precision, recall and F1 score of the word representation

VOLUME XX, 2017 9

2169-3536 (c) 2018 IEEE. Translations and content mining are permitted for academic research only. Personal use is also permitted, but republication/redistribution requires IEEE permission. See
http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2019.2909919, IEEE
Access

(Seninfo+TF-IDF) proposed in this paper are superior to paper introduces BiLSTM structure, which can capture the
other word representation methods. In particular, compared semantic information of the context more effectively, so the
with the distributed word vector trained by Word2vec, the sentiment analysis works best. F1 score is significantly
representation method proposed in this paper increases the improved, reaching around 92%.
precision by 2.44 percentage points, the recall increases by The average precision, recall and F1 score of ten repeated
4.73 percentage points, and F1 score increases by 3.58 experiments for different sentiment analysis methods are
percentage points. The reason is that the distributed word shown in Table Ⅷ and FIGURE 13.
vector trained by Wordvec mainly contains the semantic TABLE Ⅷ
EXPERIMENTAL RESULTS OF THE PROPOSED METHOD AND OTHER
information of words, but cannot contain the sentiment TRADITIONAL METHODS
information of words. At the same time, the distributed word Method precision recall F1 score
vector trained by wordvec cannot reflect the different proposed method 91.54 92.82 92.18
importance of different words to the classification task. RNN 87.18 86.60 86.89
When the word vectors are input into BiLSTM model for CNN 85.10 84.12 84.61
LSTM 88.46 88.03 88.24
sentiment analysis, the degree of discrimination of the words Naive Bayesian 86.02 84.13 85.06
is relatively weak, thus reducing the accuracy of sentiment
analysis. The word representation method proposed in this 95
paper takes into account the sentiment information contained
in the words and the contribution to the classification task, 90

which alleviates the above problems to some extent.


85
2) COMPARISON OF THE SENTIMENT ANALYSIS FOR

%
DIFFERENT SENTIMENT ANALYSIS METHODS 80

In order to further prove the effectiveness of the sentiment


analysis method proposed in this paper, the proposed method 75

is compared with other traditional sentiment analysis 70


methods (LSTM, RNN, CNN, Naive Bayesian). The inputs proposed method RNN CNN LSTM NB

of models are the weighted word vectors proposed in this precision recall F1

paper. The hyperparameters of RNN and LSTM are shown in


FIGURE 13. Comparison of experimental results between the proposed
Table Ⅵ. The CNN method uses a single channel, and the method and other traditional methods
convolution filter size is set to 5. The Naive Bayesian method It can be seen from Table Ⅷ and FIGURE 13 that
uses MultinomialNB with an alpha setting of 2.0. precision of proposed method reaches 91.54%, recall
The Data set is randomly divided into a test set and a reaches 92.82%, and F1 score reaches 92.18%. F1 scores of
training set according to a ratio of 1:4. F1 scores of ten the other methods range from 84% to 89%, which are
repeated experiments for different sentiment analysis obviously lower than F1 score of the proposed method. In
methods are shown in FIGURE 12. addition, F1 scores of RNN and LSTM suitable for sequential
processing tasks are higher than those of CNN and Naive
92
Bayesian. The reasons are: ①RNN deep learning model can
effectively transfer the semantics between words, but there is
90 a gradient disappearance problem; ② CNN deep learning
88
model can mine local information, but the semantic
information passed by the sequences cannot be effectively
F1(%)

86 modeled; ③ LSTM deep learning model alleviates the


84
gradient disappearance problem to some extent, but it is
proposed method
RNN impossible to capture context semantic information because
82
CNN
LSTM the information is only transmitted from front to back;④
NB

80
Naive Bayesian machine learning model has a certain error
0 1 2 3 4 5 6 7 8 9 10 11 rate because it determines the probability of posteriority
Run
through prior knowledge and data.
FIGURE 12. F1 scores of ten experiments for different sentiment The proposed sentiment analysis method uses the
analysis methods improved word representation as input, so that the model can
It can be seen from FIGURE 12 that F1 scores of the above better learn the sentiment information contained in the words
methods are relatively stable in ten experiments. In above and the contribution to the classification task. The introduced
five methods, the sentiment classification effects of CNN and BiLSTM model includes the LSTM unit. The gradient
Naive Bayesian are poor, and F1 score is only around 85%. disappearance problem is solved to some extent by the gating
When using RNN and LSTM suitable for sequence modeling, mechanism. In addition, the forward sequence information
F1 score can reach around 87%. The method proposed in this and the reverse sequence information are considered, and the

VOLUME XX, 2017 9

2169-3536 (c) 2018 IEEE. Translations and content mining are permitted for academic research only. Personal use is also permitted, but republication/redistribution requires IEEE permission. See
http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2019.2909919, IEEE
Access

semantic information of the context is captured more [9] Z.U.Rehman and I.S.Bajwa, “Lexicon-based sentiment analysis for
Urdu language,” in 2016 sixth international conference on
effectively. innovative computing technology (INTECH). IEEE, Dublin, Ireland,
2016, pp.497-501.
V. CONCLUSION [10] A.S.Manek et al. , “Aspect term extraction for sentiment analysis in
In the era of rapid development of Internet technology and large movie reviews using Gini Index feature selection method and
SVM classifier,” World wide web, vol.20, no.2, pp. 135-154, 2017.
social networks, it is very meaningful to explore the [11] M.S.Mubarok, Adiwijaya, and M.D.Aldhi, “Aspect-based sentiment
emotional tendency of comments through artificial analysis to review products using Naïve Bayes,” in AIP Conference
intelligence technology. In this paper, a sentiment analysis Proceedings, Budapest, Hungary, 2017, pp.1-8.
[12] M.Bouazizi, T.Ohtsuki, “A pattern-based approach for multi-class
method of comments based on BiLSTM is proposed and sentiment analysis in Twitter,” IEEE Access, vol.5,pp.20617-20639,
applied to the comment sentiment analysis task. According to 2017.
the deficiency of the word representation method in the [13] P.Turney and M.L.Littman, “Measuring praise and criticism:
current researches, the sentiment information contribution Inference of semantic orientation from association,” ACM Trans. on
Information Systems, vol.21, no.4, pp.315−346, 2003.
degree is integrated into the TF-IDF algorithm of the term [14] M.Taboada et al. , “Lexicon-based methods for sentiment analysis,”
weight computation, and a new representation method of Computational Linguistics, vol.37, no.2, pp.267–307, 2011.
word vector based on the improved term weight computation [15] B.Pang, L.Lee, and S.Vaithyanathan, “Thumbs up?: sentiment
classification using machine learning techniques,” in Proceedings of
is proposed. In addition, BiLSTM model fully considers the the ACL-02 conference on Empirical methods in natural language
context information and can better obtain the text processing, Grenoble, France, 2002, pp.79-86.
representation of the comments. Finally, through the [16] A.B.Goldberg and X.Zhu, “Seeing stars when there aren't many
stars: graph-based semi-supervised learning for sentiment
feedforward neural network and softmax mapping, the categorization,” in The Workshop on Graph Based Methods for
sentiment tendency of the text is obtained. The experiments Natural Language Processing. Association for Computational
of different word representation methods prove the validity Linguistics, Sydney, Australia, 2006, pp.45-52.
[17] Y.Wang, X.Zheng, D.Hou, and W.Hu, “Short Text Sentiment
of the proposed word representation method in this paper. Classification of High Dimensional Hybrid Feature Based on
Through the comparison experiments with other traditional SVM,” Computer Technology and Development, vol.28, no.2,
sentiment analysis methods, the accuracy of the proposed pp.88-93, 2018.
comment sentiment analysis method is improved. However, [18] R.Socher et al. , “Recursive deep models for semantic
compositionality over a sentiment treebank,” in The 2013
the sentiment analysis method of comments based on conference on empirical methods in natural language processing,
BiLSTM consumes a long time in the training model. In Seattle, United States, 2013, pp.1631-1642.
future work, the method to effectively accelerate the training [19] C.N.D.Santos and M.Gattit, “Deep Convolutional Neural Networks
for Sentiment Analysis of Short Texts,” in The 25th International
process of the model will be studied. Conference on Computational Linguistics: Technical Papers, Dublin,
Ireland, 2014, pp.69-78.
REFERENCES [20] O.Irsoy and C.Cardie, “Opinion Mining with Deep Recurrent
[1] L.Wang, D.Miao, and Z.Zhang, “Emotional Analysis on Text Neural Networks,” in Conference on Empirical Methods in Natural
Sentences Based on Topic,” Computer Science, vol.41, no.3, pp.32- Language Processing, Doha, Qatar, 2014, pp.720-728.
35, 2014. [21] K.S.Ta, R.Socher, and C.D.Manning, “Improved Semantic
[2] S.Krishnamoorthy, “Sentiment analysis of financial news articles Representations from Tree-Structured Long Short-Term Memory
using performance indicators,” Knowledge & Information Systems, Networks,” in 53rd Annual Meeting of the Association for
vol.56, no.2, pp.373-394, 2018. Computational Linguistics and 7th International Joint Conference
[3] N.Shelke, S.Deshpande, and V.Thakare, “Domain independent on Natural Language Processing, Beijing, China, 2015, pp.1556-
approach for aspect oriented sentiment analysis for product 1566.
reviews,” in Proceedings of the 5th international conference on [22] C.Baziotis, N.Pelekis, and C.Doulkeridis, “DataStories at SemEval-
frontiers in intelligent computing: Theory and applications, 2017 Task 4: Deep LSTM with Attention for Message-level and
Singapore, 2017, pp.651-659. Topic-based Sentiment Analysis,” in The 11th International
[4] P.Sharma and N.Mishra, “Feature level sentiment analysis on movie Workshop on Semantic Evaluation (SemEval-2017), Vancouver,
reviews,” in 2016 2nd International Conference on Next Generation Canada, 2017, pp.747-754.
Computing Technologies (NGCT). IEEE, Dehradun, India, 2016, [23] F.Zhang et al. , “Multi-Aspect-Aware Bidirectional LSTM
pp.306-311. Networks for Synthetic Aperture Radar Target Recognition,” IEEE
[5] Q.Zhang, S.Zhang, and Z.Lei, “Chinese sentiment classification Access, vol.5, pp.26880-26891, 2017.
based on improved convolutional neural network,” Computer [24] L.Ying et al. , “Document representation based on semantic
Engineering and Applications, vol.53, no.22, pp.116-120, 2017. smoothed topic model,” in IEEE/ACIS International Conference on
[6] D.Zhang et al. , “Research of Chinese Comments Sentiment Software Engineering, Beijing, China, 2016, pp.65-69.
Classification Based on Word2vec and SVMperf, ” Computer [25] L.Zhu, G.Wang, and X.Zou ,“A Study of Chinese Document
Science, vol.43, no.6A, pp.418-421,447, 2016. Representation and Classification with Word2vec,” in International
[7] L.Kang, L.Xu, and J.Zhao, “Co-Extracting Opinion Targets and Symposium on Computational Intelligence & Design. IEEE,
Opinion Words from Online Reviews Based on the Word Hangzhou, China, 2017, pp.298-302.
Alignment Model,” IEEE Transactions on knowledge and data [26] J.Zhao and X.Gui ,“Deep Convolution Neural Networks for Twitter
engineering, vol.27, no.3, pp.636-650, 2015. Sentiment Analysis,” IEEE Access, vol.6, pp.23253-23260 ,2018.
[8] Z.Hao et al. , “A Dynamic Conditional Random Field Based [27] G.E.Hinton, “Learning distributed representations of concepts,” in
Framework for Sentence-Level Sentiment Analysis of Chinese Eighth Conference of the Cognitive Science Society, Massachusetts,
Microblog,” in 2017 IEEE International Conference on United States, 1986, pp.1-12.
Computational Science and Engineering (CSE) and IEEE [28] Y.Bengio et al. ,“A neural probabilistic language model”, Journal of
International Conference on Embedded and Ubiquitous Computing Machine Learning Research, vol.3, no.6, pp.932-938, 2003.
(EUC), Guangzhou, China, 2017, pp.135-142. [29] T.Mikolov et al. ,“Distributed Representations of Words and
Phrases and their Compositionality”, in Advances in Neural

VOLUME XX, 2017 9

2169-3536 (c) 2018 IEEE. Translations and content mining are permitted for academic research only. Personal use is also permitted, but republication/redistribution requires IEEE permission. See
http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2019.2909919, IEEE
Access

Information Processing Systems, Nevada, United States, 2013,


pp.3111-3119.
[30] Y.Kim, “Convolutional Neural Networks for Sentence
Classification”, in Proceedings of Conference on Empirical Methods
in Natural Language Processing, Doha, Qatar, 2014, pp.1746-1751. Xu Wu was born in Fenghuang, Hunan, China, in
[31] D.Tang, F.Wei, and B.Qin , “Sentiment Embeddings with 1993. He received the B.S. degrees in software
Applications to Sentiment Analysis”, IEEE Transactions on engineering from Chongqing University of Posts
Knowledge & Data Engineering, vol.28, no.2, pp.496-509, 2016. and Telecommunications, in 2017. He is currently
[32] H.Chen et al. , “Neural Sentiment Classification with User and pursuing the master’s degree in modern education
Product Attention”, in Conference on Empirical Methods in Natural technology with the Minzu University of China.
Language Processing, Austin, Texas, USA, 2016, pp.1650-1659. His research interests include data mining, natural
[33] J.Liu and Z.Zhang, “Sentiment Analysis on Food Safety News using language processing, and artificial intelligence.
Joint Deep Neural Network Model”, Computer Science, vol.43,
no.12, pp.277-280, 2016.
[34] J.Li, “Chinese sentiment dictionary,” Research Institute of
Information Technology, Tsinghua University, No.30 Shuangqing
Road,HaidianDistrict,Beijing,China.[Online].Available:http://nlp.cs
ai.tsinghua.edu.cn/site2/index.php/zh/people?catid=13&id=13:v10,
Accessed:Oct.4, 2018.
[35] S.Wang et al. , “Sentiment classfication of Uyghur text based on
BLSTM”, Computer Engineering and Design, vol.38, no.10,
pp.2879-2886, 2017.

Guixian Xu was born in Changchun, Jilin


province, China in 1974. She received the B.S. and
M.S. from Changchun University of Technology
in 1998 and 2002 and Ph. D. degree in computer
software and theory from Beijing institute of
techonology, in 2010.
From 2002 to present, she has been a teacher
at Information Engineering College of Minzu
university of China.She is an associate professor.
Her research interests are data mining, machine
learning.

Yueting Meng was born in Shijiazhuang, Hebei,


China, in 1996. She received the B.S. degrees in
computer science and technology from Hebei
University of Science and Technology, in 2018.
She is currently pursuing the master’s degree in
software engineering with the Minzu University of
China. Her research interests include artificial
intelligence, natural language processing, and data
mining.

Xiaoyu Qiu received his M.S. in Computer


Sciences (2008) from Shandong Normal
University. Now he is a Librarian at Library of
Shandong University of Traditional Chinese
Medicine. His current research interests include
different aspects of Pattern recognition, artificial
intelligence and distributed systems.

Ziheng Yu was born in Taizhou, Zhejiang,


China in 1994. He received the B.S. degrees in
software engineering from the Beijing Union
University, in 2017, and currently studying for a
master's degree in software engineering at Minzu
University of China. His research interests
include data mining, natural language processing,
and artificial intelligence.

VOLUME XX, 2017 9

2169-3536 (c) 2018 IEEE. Translations and content mining are permitted for academic research only. Personal use is also permitted, but republication/redistribution requires IEEE permission. See
http://www.ieee.org/publications_standards/publications/rights/index.html for more information.

You might also like