0% found this document useful (0 votes)
12 views

Sentiment Analysis of Comment Texts Based On BiLSTM

Uploaded by

P KISHORE
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
12 views

Sentiment Analysis of Comment Texts Based On BiLSTM

Uploaded by

P KISHORE
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 11

SPECIAL SECTION ON ARTIFICIAL INTELLIGENCE AND COGNITIVE

COMPUTING FOR COMMUNICATION AND NETWORK

Received March 15, 2019, accepted March 31, 2019, date of publication April 9, 2019, date of current version April 29, 2019.
Digital Object Identifier 10.1109/ACCESS.2019.2909919

Sentiment Analysis of Comment


Texts Based on BiLSTM
GUIXIAN XU 1, YUETING MENG1 , XIAOYU QIU2 , ZIHENG YU 1, AND XU WU1
1 Collegeof Information Engineering, Minzu University of China, Beijing 100081, China
2 The Library, Shandong University of Traditional Chinese Medicine, Jinan 250355, China

Corresponding author: Guixian Xu (guixian_xu@muc.edu.cn)


This work was supported by Project of Humanities and Social Science, Ministry of Education of China, under Grant 18YJA740059.

ABSTRACT With the rapid development of Internet technology and social networks, a large number
of comment texts are generated on the Web. In the era of big data, mining the emotional tendency of
comments through artificial intelligence technology is helpful for the timely understanding of network public
opinion. The technology of sentiment analysis is a part of artificial intelligence, and its research is very
meaningful for obtaining the sentiment trend of the comments. The essence of sentiment analysis is the text
classification task, and different words have different contributions to classification. In the current sentiment
analysis studies, distributed word representation is mostly used. However, distributed word representation
only considers the semantic information of word, but ignore the sentiment information of the word. In this
paper, an improved word representation method is proposed, which integrates the contribution of sentiment
information into the traditional TF-IDF algorithm and generates weighted word vectors. The weighted word
vectors are input into bidirectional long short term memory (BiLSTM) to capture the context information
effectively, and the comment vectors are better represented. The sentiment tendency of the comment is
obtained by feedforward neural network classifier. Under the same conditions, the proposed sentiment
analysis method is compared with the sentiment analysis methods of RNN, CNN, LSTM, and NB. The
experimental results show that the proposed sentiment analysis method has higher precision, recall, and F1
score. The method is proved to be effective with high accuracy on comments.

INDEX TERMS Sentiment analysis, artificial intelligence, social network, weighted word vectors, BiLSTM.

I. INTRODUCTION information retrieval and other research fields [1]. Sentiment


In recent years, with the rapid development of the Internet analysis of comments mainly focuses on the sentiment ori-
and social networks, more and more users begin to freely entation analysis of comment corpus, which indicates that
express their opinions on web pages. Therefore, the big data users express positive, negative or neutral sentiments towards
of user comments is generated on the Internet. For exam- products or events. In addition, sentiment analysis can be
ple, the product comments are generated on E-commerce divided into news comment analysis [2], product comment
websites such as Jingdong and Taobao, and hotel comments analysis [3], film comment analysis [4] and other types. These
are generated on travel websites such as Ctrip and ELong. comments convey the views of Internet users about products,
With the explosive increasing of comments, it is difficult hot events, etc. Merchants can master the user satisfaction
to analyze them manually. In the era of big data, mining with the relevant product comments. Potential users can eval-
the emotional tendencies of comment texts through artificial uate products by viewing these product comments.
intelligence technology is helpful for timely understanding of The essence of sentiment analysis is the text classification
network public opinion. The research of sentiment analysis task, and the contribution of different words is different to
is very meaningful for obtaining the sentiment trend of the classification. For sentiment classification tasks, learning a
comments. low-dimensional, non-sparse word vector representation for
Sentiment analysis is a kind of text classification, involving a word is a key step [5]. The widely used word representation
natural language processing, machine learning, data mining, is the distributed word vector obtained by Word2vec technol-
The associate editor coordinating the review of this manuscript and
ogy [6]. The word vector has a low dimension and contains
approving it for publication was Yin Zhang. the semantic information of the word. However, distributed

2169-3536 2019 IEEE. Translations and content mining are permitted for academic research only.
51522 Personal use is also permitted, but republication/redistribution requires IEEE permission. VOLUME 7, 2019
See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
G. Xu et al.: Sentiment Analysis of Comment Texts Based on BiLSTM

word vectors do not contain sentiment information about analysis model based on SVM was proposed. Sentiment anal-
words. In this paper, the contribution of the word’s sentiment ysis method based on machine learning tends to be more
information to text sentiment classification is embedded into accurate, but it relies on the quality of the corpus labeled with
the traditional TF-IDF algorithm, and the weighted word polarity.
vector is generated. In recent years, many scholars have introduced the method
In this paper, a sentiment analysis method of comments of deep learning into sentiment analysis and achieved good
based on BiLSTM is proposed. The remainder of this article results. The RNTN (Recursive Neural Tensor Network)
consists of four parts. Firstly, the research backgrounds of the model proposed by Socher et al. [18] introduced a senti-
text sentiment analysis method and the representation of the ment tree library, which synthesized semantics on the syn-
word vector are expounded. Secondly, the detail of the pro- tactic tree of binary sentiment polarity and obtained good
posed sentiment analysis method of comments is described. sentiment analysis results in the data set of movie reviews.
Thirdly, the experiments are carried out and the experimental The CharSCNN [19] (Character to Sentence Convolutional
results are analyzed and discussed. Finally, the proposed Neural Network) model used two convolutional layers to
method is summarized and the next research direction is extract the features of related words and sentences, and mined
introduced. semantic information to improve the sentiment analysis of
short texts like Twitter. Irsoy and Cardie [20] used the Recur-
rent Neural Network based on time series information to
II. BACKGROUND obtain sentence representation, which further improved the
A. TEXT SENTIMENT ANALYSIS TECHNOLOGY accuracy of sentiment classification. Ta et al. [21] proposed a
Text sentiment analysis technology mines text emotions Tree-Structured Long Short-Term Memory Networks model,
through computer technology. According to the object of which had achieved good results in semantic association and
sentiment analysis, text sentiment analysis can be divided sentiment classification. Baziotis et al. [22] introduced the
into three levels, respectively for words [7], sentences [8], attention mechanism into the LSTM, which achieved good
chapters [9]. According to the classification method of sen- results in the sentiment analysis of SemEval-2017 Task4 for
timent orientation, it can be divided into binary sentiment Twitter.
classification [10], ternary sentiment classification [11] and Considering that the feature words of comments are sparse,
multi-sentiment classification [12]. and in order to better capture the context information, Bidi-
At present, text sentiment analysis methods are mainly rectional Long Short Term Memory [23] in deep learning is
divided into three categories: sentiment analysis method used to obtain the comment representation in this paper.
based on sentiment dictionary, sentiment analysis
method based on machine learning, and sentiment analysis B. WORD REPRESENTATION
method based on deep learning. In natural language processing, words in sentences or docu-
The method based on sentiment dictionary uses the dic- ments are usually used as features [24]–[26]. Currently, there
tionary to identify sentiment words in the text and obtain are two widely used word vector representations: one-hot
sentiment values. Then, according to the sentiment calcu- representation and distributed representation.
lation rules, the text sentiment tendency is obtained. The The vector dimension of one-hot representation is decided
literatures [13], [14] introduced the representative research by the words’ number of the dictionary containing a large
based on sentiment dictionary. Text sentiment analysis based number of words and is the same as it. The vector of the
on sentiment dictionary does not require manual labeling of word only has a dimension value of 1 corresponding to the
samples and is easy to implement. However, the quality of position of the word in the dictionary, and the rest dimension
the analysis is highly dependent on the sentiment dictionary. values are 0. The method has the following problems: (1) The
Most of the sentiment dictionaries have problems such as vector dimension will be too large if there are too many words
insufficient coverage of sentiment words and lack of domain contained in the dictionary; (2) The vector has too many
words. 0 values, which causes the sparseness of the vector; (3) This
The earliest research on text sentiment analysis based on method ignores the semantic association of the words.
machine learning was Pang et al. [15]. They used naive Distributed representation was proposed by Hinton in the
Bayesian algorithm, maximum entropy algorithm and SVM 1986 [27]. It maps each word into a low-dimensional real
algorithm to analyze the sentiment of film reviews. Finally, vector, which solves the problem that the dimension of the
the experimental results showed that SVM algorithm worked One-hot representation word vector is too large. All word rep-
best in dealing with the sentiment classification of movie resentations constitute a word vector space, so the semantic
reviews. Goldberg and Zhu [16] proposed a graph-based similarity can be judged by calculating the distance between
semi-supervised classification algorithm which scored 0-4 words.
stars for the positive and negative comments. Wang et al. [17] Bengio et al. [28] first introduced word distributed rep-
studied the sentiment analysis of short texts. Based on mul- resentation into the language model of neural network, and
tiple dimensions such as sentiment features, negative fea- proposed the Neural Network Language Model (NNLM).
tures and emoji, a high-dimensional mixed feature sentiment For the NNLM model, the context of the word wt was

VOLUME 7, 2019 51523


G. Xu et al.: Sentiment Analysis of Comment Texts Based on BiLSTM

represented as context = {w(t−n) , w(t−(n−1)) , . . ., w(t−1) }. TF-IDF are as follows:


The NNLM model consisted of four layers which were the tf (ti , d) × idf (ti )
input layer, the mapping layer, the hidden layer, and the w (ti , d) = r P (2)
output layer. In order to obtain an efficient training model, [tf (ti , d) × idf (ti )]2
ti ∈d
Mikolov et al. [29] removed the nonlinear hidden layer of
NNLM and proposed Word2vec technology. It contained two idf (ti ) = log (N /nti ) + 1 (3)
new log-linear models: Continuous Bag-of-Words (CBOW)
w(ti ,d) denotes the weight of the word ti in document d,
and Skip-gram (SG). These models not only improved the
tf(ti ,d) denotes the frequency of the word ti in document d,
accuracy of the word vector, but also greatly improved the
N denotes the total number of documents and nti denotes the
training speed.
number of documents in which the word ti appears.
At present, word vectors have been applied into senti-
In this paper, whether a word contains sentiment infor-
ment analysis. For example, Kim [30] took the lead in using
mation is determined by matching sentiment dictionaries.
word vectors as features and input word vectors into con-
At present, Hownet sentiment dictionary, National Taiwan
volutional neural networks for sentiment classification. The
University Sentiment Dictionary (NTUSD) and Li Jun’s
model achieved good classification results. Tang et al. [31]
Chinese commendatory term and derogatory term Dictio-
introduced several neural networks to effectively encode con-
nary of Tsinghua University [34] are three commonly used
text and sentiment information into word embeddings. The
sentiment dictionaries in Chinese sentiment analysis. If the
effectiveness of the word embedding learning algorithm is
number of words in sentiment dictionary is too large, it will
verified on Twitter dataset. Chen et al. [32] integrated user
contain a large number of words with low sentiment infor-
and product information into a hierarchical neural network
mation. If the number of words in sentiment dictionary is too
to implement sentiment analysis of user product reviews. Liu
small, it will ignore a large number of sentiment words in the
and Zhang [33] used the news about food safety as the training
text and reduce the accuracy of sentiment classification. After
corpus to obtain the word vectors. The trained word vectors
the analysis and comparison, Li Jun’s Chinese commendatory
were input into the underlying Recursive Neural Network to
term and derogatory term Dictionary of Tsinghua University
obtain the sentence representation. Then, sentiment analysis
is used in this paper. The dictionary contains a moderate num-
of the news was achieved through a high-level Recurrent
ber of words with high sentiment information. The dictionary
Neural Network.
contains 10035 sentiment words. The related information is
However, the word representations in the current
shown in TABLE 1 and TABLE 2.
researches of sentiment analysis do not comprehensively
consider the sentiment information contained in the words
TABLE 1. The related information of positive word.
and its contribution to the classification. In this paper, the sen-
timent information is integrated into the traditional TF-IDF
algorithm to calculate the weight of the words. Thus, the word
vector could be better represented.

III. RESEARCH METHODS


A. THE CONSTRUCTION OF THE WEIGHTED
WORD VECTOR
In this paper, Word2vec model is used to obtain distributed TABLE 2. The related information of negative word.

representations of words. Word2vec technology includes


CBOW model and Skip-gram model. Both CBOW model and
Skip-gram model include input layer, projection layer and
output layer. CBOW model predicts target words based on
context distribution. For the word wk , the context is expressed
as follows:
The weight calculation method for word vectors is as
context(wk ) = wk−t , wk−(t−1) , . . . , wk+(t−1) , wk+t

(1) follows:
On the contrary, the Skip-gram model predicts the context wi = tf − idfi · e (4)
based on the target word wk . 
α, ti is a sentiment word
TF-IDF is a combination of TF and IDF weight calculation e= (5)
1, ti is a non − sentiment word
methods. It is the most commonly used weight calculation
method in text categorization. The frequency of the word in where ti is the word, tf-idfi is the TF-IDF value of the feature
a single document and the distribution of the word in the word calculated by equation 2, wi is the weight of the word,
documents are considered in this method. It can better reflect and e is the weight according to whether the word contains
the importance of a feature in classification. The formulas of sentiment information. α > 1.

51524 VOLUME 7, 2019


G. Xu et al.: Sentiment Analysis of Comment Texts Based on BiLSTM

aE is defined as the distributed word vector trained by


word2vec. The weighted word vector Ev constructed in this
paper is defined as follows:
Ev = wi · aE (6)

B. RECURRENT NEURAL NETWORKS


The traditional neural network model is ineffective in deal-
ing with the sequence learning because it is impossible to
describe the correlation between the front and back of the
sequence. RNN (Recurrent Neural Networks) is a sequence
learning model that connects nodes between hidden layers FIGURE 2. The diagram of LSTM network structure.
and can learn sequence feature dynamically. RNN which
is applied to Chinese text sentiment analysis is shown in
FIGURE 1. In the figure, the input text is Among them, ht−1 and xt are the inputs of LSTM unit.
(The environment of the hotel is good). After word segmenta- Wf is the connecting weight of xt and forget gate f. Uf is
tion, it becomes . Each word is converted the connecting weight of ht−1 and forget gate f. ct−1 is the
into the corresponding word vector (w1 , w2 ,w3 ,w4 ), and then state of Memorycell at the last moment. Vf is the connecting
the corresponding word vector (wt ,wt ,wt ,wt ) is sequentially weight of ct−1 and forget gate f. bf is the bias term. σ (·) is
input into the RNN. sigmoid activation function.
Input gate i determines the information to be updated in
Memorycell at the current time. The calculation method is as
follows:

it = σ (Wi xt + Ui ht−1 + Vi ct−1 + bi )

c_int = tanh(Wc xt + Uc ht−1 + Vc ct−1 + bc ) (8)

ct = ft · ct−1 + it · ci nt

Among them, Wi is the connecting weight of xt and it . Ui


is the connecting weight of ht−1 and it . Vi is the connecting
weight of c_t − 1 and it . Wc is the connecting weight of xt and
FIGURE 1. The sentiment analysis model of RNN.
c_int . Uc is the connecting weight of c_int and ht−1 . tanh(.)
is tanh activation function. ft and it refer to weights of ct−1
and c_int . bi and bc are the bias terms.
The calculation process of RNN is as follows:
Output gate o determines the output value of LSTM unit.
1) At the time t, wt is input to the hidden layer.
The calculation method is as follows:
2) st is the hidden layer’s output of the step t. st is based (
on wt and st−1 . st = f (U ∗ wt + W ∗ st−1 ), where f is ot = σ (Wo xt + Uo ht−1 + Vo ct−1 + bo )
(9)
usually the non-linear function, such as tanh or ReLU. ht = ot · tanh(ct )
3) Finally, the output D is calculated according to
Among them, Wo is the connecting weight of xt and ot . Uo
ot = soft max(V ∗ st ).
is the connecting weight of ht−1 and ot . Vo is the connecting
weight of ct−1 and ot . bo is the bias term.
C. LONG SHORT TERM MEMORY MODEL
The traditional recurrent neural network model cannot cap-
D. SENTIMENT ANALYSIS OF COMMENTS BASED BILSTM
ture long-distance semantic connection, even if it can trans-
For overcoming the shortcomings in current comment sen-
fer semantic information between words. In the process of
timent analysis methods, a sentiment analysis method of
parameter training, the gradient decreases gradually until it
comments based on BiLSTM is proposed in this paper.
disappears. As a result, the length of sequential data is limited.
In the traditional recurrent neural network model and
Long Short Term Memory (LSTM) overcomes the problem
LSTM model, information can only be propagated in forward,
of gradient disappearance by introducing Input gate i, Output
resulting in that the state of time t only depends on the
gate o, Forget gate f and Memorycell. The LSTM network
information before time t. In order to make every moment
structure [35] is shown in the following FIGURE 2.
contain the context information, BiLSTM which combines
Forget gate f determines the information to forget in Mem-
bidirectional recurrent neural network (BiRNN) models and
orycell at the last moment. The input is h(t-1) and x(t). The
LSTM units is used to capture the context information.
output value is between 0 and 1. The calculation method is as
BiLSTM model treats all inputs equally. For the task of
follows:
sentiment analysis, the sentiment polarity of the text largely
ft = σ (Wf xt + Uf ht−1 + Vf ct−1 + bf ) (7) depends on the words with sentiment information. In this

VOLUME 7, 2019 51525


G. Xu et al.: Sentiment Analysis of Comment Texts Based on BiLSTM

FIGURE 3. The comment sentiment analysis method proposed in this paper.

TABLE 3. Examples of data SET.

paper, the sentiment reinforcement of sentiment word vec- process of obtaining the sentiment polarity of the comment
tor is realized. Sentiment analysis tasks are essentially text text. Among them, NodeNum refers to the number of nodes
categorization tasks, and distributed word vectors do not of LSTM hidden layer.
take into account the contributions of different words to the
categorization task. In Section A of Research Methods, the IV. EXPERIMENT
weighted word vectors containing sentiment information and A. EXPERIMENTAL ENVIRONMENT
classification contribution are constructed. In this paper, the experimental hardware platform is Intel
Firstly, the weighted word vectors are used as the inputs Xeon E5 (6 cores), 32G memory, GTX 1080 Ti. The exper-
of BiLSTM model, and the outputs of BiLSTM model are imental software platform is Ubuntu 16.04 operating system
used as the representations of the comment texts. Then, and development environment is Python3.5 programming
the comment text vectors are input into the feedforward language. The Tensorflow library and the Scikit-learn library
neural network classifier. Finally, the sentiment tendency of of python are used to build the proposed sentiment analysis
the comments is obtained. The activation function of feedfor- method and comparative experiments.
ward neural network is ReLU function. In order to prevent
the over-fitting phenomenon in the training process, dropout B. DATA SET
mechanism was introduced, and dropout discarding rate was The experimental corpus which has equal number of positive
set to 0.5. and negative texts includes 15000 hotel comment texts (Data
The schematic diagram of the sentiment method proposed set) crawled from Ctrip (https://www.ctrip.com/). The polari-
in this paper is as FIGURE 3. The left subgraph is the process ties of the comment texts have been labeled on Ctrip website.
of comment text feature extraction. The right subgraph is the Examples of Data set are shown in Table 3.

51526 VOLUME 7, 2019


G. Xu et al.: Sentiment Analysis of Comment Texts Based on BiLSTM

TABLE 4. Training parameters for word2VEC.

TABLE 5. The relevant parameters.

The distributed representations of words are the 300-


dimensional word vectors trained by the Skip-Gram model
provided by DataScience (https://mlln.cn). The parameters
are shown in Table 4.

FIGURE 4. Relationship between Epochs and F1 score.


C. EVALUATION INDICATORS
The evaluation indicators in this paper are precision(P),
recall(R) and F1 score. The relevant parameters are shown
in Table 5.
The formula for calculating precision(P), recall(R) and F1
score is as follows:
a
P= (10)
a+b
a
R= (11)
a+c
2×P×R
F1 = (12)
P+R

D. HYPERPARAMETERS SETTING OF MODEL


The hyperparameters in the proposed sentiment model FIGURE 5. Relationship between α values and F1 score.
include the number of epochs, α value, learning rate, maxLen,
nodeNum, and so on. The hyperparameters with the best 2) α VALUE
classification effect of the model are studied. The Data set is In this paper, the weight of the word with sentiment informa-
randomly divided into a test set and a training set according tion is α, which is the metric of the contribution of sentiment
to a ratio of 1:4. Other network parameters are unchanged, information to the sentiment classification task. If the α value
and the hyperparameters are changed to test. is too small, it cannot fully reflect the difference between sen-
timent words and non-sentiment words, reducing the effect
1) EPOCHS of sentiment classification; if the α value is too large, it will
Epochs is the number of iterations of the training set. over-measure the contribution of sentiment information and
As Epochs increases, the generalization ability of the model reduce the accuracy of sentiment classification. FIGURE 5 is
enhances. However, if the number of epochs is too large, over- the classification effect of the model at different α value.
fitting problem is easily generated, and the generalization It can be seen from FIGURE 5 that when the α value is 1,
ability of the model reduces. Therefore, it is important to F1 score of the model is 90.87%; as the α value increases,
choose the right Epochs. FIGURE 4 is the classification effect F1 score of the model increases first and then decreases.
of the model at different Epochs. When α values are 2, 3, and 4, the contribution of the sen-
It can be seen from FIGURE 4 that with the growth timent information to the model is integrated into the weight
of Epochs, the classification performance F1 score of the of the word vector, so that F1 score of the model is higher than
model gradually increases. It tends to be stable when the base value of 90.87%; meanwhile, when α value is 2, F1
Epochs is 70. score of the model is highest, reaching at 91.90%. When α

VOLUME 7, 2019 51527


G. Xu et al.: Sentiment Analysis of Comment Texts Based on BiLSTM

value is above 5, the weight difference between the sentiment


word and the non-sentiment word is too large, resulting in F1
score is lower than 90.87%.

3) LEARNING RATE
The appropriate choice of learning rate is important for the
optimization of weights and offsets. If the learning rate is
too large, it is easy to exceed the extreme point, making the
system unstable. If the learning rate is too small, the training
time is too long. FIGURE 6 is the classification effect of the
model at different learning rates.

FIGURE 7. Text length distribution in the dataset.

FIGURE 6. Relationship between learning rate and F1 score.

It can be seen from FIGURE 6 that F1 scores of the model


are around 92%. Moreover, F1 score reaches a maximum
value of 92.2% when the learning rate is 0.2.
FIGURE 8. Relationship between Maxlen and F1 score.
4) MAXLEN
In this paper, MaxLen is the number of word vectors input
into BiLSTM. If the length of data is greater than MaxLen,
5) NODENUM
the data will be truncated. If the length of data is less than
The number of hidden layer nodes influences on the complex-
MaxLen, a zero vector is added at the end of the data until
ity and effect of the model. If the number of nodes is too small,
the length reaches MaxLen.
the network learning ability will be limited. If the number of
The value of MaxLen is related to the input data of the
nodes is too large, the complexity of the network structure is
model. If MaxLen is too large, the zero vector in the data is too
large. At the same time, it is easier to fall into local minimum
much filled. If MaxLen is too small, the lost data information
points during the training process, and the network learning
is too much. Thus, MaxLen has a great influence on the
speed will decrease. FIGURE 9 is the classification effect of
performance of the model. The distribution of the length of
the model at different NodeNums.
data is shown in FIGURE 7.
It can be seen from FIGURE 9 that when the number of
It can be seen from FIGURE 7 that data lengths are
nodes increases from 32 to 128, F1 score is improved slightly.
short and most data lengths are less than 200. Therefore,
When the number of nodes exceeds 128, F1 score shows a
the maximum value of MaxLen is set 200. FIGURE 8 is
downward trend. Therefore, the number of hidden layer nodes
the classification effect of the model at different learning
in the model is selected to be 128.
rates.
It can be seen from FIGURE 8 that when MaxLen is
20, F1 score is only 90.27% due to discard too much valid E. COMPARATIVE EXPERIMENTS
information about the data. When MaxLen increases, F1 score 1) COMPARISON OF THE SENTIMENT ANALYSIS FOR
shows an upward trend. When MaxLen = 100, F1 score DIFFERENT WORD REPRESENTATIONS
reaches a maximum value 92.20%. When MaxLen continues In order to verify the validity of the word representation
to increase, F1 score shows a downward trend. This is because proposed in this paper, different word representations are
the zero vector is filled too much. input into BiLSTM model. The effects of sentiment analysis

51528 VOLUME 7, 2019


G. Xu et al.: Sentiment Analysis of Comment Texts Based on BiLSTM

FIGURE 10. F1 scores of ten experiments for different word


FIGURE 9. Relationship between NodeNum and F1 score. representations.

TABLE 6. Hyperparameter list. TABLE 7. Comparison of various word vector representations.

are compared through experiments. The specific parameters


of the sentiment analysis model are shown in Table 6.
The vec in this paper refers to the distributed word rep-
resentation generated by Word2vec containing the semantic
information of the words. TF-IDF refers to the weighted
distributed word vectors with TF-IDF, which embodies the
contribution of different words to the classification task. Sen-
info refers to the weighted distributed word vectors with sen- FIGURE 11. Comparison of the sentiment analysis of different word
timent information, which embodies the difference between vector representations.

sentiment words and other words. Seninfo+TF-IDF refers


to the weighted distributed word vectors with TF-IDF and
sentiment information. It can be seen from Table 7 and FIGURE 11 that the
The Data set is randomly divided into a test set and a precision, recall and F1 score of the word representation
training set according to a ratio of 1:4. F1 scores of ten (Seninfo+TF-IDF) proposed in this paper are superior to
repeated experiments for different word vector representa- other word representation methods. In particular, compared
tions are shown in FIGURE 10. with the distributed word vector trained by Word2vec,
It can be seen from FIGURE 10 that F1 score is only the representation method proposed in this paper increases
around 88.5% when seninfo or tfidf is not integrated to word the precision by 2.44 percentage points, the recall increases
representation; When the tfidf or seninfo is integrated to by 4.73 percentage points, and F1 score increases by 3.58 per-
the weighted word vector, the sentiment analysis effect of centage points. The reason is that the distributed word vector
comments is significantly improved, reaching around 91%, trained by Wordvec mainly contains the semantic informa-
and integrating tfidf is slightly better than seninfo. After tion of words, but cannot contain the sentiment information
integrating tfidf and seninfo as the weight of word vector, of words. At the same time, the distributed word vector
the sentiment analysis has the best effect, and F1 score is trained by wordvec cannot reflect the different importance
basically above 92%. of different words to the classification task. When the word
The average precision, recall and F1 score of ten repeated vectors are input into BiLSTM model for sentiment analysis,
experiments for different word vector representations are the degree of discrimination of the words is relatively weak,
shown in Table 7 and FIGURE 11. thus reducing the accuracy of sentiment analysis. The word

VOLUME 7, 2019 51529


G. Xu et al.: Sentiment Analysis of Comment Texts Based on BiLSTM

representation method proposed in this paper takes into TABLE 8. Experimental results of the proposed method and other
traditional methods.
account the sentiment information contained in the words and
the contribution to the classification task, which alleviates the
above problems to some extent.

2) COMPARISON OF THE SENTIMENT ANALYSIS FOR


DIFFERENT SENTIMENT ANALYSIS METHODS
In order to further prove the effectiveness of the senti-
ment analysis method proposed in this paper, the proposed
method is compared with other traditional sentiment analysis
methods (LSTM, RNN, CNN, Naive Bayesian). The inputs
of models are the weighted word vectors proposed in this
paper. The hyperparameters of RNN and LSTM are shown
in Table 6. The CNN method uses a single channel, and the
convolution filter size is set to 5. The Naive Bayesian method
uses MultinomialNB with an alpha setting of 2.0.
The Data set is randomly divided into a test set and a
training set according to a ratio of 1:4. F1 scores of ten
repeated experiments for different sentiment analysis meth-
ods are shown in FIGURE 12.
FIGURE 13. Comparison of experimental results between the proposed
method and other traditional methods.

¬ RNN deep learning model can effectively transfer the


semantics between words, but there is a gradient disap-
pearance problem; ­ CNN deep learning model can mine
local information, but the semantic information passed by
the sequences cannot be effectively modeled;® LSTM deep
learning model alleviates the gradient disappearance problem
to some extent, but it is impossible to capture context seman-
tic information because the information is only transmitted
from front to back;¯ Naive Bayesian machine learning model
FIGURE 12. F1 scores of ten experiments for different sentiment analysis has a certain error rate because it determines the probability
methods. of posteriority through prior knowledge and data.
The proposed sentiment analysis method uses the
It can be seen from FIGURE 12 that F1 scores of the above improved word representation as input, so that the model can
methods are relatively stable in ten experiments. In above better learn the sentiment information contained in the words
five methods, the sentiment classification effects of CNN and and the contribution to the classification task. The introduced
Naive Bayesian are poor, and F1 score is only around 85%. BiLSTM model includes the LSTM unit. The gradient dis-
When using RNN and LSTM suitable for sequence modeling, appearance problem is solved to some extent by the gating
F1 score can reach around 87%. The method proposed in mechanism. In addition, the forward sequence information
this paper introduces BiLSTM structure, which can capture and the reverse sequence information are considered, and
the semantic information of the context more effectively, the semantic information of the context is captured more
so the sentiment analysis works best. F1 score is significantly effectively.
improved, reaching around 92%.
The average precision, recall and F1 score of ten repeated V. CONCLUSION
experiments for different sentiment analysis methods are In the era of rapid development of Internet technology and
shown in Table 8 and FIGURE 13. social networks, it is very meaningful to explore the emo-
It can be seen from Table 8 and FIGURE 13 that precision tional tendency of comments through artificial intelligence
of proposed method reaches 91.54%, recall reaches 92.82%, technology. In this paper, a sentiment analysis method of
and F1 score reaches 92.18%. F1 scores of the other methods comments based on BiLSTM is proposed and applied to
range from 84% to 89%, which are obviously lower than F1 the comment sentiment analysis task. According to the defi-
score of the proposed method. In addition, F1 scores of RNN ciency of the word representation method in the current
and LSTM suitable for sequential processing tasks are higher researches, the sentiment information contribution degree is
than those of CNN and Naive Bayesian. The reasons are: integrated into the TF-IDF algorithm of the term weight

51530 VOLUME 7, 2019


G. Xu et al.: Sentiment Analysis of Comment Texts Based on BiLSTM

computation, and a new representation method of word vector [17] Y. Wang, X. Zheng, D. Hou, and W. Hu, ‘‘Short text sentiment classifica-
based on the improved term weight computation is proposed. tion of high dimensional hybrid feature based on SVM,’’ Comput. Technol.
Develop., vol. 28, no. 2, pp. 88–93, Feb. 2018.
In addition, BiLSTM model fully considers the context infor- [18] R. Socher, ‘‘Recursive deep models for semantic compositionality over
mation and can better obtain the text representation of the a sentiment treebank,’’ in Proc. Conf. Empirical Methods Natural Lang.
comments. Finally, through the feedforward neural network Process., Seattle, WA, USA, 2013, pp. 1631–1642.
[19] C. D. Santos and M. Gattit, ‘‘Deep Convolutional Neural Networks for
and softmax mapping, the sentiment tendency of the text is Sentiment Analysis of Short Texts,’’ in Proc. 25th Int. Conf. Comput.
obtained. The experiments of different word representation Linguistics: Tech. Papers, Dublin, Ireland, 2014, pp. 69–78.
methods prove the validity of the proposed word represen- [20] O. Irsoy and C. Cardie, ‘‘Opinion Mining with Deep Recurrent Neural
Networks,’’ in Proc. Conf. Empirical Methods Natural Lang. Process.,
tation method in this paper. Through the comparison exper- Doha, Qatar, 2014, pp. 720–728.
iments with other traditional sentiment analysis methods, [21] K. S. Tai, R. Socher, and C. D. Manning, ‘‘Improved semantic repre-
the accuracy of the proposed comment sentiment analysis sentations from tree-structured long short-term memory networks,’’ in
Proc. 53rd Annu. Meeting Assoc. Comput. Linguistics, China, Aug. 2015,
method is improved. However, the sentiment analysis method pp. 1556–1566.
of comments based on BiLSTM consumes a long time in [22] C. Baziotis, N. Pelekis, and C. Doulkeridis, ‘‘Datastories at SemEval-
the training model. In future work, the method to effectively 2017 Task 4: Deep LSTM with attention for message-level and topic-
based sentiment analysis,’’ in Proc. 11th Int. Workshop Semantic Eval.
accelerate the training process of the model will be studied. (SemEval), Vancouver, BC, Canada, Aug. 2017, pp. 747–754.
[23] F. Zhang, C. Hu, Q. Yin, W. Li, H.-C. Li, and W. Hong, ‘‘Multi-aspect-
REFERENCES aware bidirectional LSTM networks for synthetic aperture radar target
recognition,’’ IEEE Access, vol. 5, pp. 26880–26891, 2017.
[1] L. Wang, D. Miao, and Z. Zhang, ‘‘Emotional analysis on text sentences [24] Y. Liu, W. Song, L. Liu, and H. Wang, ‘‘Document representation based
based on topics,’’ Comput. Sci., vol. 41, no. 3, pp. 32–35, Mar. 2014. on semantic smoothed topic model,’’ in Proc. 17th IEEE/ACIS Int. Conf.
[2] S. Krishnamoorthy, ‘‘Sentiment analysis of financial news articles using Softw. Eng., Artif. Intell., Netw. Parallel/Distrib. Comput., Beijing, China,
performance indicators,’’ Knowl. Inf. Syst., vol. 56, no. 2, pp. 373–394, May/Jun. 2016, pp. 65–69.
Aug. 2018. [25] L. Zhu, G. Wang, and X. Zou, ‘‘A study of chinese document representation
[3] N. Shelke, S. Deshpande, and V. Thakare, ‘‘Domain independent approach and classification with word2vec,’’ in Proc. 9th Int. Symp. Comput. Intell.
for aspect oriented sentiment analysis for product reviews,’’ in Proc. 5th Des., Hangzhou, China, Dec. 2017, pp. 298–302.
Int. Conf. Frontiers Intell. Comput., Theory Appl., Singapore, Mar. 2017, [26] Z. Jianqiang, G. Xiaolin, and Z. Xuejun, ‘‘Deep convolution neu-
pp. 651–659. ral networks for twitter sentiment analysis,’’ IEEE Access, vol. 6,
[4] P. Sharma and N. Mishra, ‘‘Feature level sentiment analysis on movie pp. 23253–23260, 2018.
reviews,’’ in Proc. 2nd Int. Conf. Next Gener. Comput. Technol. (NGCT), [27] G. E. Hinton, ‘‘Learning distributed representations of concepts,’’ in Proc.
Dehradun, India, Oct. 2016, pp. 306–311. 8th Annu. Conf. Cogn. Sci. Soc., vol. 1, Aug. 1986, p. 12.
[5] Q. Zhang, S. Zhang, and Z. Lei, ‘‘Chinese text sentiment classification [28] Y. Bengio, R. Ducharme, P. Vincent, and C. Jauvin, ‘‘A neural probabilis-
based on improved convolutional neural networks,’’ Comput. Eng. Appl., tic language model,’’ J. Mach. Learn. Res., vol. 3, no. 6, pp. 932–938,
vol. 53, no. 22, pp. 111–115, Sep. 2017. Feb. 2003.
[6] D. Zhang et al., ‘‘Chinese comments sentiment classification based on [29] T. Mikolov, I. Sutskever, K. Chen, G. S. Corrado, and J. Dean, ‘‘Dis-
word2vec and SVMperf,’’ Comput. Sci., vol. 42, no. 4, pp. 1857–1836, tributed representations of words and phrases and their compositional-
Oct. 2016. ity,’’ in Proc. Adv. Neural Inf. Process. Syst., Nevada, NV, USA, 2013,
[7] L. Kang, L. Xu, and J. Zhao, ‘‘Co-extracting opinion targets and opinion pp. 3111–3119.
words from Online reviews based on the word alignment model,’’ IEEE [30] Y. Kim, ‘‘Convolutional neural networks for sentence classification,’’ in
Trans. Knowl. Data Eng., vol. 27, no. 3, pp. 636–650, Mar. 2015. Proc. Conf. Empirical Methods Natural Lang. Process., Doha, Qatar,
[8] Z. Hao, R. Cai, Y. Yang, W. Wen, and L. Liang, ‘‘A dynamic conditional Aug. 2014, pp. 1746–1751.
random field based framework for sentence-level sentiment analysis of [31] D. Tang, F. Wei, and B. Qin, N. Yang, T. Liu, and M. Zhou, ‘‘Sentiment
chinese microblog,’’ in Proc. IEEE Int. Conf. Comput. Sci. Eng. (CSE), embeddings with applications to sentiment analysis,’’ IEEE Trans. Knowl.
Guangzhou, China, Jul. 2017, pp. 135–142. Data Eng., vol. 28, no. 2, pp. 496–509, Feb. 2016.
[9] Z. U. Rehman and I. S. Bajwa, ‘‘Lexicon-based sentiment analysis for [32] H. Chen, M. Sun, C. Tu, Y. Lin, and Z. Liu, ‘‘Neural sentiment classifica-
urdu language,’’ in Proc. 6th Int. Conf. Innov. Comput. Technol. (INTECH), tion with user and product attention,’’ in Proc. Conf. Empirical Methods
Dublin, Ireland, Aug. 2016, pp. 497–501. Natural Lang. Process., Austin, TX, USA, 2016, pp. 1650–1659.
[10] A. S. Manek, P. D. Shenoy, M. C. Mohan, and K. R. Venugopal, ‘‘Aspect [33] J. Liu and Z. Zhang, ‘‘Sentiment analysis on food safety news using joint
term extraction for sentiment analysis in large movie reviews using Gini deep neural network model,’’ Comput. Sci., vol. 43, no. 12, pp. 277–280,
Index feature selection method and SVM classifier,’’ World Wide Web, Dec. 2016.
vol. 20, no. 2, pp. 135–154, Mar. 2017. [34] J. Li. Chinese Sentiment Dictionary. Research Institute Information
[11] M. Mubarok, S. Adiwijaya, and M. D. Aldhi, ‘‘Aspect-based sentiment Technology, Tsinghua University, Beijing, China. Accessed: Oct. 4, 2018.
analysis to review products using Naïve Bayes,’’ in Proc. AIP Conf., [Online]. Available: http://nlp.csai.tsinghua.edu.cn/site2/index.php/zh/
Budapest, Hungary, 2017, pp. 1–8. people?catid=13&id=13:v10
[12] M. Bouazizi and T. Ohtsuki, ‘‘A pattern-based approach for multi-class [35] W. A. Shu-heng, T. U. Ibrahim, and K. Abiderexiti, ‘‘Sentiment classfica-
sentiment analysis in twitter,’’ IEEE Access, vol. 5, pp. 20617–20639, tion of Uyghur text based on BLSTM,’’ Comput. Eng. Des., vol. 38, no. 10,
2017. pp. 2879–2886, 2017.
[13] P. Turney and M. L. Littman, ‘‘Measuring praise and criticism: Inference
of semantic orientation from association,’’ ACM Trans. Inf. Syst., vol. 21,
GUIXIAN XU was born in Changchun, Jilin,
no. 4, pp. 315–346, Oct. 2003.
China, in 1974. She received the B.S. and M.S.
[14] M. Taboada, ‘‘Lexicon-based methods for sentiment analysis,’’ Comput.
degrees from the Changchun University of Tech-
Linguistics, vol. 37, no. 2, pp. 267–307, Jun. 2011.
[15] B. Pang, L. Lee, and S. Vaithyanathan, ‘‘Thumbs up?: Sentiment clas-
nology, in 1998 and 2002, respectively, and the
sification using machine learning techniques,’’ in Proc. Conf. Empirical Ph.D. degree in computer software and theory
Methods Natural Lang. Process. (ACL), Grenoble, France, Jul. 2002, from the Beijing Institute of Technology, in 2010.
pp. 79–86. Since 2002, she has been a Teacher with the
[16] A. B. Goldberg and X. Zhu, ‘‘Seeing stars when there aren’t many Information Engineering College, Minzu Uni-
stars: Graph-based semi-supervised learning for sentiment categoriza- versity of China. She is currently an Associate
tion,’’ in Proc. Workshop Graph Based Methods Natural Lang. Process., Professor. Her research interests include data
Sydney, VIC, Australia, Jun. 2006, pp. 45–52. mining and machine learning.

VOLUME 7, 2019 51531


G. Xu et al.: Sentiment Analysis of Comment Texts Based on BiLSTM

YUETING MENG was born in Shijiazhuang, ZIHENG YU was born in Taizhou, Zhejiang,
Hebei, China, in 1996. She received the B.S. China, in 1994. He received the B.S. degree in
degree in computer science and technology from software engineering from Beijing Union Univer-
the Hebei University of Science and Technology, sity, in 2017. He is currently pursuing the master’s
in 2018. She is currently pursuing the master’s degree in software engineering with the Minzu
degree in software engineering with the Minzu University of China. His research interests include
University of China. Her research interests include data mining, natural language processing, and arti-
artificial intelligence, natural language processing, ficial intelligence.
and data mining.

XIAOYU QIU received the M.S. degree in com- XU WU was born in Fenghuang, Hunan, China,
puter science from Shandong Normal University, in 1993. He received the B.S. degree in software
in 2008. He is currently a Librarian with the engineering from the Chongqing University of
Library of Shandong University of Traditional Posts and Telecommunications, in 2017. He is
Chinese Medicine. His current research interests currently pursuing the master’s degree in mod-
include different aspects of pattern recognition, ern education technology with the Minzu Univer-
artificial intelligence, and distributed systems. sity of China. His research interests include data
mining, natural language processing, and artificial
intelligence.

51532 VOLUME 7, 2019

You might also like