Computer Organization: National Institute of Technology Hamirpur
Computer Organization: National Institute of Technology Hamirpur
Computer Organization: National Institute of Technology Hamirpur
Hamirpur
CSD-221
COMPUTER ORGANIZATION
Submitted By :
185010 Akshay Kumar akshaychoudharyac01@gmail.com
185019 Prateek Bharat Sharma prateek21112@gmail.com
185036 Aman Garg amangarg3april@gmail.com
{ B. Tech CSE 4yr }
Submitted To :
Dr. Jatoth Chandrashekhar
Assistant Professor
Computer Science & Engineering Dept.
NIT Hamirpur
Text Sentiment Predictor 185010,185019,185036
Contents
1 Introduction : 2
1.1 LSTM (Long-Short-Term-Memory) . . . . . . . . . . . . . . . . . . . 2
3 Code : 4
3.1 Train . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
3.2 Test . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1
Text Sentiment Predictor 185010,185019,185036
1 Introduction :
A recurrent neural network (RNN) is a class of artificial neural networks
where connections between nodes form a directed graph along a temporal sequence.
This allows it to exhibit temporal dynamic behavior. Derived from feedforward neu-
ral networks, RNNs can use their internal state (memory) to process variable length
sequences of inputs. This makes them applicable to tasks such as unsegmented,
connected handwriting recognition[2] or speech recognition.
The term “recurrent neural network” is used indiscriminately to refer to two
broad classes of networks with a similar general structure, where one is finite impulse
and the other is infinite impulse. Both classes of networks exhibit temporal dynamic
behavior.A finite impulse recurrent network is a directed acyclic graph that can be
unrolled and replaced with a strictly feedforward neural network, while an infinite
impulse recurrent network is a directed cyclic graph that can not be unrolled.
2
Text Sentiment Predictor 185010,185019,185036
3
Text Sentiment Predictor 185010,185019,185036
3 Code :
The Code for our train and test file are:
3.1 Train
Listing 1: train.py
1 import pandas as pd
2 import numpy as np
3 import emoji
4
5 data_set = pd . read_csv ( ’ emoji_data . csv ’ , engine = ’ python ’)
6
7 text = data_set [ ’ Text ’]
8 label = data_set [ ’ Label ’]
9
10 emoji_dict = { " joy " : " " , " fear " : " " , " anger " : " " , " sadness " : " " ,←-
" disgust " : " " , " shame " : " " , " guilt " : " " }
11 emoji = { " joy " :0 , " fear " :1 , " anger " :2 , " sadness " :3 , " disgust←-
" :4 , " shame " :5 , " guilt " :6}
12
13 for i in range ( len ( label ) ) :
14 label [ i ] = emoji [ label [ i ]]
15
16 Y = np . zeros (( len ( label ) , 7) )
17 for i in range ( len ( label ) ) :
18 Y [i , label [ i ]] = 1
19
20 def read_glove_vecs ( glove_file ) :
21 with open ( glove_file , ’r ’) as f :
22 words = set ()
23 word_to_vec_map = {}
24 for line in f :
25 line = line . strip () . split ()
26 curr_word = line [0]
27 words . add ( curr_word )
28 word_to_vec_map [ curr_word ] = np . array ( line [1:] , ←-
dtype = np . float64 )
29
30 i = 1
31 words_to_index = {}
32 index_to_words = {}
33 for w in sorted ( words ) :
34 words_to_index [ w ] = i
35 index_to_words [ i ] = w
36 i = i + 1
37 return words_to_index , index_to_words , word_to_vec_map
38
4
Text Sentiment Predictor 185010,185019,185036
5
Text Sentiment Predictor 185010,185019,185036
84
85 X = Bidirectional ( LSTM ( units = 128 , return_sequences = True )←-
) ( embeddings )
86
87 X = Dropout ( rate = 0.4) ( X )
88
89 X = LSTM ( units = 128 , return_sequences = False ) ( X )
90
91 X = Dropout ( rate = .4) ( X )
92
93 X = Dense ( units = 7) ( X )
94
95 X = Activation ( ’ softmax ’) ( X )
96
97 model = Model ( inputs = sentence_indices , outputs = X )
98
99 model . compile ( loss = ’ c a t e g o r i c a l _ c r o s s e n t r o p y ’ , optimizer = ’←-
adam ’ , metrics =[ ’ accuracy ’ ])
100 from sklearn . model_selection import train_test_split
101 X_train , X_test , y_train , y_test = train_test_split ( X_set , Y←-
, test_size =0.10)
102
103 model . fit ( X_train , y_train , epochs = 20 , batch_size = 32 , ←-
shuffle = True , validation_data =( X_test , y_test ) )
104
105 model . save ( ’ Emoji - Predictor . h5 ’)
3.2 Test
Listing 2: test.py
1 import pandas as pd
2 import numpy as np
3 import emoji
4 from nltk . tokenize import TweetTokenizer
5 import keras
6 import tensorflow
7
8 emoji_dict = { " joy " : " " , " fear " : " " , " anger " : " " , " sadness " : " " ,←-
" disgust " : " " , " shame " : " " , " guilt " : " " }
9 emoji = { " joy " :0 , " fear " :1 , " anger " :2 , " sadness " :3 , " disgust←-
" :4 , " shame " :5 , " guilt " :6}
10
11 def preprocess (X , words_to_index , max_len ) :
12 X_indices = np . zeros ((1 , max_len ) )
13 tknz = TweetTokenizer ()
14 sentence_words = [ word . lower () for word in tknz . tokenize←-
(X)]
6
Text Sentiment Predictor 185010,185019,185036
15 j = 0
16 for w in sentence_words :
17 if j > 39:
18 break
19 try :
20 X_indices [0 , j ] = words_to_index [ w ]
21 j = j + 1
22 except KeyError :
23 continue
24 return X_indices
25
26 max_len = 40
27 words_to_index , index_to_words , word_to_vec_map = ←-
read_glove_vecs ( ’ glove .6 B .300 d . txt ’)
28
29 file = open ( ’ input . txt ’ , ’r ’)
30 inp = file . read ()
31 file . close ()
32 x = preprocess ( inp , words_to_index , max_len )
33 model = keras . models . load_model ( ’ Emoji - Predictor . h5 ’)
34
35 y = model . predict ( x )
36 y = np . argmax (y , axis = -1)
37 arr = list ( emoji_dict . keys () )
38
39 ans = " Predicted Sentiment : " + arr [ y [0]]
40 ans += " \ nPredicted Emoji : " + emoji_dict [ arr [ y [0]]]
41
42 file = open ( ’ output . txt ’ , ’w ’)
43 file . write ( ans )
44 file . close ()