Report
Report
Report
REPORT
On
Rishi Nayak
2023201004
[ M.tech CSE ]
The following report contains the implementation of Part of Speech [ POS ] tagging
using Feed Forward Neural Network and Recurrent Neural Network ( LSTM ). The
models are trained on a variety of English datasets and validated and tested on given
datasets respectively. This report briefly explains the implementation of these neural
networks, and how the models are trained. It consists of reports of different
hyperparameters used to test the accuracies of the models and the graphs plotted
between accuracies and different hyperparameters.
INTRODUCTION
Code Overview
The system's core is the FNN_POS_Tagger class, a PyTorch neural network module. It
utilises an embedding layer to convert input words into dense vectors, followed by fully
connected layers to perform classification. The system loads data from CONLLU files,
which contain parsed sentences in the Universal Dependencies format. These
sentences are then processed to extract word and POS tag information.
Functions are provided to generate input-output pairs for training the model. These pairs
consist of input contexts (previous and successive tokens) and their corresponding POS
tags. The model is trained using a provided training dataset. Training involves
minimizing a cross-entropy loss function using the Adam optimizer. The model's
performance is evaluated using a validation dataset to monitor loss and accuracy. After
training, the model is tested on separate test datasets to assess its generalization
performance. The test accuracy is reported for each test dataset. After training and
testing, the trained model's parameters are saved to a file for future use.
Hyperparameters
Code Overview
Functions generate input-output pairs for training the model. These pairs consist of input
sequences (words represented as indices) and their corresponding POS tags. Data
loaders are created to handle batching and padding of input-output pairs for efficient
training.
The model is trained using the training dataset. Training involves minimizing the
cross-entropy loss function using the Adam optimizer. Validation is performed on a
separate dataset to monitor loss and accuracy. After training, the model is evaluated on
multiple test datasets to assess its generalization performance. The test accuracy is
reported for each test dataset. The trained model's parameters and the vocabulary
(word-to-index and tag-to-index mappings) are saved for future use.
Hyperparameters
FNN REPORTS
Testing accuracies:
With the entire given dataset of the English language, I tested my model with the
mentioned context window size with the specific condition of -
and trained the model with the training data, later testing the accuracies with the
validation set as mentioned in the task. Here are the observations -
Case 1: p = s = 0
Case 2: p = s = 1
Case 3: p = s = 2
Case 4: p = s = 3
Case 5: p = s = 4
Here is the graph plotted between the context window size and the validation set
accuracies -
LSTM REPORTS
hyperparameters = [
{"epochs": 10, "lstm_stacks": 1, "bidirectional": False,
"hidden_dim": 128, "embedding_dim": 100, "activation": nn.Tanh()},
{"epochs": 5, "lstm_stacks": 2, "bidirectional": True,
"hidden_dim": 256, "embedding_dim": 150, "activation": nn.ReLU()},
{"epochs": 8, "lstm_stacks": 1, "bidirectional": False,
"hidden_dim": 100, "embedding_dim": 200, "activation": nn.Sigmoid()}
]
1. Set 1 Results -
2. Set 2 Results -
3. Set 3 Results -
Performance of best model on testing data :
1. ud-treebanks-v2.13/UD_English-ESLSpok/en_eslspok-ud-test.conllu:
Micro Precision: 0.9871, Micro Recall: 0.9871, Micro F1-score: 0.9871
Macro Precision: 0.8622, Macro Recall: 0.8176, Macro F1-score: 0.8296
2. ud-treebanks-v2.13/UD_English-EWT/en_ewt-ud-test.conllu:
Micro Precision: 0.9803, Micro Recall: 0.9803, Micro F1-score: 0.9803
Macro Precision: 0.7962, Macro Recall: 0.7341, Macro F1-score: 0.7560
3. ud-treebanks-v2.13/UD_English-GUM/en_gum-ud-test.conllu:
Micro Precision: 0.9699, Micro Recall: 0.9699, Micro F1-score: 0.9699
Macro Precision: 0.7949, Macro Recall: 0.7160, Macro F1-score: 0.7449
4. ud-treebanks-v2.13/UD_English-Pronouns/en_pronouns-ud-test.conllu:
Micro Precision: 0.8916, Micro Recall: 0.8916, Micro F1-score: 0.8916
Macro Precision: 0.7730, Macro Recall: 0.7514, Macro F1-score: 0.7486
5. ud-treebanks-v2.13/UD_English-GENTLE/en_gentle-ud-test.conllu:
Micro Precision: 0.9804, Micro Recall: 0.9804, Micro F1-score: 0.9804
Macro Precision: 0.7539, Macro Recall: 0.6903, Macro F1-score: 0.7087
6. ud-treebanks-v2.13/UD_English-LinES/en_lines-ud-test.conllu:
Micro Precision: 0.9688, Micro Recall: 0.9688, Micro F1-score: 0.9688
Macro Precision: 0.8180, Macro Recall: 0.7548, Macro F1-score: 0.7767
7. ud-treebanks-v2.13/UD_English-PUD/en_pud-ud-test.conllu:
Micro Precision: 0.9444, Micro Recall: 0.9444, Micro F1-score: 0.9444
Macro Precision: 0.7771, Macro Recall: 0.7095, Macro F1-score: 0.7301
MODEL PERFORMANCE -
LSTM -
FNN -
CONCLUSION
FNN REPORT
Accuracy: Larger context window sizes may improve accuracy by providing more
contextual cues, while too small window sizes may lead to decreased accuracy due to
insufficient context.
Efficiency: Smaller window sizes improved the training efficiency but might sacrifice
accuracy, while larger window sizes enhanced the accuracy at the expense of increased
computational resources.
RNN REPORT
Hyperparameters:
{"epochs": 10, "lstm_stacks": 1, "bidirectional": False, "hidden_dim": 128, "embedding_dim": 100,
"activation": nn.Tanh()}
This configuration resulted in moderate training and validation losses. The model
converged quickly, possibly indicating that the network capacity might not be fully
utilized.
Hyperparameters:
{"epochs": 15, "lstm_stacks": 2, "bidirectional": True, "hidden_dim": 256, "embedding_dim": 150,
"activation": nn.ReLU()}
This configuration led to slightly higher training and validation losses compared to the
first configuration but with an improved accuracy on the validation set. The deeper
network with bidirectional LSTM layers might have contributed to capturing more
complex patterns in the data.
Hyperparameters:
{"epochs": 12, "lstm_stacks": 1, "bidirectional": False, "hidden_dim": 100, "embedding_dim": 200,
"activation": nn.Sigmoid()}
With this configuration, the model exhibited a similar performance to the first
configuration on the validation set. The higher embedding dimension might have
allowed the model to capture more nuanced semantic information, but it did not
significantly improve the overall performance.