0% found this document useful (0 votes)
117 views

NLP Assignment 2

The document is an assignment for a Natural Language Processing course. It includes details about the assignment such as the course code, semester, and names of the lecturer and students. It does not include any specific assignment questions or details.

Uploaded by

Adam Hafizi
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
117 views

NLP Assignment 2

The document is an assignment for a Natural Language Processing course. It includes details about the assignment such as the course code, semester, and names of the lecturer and students. It does not include any specific assignment questions or details.

Uploaded by

Adam Hafizi
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 10

BITI 3413: NATURAL LANGUAGE PROCESSING

SEM 1, 2023/2024

ASSIGNMENT 2

LECTURER’S NAME:

NAME MATRIC NO

Muhammad Adam Hafizi bin Hashim Tee B032110306

Muhammad Fakhrul Hazwan Bin Fahrurazi B032110357


i) Who is the creator and when was it introduced?

The Text-to-Text Transfer Transformer or T5 was created by a team of researchers. That

includes Colin Raffel, Noam Shazeer, Adam Roberts, Katherine Lee, Sharan Narang, Michael

Matena, Yanqi Zhou, Wei Li, and Peter J. Liu. It was introduced in 2020 by this group of

individuals.

ii) Purpose of the LLM model in NLP

In natural language processing (NLP), T5 was developed to provide a fused framework

for many NLP tasks. By transforming them into a text-to-text format where both the input and

output are stated in natural language text.The structure of its architecture simplifies the execution

of various NLP tasks. The large language model uses a single model and training aim to make it

work. T5 handles tasks like translation, summarization, answering questions and many more.

The goal of T5 is to unify multiple NLP processes into a single framework to improve

performance. It can improve performance in diverse NLP applications and accelerate the process

of generating and deploying models.


iii) Model architecture (with diagram, if any)

iv) The methodologies of the LLM model development

a) Transformer Architecture

- The Transformer architecture, first presented by Vaswani et al. in their paper

"Attention is All You Need," serves as the foundation for T5. The transformer

architecture is ideally suited for addressing long range dependencies in sequential


data, such as natural language, because it processes input sequences in parallel

using a self-attention mechanism.

b) Pre-training

- Pre-training is performed on a large corpus with diverse text input styles for T5.

By predicting missing segments of the input sequence, the model is capable of

generating text that is both consistent and contextually appropriate during

pre-training. To enable the model to capture general language patterns and

semantic understanding, this pre-training phase is essential.

c) Text-to-Text Framework

- T5 stands out due to its text-to-text framework, where various NLP tasks share a

common text generation format instead of using task-specific architectures. This

means all tasks involve natural language text for both input and output. This

smooth approach simplifies the training and capacitates the model to effortlessly

handle various tasks in NLP.

d) Task Formulation–

- For fine-tuning on specific NLP tasks, T5 requires task-specific prompts that

frame the task as a text generation problem. This framing allows T5 to adapt to

different tasks using a consistent methodology. Essentially, T5 is guided to

approach each task as if it were generating text, even when the desired output isn't
strictly text-based. By framing tasks in this way, T5 can leverage its core text

generation capabilities to tackle a wide range of NLP challenges.

e) Multi-Task and Large-Scale Learning

- T5, or Text-To-Text Transfer Transformer, demonstrates improved performance

through a combination of multi-task learning and large-scale training. Multi-task

learning is employed both in the pre-training and fine-tuning stages, enabling the

model to simultaneously tackle multiple tasks. This approach capitalizes on the

shared knowledge across tasks, enhancing the model's overall capabilities.

Additionally, T5 leverages the advantages of large-scale training, involving

extensive datasets and powerful hardware such as GPUs or TPUs. The model

benefits from exposure to a diverse range of data, allowing it to learn intricate

patterns and relationships. The synergy of multi-task learning and large-scale

training contributes to T5's effectiveness in understanding and generating

human-like text across various language task

f) Evaluation and Iterative Improvement

- The development of the model follows an iterative process that includes

continuous evaluation and refinement. Researchers assess the model's

performance across benchmark datasets for diverse NLP tasks, pinpoint areas

requiring improvement, and iteratively adjust both the model architecture and

training methodologies.
v) Advantages and Weakness of the LLM

The advantage of T5 is flexible, the text-to-text model design can handle different kinds

of Natural Language Processing Tasks by simply changing the input and output formats. This

makes the model development easier since there is no requirement for task-specific architectures.

T5 can translate, summarize, answer questions, and classify text. This proves that T5 is versatile

and efficient in solving many language problems.

Another key strength of T5 lies in its extensive pre-training on massive datasets, allowing

it to clean insights from a wide range of language patterns and structures. This large-scale

pretraining contributes significantly to the model's proficiency in capturing nuanced linguistic

features, thereby enhancing its overall performance on downstream tasks. This foundational

knowledge, acquired during pre-training, positions T5 as a robust and effective language model,

capable of understanding and generating coherent text across diverse contexts.

T5's prowess is further exemplified by its consistently improved performance, achieving

state-of-the-art results on prominent NLP benchmarks like GLUE and SuperGLUE. This

indicates its exceptional ability to grasp complex language structures and patterns, translating

into high-quality outputs across a multitude of tasks. The model's success in these benchmarks

underscores its effectiveness and competitiveness in the rapidly evolving landscape of NLP

research and applications.


Moreover, T5 leverages transfer learning as a key methodology to bolster its performance

on downstream tasks. By initially pre-training on a vast corpus of data, T5 acquires a broad

understanding of general language patterns, which is then fine-tuned for specific applications.

This transfer learning approach enhances T5's adaptability, allowing it to leverage previously

gained knowledge and apply it to new, task-specific challenges. The model's versatility in

handling various NLP tasks positions it as a powerful tool for researchers and practitioners

seeking a comprehensive and adaptable solution.

The T5 model, with its impressive performance in natural language processing (NLP),

introduces notable challenges. One significant drawback is its substantial size, surpassing models

like BERT by over thirty times. This hinders accessibility for researchers and practitioners

relying on commodity GPU hardware due to increased difficulties and costs. Despite its

successes, the model's susceptibility to brittleness and un-human-like failures underscores the

ongoing complexities in achieving robust and human-like language understanding, particularly in

real-world applications.

Additionally, the success of T5 highlights the pressing need for improved evaluation

methodologies in the NLP community. The existing challenges in creating clean, challenging,

and realistic test datasets are acknowledged, emphasizing the necessity of establishing fair

benchmarks that accurately assess the capabilities of these advanced language models. This

recognition of evaluation shortcomings signals a call for continued efforts to enhance the

reliability of assessments and to drive progress in the field.


Furthermore, the ethical implications associated with biases present in the training data

of models like T5 are a significant concern. The learned biases related to race, gender, and

nationality can render the deployment of such models in real-world applications potentially

illegal or unethical, necessitating meticulous debiasing efforts by product engineers. The passage

underscores the importance of addressing biases in a task-independent manner, presenting it as a

substantial open problem within the realm of NLP, and emphasizing the critical role of ethical

considerations in the deployment of advanced language models.

In conclusion, T5 represents a groundbreaking advancement in natural language

processing, showcasing unparalleled flexibility with its text-to-text model design. Through

extensive pre-training on massive datasets, T5 attains a profound understanding of linguistic

nuances, consistently achieving state-of-the-art performance on benchmarks like GLUE and

SuperGLUE. While recognizing its strengths, it's crucial to acknowledge challenges tied to its

substantial size and ethical considerations regarding biases. As T5 shapes the NLP landscape, its

successes and challenges propel ongoing research, fostering progress and ethical deployment in

the dynamic realm of language models.

vi) Include one NLP application that uses the LLM

One application that uses the T5 Large Language model is text summarization which

involves generating concise and coherent summaries that capture the important information from

longer pieces of text. When using T5 for text summarization, the model is fine-tuned to a dataset
that contains pairs of longer documents and their corresponding human-generated summaries.

During training, the input consists of the document and the output is the generated summary. The

models learn to understand the content of the document and generate a summary that will capture

the key information in a human-like manner.

The T5 is powerful but the quality of summarization depends on the training data and the

fine tuning process. Continuous evaluation and refinement are necessary to make sure the

generated summaries meet high standards of accuracy and informativeness.

vii) References (include 2-5 article papers that you referred when preparing your article)

Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., & Liu, P.

J. (2020). Exploring the Limits of Transfer Learning with a Unified Text-to-Text

Transformer. Journal of Machine Learning Research, 21(140), 1–67.

https://jmlr.org/papers/volume21/20-074/20-074.pdf

T5 - a lazy data science guide. (n.d.).

https://mohitmayank.com/a_lazy_data_science_guide/natural_language_processing/T5/

Mishra, P. (2021, December 14). Understanding T5 Model : Text to Text Transfer Transformer

model. Medium.

https://towardsdatascience.com/understanding-t5-model-text-to-text-transfer-transformer-

model-69ce4c165023
Bahani, M., Ouaazizi, A. E., & Maalmi, K. (2023). The effectiveness of T5, GPT-2, and BERT

on text-to-image generation task. Pattern Recognition Letters, 173, 57–63.

https://doi.org/10.1016/j.patrec.2023.08.001

T5. (n.d.). https://huggingface.co/docs/transformers/model_doc/t5

You might also like