English to Yoruba Translation using RNN

A beginner's approach to Sequence to Sequence modeling

Overview

This project is a beginner’s approach to Sequence to Sequence (Seq2Seq) modeling with Recurrent Neural Networks (RNNs), specifically LSTMs. My aim here is to understand the foundation of Seq2Seq modeling and progressively build my understanding of NLP workflow.

Some of my past projects on natural language processing include:

In this project, I implemented a Seq2Seq model using RNN (LSTM). The approach is straightforward, with a basic word-level tokenizer and no advanced tokenization techniques like BPE. The goal is to get familiar with the typical Seq2Seq modeling workflow, rather than focusing on achieving state-of-the-art results.

Dataset

The dataset used for this project was obtained from the Zindi AI4D Yoruba Machine Translation Challenge. It consists of 10,000 Yoruba to English parallel sentence pairs.

Model

In the notebook, I followed a tutorial by Bentrevett, based on the paper Sequence to Sequence Learning with Neural Networks. While the original paper uses a 4-layer architecture, he used a simpler 2-layer setup for both the encoder and decoder, focusing on building a fundamental understanding of how Seq2Seq models work. Training was done for 10 epochs.

Result

Given the limited dataset size and the simple model architecture, I was not expecting so much. However, this project helped me learn how to work with Seq2Seq models: from preparing the data, building the encoder and decoder, to training the model and evaluating it. In the future, I aim to explore more interesting techniques like using pre-trained word embeddings and different tokenization methods.

See you in the next one 🙂

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
images		images
.gitattributes		.gitattributes
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
english-to-yoruba-translation-using-rnns.ipynb		english-to-yoruba-translation-using-rnns.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

English to Yoruba Translation using RNN

A beginner's approach to Sequence to Sequence modeling

Table of Contents

Overview

Dataset

Model

Result

About

Releases

Packages

Languages

License

Oyebamiji-Micheal/English-to-Yoruba-Translation-using-RNN

Folders and files

Latest commit

History

Repository files navigation

English to Yoruba Translation using RNN

A beginner's approach to Sequence to Sequence modeling

Table of Contents

Overview

Dataset

Model

Result

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages