Skip to content

Ashly1991/transformer-nmt-tf2

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Transformer Neural Machine Translation (TensorFlow 2)

Self-contained implementation of a Transformer for neural machine translation in TensorFlow 2. The project covers tokenization, positional encodings, masking (padding & look-ahead), multi-head self-attention, encoder–decoder stacks, training, and inference/decoding.

Highlights

  • Tokenization & vocab (with tensorflow-text) and positional encodings.
  • Masks: padding mask for loss/attention; look-ahead mask for the decoder.
  • Transformer blocks: scaled dot-product attention, multi-head attention, FFN, residuals + layer norm.
  • Training loop with cross-entropy + accuracy; masked loss to ignore padding.
  • Inference (greedy by default; extendable to beam search).
  • Reproducibility: seeds set in the notebook; notes on deterministic decoding.

What I learned (from this build)

  • How a Transformer uses self-attention, multi-head attention, and positional encodings to model sequences.
  • Why positional encodings are needed (attention is permutation-invariant).
  • How masks (padding & look-ahead) affect attention and the loss.
  • Encoder–decoder structure; teacher forcing at training vs auto-regressive decoding at inference.
  • Practical setup (tokenization, vocabularies, training loop, decoding).

How to run

python -m venv .venv && source .venv/bin/activate   # Windows: .venv\Scripts\activate
pip install -r requirements.txt
jupyter lab transformer-nmt.ipynb

Requirements

Pinned for stability/performance with this implementation:

tensorflow==2.14.0
tensorflow-text==2.14.0
tensorflow-datasets
numpy
matplotlib
jupyterlab

Notes

  • Greedy decoding is deterministic with fixed weights and dropout disabled.
    To ensure repeatable translations, set seeds and avoid sampling at inference.
  • Swap tokenizers/vocabs + final projection size to use a different language pair; core architecture stays the same.

License

MIT — see LICENSE.

About

Transformer neural machine translation in TensorFlow 2 with tensorflow-text; tutorial-based build with masks, positional encodings, and experiments.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors