ittybittyGPT

This repository was created as an exercise to gain a deeper understanding LLMs, transformers, and more concretely, the attention mechanism (which is written here from scratch).

We code up our own "itty bitty GPT" and train it on a chunk of the TinyStories dataset.

This is far from the most effective implementation, but I think it is quite readable and easy to follow. This exercised is inspired by an excellent LLM interpretability course that I sat in on.

The tutorial.py notebook houses the model, data set creation, training, and prompting, all in one big notebook. Looking through it sequentially allows one to start at the attention head mechanism and end with prompting a model that they trained on their device.

The modular folder has a more proper structure, where the model and text dataset are defined in separate files, with separate notebooks for both training and prompting.

This is my first stab at this, and is very much a work in progress. At the moment, the stories generated by the model are not coherent, so there is much work to be done!

Name		Name	Last commit message	Last commit date
Latest commit History 23 Commits
modular		modular
.gitignore		.gitignore
README.md		README.md
requirements.txt		requirements.txt
tutorial_notebook.ipynb		tutorial_notebook.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ittybittyGPT

About

Releases

Packages

Languages

antonyxsik/ittybittyGPT

Folders and files

Latest commit

History

Repository files navigation

ittybittyGPT

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages