Skip to content

Coding up and training my own GPT from scratch. A learning experience for me, and hopefully a tutorial for others in the future!

Notifications You must be signed in to change notification settings

antonyxsik/ittybittyGPT

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

23 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

ittybittyGPT

This repository was created as an exercise to gain a deeper understanding LLMs, transformers, and more concretely, the attention mechanism (which is written here from scratch).

We code up our own "itty bitty GPT" and train it on a chunk of the TinyStories dataset.

This is far from the most effective implementation, but I think it is quite readable and easy to follow. This exercised is inspired by an excellent LLM interpretability course that I sat in on.


The tutorial.py notebook houses the model, data set creation, training, and prompting, all in one big notebook. Looking through it sequentially allows one to start at the attention head mechanism and end with prompting a model that they trained on their device.

The modular folder has a more proper structure, where the model and text dataset are defined in separate files, with separate notebooks for both training and prompting.

This is my first stab at this, and is very much a work in progress. At the moment, the stories generated by the model are not coherent, so there is much work to be done!

About

Coding up and training my own GPT from scratch. A learning experience for me, and hopefully a tutorial for others in the future!

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published