2024-10-30

Logs

language_models
google_colab

I wanted to get more hands-on with the language model trained in chapter 12 of the FastAI course, so I got some Google Colab credits and actually ran the training on an A100. It cost about $2.50 and took about 1:40, but generally worked quite well. There was a minor issue with auto-saving the notebook, probably due to my use of this workaround to avoid needing to give Colab full Google Drive access. Regardless, I was still able to train the language model, run sentence completions, then using the fine-tuned language model as an encoder to build a sentiment classifier. Seeing how long this process took, then seeing it work helped me build a bit more intuition about what to expect when training models. I was also a bit surprised how fast the next token prediction and classification inference were. I might try out a smaller fine-tune on my local machine now that I have a better sense of what this process looks like end to end.

✎ Edit

Raw