In this notebook/post, we’re going to be using the markdown content from my blog to try a language model. From this, we’ll attempt to prompt the model to generate a post for a topic I might write about. Let’s import fastai and disable warnings since these pollute the notebook a lot when I’m trying to convert these notebooks into posts (I am writing this as a notebook and converting it to a markdown file with this script).
I recently found Joe’s article, We All Know AI Canā€™t Code, Right?. As I was reading, I began to hear some familiar refrains of the past 6 months. Raise your hand if youā€™ve ever used GitHub or Stack Overflow or any other kind of example code or library or whatever to help you get started on the foundational solution to the business problem that your code needs to solve. Now, put your hand down if youā€™ve never once had to spend hours, sometimes days, tweaking and modifying that sample code a million times over to make it work like you need it to work to solve your unique problem.
I had the idea to try and use a language model as a random number generator. I didn’t expect it to actually work as a uniform random number generator but was curious to see what the distribution of numbers would look like. My goal was to prompt the model to generate a number between 1 and 100. I could also vary the temperature to see how that changed the distribution of the numbers.
In this notebook, we train two similar neural nets on the classic Titanic dataset using techniques from fastbook chapter 1 and chapter 4. The first, we train using mostly PyTorch APIs. The second, with FastAI APIs. There are a few cells that output warnings. I kept those because I wanted to preserve print outs of the models’ accuracy. The Titanic data set can be downloaded from the link above or with:
I use direnv to manage my shell environment for projects. When using a Jupyter notebook within a project, I realized that the environment variables in my .envrc file were not being made available to my notebooks. The following worked for me as a low-effort way to load my environment into the notebook in a way that wouldn’t risk secrets being committed to source control, since I gitignore the .envrc file.
I upgraded to macOS Sequoia a few weeks ago. I had a feeling this update wasn’t going to be trivial with my Nix setup, but after trying to upgrade to a newer package version on unstable, I got a message that seemed to imply I needed to upgrade the OS, so I went for it. Also, I was at least confident I wouldn’t lose too much about my setup given it’s all committed to version control in my nix-config repo.
I added some configuration to this Hugo site allow access to the raw Markdown versions of posts. This enables you to hit URLs such as this to get the raw markdown of this post. You can find the same Raw link at the bottom of all my posts as well. This addition was made possible with the follow config changes [outputs] # ... page = ["HTML", "Markdown"] [mediaTypes] [mediaTypes."text/markdown"] suffixes = ["md"] [outputFormats] [outputFormats.
Hugo allows you to store your images with your content using a feature called page bundles. I was loosely familiar with the feature, but Claude explained to me how I could use it to better organize posts on this site and the images I add to them. Previously, I defined a _static directory at the root of this site and mirrored my entire content folder hierarchy inside _static/img. This approach works ok and is pretty useful if I want to share images across posts, but jumping between these two mirrored hierarchies became a bit tedious while I was trying to add images to the markdown file I generated from a Jupyter notebook (.
I was listening to episode 34 of AI & I of Dan Shipper interviewing Simon Eskiidsen. Simon was describing one of the processes he uses with language models to learn new words and concepts. In practice, he has a prompt template that instructs the model to explain a word to him but using it in a few sentences and giving synonyms, then injects the specific word or phrase into this template.
The following is the notebook I used to experiment training an image model to classify types of rowing shells (with people rowing them) and the same dataset by rowing technique (sweep vs. scull). There are a few cells that output a batch of the data. I decided not to include these because the rowers in these images didn’t ask to be on my website. I’ll keep this in mind when selecting future datasets as I think showing the data batches in the notebook/post is helpful for understanding what is going on.