I’ve started posting more on Bluesky and I noticed that articles from my site didn’t have social image previews 😔

A screenshot showing a Bluesky post with no social preview image

I looked into Poison’s code (the theme this site is based on) and found that it supports social image previews at the site level or in the site’s assets folder.

This approach didn’t quite work for me. I recently switched to using page bundles which group markdown and content in the same folder and make linking to images from markdown straightforward. With a few modifications, I was able to make the code work to use images in the page bundles for social previews as well.

I explored how embeddings cluster by visualizing LLM-generated words across different categories. The visualizations helped build intuition about how these embeddings relate to each other in vector space. Most of the code was generated using Sonnet.

!pip install --upgrade pip
!pip install openai
!pip install matplotlib
!pip install scikit-learn
!pip install pandas
!pip install plotly
!pip install "nbformat>=4.2.0"

We start by setting up functions to call ollama locally to generate embeddings and words for several categories. The generate_words function occasionally doesn’t adhere to instructions, but the end results are largely unaffected.

Having completed lesson 5 of the FastAI course, I prompted Claude to give me some good datasets upon which to train a random forest model. This housing dataset from Kaggle seemed like a nice option, so I decided to give it a try. I am also going to try something that Jeremy Howard recommended for this notebook/post, which is to not refine or edit my process very much. I am mostly going to try things out and if they don’t work, I’ll try and write up why and continue, rather than finding a working path and adding commentary at the end.

In this notebook/post, we’re going to be using the markdown content from my blog to try a language model. From this, we’ll attempt to prompt the model to generate a post for a topic I might write about.

Let’s import fastai and disable warnings since these pollute the notebook a lot when I’m trying to convert these notebooks into posts (I am writing this as a notebook and converting it to a markdown file with this script).

In this notebook, we train two similar neural nets on the classic Titanic dataset using techniques from fastbook chapter 1 and chapter 4.

The first, we train using mostly PyTorch APIs. The second, with FastAI APIs. There are a few cells that output warnings. I kept those because I wanted to preserve print outs of the models’ accuracy.

The Titanic data set can be downloaded from the link above or with:

!kaggle competitions download -c titanic

To start, we install and import the dependencies we’ll need:

I use direnv to manage my shell environment for projects. When using a Jupyter notebook within a project, I realized that the environment variables in my .envrc file were not being made available to my notebooks. The following worked for me as a low-effort way to load my environment into the notebook in a way that wouldn’t risk secrets being committed to source control, since I gitignore the .envrc file.

The code below assumes an .envrc file exists in the project root, containing

I upgraded to macOS Sequoia a few weeks ago. I had a feeling this update wasn’t going to be trivial with my Nix setup, but after trying to upgrade to a newer package version on unstable, I got a message that seemed to imply I needed to upgrade the OS, so I went for it. Also, I was at least confident I wouldn’t lose too much about my setup given it’s all committed to version control in my nix-config repo.

I added some configuration to this Hugo site allow access to the raw Markdown versions of posts. This enables you to hit URLs such as this to get the raw markdown of this post. You can find the same Raw link at the bottom of all my posts as well.

This addition was made possible with the follow config changes

[outputs]
# ...
page = ["HTML", "Markdown"]

[mediaTypes]
[mediaTypes."text/markdown"]
suffixes = ["md"]

[outputFormats]
[outputFormats.Markdown]
mediaType = "text/markdown"
isPlainText = true
isHTML = false
baseName = "index"
rel = "alternate"

which rebuilds the original post markdown according to the definition in layouts/_default/single.md

Hugo allows you to store your images with your content using a feature called page bundles. I was loosely familiar with the feature, but Claude explained to me how I could use it to better organize posts on this site and the images I add to them. Previously, I defined a _static directory at the root of this site and mirrored my entire content folder hierarchy inside _static/img. This approach works ok and is pretty useful if I want to share images across posts, but jumping between these two mirrored hierarchies became a bit tedious while I was trying to add images to the markdown file I generated from a Jupyter notebook (.ipynb file). Using page bundles, I could store the images right next to the content like this: