I was listening to episode 34 of AI & I of Dan Shipper interviewing Simon Eskiidsen. Simon was describing one of the processes he uses with language models to learn new words and concepts. In practice, he has a prompt template that instructs the model to explain a word to him but using it in a few sentences and giving synonyms, then injects the specific word or phrase into this template.

The following is the notebook I used to experiment training an image model to classify types of rowing shells (with people rowing them) and the same dataset by rowing technique (sweep vs. scull). There are a few cells that output a batch of the data. I decided not to include these because the rowers in these images didn’t ask to be on my website. I’ll keep this in mind when selecting future datasets as I think showing the data batches in the notebook/post is helpful for understanding what is going on.

I set out to do a project using my learnings from the first chapter of the fast.ai course. My first idea was to try and train a Ruby/Python classifier. ResNets are not designed to do this, but I was curious how well it would perform.

Classifying images of sources code by language

My first idea was to download a bunch of source code from GitHub, sort it by language type, then convert it to images with Carbon. After working through some GitHub rate limiting issues, I eventually had a list of the top 10 repositories for several different languages. From here, I created a list of files in these repos, filtering by the extension of the programming language I wanted to download.

I’ve enjoyed using fasthtml to deploy small, easily hosted webpages for little apps I’ve been building. I’m still getting used to it but it almost no effort at all to deploy. Recently, I built an app that would benefit from having a loading spinner upon submitting a form, but I couldn’t quite figure out how I would do that with htmx in FastHTML, so I built a small project to experiment with various approaches. This is what I came up with:

I revisited Eugene’s excellent work, “Prompting Fundamentals and How to Apply Them Effectively”. From this I learned about the ability to prefill Claude’s responses. Using this technique, you can quickly get Claude to output JSON without any negotiation and avoid issues with leading codefences (e.g. ```json).

While JSON isn’t as good an example as XML, which ends less ambiguously, here’s a quick script showing the concept:

import anthropic


message = anthropic.Anthropic().messages.create(
    model="claude-3-5-sonnet-20240620",
    max_tokens=1024,
    messages=[
        {
            "role": "user",
            "content": """<status>Today is Tuesday, September 3rd, 2024 at 8:46pm ET, in New York, NY</status>
Extract the <day_of_week>, <month>, <day>, <year> and <location> from the <status> as JSON.
""",
        },
        {"role": "assistant", "content": "{"},
    ],
    stop_sequences=["}"],
)
print(message.content[0].text)

The script outputs

One challenge I’ve continued to have is figuring out how to use the models on Huggingface. There are usually Python snippets to “run” models that often seem to require GPUs and always seem to run into some sort of issues when trying to install the various Python dependencies. Today, I learned how to run model inference on a Mac with an M-series chip using llama-cpp and a gguf file built from safetensors files on Huggingface.

I’ve been experimenting with FastHTML for making quick demo apps, often involving language models. It’s a pretty simple but powerful framework, which allows me to deploy a client and server in a single main.py – something I appreciate a lot for little projects I want to ship quickly. I currently use it how you might use streamlit.

I ran into an issue where I was struggling to submit a form with multiple images.

I spent a bit of time configuring WezTerm to my liking. This exercise was similar to rebuilding my iTerm setup in Alacritty. I found WezTerm to be more accessible and strongly appreciated the builtin terminal multiplexing because I don’t like using tmux.

I configured WezTerm to provide the following experience. Getting this working probably took me 30 minutes spread across a few sessions as I noticed things I was missing.

  • Monokai-like theme
  • Horizontal and vertical pane splitting
  • Dimmed inactive panes
  • Steady cursor
  • Immediate pane closing with confirmation if something is still running
  • Pane full screening
  • Command+arrow navigation between panes
  • Command+option+arrow navigation between tabs
  • Moving between words in the command prompt with option-arrow
  • Hotkey to clear terminal

What went well

I found achieving these configurations to be much easier in WezTerm than Alacritty, or at least, it took me less time. The blend of native UI with dotfile-style configurable settings hits a sweet spot for my preferences as well, and I haven’t even scratched the surface of scripting things with Lua.

In Python, the most straightforward path to implementing a gRPC server for a Protobuf service is to use protoc to generate code that can be imported in a server, which then defines the service logic.

Let’s take a simple example Protobuf service:

syntax = "proto3";

package simple;

message HelloRequest {
  string name = 1;
}

message HelloResponse {
  string message = 1;
}

service Greeter {
  rpc SayHello (HelloRequest) returns (HelloResponse);
}

Next, we run some variant of python -m grpc_tools.protoc to generate code (assuming we’ve installed grpcio and grpcio-tools). Here’s an example for .proto files in a protos folder:

Temporal provides helpful primitives called Workflows and Activities for orchestrating processes. A common pattern I’ve found useful is the ability to run multiple “child workflows” in parallel from a single “parent” workflow.

Let’s say we have the following activity and workflow (imports omitted for brevity)

Activity code

@dataclass
class MyGoodActivityArgs:
    arg1: str
    arg2: str


@dataclass
class MyGoodActivityResult:
    arg1: str
    arg2: str
    random_val: float


@activity.defn
async def my_good_activity(args: MyGoodActivityArgs) -> MyGoodActivityResult:
    activity.logger.info("Running my good activity")
    return MyGoodActivityResult(
        arg1=args.arg1,
        arg2=args.arg2,
        random_val=random.random(),
    )

Workflow code