FastHTML Loading Spinner

[TIL] September 8, 2024

fasthtml

I’ve enjoyed using fasthtml to deploy small, easily hosted webpages for little apps I’ve been building. I’m still getting used to it but it almost no effort at all to deploy. Recently, I built an app that would benefit from having a loading spinner upon submitting a form, but I couldn’t quite figure out how I would do that with htmx in FastHTML, so I built a small project to experiment with various approaches. This is what I came up with:

Prefill And Stop Sequences

[TIL] September 3, 2024

I revisited Eugene’s excellent work, “Prompting Fundamentals and How to Apply Them Effectively”. From this I learned about the ability to prefill Claude’s responses. Using this technique, you can quickly get Claude to output JSON without any negotiation and avoid issues with leading codefences (e.g. ```json).

While JSON isn’t as good an example as XML, which ends less ambiguously, here’s a quick script showing the concept:

import anthropic


message = anthropic.Anthropic().messages.create(
    model="claude-3-5-sonnet-20240620",
    max_tokens=1024,
    messages=[
        {
            "role": "user",
            "content": """<status>Today is Tuesday, September 3rd, 2024 at 8:46pm ET, in New York, NY</status>
Extract the <day_of_week>, <month>, <day>, <year> and <location> from the <status> as JSON.
""",
        },
        {"role": "assistant", "content": "{"},
    ],
    stop_sequences=["}"],
)
print(message.content[0].text)

The script outputs

Running Huggingface Models with Llama.cpp and ollama

[TIL] August 30, 2024

One challenge I’ve continued to have is figuring out how to use the models on Huggingface. There are usually Python snippets to “run” models that often seem to require GPUs and always seem to run into some sort of issues when trying to install the various Python dependencies. Today, I learned how to run model inference on a Mac with an M-series chip using llama-cpp and a gguf file built from safetensors files on Huggingface.

Upload Multiple Images with FastHTML

[TIL] August 28, 2024

I’ve been experimenting with FastHTML for making quick demo apps, often involving language models. It’s a pretty simple but powerful framework, which allows me to deploy a client and server in a single main.py – something I appreciate a lot for little projects I want to ship quickly. I currently use it how you might use streamlit.

I ran into an issue where I was struggling to submit a form with multiple images.

Rebuilding My iTerm Setup In Wezterm

[TIL] August 26, 2024

I spent a bit of time configuring WezTerm to my liking. This exercise was similar to rebuilding my iTerm setup in Alacritty. I found WezTerm to be more accessible and strongly appreciated the builtin terminal multiplexing because I don’t like using tmux.

I configured WezTerm to provide the following experience. Getting this working probably took me 30 minutes spread across a few sessions as I noticed things I was missing.

Monokai-like theme
Horizontal and vertical pane splitting
Dimmed inactive panes
Steady cursor
Immediate pane closing with confirmation if something is still running
Pane full screening
Command+arrow navigation between panes
Command+option+arrow navigation between tabs
Moving between words in the command prompt with option-arrow
Hotkey to clear terminal

What went well

I found achieving these configurations to be much easier in WezTerm than Alacritty, or at least, it took me less time. The blend of native UI with dotfile-style configurable settings hits a sweet spot for my preferences as well, and I haven’t even scratched the surface of scripting things with Lua.

VLMs Hallucinate

[posts] August 16, 2024

I’ve done some experimentation extracting structured data from documents using VLMs. A summary of one approach I’ve tried can be found in my repo, impulse. I’ve found using Protobufs to be a relatively effective approach for extracting values from documents. The high-level idea is you write a Protobuf as your target data model then use that Protobuf itself as most of the promptI really need a name for this as I reference the concept so frequently. . I discussed the approach in more detail in this post so I am going to jump right into it.

Structured Output, Functions and Prompting

[posts] August 12, 2024

I’ve been prompting models to output JSON for about as long as I’ve been using models. Since text-davinci-003, getting valid JSON out of OpenAI’s models didn’t seem like that big of a challenge, but maybe I wasn’t seeing the long tails of misbehavior because I hadn’t massively scaled up a use case. As adoption has picked up, OpenAI has released features to make it easier to get JSON output from a model. Here are three examples using structured outputs, function calling and just prompting respectively.

VLM data extraction with Protobufs

[posts] August 3, 2024

In light of OpenAI releasing structured output in the model API, let’s move output structuring another level up the stack to the microservice/RPC level.

A light intro to Protobufs

Many services (mostly in microservice land) use Protocol Buffers (protobufs) to establish contracts for what data an RPC requires and what it will return. If you’re completely unfamiliar with protobufs, you can read up on them here.

Here is an example of a message that a protobuf service might return.

Protobuf Zip Imports in Python

[TIL] August 3, 2024

In Python, the most straightforward path to implementing a gRPC server for a Protobuf service is to use protoc to generate code that can be imported in a server, which then defines the service logic.

Let’s take a simple example Protobuf service:

syntax = "proto3";

package simple;

message HelloRequest {
  string name = 1;
}

message HelloResponse {
  string message = 1;
}

service Greeter {
  rpc SayHello (HelloRequest) returns (HelloResponse);
}

Next, we run some variant of python -m grpc_tools.protoc to generate code (assuming we’ve installed grpcio and grpcio-tools). Here’s an example for .proto files in a protos folder:

Making Your Vision Real with Models

[posts] July 21, 2024

Using models for various different purposes daily has been a satisfying endeavor for me because they can be used as tools to help make your vision for something come to life. Models are powerful generators that can produce code, writing, images and more based on a user’s description of what they want. But models “fill in the gaps” on behalf of the user to resolve ambiguity in the user’s prompt.