promptfoo is a Javascript library and CLI for testing and evaluating LLM output quality. It’s straightforward to install and get up and running quickly. As a first experiment, I’ve used it to compare the output of three similar prompts that specify their output structure using different modes of schema definition. To get started mkdir prompt_comparison cd prompt_comparison promptfoo init The scaffold creates a prompts.txt file, and this is where I wrote a parameterized prompt to classify and extract data from a support message.

Nix Language

To broaden my knowledge of nix, I’m working through an Overview of the Nix Language. Most of the data types and structures are relatively self-explanatory in the context of modern programming languages. Double single quotes strip leading spaces. '' s '' == "s " Functions are a bit unexpected visually, but simply enough with an accompanying explanation. For example, the following is a named function f with two arguments x and y.

Zero to Nix

I started working through the Zero to Nix guide. This is a light introduction that touch on a few of the command line tools that come with nix and how they can be used to build local and remote projects and enter developer environments. While many of the examples are high level concept you’d probably apply when developing with nix, flake templates are one thing I could imagine returning to often.
I’ve been following the “AI engineering framework” marvin for several months now. In addition to openai_function_call, it’s currently one of my favorite abstractions built on top of a language model. The docs are quite good, but as a quick demo, I’ve ported over a simplified version of an example from an earlier post, this time using marvin. import json import marvin from marvin import ai_model from pydantic import ( BaseModel, ) from typing import ( List, ) marvin.
Go introduced modules several years ago as part of a dependency management system. My Hugo site is still using git submodules to manage its theme. I attempted to migrate to Go’s submodules but eventually ran into a snag when trying to deploy the site. To start, remove the submodule git submodule deinit --all and then remove the themes folder git rm -r themes To finish the cleanup, remove the theme key from config.
The threading macro in Clojure provides a more readable way to compose functions together. It’s a bit like a Bash pipeline. The following function takes a string, splits on a : and trims the whitespace from the result. The threading macro denoted by -> passes the threaded value as the first argument to the functions. (defn my-fn [s] (-> s (str/split #":") ;; split by ":" second ;; take the second element (str/trim) ;; remove whitespace from the string ) ) There is another threading macro denoted by ->> which passes the threaded value as the last argument to the functions.
This past week, OpenAI added function calling to their SDK. This addition is exciting because it now incorporates schema as a first-class citizen in making calls to OpenAI chat models. As the example code and naming suggest, you can define a list of functions and schema of the parameters required to call them and the model will determine whether a function needs to be invoked in the context of the completion, then return JSON adhering to the schema defined for the function.
I was interested to learn more about the developer experience of Cloudflare’s D1 serverless SQL database offering. I started with this tutorial. Using wrangler you can scaffold a Worker and create a D1 database. The docs were straightforward up until the Write queries within your Worker section. For me, wrangler scaffolded a worker with a different structure than the docs discuss. I was able to progress through the rest of the tutorial by doing the following:
I tried out jsonformer to see how it would perform with some of structured data use cases I’ve been exploring. Setup python -m venv env . env/bin/activate pip install jsonformer transformers torch Code āš ļø Running this code will download 10+ GB of model weights āš ļø from jsonformer import Jsonformer from transformers import AutoModelForCausalLM, AutoTokenizer model = AutoModelForCausalLM.from_pretrained("databricks/dolly-v2-12b") tokenizer = AutoTokenizer.from_pretrained("databricks/dolly-v2-12b") json_schema = { "$schema": "http://json-schema.org/draft-07/schema#", "title": "RestaurantReview", "type": "object", "properties": { "review": { "type": "string" }, "sentiment": { "type": "string", "enum": ["UNKNOWN", "POSITIVE", "MILDLY_POSITIVE", "NEGATIVE", "MILDLY_NEGATIVE"] }, "likes": { "type": "array", "items": { "type": "string" } }, "dislikes": { "type": "array", "items": { "type": "string" } } }, "required": ["review", "sentiment"] } prompt = """From the provided restaurant review, respond with JSON adhering to the schema.
Imagine we have a query to an application that has become slow under load demands. We have several options to remedy this issue. If we settle on using a cache, consider the following failure domain when we design an architecture to determine whether using a cache actually is a good fit for the use case. Motivations for using a cache When the cache is available and populated it will remove load from the database.