Today, I set out to add an llms.txt to this site.
I’ve made a few similar additions in the past with raw post markdown files and a search index.
Every time I try and change something with outputFormats
in Hugo, I forget one of the steps, so in writing this up, finally I’ll have it for next time.
Steps
First, I added a new output format in my config.toml
file:
[outputFormats.TXT]
mediaType = "text/plain"
baseName = "llms"
isPlainText = true
Then, I added this format to my home outputs:
Today, Anthropic entered the LLM code tools party with Claude Code.
Coding with LLMs is one of my favorite activities these days, so I’m excited to give it a shot.
As a CLI tool, it seems most similar to aider
and goose
, at least of the projects I am familiar with.
Be forewarned, agentic coding tools like Claude Code use a lot of tokens which are not free.
Monitor your usage carefully as you use it or know you may spend more than you expect.
An LLM stop sequence is a sequence of tokens that tells the LLM to stop generating text.
I previously wrote about stop sequences and prefilling responses with Claude.
As a reference, here’s how to use a stop sequence with the OpenAI API in Python
from openai import OpenAI
client = OpenAI()
response = client.chat.completions.create(
model="gpt-4o",
messages=[{"role": "user", "content": "What is the capital of France?"}],
stop=["Paris"],
)
print(response.choices[0].message.content)
which outputs something like
'The capital of France is '
Notice the LLM never outputs the word “Paris”.
This is due to the stop sequence.
I built an Astro component called CodeToggle.astro
for my experimental site.
The idea was to create a simple wrapper around a React (or other interactive component) in an MDX file so that the source of that rendered component could be nicely displayed as a highlighted code block on the click of a toggle.
Usage looks like this:
import { default as TailwindCalendarV1 } from "./components/TailwindCalendar.v1";
import TailwindCalendarV1Source from "./components/TailwindCalendar.v1?raw";
<CodeToggle source={TailwindCalendarV1Source}>
<TailwindCalendarV1 client:load />
</CodeToggle>
The implementation of CodeToggle.astro
looked like this
Deepseek is getting a lot of attention with the releases of V3 and recently R1.
Yesterday, they also released “Pro 7B” version of Janus, a “Unified Multimodal” model that can generate images from text and text from images.
Most models I’ve experimented with can only do one of the two.
The 7B model requires about 15GB of hard disk space.
It also seemed to almost max out the 64GB of memory my machine has.
I’m not deeply familiar with the hardware requirements for this model so your mileage may vary.
The llm
package uses a plugin architecture to support numerous different language model API providers and frameworks.
Per the documentation, these plugins are installed using a version of pip
, the popular Python package manager
Use the llm install command (a thin wrapper around pip install) to install plugins in the correct environment:
llm install llm-gpt4all
Because this approach makes use of pip
occasionally we run into familiar issues like pip
being out of date and complaining about it on every use
Today, Anthropic released Citations for Claude.
In the release, Anthropic disclosed the following customer case study:
“With Anthropic’s Citations, we reduced source hallucinations and formatting issues from 10% to 0% and saw a 20% increase in references per response. This removed the need for elaborate prompt engineering around references and improved our accuracy when conducting complex, multi-stage financial research,” said Tarun Amasa, CEO, Endex.
I decided to kick the tires on this feature as I thought it could slot in very nicely with a project I am actively working on.
Also, I couldn’t quickly find Python code I could copy and run so I conjured some.
Today, I needed to turn SVGs into PNGs.
I decided to use Deno to do it.
Some cursory searching showed Puppeteer should be up to the task.
I also found deno-puppeteer
which seemed like it would provide a reasonable way to make this work.
To start, let’s set up a deno
project
deno init deno-browser-screenshots
deno-browser-screenshots
Using puppeteer
Now, add some code to render an SVG with Chrome via puppeteer
.
import puppeteer from "https://deno.land/x/[email protected]/mod.ts";
const svgString = `
<svg width="512" height="512" xmlns="http://www.w3.org/2000/svg">
<rect width="100%" height="100%" fill="#87CEEB"/>
<circle cx="256" cy="256" r="100" fill="#FFD700"/>
<path d="M 100 400 Q 256 300 412 400" stroke="#1E90FF" stroke-width="20" fill="none"/>
</svg>`;
if (import.meta.main) {
try {
const browser = await puppeteer.launch({
headless: true,
args: ["--no-sandbox"],
});
const page = await browser.newPage();
await page.setViewport({ width: 512, height: 512 });
await page.setContent(svgString);
await page.screenshot({
path: "output.png",
clip: {
x: 0,
y: 0,
width: 512,
height: 512,
},
});
await browser.close();
} catch (error) {
console.error("Error occurred:", error);
console.error("Make sure Chrome is installed and the path is correct");
throw error;
}
}
When we run this code, we get the following error
About 6 months ago, I experimented with running a few different multi-modal (vision) language models on my Macbook.
At the time, the results weren’t so great.
An experiment
With a slight modification to the script from that post, I tested out llama3.2-vision
11B (~8GB in size between the model and the projector).
Using uv
and inline script dependencies, the full script looks like this
# /// script
# requires-python = ">=3.12"
# dependencies = [
# "ollama",
# ]
# ///
import os
import sys
import ollama
PROMPT = "Describe the provided image in a few sentences"
def run_inference(model: str, image_path: str):
stream = ollama.chat(
model=model,
messages=[{"role": "user", "content": PROMPT, "images": [image_path]}],
stream=True,
)
for chunk in stream:
print(chunk["message"]["content"], end="", flush=True)
def main():
if len(sys.argv) != 3:
print("Usage: python run.py <model_name> <image_path>")
sys.exit(1)
model_name = sys.argv[1]
image_path = sys.argv[2]
if not os.path.exists(image_path):
print(f"Error: Image file '{image_path}' does not exist.")
sys.exit(1)
run_inference(model_name, image_path)
if __name__ == "__main__":
main()
We run it with
Deepseek V3 was recently released: a cheap, reliable, supposedly GPT-4 class model.
Quick note upfront, according to the docs, there will be non-trivial price increases in February 2025:
- Input price (cache miss) is going up to
$0.27
/ 1M tokens from $0.14
/ 1M tokens (~2x) - Output price is going up to
$1.10
/ 1M tokens from $0.28
/1M tokens (~4x)
From now until 2025-02-08 16:00 (UTC), all users can enjoy the discounted prices of DeepSeek API