I spent some time looking for low latency, image-to-image APIs.
I look around a fair bit and think I’ve settled on together.ai.
My main needs are very similar to Krea
- < 1 sec latency
- ability to specify the starting a prompt and image
I’m still validating this is the case but latency seems nearly there and I’m still trying to confirm how to make a start image work.
I’ve been playing around a bunch with image generation models/tools, notably Recraft and Krea.
Recently, I’ve been trying to use these tools to design a logo and favicon for Thought Eddies.
I’ve also been experimenting with using D3 to build visuals for LLM chat conversation branching.
It’s harder to work with than React Flow, unsurprisingly since it is a much more general tool/framework.
I am aiming to create a few different interactive visuals to showcase a few different ideas I have for working with LLMs on a canvas.
Maybe I will end up going back to React Flow if I keep having challenges with D3.
I tried two separate ways to configure Cursor to point to an alternative OpenAI compliant API endpoint by modifying the “OpenAI API Key > Override OpenAI Base URL” section of the Cursor settings.
My first attempt was with Deepseek, using learnings from wiring that up to llm
.
I got to the point where Cursor failed to validate the API endpoint (don’t forget to save the url override), but the curl
command it output for me to check manually worked if I switched the model to deepseek-chat
.
A day in the mind of Claude Sonnet
— Dan Corin (@danielcorin.com) 2024-12-19T21:21:16.724Z
To create this animation for a day in the mind of Claude Sonnet, I used Sonnet to write the following code
- to generate the HTML with this Python script
- capture PNGs with
puppeteer
- use
ffmpeg
to stitch them together
import llm
import requests
import time
from pathlib import Path
from datetime import datetime, timedelta
MODELS = ["claude-3-5-sonnet-latest"]
def get_temp_weather(hour):
if hour < 14:
temp = 30 + (24 * (hour / 14))
else:
temp = 54 - (27 * ((hour - 14) / 10))
if hour < 6:
weather = "cold"
elif hour < 18:
weather = "sunny"
else:
weather = "snowing"
return round(temp), weather
now = str(int(time.time()))
out_dir = Path("out") / now
out_dir.mkdir(parents=True, exist_ok=True)
start_time = datetime.now().replace(hour=0, minute=0, second=0, microsecond=0)
end_time = start_time + timedelta(days=1)
current = start_time
start_time = datetime.now().replace(hour=0, minute=0, second=0, microsecond=0)
specific_times = [
start_time.replace(hour=2, minute=30),
start_time.replace(hour=3, minute=0),
start_time.replace(hour=5, minute=0),
]
for current in specific_times:
hour = current.hour
minute = current.minute
temp, weather = get_temp_weather(hour)
time_str = current.strftime("%I:%M %p")
PROMPT = f"""It is {time_str}. {temp}°F and {weather}.
Generate subtle, abstract art using SVG in an HTML page that fills 100% of the browser viewport.
Take inspiration for the style, colors and aesthetic using the current weather and time of day.
Prefer subtle colors but it's ok to use intense colors sparingly.
No talk, no code fences. Code only."""
timestamp = current.strftime("%H_%M")
prompt_file = out_dir / f"prompt_{timestamp}.txt"
prompt_file.write_text(PROMPT)
for m in MODELS:
model = llm.get_model(m)
response = model.prompt(PROMPT, temperature=1.0)
out_file = out_dir / f"{timestamp}_{m}.html"
out_file.write_text(response.text())
current += timedelta(minutes=30)
const puppeteer = require('puppeteer');
async function createGif() {
const browser = await puppeteer.launch();
const page = await browser.newPage();
await page.setViewport({
width: 800,
height: 600
});
const htmlFiles = [
// paths to html files
];
for (let i = 0; i < htmlFiles.length; i++) {
await page.goto(`file://${__dirname}/${htmlFiles[i]}`);
await page.screenshot({
path: `frame${i}.png`
});
}
await browser.close();
}
createGif();
ffmpeg -framerate 2 -i timestamped_frame%d.png -c:v libx264 -pix_fmt yuv420p output.mp4
I’ve been setting up the foundations to add node summaries to Delta.
Ideally, I will use the same model to create the node summaries as I use to generate the responses since this will keep the model dependencies minimal.
However, my early experiments have yielded some inconsistency in how a shared prompt behaves across models. To try and understand this and smooth it out as much as possible, I plan to set up evals to ensure the summaries are
Found this nifty Bluesky Hugo shortcode written by Bryce.
Inspired by Simon’s pelican-bicycle
repo, I played around a bit with using LLMs to generate visuals with SVGs, Three.js and pure CSS.
I posted about some of the results here.
Inspired by @simonwillison.net's pelican-bicycle repo, here are some CSS sunsets by gpt-4o, claude-3-5-sonnet and gemini-2.0-flash-exp. Sonnet and Gemini animate the birds and clouds.
Prompt: Generate pure HTML/CSS art of an extremely detailed, beautiful sunset. No talk or code fences. Code only.
Got Delta working on other machines.
It took a lot longer than I expected.
I spent most of the time dealing with build issues regarding:
- missing dependencies
- different paths between dev and production
- loading
vec.dylib
with sqlite-vec
- dependencies compiled for the wrong architecture
I tried to write some of this up but it’s been challenging to extract and/or remember the specific circumstances and how I solved it in the context of some minimal example.
There are lots of parts that feel a bit wrong or regrettable but were compromises to getting the thing working.
I don’t know anything about rice disease but apparently these are various rice diseases and this is what they look like.
- Jeremy Howard, Fast.ai Course Lesson 6
I have no idea if Jeremy had this in mind when he said this (alluding to the fact he doesn’t know about the subject area, but when building with ML that doesn’t necessarily matter), but this sentiment is how I feel when building with and learning about anything new with language models, at least to start.
I use LLMs to bootstrap my understanding of whatever situation I find myself in, and from there, try to orient myself and gain a better understanding.
It’s incredible to be able to learn from whatever your starting point is rather than needing to try and read less relevant introductory material to start to understand an area.
It’s akin to having a CSV and wanting to run SQL queries on your data and needing to start by reading the pandas
or sqlite
documentation.
The developer space is moving so fast.
I went from thinking Cursor’s cmd+k was the greatest thing, to using Windsurf’s Cascade, the new Cursor composer and bolt.new in a week.
Powerful tools.