2024-08-31

Language models can’t generate instructions for knitting patterns generate crossword puzzles from scatch Language models can generate Connections puzzles

2024-08-29

Incredible read: https://eieio.games/essays/the-secret-in-one-million-checkboxes/ I failed many attempts at getting Sonnet to write code to display the folder structure of the output of a tree -F command using shortcodes. After a lot of prompting, I wrote a mini-design doc on how the feature needed to be implemented and used it as context for Sonnet. I tried several variants of instructions in the design including trying to improve it with the model itself for clarity.

2024-08-25

I tried Townie. As has become tradition, I tried to build a writing editor for myself. Townie got a simple version of this working with the ability to send a highlighted selection of text to the backend and run it through a model along with a prompt. This experience was relatively basic, using a textarea and a popup. From here, I got Townie to add the ability to show diffs between the model proposal and original text.

2024-08-23

I’ve been trying out Cursor’s hyped composer mode with Sonnet. I am a bit disappointed. Maybe I shouldn’t be. I think it’s not as good as I expected because I hold Cursor to a higher bar than the other developer tools out there. It’s possible it’s over-hyped or that I am using it suboptimally. But it’s more or less of the same quality as most of the tools of the same level of abstraction like aider, etc.
I tried out OpenRouter for the first time. My struggles to find an API that hosted llama3.1-405B motivated me to try this out. There are too many companies providing inference APIs to keep track. OpenRouter seems to be aiming to make all these available from a single place, sort of like AWS Bedrock, but not locked in cloud configuration purgatory. The first thing I tried was playing a game of Connections with nousresearch/hermes-3-llama-3.

2024-08-21

An interesting read about how the world works through an economic lens. But what is success? You can quantify net worth, but can you quantify the good you have brought to others lives? It is not all about the TAM monster–doing cool things that are NOT ECONOMICALLY VALUABLE, but ARTISTICALLY VALUABLE, is equally important.
I downloaded Pile, a journal app with a first-class language model integration and an offline ollama integration. For personal data, running the model offline is a must for me. I use DayOne sporadically, but I’m intrigued by the potential of a more of conversational format as I write. The concept of a journal writing partner appears to be capturing mindshare. I found another similar app called Mindsera today as well. I also learned about Lex which puts collaborative and AI features at the heart of document authorship, I concept I played around a bit with Write Partner.

2024-08-19

I setup WezTerm and experimented a bit. It’s a nice terminal emulator. I like the builtin themes and Lua as a configuration language. These days, I largely rely on the Cursor integrated terminal. It’s not the greatest, but having cmd+k it’s a bit of a killer feature.

2024-08-18

I can't believe we're back to discussing LLMs' ability to reason. Where have you been these past two years? In a bunker? If you'd actually worked with LLMs during this time, you'd know by now that they're obviously pattern-matching machines. Try asking one to write incorrect… pic.twitter.com/KPcDCI2cjD — Andriy Burkov (@burkov) August 18, 2024 I haven’t viewed the LLMs-can, LLM-can’t discourse through this lens explicitly. they’re obviously pattern-matching machines I’m not sure if I understand at what point these are different things.
I tried to run florence-2 and colpali using the Huggingface serverless inference API. Searching around, there seems to pretty pretty start support for image-text-to-text models. On Github, I only found a few projects that even reference these types of models. I didn’t really know what I was doing, so I copied the example code then tried to use a model to augment it to call florence-2. Initially, it seemed like it was working: