I tried out OpenRouter for the first time. My struggles to find an API that hosted llama3.1-405B motivated me to try this out. There are too many companies providing inference APIs to keep track. OpenRouter seems to be aiming to make all these available from a single place, sort of like AWS Bedrock, but not locked in cloud configuration purgatory. The first thing I tried was playing a game of Connections with nousresearch/hermes-3-llama-3.1-405b. It didn’t get any categories correct for the 2024-08-21 puzzle. OpenRouter’s app showcase list is an interesting window into how people are using models. The dominant themes are

2024-08-21

An interesting read about how the world works through an economic lens.

But what is success? You can quantify net worth, but can you quantify the good you have brought to others lives?

It is not all about the TAM monster–doing cool things that are NOT ECONOMICALLY VALUABLE, but ARTISTICALLY VALUABLE, is equally important.

I downloaded Pile, a journal app with a first-class language model integration and an offline ollama integration. For personal data, running the model offline is a must for me. I use DayOne sporadically, but I’m intrigued by the potential of a more of conversational format as I write.

The concept of a journal writing partner appears to be capturing mindshare. I found another similar app called Mindsera today as well. I also learned about Lex which puts collaborative and AI features at the heart of document authorship, I concept I played around a bit with Write Partner.

2024-08-19

I setup WezTerm and experimented a bit. It’s a nice terminal emulator. I like the builtin themes and Lua as a configuration language. These days, I largely rely on the Cursor integrated terminal. It’s not the greatest, but having cmd+k it’s a bit of a killer feature.

2024-08-18

I haven’t viewed the LLMs-can, LLM-can’t discourse through this lens explicitly.

they’re obviously pattern-matching machines

I’m not sure if I understand at what point these are different things. Maybe it’s a consequence of how I learn, but I generally develop skills on the foundations of seeing and understanding how someone more skilled than myself solves a problem.

2024-08-09

I figured out the issue with adding mistral-large. After a bit of debugging, I realized by manually calling llm_mistral.refresh_models() that something was wrong with how I had added the secret on Modal. It turns out the environment variable name for the Mistral API key needed to be LLM_MISTRAL_KEY. I’m going to try and make a PR to the repo to document this behavior.


I’ve been trying to run models locally. Mostly specifically colpali and florence-2. This has not been easy. It’s possible these require GPUs and might not be macOS friendly. I’ve ended up deep in Github threads and dependency help trying to get basic inference running. I might need to start with something more simpler and smaller and build up from there.

I did some experimentation deriving a data model iteratively (something I am currently calling “data model distillation”) by sequentially passing multiple images (could work with text as well) to a language model and prompting it to improve the schema using any new learnings from the current image. Results so far have been unimpressive.

I’ve been hearing good things about mistral-large-2. I’m working on adding it to bots-doing-things but have had a bit of dependency trouble so far.

I watched Jeremy Howard’s interview with Carson Gross, the author of htmx. As someone who learned my first bits of web dev with jQuery, I feel like I appreciate the foundations of the approach in the library, but am still early in fully developing my mental model. Jeremy built a Python wrapper on top of htmx called fastml and the combination of these technologies is pretty well aligned with the technology I like to work with.