espanso

I tried out adding espanso to configure text expansions rather than using Alfred just to try something new. This is the PR to add it to my Nix configurations. The existing examples are a toy configuration. The tool seems to support far more complex configuration that I still need to look into further.


gpt-4o-mini

Admittedly, I have not gone to deep yet using gpt-4o-mini. I’ve been quite happy with claude-3.5-sonnet and I typically use the most advanced model I can because I’m not running a huge number of model inferences per day at the moment. However, swyx’s point

intelligence too cheap to meter, but also too ephemeral to support >5 players doing R&D? is there an angle here i’m missing?

makes a lot of sense to me.

How can several companies continue to spend the current amount of R&D dollars when it’s possible for one to fully marginalize almost the entire body of work of the rest with a single model release? I need to use gpt-4o-mini more to understand if that is what’s in fact happening. One interesting thing I learned as I started to try it out – it seems to think it’s trained on the GPT-3.5 architecture. Does this mean OpenAI back ported other improvement to an old model architecture, data set or training approach? I’m trying to figure out if there is any concrete information on this.

gpt-4o-mini
user
what model architecture do you use?
assistant
I’m based on the GPT-3 architecture, which is a transformer-based model. The transformer architecture employs mechanisms like self-attention and feedforward neural networks to process and generate text. If you have specific questions about the architecture or how it works, feel free to ask!
user
when is your knowledge cutoff?
assistant
My knowledge cutoff date is October 2021. This means I have information and data available up to that point, but I don’t have access to events or developments that occurred after that date. If you have questions about anything up to that time, feel free to ask!

I think what Danny said (in the first tweet) is interesting but I’m not yet sure what to make of it.

people frame this like it’s somehow a win over llama, when in fact the goal of llama has wildly succeeded: [sic] commoditize models and drive token cost to zero

It seems like the inevitable ebb and flow of the current dynamic with models. If like swyx implies, this causes R&D consolidation since the market can’t support ~5 major competitors in this space, the token price may go up again with less competition. Something to revisit.