2024-08-23

Logs

I’ve been trying out Cursor’s hyped composer mode with Sonnet. I am a bit disappointed. Maybe I shouldn’t be. I think it’s not as good as I expected because I hold Cursor to a higher bar than the other developer tools out there. It’s possible it’s over-hyped or that I am using it suboptimally. But it’s more or less of the same quality as most of the tools of the same level of abstraction like aider, etc. I am trying to create a multipane, React-based writing app. It’s possible I need to provide more detailed description than I am giving so far. However, my main complaint after running it is that now I have a ton of code that isn’t quite right and I don’t know where or why it’s sort of broken. Now, I need to read all the code. This approach is notably less productive than slowly building up an app with LLM-code generation, because after each generation I can test the new code and make sure it does what I intended (or write automated tests to do that). The code I get out of Composer doesn’t do what I want, but the LLM doesn’t know why, either because my high level task is under-specified, it doesn’t have enough context, or the ask is too vague. I don’t usually run into this issue when I use cmd+k. Maybe, I need to watch some videos of folks using it.

I’ve been playing around with different UXs for LLM-augmented text editors. There is a lot out there but nothing I love. I’ve been circling around the idea of a configurable prompt library that I can invoke on an arbitrary selection of a document. I’ve tried playing around with Monaco and augmenting the command palette with LLM-specific actions. This works well enough, but these end up sitting alongside the standard actions and that doesn’t feel quite right. I played around with AIEditor and I like some of the ideas, but the UX isn’t right for writing entries like these which is my top priority.

I think I’m realizing the LLM-commands need to be native to the OS, so that they are portable across applications and can be provided multi-modal context to be most useful. I was (re-?)introduced to a radial menu the other day. They’re unintrusive, always available popovers, that can be given access to arbitrary context assuming they’re given the right permissions. This context could include everything from what app is open, what text is highlighted or even a screenshot of everything that is visible.

I spent some time trying to implement a radial selector in Xcode. I’ve spent maybe a couple hours in my life writing Swift. This skill deficiency proved problematic even with the help of a model. Even with gpt-4o, I couldn’t figure out how to get a hotkey event monitor working. Also, I am still not a fan of Xcode. Maybe, I will attempt this idea with web tech.

✎ Edit

Raw