I spent some time experimenting with OpenDevin using claude-3-opus (I couldn’t find an easy way to use claude-3.5-sonnet). The agentic capabilities were not bad. I gave a prompt and behind the scenes, the agent iterated, created files, ran code and course corrected. I didn’t love that there wasn’t an obvious way to interrupt or help course correct. My first attempt was with the same prompt I sent to Sonnet to build Tactic. The result wasn’t bad. OpenDevin (running in a container, which makes me feel better) setup several files in a project then seemed to install dependencies, iterate on issues and inspect the site running in the browser. At least, this was what it claimed to be doing. It was hard to validate what was happening, especially because the tool’s built in browser never loaded anything. Maybe this was a bug or something unusual happening on the model side.

After this first attempt, I tried to be less prescriptive, hoping to do a bit more back and forth with the agent. After I submitted my first message, off the agent went. I tried to interrupt its progress by sending another message, but that didn’t seem to work. Eventually, it found a stopping point. It had created a Python version of the game, which was not what I wanted. I hadn’t had a chance to add any additional instructions. I could have followed up with more at this point, but we had already gone pretty far in the wrong direction.