I’m working on a conversation branching tool called “Delta” (for now). The first thing that led me to this idea came from chatting with Llama 3.2 and experimenting with different system prompts. I was actually trying to build myself a local version of an app I’ve been fascinated by called Dot.

I noticed that as conversations with models progressed, they became more interesting. A friend made a point that really stuck in my head about how you “build trust” with a model over multiple conversation turns. While you can write system prompts to steer the model to respond with more details and longer paragraphs, I observed that regardless of the system prompt, as conversations went on longer (more turns, more exchanges, longer message history), the responses became more interesting and better calibrated to what I was looking for. The model seemed to display a more coherent understanding of what I was talking about.

This led me to ask: could I use that message history as an augmented system prompt? For example, if I have a conversation with six messages (three turns), could I add several different fourth turns to that conversation and see what happens? Delta was created to let you build a conversation and then branch it from different points.

The project was also partly inspired by Simon Willison’s llm CLI tool, which uses a local SQLite database to save all LLM conversation history. I took inspiration from this idea and augmented it to create pointers to previous messages, which form the tree structure used in Delta. Lastly, I wanted to be able to experiment with multiple different models - both local models and with APIs. Using multiple models in Delta to try the same prompts helps me develop an intuition for the quality and types of responses I get from different models and provides a helpful playground for these experiments.