Language models are more than chatbots - they’re tools for thought. The real value lies in using them as intellectual sounding boards to brainstorm, refine and challenge our ideas.
What if you could explore every tangent in a conversation without losing the thread? What if you could rewind discussions to explore different paths? Language models make this possible. This approach unlocks your next level of creativity and productivity.
Context Quality Counts#
I’ve found language models useful for iterating on ideas and articulating thoughts. Here’s an example conversation (feel free to skip; this conversation is used in the examples later on):
These conversations often get more interesting the longer they go on, but they’re not always interesting or useful. Sometimes, I respond in a way that confuses the model or sends it in a different direction than I intended. In other cases, there are several paths I want to explore, but having conversations about multiple ideas in parallel is confusing and unintuitive.
When you’re having a conversation with another person and there are multiple things you want to talk about, you need to pick one topic and continue. If you’re organized or well practiced, maybe you’ll weave in the other topics you wanted to explore or bring the conversation back to those when it makes sense. In conversations with a model, you also have to pick a topic to continue with. If you eventually want to move on to another topic, there are limits and downsides including information we don’t want in the context1.
Conversation branching#
All that said, conversation branching provides a straightforward solution to many of these challenges for individual conversations. Branching imposes a tree structure on top of what are individually, linear conversations. To take a different path or “branch” a conversation, you just need to rewind the conversation to the last node you think is relevant, then continue the conversation in a different direction. You can retain the parts of the conversation that are relevant while pruning ambiguous responses, typos or just taking the conversation in a whole new direction without the less relevant parts of the conversation biasing the model.
Considering our conversation from earlier, here’s a conversation tree that branches off from the first message pair.
But we may also want different perspectives (variation). We could try increasing the temperature from 0.7 to 1.0 with the same prompt.
Or send the same prompt to different models.
We could also explore multiple paths simultaneously while keeping the original premise as our starting point. In the next example, we ask for options for storing key/value pairs in Python. Then we explore two of the suggestions in parallel conversations.
Each of these responses takes us in a different direction. For the most part, it’s not the model coming up with ideas for us – it’s prompting us to think further about the concept.
Using this approach along with long conversations with models, I’ve found surprising depth on most topics with which I am familiar. It’s quite possible I’m being fooled by the “stochastic parrot”, but I’ve perceived some of these conversations as genuinely insightful. They’ve given me ideas of experiments to try or features to add to projects I am working on.
Could friends of mine give me these same ideas or insight? Probably.
Could I search the internet and glean similar insights from products that already exist? Sure.
But my friends are tired of hearing about my latest LLM project, the model responds faster than either of these approaches and most of the internet is already crammed into these models so it’s much faster to unearth things starting with the model than most search engines. It’s the fastest idea iteration feedback loop I’ve ever used.
The model doesn’t replace the conversations I have with people. It’s another tool and ideation partner that helps me get unstuck and wade through some of the fuzziness when I am trying new things.
Where next?#
From here, my inclination is to investigate what it could look like to summarize some of these longer conversations to see if the same insight can be distilled from the longer form exploration or if the value itself comes from the conversational journey. If valuable, these summaries could then serve as the premises of future conversation and exploration, with the model primed to continue discussing similar topics.
My experience has been that it is generally better to fix your prompt rather than to correct the model with a follow up message, if you’re looking for the highest quality response. While models like Claude (200K) and Gemini (2M) have large context windows, there are still practical limits to how much we can fit in a single conversation, especially if you start including documents, audio, images and video. Additionally, there’s been some evidence that there can be recall issues as the amount of context grows, so minimizing what you include to the relevant context seems to help with response quality as well. The size of the context window used also affects in-context learning with RAG (retrieval-augmented generation). ↩︎