I’m working on a conversation branching tool called “Delta” (for now).
The first thing that led me to this idea came from chatting with Llama 3.2 and experimenting with different system prompts.
I was actually trying to build myself a local version of an app I’ve been fascinated by called Dot.
I noticed that as conversations with models progressed, they became more interesting.
A friend made a point that really stuck in my head about how you “build trust” with a model over multiple conversation turns.
While you can write system prompts to steer the model to respond with more details and longer paragraphs, I observed that regardless of the system prompt, as conversations went on longer (more turns, more exchanges, longer message history), the responses became more interesting and better calibrated to what I was looking for.
The model seemed to display a more coherent understanding of what I was talking about.
Tried out Letta.
Unsure where to try and go with it.
Would another day of editing fundamentally change the value readers get?
Probably not.
Ship it and move on to your next idea while you’re still energized.
Trying out Windsurf
When you are curious about something, you have the right cocktail of neurotransmitters present to make that information stick.
If you get the answer to something in the context of your curiosity, then it’s going to stay with you.
we can improve the accuracy of nearly any kind of machine learning algorithm by training it multiple times, each time on a different random subset of the data, and averaging its predictions
I wanted to get more hands-on with the language model trained in chapter 12 of the FastAI course, so I got some Google Colab credits and actually ran the training on an A100.
It cost about $2.50 and took about 1:40, but generally worked quite well.
There was a minor issue with auto-saving the notebook, probably due to my use of this workaround to avoid needing to give Colab full Google Drive access.
Regardless, I was still able to train the language model, run sentence completions, then using the fine-tuned language model as an encoder to build a sentiment classifier.
Seeing how long this process took, then seeing it work helped me build a bit more intuition about what to expect when training models.
I was also a bit surprised how fast the next token prediction and classification inference were.
I might try out a smaller fine-tune on my local machine now that I have a better sense of what this process looks like end to end.
The following code allowed me to successfully download the IMDB dataset with fastai to a Modal volume:
import os
os.environ["FASTAI_HOME"] = "/data/fastai"
from fastai.text.all import *
app = modal.App("imdb-dataset-train")
vol = modal.Volume.from_name("modal-llm-data", create_if_missing=True)
@app.function(
gpu="any",
image=modal.Image.debian_slim().pip_install("fastai"),
volumes={"/data": vol},
)
def download():
path = untar_data(URLs.IMDB)
print(f"Data downloaded to: {path}")
return path
run with
modal run train.py::download
Next, I tried to run one epoch of training of the language model
I tried training a language model with fastai on Modal.
First I attempted it in a standalone Modal script.
I first wrote a script to unpack the data to a volume, then ran the fit_one_cycle
function with the learner.
I ran into an issue with counter.pkl
sort of similar to this issue but I haven’t figured out how to resolve it yet.
On a whim, I checked to see if I could run a Jupyter notebook on Modal.
Apparently, you can!
Jon wrote an interesting blog on top of Cloudflare Workers and KV.
I’ve been seeing more and more notebook-like products and UX.
A few I’ve seen recently: