2024-11-20

[logs] November 20, 2024

Tried out Letta. Unsure where to try and go with it.

2024-11-16

[logs] November 16, 2024

Would another day of editing fundamentally change the value readers get? Probably not. Ship it and move on to your next idea while you’re still energized.

2024-11-15

[logs] November 15, 2024

Trying out Windsurf

2024-11-10

[logs] November 10, 2024

When you are curious about something, you have the right cocktail of neurotransmitters present to make that information stick. If you get the answer to something in the context of your curiosity, then it’s going to stay with you.

David Eagleman, https://freakonomics.com/podcast/feeling-sound-and-hearing-color/

2024-11-09

[logs] November 9, 2024

we can improve the accuracy of nearly any kind of machine learning algorithm by training it multiple times, each time on a different random subset of the data, and averaging its predictions

Fastbook Chapter 9

2024-10-30

[logs] October 30, 2024

I wanted to get more hands-on with the language model trained in chapter 12 of the FastAI course, so I got some Google Colab credits and actually ran the training on an A100. It cost about $2.50 and took about 1:40, but generally worked quite well. There was a minor issue with auto-saving the notebook, probably due to my use of this workaround to avoid needing to give Colab full Google Drive access. Regardless, I was still able to train the language model, run sentence completions, then using the fine-tuned language model as an encoder to build a sentiment classifier. Seeing how long this process took, then seeing it work helped me build a bit more intuition about what to expect when training models. I was also a bit surprised how fast the next token prediction and classification inference were. I might try out a smaller fine-tune on my local machine now that I have a better sense of what this process looks like end to end.

2024-10-29

[logs] October 29, 2024

The following code allowed me to successfully download the IMDB dataset with fastai to a Modal volume:

import os

os.environ["FASTAI_HOME"] = "/data/fastai"

from fastai.text.all import *

app = modal.App("imdb-dataset-train")
vol = modal.Volume.from_name("modal-llm-data", create_if_missing=True)


@app.function(
    gpu="any",
    image=modal.Image.debian_slim().pip_install("fastai"),
    volumes={"/data": vol},
)
def download():
    path = untar_data(URLs.IMDB)
    print(f"Data downloaded to: {path}")
    return path

run with

modal run train.py::download

Next, I tried to run one epoch of training of the language model

2024-10-28

[logs] October 28, 2024

I tried training a language model with fastai on Modal.

First I attempted it in a standalone Modal script. I first wrote a script to unpack the data to a volume, then ran the fit_one_cycle function with the learner. I ran into an issue with counter.pkl sort of similar to this issue but I haven’t figured out how to resolve it yet.

On a whim, I checked to see if I could run a Jupyter notebook on Modal. Apparently, you can!

2024-10-26

[logs] October 26, 2024

Jon wrote an interesting blog on top of Cloudflare Workers and KV.

I’ve been seeing more and more notebook-like products and UX. A few I’ve seen recently:

2024-10-21

[logs] October 21, 2024

Ran several experiments using local LLMs (~7b parameter models) like llama3.2 and phi3 to generate a random number between 1 and 100. The exact prompt was

llama3.2

system

You are a random number generator that provides a number between 1 and 100.

user

Generate a random number between 1 and 100. Provide the output in the following format: ‘Random number: X’, where X is the generated number. Ensure the number is an integer and do not include any additional text or explanations.

I didn’t expect this approach to work as a uniform number generator, but it was interesting to see how it doesn’t work. At lower temperatures, most models only output a few different values in the range of 40-60. There was little to no variability. With increases in temperature (between 1-3), the distribution begins to look bi-model for several models. After this threshold, most model outputs start to breakdown and output only single digit numbers at temperature 7 and higher (I am aware this is general not a recommended model of using a language model).