2024-10-20

[logs] October 20, 2024

Reading a bunch. Also got inspired to play around with generating random numbers with language models across different temperatures to see their distributions.

2024-10-16

[logs] October 16, 2024

There is an enormous amount of jargon in deep learning, including terms like rectified linear unit. The vast vast majority of this jargon is no more complicated than can be implemented in a short line of code, as we saw in this example. The reality is that for academics to get their papers published they need to make them sound as impressive and sophisticated as possible. One of the ways that they do that is to introduce jargon. Unfortunately, this has the result that the field ends up becoming far more intimidating and difficult to get into than it should be. You do have to learn the jargon, because otherwise papers and tutorials are not going to mean much to you. But that doesn’t mean you have to find the jargon intimidating. Just remember, when you come across a word or phrase that you haven’t seen before, it will almost certainly turn out to be referring to a very simple concept.
Read More…

2024-10-12

[logs] October 12, 2024

Reading fastbook, I get the sense we could teach math more effectively if we did so through spreadsheets and Python code.

2024-10-11

[logs] October 11, 2024

Current theory on why nbdev and notebooks in general make sense and can work: Writing code for most software is actually pretty similar to writing code for models, but usually you pay less of a cost for not knowing what you’re doing yet (aka exploring). You still pay some cost, but it isn’t totally deal-breaking in most software settings, unlike if you needed to run the entire data load and clean job as you’re training a model. For the latter, a full re-run would slow down the feedback loop too much to still move quickly. Notebooks reduce the feedback loop to about as low as imaginable while also allowing you to experiment while still not totally knowing where you’re going yet. Because you can still go forward(ish), you can sketch in the general direction of where you want to go without as many of the constraints of abstraction, structure, or slow parts of your workflow, which can easily be memoized by the notebook.

2024-10-10

[logs] October 10, 2024

There are many tools for doing evals. I used ell and braintrust together for fun and disaster. The integration is actually not terrible, though I’m not 100% whether they’d be obvious things to try and link together. It seems ell is striving to build its own eval capabilities as well.

2024-10-09

[logs] October 9, 2024

course.fast.ai

Some quotes from Lesson 3 of course.fast.ai by Jeremy Howard.

I remember a few years ago when I said something like this in a class somebody on the forum was like “this reminds me of that thing about how to draw an owl”. Jeremy’s basically saying okay step one draw two circles, step two draw the rest of the owl. The thing I find I have a lot of trouble explaining to students is when it comes to deep learning, there’s nothing between these two steps. When you have ReLUs getting added together and gradient descent to optimize the parameters and samples of inputs and of what you want, the computer draws the owl. That’s it.
Read More…

2024-10-08

[logs] October 8, 2024

aider

I tried to use aider to build a crossword generator in Python. Even with a preselected set of words, this proved difficult. Or perhaps the preselected set of words was why it was difficult. Either way, the AI model doesn’t really understand the concept of word overlap in the context of a crossword. That seemed solvable. Instead, I had it write code to precalculate word overlap from the word list, then use that to place the words. The program seemed to hang indefinitely. Adding debug statements revealed extensive looping and attempts to place words that likely couldn’t fit within the constraints of the puzzle. I had selected a group of words around the theme “celestial bodies” without considering their potential for overlap. It was probably impossible to place them all.

2024-10-07

[logs] October 7, 2024

One of the most painful lessons beginners have to learn is just how often everyone is wrong about everything.

Imagine a spreadsheet where every time you change something you must open a terminal, run the compiler and scan through the cell / value pairs in the printout to see the effects of your change. We wouldn’t put up with UX that appalling in any other tool but somehow that is still the state of the art for programming tools.
Read More…

2024-09-27

[logs] September 27, 2024

Erik wrote about how it’s hard to write code for humans.

Getting started is the product!

Found a cool script by David to allow streaming output using glow. Can’t seem to figure out why it strips away the terminal colors for me.

2024-09-26

[logs] September 26, 2024

Cool article by Jacob on a blog re-write to Astro. I’ve been getting a bit of a re-write itch lately but I don’t want it to be a distraction. Might need to wait until the end of the FastAI course with just a little exploration on the side.