Reading a bunch. Also got inspired to play around with generating random numbers with language models across different temperatures to see their distributions.There is an enormous amount of jargon in deep learning, including terms like rectified linear unit. The vast vast majority of this jargon is no more complicated than can be implemented in a short line of code, as we saw in this example. The reality is that for academics to get their papers published they need to make them sound as impressive and sophisticated as possible. One of the ways that they do that is to introduce jargon.Reading fastbook, I get the sense we could teach math more effectively if we did so through spreadsheets and Python code.Current theory on why nbdev and notebooks in general make sense and can work: Writing code for most software is actually pretty similar to writing code for models, but usually you pay less of a cost for not knowing what you’re doing yet (aka exploring). You still pay some cost, but it isn’t totally deal-breaking in most software settings, unlike if you needed to run the entire data load and clean job as you’re training a model.There are many tools for doing evals. I used ell and braintrust together for fun and disaster. The integration is actually not terrible, though I’m not 100% whether they’d be obvious things to try and link together. It seems ell is striving to build its own eval capabilities as well.Some quotes from Lesson 3 of course.fast.ai by Jeremy Howard.
I remember a few years ago when I said something like this in a class somebody on the forum was like “this reminds me of that thing about how to draw an owl”. Jeremy’s basically saying okay step one draw two circles, step two draw the rest of the owl. The thing I find I have a lot of trouble explaining to students is when it comes to deep learning, there’s nothing between these two steps.I tried to use aider to build a crossword generator in Python. Even with a preselected set of words, this proved difficult. Or perhaps the preselected set of words was why it was difficult. Either way, the AI model doesn’t really understand the concept of word overlap in the context of a crossword. That seemed solvable. Instead, I had it write code to precalculate word overlap from the word list, then use that to place the words.One of the most painful lessons beginners have to learn is just how often everyone is wrong about everything.
Imagine a spreadsheet where every time you change something you must open a terminal, run the compiler and scan through the cell / value pairs in the printout to see the effects of your change. We wouldn’t put up with UX that appalling in any other tool but somehow that is still the state of the art for programming tools.Erik wrote about how it’s hard to write code for humans.
Getting started is the product!
Found a cool script by David to allow streaming output using glow. Can’t seem to figure out why it strips away the terminal colors for me.Cool article by Jacob on a blog re-write to Astro. I’ve been getting a bit of a re-write itch lately but I don’t want it to be a distraction. Might need to wait until the end of the FastAI course with just a little exploration on the side.