2024-02-04

[logs] February 4, 2024

I pivoted to using home-manager using the standalone installation. It seems to provide a reasonable starting point and a found a fair bit of prior art when searching Github, so my hope was there would be good examples to model off of. It took me a frustrating time to realize after running home-manager switch that only some changes took effect immediately. Knowing this probably would have saved me an hour and could have possibly solved a number of my problems from yesterday as well – I don’t have a quick way to check.

2024-02-03

[logs] February 3, 2024

Today, was a first day setting up a new computer. I spent most of the time installing applications and building up my setting with declarative configurations. This site has been helpful. I also started this walkthrough for setting up nix on a Mac.

I had to temporarily disable my nix.conf that was generated from the installer then run

nix --extra-experimental-features nix-command --extra-experimental-features flakes run nix-darwin -- switch --flake .

to run the flake.nix file.

2024-01-31

[logs] January 31, 2024

A new thing I am trying is sending thanks to folks who write articles or build projects that I find useful. I got a taste of this after writing an article on fine-tuning gpt-3.5 to solve the Connections word game, even though the results didn’t turn out that well. Getting the positive feedback was quite motivating and my hope is to give others the same appreciation for the positive impact their work makes on me.

2024-01-28

[logs] January 28, 2024

I’m currently working on building a language model based chatbot that can answer questions about the contents of a database. There are a lot of products and libraries making efforts at this problem. To start, I tried out the Vanna.ai open source library. I followed this guide to get started with ChromaDB for the indices and OpenAI as the language model to query a Postgres database. I also set up a Postgres database with Docker and the Chinook dataset. I downloaded the Chinook dataset for Postgres from this repo. The dataset is described in detail here. To start up Docker and load the data, I ran the following from my host machine (not inside a Docker container)

2024-01-26

[logs] January 26, 2024

For several days now, I’ve been looking into recording audio in a browser and streaming it to a backend over a websocket with the intent to do speech to text translation with an AI model. I know the pieces are all there and I’ve done something like this before (streamed audio from a Twilio IVR to a node backend, the send that to a Google Dialogflow CX agent). The current challenge is finding which pieces I want to connect. I’ve used a lot of Next.js lately. I like the developer experience. It enjoyable to use to build frontends. It also has route handlers, which are backend functions that are deployed on Lambda if you deploy on Vercel. These route handlers can’t really support a websocket backend because they aren’t designed to be long lived, something I learned when I worked around if by creating a secondary route handler as an async function. Apparently, these can now run for up to five minutes.¹ Route handlers on Vercel can now run for a maximum of five minutes, which is an increase from the previous limit. This allows for more complex operations to be handled directly within these functions. I would need to stand up a separate backend. That seemed fine and fair enough, so I started looking at Deno, which I’ve also used recently and enjoyed. Deno supports websockets out of the box. It also supports importing npm modules – I plan to use @google-cloud/speech to do speech to text conversion. The remaining question is how I can stream audio captured in the browser with navigator.getUserMedia over a websocket to forward to Google to convert to text.

2024-01-23

[logs] January 23, 2024

Hardly seemed with a TIL post because it was too easy, but I learned gpt-4 is proficient at building working ffmpeg commands. I wrote the prompt

convert m4a to mp3 with ffmpeg

and it responsed with

ffmpeg -i input.m4a -codec:v copy -codec:a libmp3lame -q:a 2 output.mp3

Since the problem at hand was low stakes, I just ran the command and, to my satisfaction, it worked. Language models can’t solve every problem but they can be absolutely delightful when they work.

2024-01-16

[logs] January 16, 2024

I spent another hour playing around with different techniques to try and teach and convince gpt-4 to play Connections properly, after a bit of exploration and feedback. I incorporated two new techniques

Asking for on category at a time, then giving the model feedback (correct, incorrect, 3/4)
Using the chain of thought prompting technique

Despite all sorts of shimming and instructions, I still struggled to get the model to

only suggest each word once, even when it already got a category correct
only suggest words from the 16 word list

Even giving a followup message with feedback that the previous guess was invalid didn’t seem to help. This was the prompt I ended up with. It wasn’t all that effective.

2024-01-10

[logs] January 10, 2024

After some experimentation with GitHub Copilot Chat, my review is mixed. I like the ability to copy from the sidebar chat to the editor a lot. It makes the chat more useful, but the chat is pretty chatty and thus somewhat slow to finish responding as a result. I’ve also found the inline generation doesn’t consistently respect instructions or highlighted context, which is probably the most common way I use Cursor, so that was a little disappointing. To get similar behavior with Copilot, sometimes I needed to run a generation for the whole file, but the lack of specific highlighted context meant I had to write more specific instructions, which was more time-consuming than highlighting and giving shorter, more contextual instructions. It is easy to edit the prompt and resubmit it if the completion is close, but not quite right, so that is helpful.

2024-01-07

[logs] January 7, 2024

I worked through a basic SwiftUI 2 tutorial to build a simple Mac app. Swift and SwiftUI are an alternative to accomplish the same things Javascript and React do for web. I could also use something like Electron to build a cross-platform app using web technology, but after reading Mihhail’s article about using macOS native technology to develop Paper, I was curious to dip my toe in and see what the state of the ecosystem looked like. He opted to use Objective-C, for performance reasons. I decided to try Swift because I’ve written a bit of Objective-C years ago. I like the ergonomics of Swift as a language well enough. I can’t say I’m a huge fan of Xcode. My hardware is almost certainly too old, but Xcode is sluggish and not fun to use in a way that the web development tools I use are not (at least on my machine). Seeing all the things that PWAs can do today, I’m unsure whether it makes sense to invest in learning SwiftUI unless I want to build native mac apps.

2024-01-05

[logs] January 5, 2024

I enjoyed this article by Robin about writing software for yourself. I very much appreciate the reminder of how gratifying it can be to build tools for yourself.