2023-10-18

I spent some time experiementing with Inngest, cloud software for running async jobs, workflows, crons and more. It’s quite similar to Temporal, which I am big advocate for for running durable workflows and gracefully handling failures. There are some drawbacks, but it feels simpler to get started than Temporal and has a most of the same topline capabilities. The main feature deficiencies I noticed in about an hour of research were lack of as granular retry configurations and timeouts and no support for query handlers to inspect the status of a running function/workflow. The list of supported features is impressive, notably

I had a bunch of fun following along with this post with my own Hugo blog to construct a sqlite database of metadata. Building the database indices, I found a mistake I had made years ago in defining a post’s alias, which was a duplicate, so I fixed that. I’ve read a lot of praise of sqlite lately and wanted to get more familiar with the tools and ecosystem and this was a nice way to start to do that.

Further investigation with Open Interpreter today reaffirmed certain strengths but also revealed a number of weaknesses. The look is excellent at parsing structured data like JSON or CSV, doing analysis with tools like pandas and numpy, and plotting the results with matplotlib. However, it falls short when trying to perform more complex data fetching tasks. It seems to struggle to scrape websites or make use of less common libraries, at least when I tried without providing any additional documentation. At one point, I had it scrape a relatively simple html website and it was able to parse that into a dataframe, when cleanup the data to a point where I think I was close to being ready to do analysis. Unfortunately, the REPL seemed to hang at that point. I’m not sure if I maxed out the token context of the model or if something else happened, but I had to hard exit the process. There isn’t an easy way to “jump back to where you were in a session” and I didn’t have the patience to try again from the start. I need to look into whether some kind of step memoization is possible.

I did some more experimentation with open-interpreter today. The first use case I tried was to create, organize and reorganize files. It didn’t generate interesting content, but it was fluent at writing Python code to organize and rename files. When I prompted it to generate a fake dataset, it installed faker and created a CSV with the columns I requested. When I requested it plot those data points, it installed matplotlib and did so without issue.

It’s much easier to test Temporal Workflow in Python by invoking the contents of the individual Activities first, in the shell or via a separate script, then composing them into a Workflow. I need to see if there’s a better way to surface exceptions and failures through Temporal directly to make the feedback loop faster.


From this paper:

62% of the generated code contains API misuses, which would cause unexpected consequences if the code is introduced into real-world software

Language models and prompts are magic in a world of deterministic software. As prompts change and use cases evolve, it can be difficult to continue to have confidence in the output of a model. Building a library of example inputs for your model+prompt combination with annotated outputs is critical to evolving the prompt in a controlled way, ensuring performance and outcomes don’t drift or regress as you try and improve your overall performance.

I’ve been doing a bit of work with Temporal using it’s Python SDK. Temporal remains one of my favorite pieces of technology to work with. The team is very thoughtful with their API design and it provides a clean abstraction for building distributed, resilient workflows. It’s a piece of technology that is difficult to understand until you build with it, and once you do, you find applications for it everywhere you look. I highly recommend experimenting with it if you’re unfamiliar.

2023-08-07

🎧 Velocity over everything: How Ramp became the fastest-growing SaaS startup of all time | Geoff Charles (VP of Product)

This conversation between Lenny and Geoff was particularly noteworthy for me because it hit on so many areas of what I’ve seen in the most effective organizations and teams I’ve been apart of as well as realigning incentives to solve a number of problems I’ve experienced that hold teams back.

We report back operational overhead, meaning the percentage of tickets that come from your product area normalized by the number of users that are using that product