This conversation between Lenny and Geoff was particularly noteworthy for me because it hit on so many areas of what I’ve seen in the most effective organizations and teams I’ve been apart of as well as realigning incentives to solve a number of problems I’ve experienced that hold teams back.
We report back operational overhead, meaning the percentage of tickets that come from your product area normalized by the number of users that are using that product
Having worked on a support eng team, this point resonates deeply. We often saw support tickets being opened in our systems because of bugs and issues in the product. The product teams were often unaware of these issues, and the number of these types of problems only grew over time. Occasionally, if an issue was driving enough ticket volume, it would make it back to the product team who would fix the bug, but I always advocated that we report back support tickets attributed to their respective teams and make those numbers public. To hear this approach corroborated in the discussion was validating.
We don’t have a bug backlog. We fix every bug once they’re surfaced almost.
This approach is simultaneously an engineer’s dream and nightmare. On the surface, it sounds like you would get nothing done because you’re work would constantly be interrupted by bug that need fixing. It wasn’t entirely clear to me how the workload breaks down between “core” and “production” engineers. However, to me, this approach seems to be the best shot you have and keeping your product as close to bug-free as you can. It’s an idealistic goal but I like it. Bug backlogs rot rapidly. The codebase continues to change and the engineers with context move on to other things. The best time to fix a bug is as soon as you learn about it, when the context is most fresh and when you have the reporter available or a stacktrace that is current. The level of effort required to fix a bug weeks or months later always seems to be higher than fixing it immediately because you have to ramp up on context, find a way to reproduce it and things may have changed since it was reported.
[S]upport reports into me. And the first principle there was saying, “Well, every support ticket is a failure of our product.” We literally have that as a quote just posted on all those channels. It’s a failure. And if the product works perfectly, no one should ever have to contact our support team. And what better way of holding the product team accountable for support other than having support report into product.
Here, Geoff further reinforces the point that with a dedicated approach, a team can build quickly while still maintaining a bug-free experience. I love this mentality. I personally feel a lot of dissonance when I find a bug in my code. The first thing I want to do it fix it, particularly when I know it has customer impact. On one hand, this can interrupt your flow if you’re working on something else. And that something else may also be positioned to deliver a lot of customer value, so there is an opportunity cost to the interruption. On the other, the bug is negatively impacting customers today. To have the buy-in of the organization to prioritize these fixes makes a strong statement about their culture and values.
I really enjoyed listening to Geoff’s ideas and approach to product. This conversation has helped me challenge some of my priors and assumptions about how teams are built and run.
Bonus quote, included without further comment
[W]e moved from quarterly, very expensive quarterly planning, which took one month every three months, so basically 33% of the time was planning, to a biannual one-pager on, these are the company priorities and it’s much more smooth and much faster