Fail Safe, Fail Smart… Succeed! (Part 1 of 3)
S'mae byd! (hello world in Welsh)
Astute readers noticed I accidentally sent the same chapter twice in the newsletter four weeks apart. Very embarrassing! I was on vacation using different tools than usual, but that is just an excuse. Because I deprived you of new content in the last newsletter, I’ve included a chapter I've been saving. It's one of my favorites and one of the longest. Given my mistake in the previous newsletter, it's also highly appropriate! I like to think it might make some good beach reading as well. It’s actually so long that your mail reader may truncate it, so I will have to publish it in three parts. The second part will come tomorrow, and the third part will come out on Tuesday.
About this newsletter
If you are new to this newsletter, here's what makes it unique. Every two weeks, I share a chapter from my book It Depends: Writing on Technology Leadership 2012-2022, which Unit Circle Press released in March. These chapters are not in sequential order, but each one is a standalone piece. In addition, a podcast serializes the audiobook in order, released in the alternate weeks from this newsletter.
The Latest Podcast
The latest episode of the podcast is "A Resignation Can Be an Opportunity." In it, I discuss how you can handle resignations from your team to improve both your team and yourself.
You can get the episode at Apple Podcasts, Spotify, YouTube, directly via the RSS Feed, or wherever you get your podcasts (see pod.link for a more extensive list).
Drop me a line
I'm always eager to hear from you. If you have questions, want a signed copy, or simply want to say 'hi!', please email me at contact@itdependsbook.net. You can also share your thoughts on Substack or the various podcast platforms. Your feedback is invaluable, and I look forward to hearing from you.
About this week's chapter
In this edition of the newsletter, we have chapter one of the book, which originally started as a keynote talk that I presented at several conferences. There are multiple recordings of the talk and the slides linked on my website. In this chapter, I share one of the most important lessons I learned in my career: that failure isn’t necessarily a bad thing and can even be a good thing if you learn to fail well.
If you subscribe to Lean Startup or Agile methodologies (which I do), both have the idea of learning from mistakes and leveraging those lessons to improve your product, process, and company. This really was cemented for me when I was at Spotify, and in the chapter, I share multiple things I learned there (I originally wrote this talk while I was still working at the company in Stockholm. I also shared this chapter in the podcast in January, if you would prefer to listen to it.
Fail Safe, Fail Smart… Succeed!
Originally published on December 30, 2020
The importance of failure in software development
How we approach failure is critical in any industry, but it is especially crucial in building software.
Why?
The answer is simple: invention requires failure.
We don't acknowledge that fact enough as an industry. Not broadly. It is something we should recognize and understand more. Technologists continually look for ways to transform existing businesses or build new products. We are an industry that grows on innovation and invention.
Real innovation is creating something uniquely new. If you can create something genuinely novel without failing a few times along the way, it probably isn't very innovative. As Albert Einstein said:
"Anyone who has never made a mistake has never tried anything new." - Albert Einstein
In his own words, Thomas Edison said that he created three thousand theories before finding suitable materials for his electric light.
Filmmaker Kevin Smith says, "Failure is success training." I like that sentiment. It frames failure as leading to success.
Failure teaches you the things you need to know to succeed. Stated more strongly: failure is a requirement for success.
Creating a fail-safe environment
To achieve success, what's important isn't avoiding failure; it is handling failure when it comes. The handling of failure makes the difference between eventual success and never succeeding. Creating conditions conducive to learning from failure means creating a fail-safe environment.
In the software industry, we used to define a fail-safe environment as an environment with many processes to avoid failure. Instead, we should ensure that when the inevitable failure happens, we handle it well and reduce its impact. We want to fail smart.
When I was at Spotify, a company that worked hard to create a fail-smart environment, we described this as "minimizing the blast radius." This quote from Mikael Krantz, the head architect at Spotify during that time, sums up the idea nicely:
"We want to be an internal combustion engine, not a fuel-air bomb. Many small, controlled explosions, propelling us in a generally ok direction, not a huge blast leveling half the city." - Mikael Krantz
So, let us plan for failure. Let's embrace the mistakes that will come in the most thoughtful way possible. Then, we can use those failures to move us forward and ensure they are small enough not to take out the company. I like the combustion engine analogy because it embraces that a well-handled failure still pushes us in the right direction. If we anticipate, we can course-correct and continue to move forward.
One way to create these small, controlled explosions is to fail fast. Find the fastest, most straightforward path to learning. Can you validate your idea quickly? Can you reduce the scope so that you can get it in front of real people immediately and get feedback before investing in a bunch of work?
A side benefit of small failures is that they are easier to understand. You can identify what happened and learn from it. With a big failure, you must unpack and dig in to know where things went wrong.
The Lesson of Clippy
Even if you've never used the Office Assistant feature of Microsoft Office, you are likely aware of it. It was a software product flop so massive that it became a part of pop culture.
I worked at Microsoft when the company created Office Assistant. Although I didn't work on that team, I knew a few people who did.
It is easy to think that the Office Assistant was a horrible idea created by a group of poor-performing developers and product people, but that couldn't be farther from the truth. Clippy was built by highly talented developers, product leads, researchers with fantastic track records, and PhDs from top-tier universities. People who thought they understood the market and their users. These world-class people were working on one of (if not THE) most successful software products of all time at the apex of its popularity. Microsoft spent millions of dollars and many person-years on the development of Clippy.
So, what happened?
What happened is that those brilliant people were wrong. Very wrong, as all of us are from time to time. How could they have found their mistake before releasing widely? It wasn't easy at the time to test product assumptions. It was much harder to validate hypotheses about users and their needs then compared to today.
How we used to release software
Before we could assume high-bandwidth internet connections, we wrote and shipped software in a very different way.
Software products were manufactured, transcribed onto plastic and foil discs. For a release like Microsoft Office, those discs were manufactured in countries worldwide, put into boxes, then put onto trucks and trains and shipped to warehouses, like TV sets. From there, trucks would take them to stores where people would purchase them in person, take them home and spend an afternoon swapping the discs in and out of their computers, installing the software.
With a release like Office, Microsoft would need massive disc pressing capability. It required dozens of CD/DVD plants across the world to work simultaneously. That capability had to be booked years in advance. Microsoft would pay massive sums to take over the capacity of the entire CD/DVD pressing industry. This monopolization of disc manufacturing required a fixed duration. Moving or growing that window was monstrously expensive.
It was challenging to validate a new feature in that atmosphere, particularly if it was a significant part of a release that you didn't want to leak to the press.
That was then; this is now
Today, the world is very different. There is no excuse for not validating your ideas.
You can now deploy your website every time you hit save in your editor. You can ship your mobile app multiple times per week. You can try ideas almost as fast as you can think of them. You can try and fail, learn from the failure, and improve your product continuously.
"If you want to increase your success rate, double your failure rate." - Thomas J. Watson (CEO of IBM from 1914-1956)
If it takes you years and millions of dollars to fail, and you want to double that, your company will not survive to see eventual success. Failing Fast minimizes the impact of your failure by reducing the cost and delay in learning.
I worked at an IBM research lab a long time ago. I was a developer on a project building early versions of synchronized streaming media. After over a year of effort, we arranged to publish our work. As we prepared, we learned that two other IBM labs were working on the same problems. Our work was complete; it was too late to collaborate. At the time, it seemed to me like big-company stupidity, not realizing that three different teams were working on the same thing. Later, I realized that this was a deliberate choice. It was how IBM failed fast. Since it took too long to fail serially, IBM had become good at failing in parallel.
Building a fail-safe culture
If innovation requires failure to build an innovative product or company, how your culture handles the inevitable failures is key to creating a fail-safe environment.
Many companies still punish projects or features that do not succeed. The same companies then wonder why their employees are so risk-averse. Punishing failure can take many forms, both obvious and subtle. For example, punishment can mean firing the team or leader who created an unsuccessful release or project.
Sanctions can be more subtle:
Moving resources away from innovative efforts that don't yield immediate successes.
Allowing people to ridicule failed efforts.
Continuing to invest in the slow, steady growth projects instead of the more innovative but risky efforts. The innovator's dilemma is just the most well-known aspect of this.
Breeding innovation out
I spent several years working at a company whose leadership constantly encouraged employees to be more innovative and take more risks. It created ever-new incentives to incite new products from the organization. It was also a company that had consistently grown through acquisition. Every year, it would acquire new companies. At the start of the following year's budget process, there would inevitably be the realization that the company had grown too large. Nearly every year, there would be a layoff.
Where would you look if you are a senior leader and need to trim ten percent of your organization? In previous years, you likely had already eliminated your lowest performers. Should you reduce the funding of the products that bring in your revenue or kill the new products struggling to make their first profit? The answer is clear if your bonus and salary depend on hitting revenue targets.
No matter what the intentions of the company were, through its actions, it communicated that taking risks was detrimental to a career. So, the company lost its most entrepreneurial employees through voluntary or involuntary attrition. Because it could not innovate within, innovation could only happen through acquisitions, perpetuating the cycle.
If you overtly or subtly punish failure, and failure is necessary for innovation, then you are disincentivizing innovation.
Don't punish failure. Punish not learning from failure. Punish failing big when you could have failed small first. Better yet, don't punish at all. Instead, reward the failures that produce essential lessons for the company that the team handles well. Reward risk-taking if you want to encourage innovation.
If you worry about employees taking risks without accountability, give them participation in the revenue that they bring in (see the chapter "The Myth of a Startup in a Large Company").
Part 2 of this chapter will go out tomorrow.
Buy on Amazon | Buy on Bookshop.org | Buy on Audible | Other stores