Why do prototypes suck?

Hey there! I'm Nat Bennett, software engineer, writer, and general systems thinker. You're reading Simpler Machines, which is currently an irregular newsletter about making software (and dealing with work in general) that goes out when I get a minute to write.

I build little prototypes or proof-of-concepts a lot. I find that they're a really efficient way to make persuasive technical arguments. Rather than writing a long document about why I think my team should try something, I'll spend a few hours making a small demonstration and show it to folks. These demos are usually profoundly hacky – no tests, hardcoded values, that kind of thing – but they're real tangible implementations, and those are much easier to discuss and reason about than a totally hypothetical proposal.

A key part of the practice is that these prototypes are small, and if we decide to actually implement them we throw them away.

Whenever I recommend this strategy, I always get at least a little bit of pushback. Something along the lines of, "Prototypes are dangerous, because no one actually throws them away. Once you've got it working it's too tempting just to put it into production, and then– ugh!" Often this comes from ops-y folks who have had the pleasure of maintaining other people's "quick hacks" for years.

That got me wondering – why is it, exactly, that prototypes are so miserable to maintain and operate? And how can we avoid putting prototypes into production? How they end up in production anyway seems obvious – once you've got something that technically works, there's a huge temptation to just ship it, especially if you're under any kind of delivery pressure – but those other two questions were interesting enough that I've had a bunch of conversations with people about them over the last few years. Here I'm attempting to share some of the notes from those conversations.

Prototypes are hard to change without breaking

They usually don't have tests. Tests do lots of things but one of the things they do is programmatically encode which features of a system are important and intentional, and separate those features from the parts that are merely incidental.

Likewise they often don't have design documents or other documentation. Often "what does <x> do" ends up as an oral tradition. (This can be okay on a small team that stays small, but if you're at a startup that goes into scaling mode, oh boy.)

In my own personal prototypes, I often hardcode values that I would normally make configurable. This is especially sticky when those values are things that need to be shared within systems – it'd be really easy to end up with a system that has values that need to be updated in tandem across several systems. Add in "no documentation" and, well, you've got a mess.

Prototypes or hack day projects are also often a chance for the author to experiment with something new, which means that they don't behave like all the other projects. They might be the one Django service in a Flask shop, or a NodeJS project in a Python shop, or a gRPC system instead of REST. It might have different methods of data migrations.

They also often don't have a team that owns them. If one person made them and then that person leaves the company– whoops. So they might not have a real on-call rotation, dependency updates, or anyone watching their metrics.

One SRE I talked to noted that

Some of it is also what can feel like engineers creating a service out of spite - "the infra team won't give us <x> so i'm just gonna go off and write a Java service that does something", and then throwing it over the fence. Combine this with AWS console access and you have a lot of funtimes for the SRE team to go clean up things and put them into IaC.

How to keep prototypes from getting into production?

Or: How to tell when you need to rewrite something before putting it into production.

As an individual engineer asking yourself, "Should I rewrite this?" I think the answer is basically the inverse of, "Could someone else on my team change it?" The full checklist probably looks something like:

  • Does it have tests?
  • Does it have a "normal" deployment process?
  • Is there anything weird or unique about it, and is that documented?
  • Are the values it needs to share with other systems provided by a config system?
  • How would we tell if it broke or wasn't working correctly?

Keeping people across a company from doing this is harder since it requires engineering leadership to set expectations and hold people accountable. I tend to like Charity Major's concept of "The Golden Path" – basically, devs can put whatever they want into production, but if they want specialist support from the operations team it needs to be built according to the organization's operability standards.

I think "how prototype-y is too prototype-y for production" probably depends a lot on your context. Something might be fine to ship on a tiny team, where everyone is very experienced and the dev team is doing its own ops, at a company that is still looking for product-market fit, but dangerous at a large multi-team organization with a separate operations team that's rapidly scaling both staff and customers.

One last trick – especially if you're building a prototype that has any kind of GUI – is to make prototypes deliberately, visibly janky. Don't add any CSS or visual polish. Especially if you're showing something to stakeholders outside the engineering team, it helps to communicate "this isn't finished" really clearly.

What have I missed?

This is a topic I expect to return to so if you have any thoughts please share 'em. You can send me an e-mail or leave a comment on the website.