Deployment - Assess the risk

Back to Unsung Developer Thoughts

In an ideal scenario, deployments should be painless. However, some deployments are always going to be risky. You’ll see the upcoming change and have that knot in your gut. You’ll hear the voice saying “what about this and this and this and this?” I don’t think that entirely goes away. There’s always something that can go wrong, but ideally you’ll be prepared for it.

This section attempts to help you assess the risk of a deployment and hopefully start you on the path to identifying precautions.

What’s the reasonable worst case scenario?

You should try to identify what could potentially go wrong with the deployment. Yes, quantum mechanics says that any deployment could potentially drop your database, but if your change has zero possibility of executing a DROP or DELETE command it’s not reasonable. This process asks you to be familiar with not only your change, but also the project you’re working on. You may need to ask another developer for their advice if you’re new to the project.

If your deployment involves updating a large amount of data, a potential worst case scenario is that all the data is updated incorrectly. The next step is to identify what would need to happen to recover from that? Is it restoring from a backup? If so, then maybe your deployment should start with creating a backup.

Or maybe you’re changing your SECRET_KEY. A potential worst case scenario might be that you’d invalidate all your sessions or signed tokens, requiring every user to re-authenticate.

This process asks you to think outside the box and contemplate how problems can occur. From there, you’ll need to see where it falls on the mild-inconvenience to catastrophe scale. The worse a problem is, the more likely you should test the deployment and prepare mitigations before the deployment.

Where will the most likely errors occur?

A similar exercise is to identify the most likely error scenarios. This focuses on the probability of a problem rather than the scale of the problem. Perhaps you manage an application with hundreds of configuration options, resulting in tens of thousands of permutations. You may not have 100% test coverage of every configuration combination. If your change involves modifying a single option, it’s certainly possible that a customer is using an obscure configuration that break because of this change. The goal is to determine how likely that possibility is and how bad that problem could be.

Depending on your answers, the result may be to move forward with the deployment as designed. Or perhaps you need to do some additional testing. Everything is a balancing act. Gather the information you can find and make the best decision possible right now.

Does the change impact customers?

When assessing the risk of a deployment, you need to consider the impacted parties. The more important that party is, the higher the risk of the change. If your change is to a reporting tool that only you use then it’s a pretty small risk. If your change is to the account creation flow for your application, that’s a major risk. You can’t acquire new customers if they can’t sign up.

Sometimes changes don’t impact customers, but they do impact coworkers or colleagues. Think about the extent of the problems you could potentially cause them and what their role is. You never want to cause frustration for a coworker except for maybe Tim, he’s a bit sheisty. But it can and does happen. The more consideration you do up front, the more likely you are to test and evaluate your changes appropriately.