Advice for a Companies Redoing their Environments

DevOps

erik

Once upon a time

Advice for a Companies Redoing their Environments

Are you planning to re-do your environments (Dev, QA, Integration, Stage and Prod) on AWS? Maybe you're looking for some advice or great articles on some of the best-practices for quality-control and DevOps. Maybe you have a small team so whatever you do, it has to be easy. E.g. Going multi-cloud is the last thing on your mind. Well, you're in luck! We're here to help you figure that out.

The great thing is that today even small teams can manage an insane amount of stuff with modern tools and techniques. You just need to know where to start. Just as in business, much of the secret to success is automation. But it's very, very hard to get all this "right" or even close to it if you haven't done it before. If you tried Googling for the answers to these questions, you probably realized that there wasn't anything practical enough and were quickly overwhelmed with the amount of information and conflicting advice. One popular book on the subject is

The DevOps Handbook, which details some of the things Etsy has done; that's great, but if your team is small by comparison, it's probably overkill for you you need at this stage of the game.

Let's assume that most of your tech stack is written in languages like PHP, Python, NodeJS with frameworks like Rails, Django, and React. You might have a few single-page applications (SPAs), but the rest runs in Docker and you're familiar with Docker Compose.

Here are some considerations you should make.

How many environments will you need? E.g. (dev, qq, stage and prod)

For the most part, all of these should be nothing but a name. Sure, there can be a few "pet" environments, but for the most part they are cattle and should be able to have any number of them. What we really like is to have "preview environments", where any branch/pr brings up a complete environment (typically in a new namespace on kubernetes). The trick is how to do this for microservice or "distributed monoliths".

What kind of test database should you use? (e.g. local or shared)

test database: local or shared Since testing includes running migrations, the best way, in the long run, is local databases. Either use fixtures to populate the data, use an anonymized/randomized data - never with real email addresses, and avoid using data dumps from Production that contain any PII/CHD/PHI/etc. Where fixtures don't get you far enough, what we've often done is run a pipeline nightly that generates a database container image with a working dataset. This container then is run in local docker compositions.

How should software be promoted/graduate between environments and what should your approval strategy be?

graduation, approval strategy from lower to higher env In modern day ci/cd systems (e.g. Codefresh), adding an approval step (with ACLs) is possible in-between any two steps. So the first approval step is merging into a branch (ala GitHub branch protections), but what happens after that? It's nice to have a pipeline that visually depicts the status of where a deployment across multiple stages - where each stage is a column. Then at the top of each stage column, an approval step.

What should happen when tests break?

pretty open ended, but the first thing that comes to mind is "branch protections". Ensure that branches cannot be merged until all essential tests pass.

How will you handle rollbacks?

briefly touched on this above, but this is a broad topic. One thing often ignored is what to do about db migrations?

Will you need to mock API services for testing? How will you handle integrating 3rd party API and services?

What are the general continuous deployment strategies?

General continuous deployment strategies, etc first goal is to get CD to to some environment, but not prod. CD to prod comes after you complete confidence in your automated testing and monitoring. You can get here faster by introducing things like feature flagging (launch darkly) and more advanced deployment strategies (canary, or blue/green). Check out harness.io for this. But to be able to do any kind of automated rollbacks, you'll also need metrics/monitoring (E.g. server errors). The other important consideration is ensuring backward compatibility of APIs and schemas between any 2 adjacent deployments/releases. E.g. doing a canary or rolling update means that for some period of time, you'll have multiple versions of the same app online.

What is your version control branching strategy (e.g. Git flow or Github Flow)?

What is your strategy for managing secrets?

I'm happy to jump on a call anytime to answer all these and more. Always enjoy to talk shop. Without threads for each one of these, hard to start a discussion around it here. https://calendly.com/cloudposse (probably an hour)

What changes will we need to make to our applications?

Here's our practical 12-factor checklist: /12-factor-app/

What are some other considerations?

12 factorization of your apps Exception tracking (E.g. sentry) Log aggregation (E.g. sumologic, splunk, or EFK) Feature flagging Securing dev/qc/staging environments Remote access management (e.g. how to get out of the business of managing SSH keys?) Automated updates of dependencies (e.g. dependabot)

If this sounds interesting, I welcome you to book a time on my scheduling page. Please take our quiz and then follow the instructions to add an event to my calendar.

Share This Post: