Should You Run Stateful Systems via Container Orchestration?

Erik OstermanCloud Architecture & Platforms, DevOpsLeave a Comment

Recently it was brought up that ThoughtWorks now says that:

We recommend caution in managing stateful systems via container orchestration platforms such as Kubernetes. Some databases are not built with native support for orchestration — they don’t expect a scheduler to kill and relocate them to a different host. Building a highly available service on top of such databases is not trivial, and we still recommend running them on bare metal hosts or a virtual machine (VM) rather than to force-fit them into a container orchestration platform

This is just more FUD that deserves to be cleared up. First, not all container management platforms are the same. I can only address from experience, what it means for Kubernetes. Kubernetes is ideally suited to run these kinds of workloads when used properly.

NOTE: Just so we're clear–our recommendation for production-grade infrastructure is to always use a fully-managed service like RDS, Kinesis, MSK, Elasticache, etc rather than self-hosting it, whether it be on Kubernetes or bare-metal/VMs. Of course, that only works if these services meet your requirements.

To set the record straight, Kubernetes won't randomly kill Pods and relocate them to a different host if configured correctly. First, by setting requested resources equal to the limits, the pods will have a Guaranteed QoS (Quality of Service) – the highest scheduling priority and be the last ones evicted. Then by setting a PodDisruptionBudget, we can be very explicit on what sort of “SLA” we want on our pods.

The other recommendation is to use the appropriate replication controller for the Pods. For databases, it's typically recommended to use StatefulSets (formerly called PetSets for a good reason!). With StatefulSets, we get the same kinds of lifecycle semantics when working with discrete VMs. We can get static IPs, assurances that there won't ever be 2 concurrent pods (“Pets”) with the same name, etc. We've experienced first hand how some applications like Kafka hate it when their IP changes. StatefulSets solve that.

If StatefulSets are not enough of a guarantee, we can provision dedicated node pools. These node pools can even run on bare-metal to assuage even the staunchest critics of virtualization. Using taints and tolerations, we can ensure that the databases on run exactly where want them. There's no risk that the “spot instance” will randomly nuke the pod. Then using affinity rules, we can ensure that the Kubernetes scheduler places the workloads as best as possible on different physical nodes.

Lastly, Kubernetes above all else is a framework for consistent cloud operations. It exposes all the primitives that developers need to codify the business logic required to operate even the most complex business applications. Contrast this to ThoughtWorks' recommendation of running applications on bare metal hosts or a virtual machine (VM) rather than to “force-fit” into a container orchestration platform: when you “roll your own”, almost no organization posses the in-house skillsets to orchestrate and automate this system effectively. In fact, this kind of skillset used to only be posses by technology like Google and Netflix. Kubernetes has leveled the playing field.

Using Kubernetes Operators, the business logic of how to operate a highly available legacy application or cloud-native application can be captured and codified. There's an ever-growing list of operators. Companies have popped up whose whole business model is around building robust operators to manage databases in Kubernetes. Because this business logic is captured in code, it can be iterated and improved upon. As companies encounter new edge-cases those can be addressed by the operator, so that everyone benefits. With the traditional “snowflake” approach where every company implements its own kind of Rube Goldberg apparatus. Hard lessons learned are not shared and we're back in the dark ages of cloud computing.

As with any tool, it's the operator's responsibility to know how to operate it. There are a lot of ways to blow one's leg off using Kubernetes. Kubernetes is a tool that when used the right way, will unlock the superpowers your organization needs.

Fun Facts About the Kubernetes Ingress

JeremyDevOpsLeave a Comment

Here are some important things to know about the Kubernetes Ingress resource as implemented by ingress-nginx (which is one of the Ingress controllers we support at Cloud Posse).

  • An Ingress must send traffic to a Service in the same Namespace as the Ingress
  • An Ingress, if it uses a TLS secret, must use a Secret from the same Namespace as the Ingress
  • It is completely legal and supported to have multiple ingresses defined for the same host, and this is how you can have one host refer to 2 services in different namespaces
  • Multiple ingresses for the same host are mostly merged together as if they were one ingress, with some exceptions:
    • While the ingresses must refer to resources in their own namespaces, the multiple ingresses can be in different namespaces
    • Annotations that can be applied to only one ingress are applied only to that ingress
    • In case of conflicts, such as multiple TLS certificates or server-scoped annotations, the oldest rule wins
    • These rules are defined more rigorously in the documentation
  • Because of the way multiple ingresses are merged, you can have an ingress in one namespace that defines the TLS secret and external DNS name target and not have that defined at all in the other ingresses and yet they will all appear with the same TLS certificate

The paths section of the Ingress deserves some special attention, too:

  • The interpretation of the path is implementation-dependent. GCE ingress treats the path as an exact match, while Nginx treats it as a prefix. Starting with Kubernetes 1.18, there is a pathType field that can be either Exact or Prefix, but the default remains implementation-dependent. Generally, helm charts appear to expect the path to be interpreted the way Nginx does.
  • The general rule is that the longest matching path wins, but it gets complicated with regular expressions (more below)
  • Prior to Kubernetes 1.18 (and maybe even then), there is no way for an Ingress to specify the native Nginx exact path match. The closest you can come is to use a regex match, but regex matches are case-independent. Furthermore, adding a regex path to an ingress makes all the paths of that ingress case-independent regexes, by default rooted as prefixes.
  • The catch here is that it is still the longest rule that wins, even over an exact match: /[abc]pi/ will take precedence over /api/
  • There is a simple explainer of priorities and gotchas with multiple path rules in the ingress-nginx documentation and a fuller explanation in this tutorial from Digital Ocean.
  • With Nginx, if a path ends in /, then it creates an implied 301 permanent redirect from the path without the trailing / unless that path is also defined. path: /api/ will cause https://host/api to redirect to https://host/api/. (Not sure if this applies to regex paths.)

Effortless Blue/Green Deployments on Kubernetes with Helm

adminMeetupLeave a Comment

Last night was our first ever Pasadena “DevOps Mastermind” meetup.

First speaker up was Dan Garfield. He talked about how to achieve Blue/Green deployments. Blue/Green has been around for a long time but what are the “best practices” when using Kubernetes? How does it change when using Helm? Last night we learned from Dan the differences as he demonstrated how to pull it off effectively with repeatability using Codefresh. When using Helm, the picture changes slightly, keeping a history so rollbacks work properly is critical and requires structuring your Helm Chart accordingly. Check out the slides!

Dan Garfield is a Google Developer Expert, Chief Evangelist of Codefresh, and Kubernetes, Helm, Istio, and Docker meetup organizer. His talks have been featured at Kubecon, Swampup, DeveloperWeek, and many other places. He focuses on DevOps, and Deployment Strategies in a micro-service world.

Unlimited Staging Environments with Kubernetes

Erik OstermanMeetup, Release Engineering & CI/CDLeave a Comment

Last week we had the pleasure of listening to David Huie present at the DevOps Mastermind at WeWork Promenade. David is an infrastructure engineer at Dollar Shave Club, where he’s helping DSC shave the world using Kubernetes. He presented how they've achieved the Holy Grail of QA automation: running “Unlimited Staging Environments with Kubernetes.”


In modern micro-services architectures, there is a serious need for ad-hoc staging environments since it's often infeasible for developers to run the entire stack on their laptops. At the same time, static staging environments can be difficult to scale as an organization's infrastructure and engineering team grow.


To counter this effect, Dollar Shave Club created a Kubernetes-based system to enable an unlimited number of environments, bounded only by the capacity of the underlying Kubernetes cluster running some 38 nodes! At its core, is an Open Source project called Furan which rapidly builds Docker containers in Docker (DnD). Using their CI/CD system and an in-house tool called Amino, they are then able to automatically spawn environments composed of many independent projects, where each project is pegged to a specific version (e.g. branch or tag).


The company is able to iterate much faster which has sped up application delivery at DSC.

About the Speaker

Prior to joining Dollar Shave Club, David’s worked at Splice, NationBuilder, and Yelp. David has a degree in Computer Science from Harvey Mudd College.

Follow David on Twitter:


Slides from the presentation are below. We'll be posting video & transcripts shortly.

Unlimited Staging Environments with Kubernetes

Join us at the next Santa Monica DevOps Mastermind Meetup!

Register here:

The Paradigm Shift

Erik OstermanDevOpsLeave a Comment

Over the last year, we're seeing yet another massive transformation in how software is delivered take hold. I will call this a “Paradigm Shift” – containers are replacing virtual machines as the fundamental unit of software delivery at an unprecedented rate.

Over the last year, we're seeing yet another massive transformation in how software is delivered take hold. I will call this a “Paradigm Shift” – containers are replacing virtual machines as the fundamental unit of software delivery at an unprecedented rate.

Apparently, Moore’s Law applies to the rate of adoption of new technologies as much as it does the density of transistors. The adoption rate of public cloud adoption is twice that of what we saw with Virtual Machines, and now we're seeing the same thing with container adoption. Enterprises are interesting species to study because they are the slowest to move and therefore a consistent barometer of change. Enterprises are learning to be more tolerant of change—this an awesome trend.

What are the ingredients for a paradigm shift? Let’s begin by looking at a few examples.

The concept of “Virtual Machines” had been around since the 60s, but it took until the late 90s for the technology to catch up. It wasn’t until VMware came out with their “VMware Workstation” product in 1998 that the concept got popularized and we saw mass adoption. What did they do? They made it easy—first and foremost for developers to run multiple environments on their desktops. Then they conquered the enterprise with tools.

The other prime example is “Cloud Computing.” It was not a new concept, it’s just that no one had really cracked the nut to show us how to do it properly. That was until Amazon came along. With EC2 they made it accessible and showed us the possibilities; they let us write infrastructure as code. The possibilities blew our minds! So everyone tried to copy what Amazon did, but unfortunately, it was a little too late.

That's because now we have the container movement. The concept of “Containers” is also nothing new. In Linux, the core functionality has existed since 2008 when Google contributed their work on LXC – the technology behind containers – to the Linux Kernel. However, it wasn’t until Docker came along circa 2013 (5 years later!) and made it brain-dead easy for developers to run them that we started seeing an uptick in their adoption. Now Docker is taking a page out of VMware's playbook by following up with Enterprise tools for production with the release of the Universal Container Platform (“UCP”) & and the Docker Datacenter (“DDC”).

The secret?

  1. Make it easy.
  2. Target developers.
  3. Let percolate throughout the enterprise until resistance is futile.

In the wake of all these transitions is some collateral damage. These are shims or training wheels we used to get from bare metal to containers. It's the result of the natural process of innovation.

  • A dozen or more hypervisor technologies like VMware, Zen, KVM will lose massive market share.
  • Elaborate Configuration Management tools like Puppet and Chef that were created to address the broken ways we used to configure software (basically emulated what humans would do by hand) will no longer be needed because we don’t write software as broken anymore.
  • EC2 private-cloud knockoffs like OpenStack, vCloud, Eucalyptus, CloudStack, etc that were designed to run your own private cloud on-prem like Amazon, now overkill or at the very least passé (R.I.P.)

So why is the move to containers happening so quickly?

Hint: It’s not strictly technological.

First, we can agree that the second iteration is easier, better, and faster than the first anytime we iterate. Simply put, everything is less scary the second time around. Moving from the classic “bare metal” paradigm to a “virtualized” one was a massive endeavor. It was the “first” major paradigm shift of its kind. It took convincing of both C-Level execs and wrangling of Operations teams. Since it was a foreign concept, there was severe skepticism and pushback at all stages. Flash forward 15 years later, and there’s now fresh blood at the top. There’s a new guard who has moved up through the ranks that’s more accepting of new technology. Enterprises have gotten better at accepting change. Moreover, the tools of the trade have improved. We’re better at writing software — software that is more cloud friendly (aka “cloud native”).

Here are my predictions for what we'll see over the next few years.

  1. Containers will become first-class citizens, replacing VMs as the defacto unit of the cloud.
  2. If you still need a VM, that’s cool; you’ll have a couple options:
    • Leverage a VM running inside a container. There's a project by Rancher called “VM Containers” which does exactly this. Sound absurd? Not to Google. They run their entire Public Cloud – VMs & all – on top of Borg.
    • Use Clear Containers by Intel which have minimal overhead, full machine-level isolation and can leverage the VT technology of modern CPU chipsets. Not to mention, it's fully Open Source!
    • The brave will attempt using some sort of Unikernel, but it’s still too early to know for sure if that will be the way to go.
  3. Interest behind OpenStack (et al) will wane, and innovation will cease – they were ahead of their time. We learned A LOT from the experience – both what worked well and what didn't. As a result, we'll see a significant brain drain, with key contributors moving over to the Kubernetes camp.
  4. Kubernetes will replace OpenStack du jour and as a result we'll see a resurgence of bare-metal in the Enterprise
  5. Amazon’s ECS will be EOL’d and replaced with offerings of Kubernetes & Swarm.
  6. Kubernetes and Swarm will be battling it out for #1 because the competition is good.
  7. The best features of Mesos will be cherry-picked by both Kubernetes & Swarm, but Mesos will fail to gain a bigger foothold in the market.