Public “Office Hours” (2020-05-27)

Erik OstermanOffice Hours

Here's the recording from our DevOps “Office Hours” session on 2020-05-27.

We hold public “Office Hours” every Wednesday at 11:30am PST to answer questions on all things DevOps/Terraform/Kubernetes/CICD related.

These “lunch & learn” style sessions are totally free and really just an opportunity to talk shop, ask questions and get answers.

Register here: cloudposse.com/office-hours

Basically, these sessions are an opportunity to get a free weekly consultation with Cloud Posse where you can literally “ask me anything” (AMA). Since we're all engineers, this also helps us better understand the challenges our users have so we can better focus on solving the real problems you have and address the problems/gaps in our tools.

Public “Office Hours” 2020-05-27

Cloud Posse holds public

Should You Run Stateful Systems via Container Orchestration?

Erik OstermanDevOpsLeave a Comment

Recently it was brought up that ThoughtWorks now says that:

We recommend caution in managing stateful systems via container orchestration platforms such as Kubernetes. Some databases are not built with native support for orchestration — they don’t expect a scheduler to kill and relocate them to a different host. Building a highly available service on top of such databases is not trivial, and we still recommend running them on bare metal hosts or a virtual machine (VM) rather than to force-fit them into a container orchestration platform

https://www.thoughtworks.com/radar/techniques/managing-stateful-systems-via-container-orchestration

This is just more FUD that deserves to be cleared up. First, not all container management platforms are the same. I can only address from experience, what it means for Kubernetes. Kubernetes is ideally suited to run these kinds of workloads when used properly.

NOTE: Just so we're clear–our recommendation for production-grade infrastructure is to always use a fully-managed service like RDS, Kinesis, MSK, Elasticache, etc rather than self-hosting it, whether it be on Kubernetes or bare-metal/VMs. Of course, that only works if these services meet your requirements.

To set the record straight, Kubernetes won't randomly kill Pods and relocate them to a different host if configured correctly. First, by setting requested resources equal to the limits, the pods will have a Guaranteed QoS (Quality of Service) – the highest scheduling priority and be the last ones evicted. Then by setting a PodDisruptionBudget, we can be very explicit on what sort of “SLA” we want on our pods.

The other recommendation is to use the appropriate replication controller for the Pods. For databases, it's typically recommended to use StatefulSets (formerly called PetSets for a good reason!). With StatefulSets, we get the same kinds of lifecycle semantics when working with discrete VMs. We can get static IPs, assurances that there won't ever be 2 concurrent pods (“Pets”) with the same name, etc. We've experienced first hand how some applications like Kafka hate it when their IP changes. StatefulSets solve that.

If StatefulSets are not enough of a guarantee, we can provision dedicated node pools. These node pools can even run on bare-metal to assuage even the staunchest critics of virtualization. Using taints and tolerations, we can ensure that the databases on run exactly where want them. There's no risk that the “spot instance” will randomly nuke the pod. Then using affinity rules, we can ensure that the Kubernetes scheduler places the workloads as best as possible on different physical nodes.

Lastly, Kubernetes above all else is a framework for consistent cloud operations. It exposes all the primitives that developers need to codify the business logic required to operate even the most complex business applications. Contrast this to ThoughtWorks' recommendation of running applications on bare metal hosts or a virtual machine (VM) rather than to “force-fit” into a container orchestration platform: when you “roll your own”, almost no organization posses the in-house skillsets to orchestrate and automate this system effectively. In fact, this kind of skillset used to only be posses by technology like Google and Netflix. Kubernetes has leveled the playing field.

Using Kubernetes Operators, the business logic of how to operate a highly available legacy application or cloud-native application can be captured and codified. There's an ever-growing list of operators. Companies have popped up whose whole business model is around building robust operators to manage databases in Kubernetes. Because this business logic is captured in code, it can be iterated and improved upon. As companies encounter new edge-cases those can be addressed by the operator, so that everyone benefits. With the traditional “snowflake” approach where every company implements its own kind of Rube Goldberg apparatus. Hard lessons learned are not shared and we're back in the dark ages of cloud computing.

As with any tool, it's the operator's responsibility to know how to operate it. There are a lot of ways to blow one's leg off using Kubernetes. Kubernetes is a tool that when used the right way, will unlock the superpowers your organization needs.

Public “Office Hours” (2020-05-20)

Erik OstermanOffice Hours

Here's the recording from our DevOps “Office Hours” session on 2020-05-20.

We hold public “Office Hours” every Wednesday at 11:30am PST to answer questions on all things DevOps/Terraform/Kubernetes/CICD related.

These “lunch & learn” style sessions are totally free and really just an opportunity to talk shop, ask questions and get answers.

Register here: cloudposse.com/office-hours

Basically, these sessions are an opportunity to get a free weekly consultation with Cloud Posse where you can literally “ask me anything” (AMA). Since we're all engineers, this also helps us better understand the challenges our users have so we can better focus on solving the real problems you have and address the problems/gaps in our tools.

Public “Office Hours” 2020-05-20

Cloud Posse holds public

Public “Office Hours” (2020-05-13)

Erik OstermanOffice Hours

Here's the recording from our DevOps “Office Hours” session on 2020-05-13.

We hold public “Office Hours” every Wednesday at 11:30am PST to answer questions on all things DevOps/Terraform/Kubernetes/CICD related.

These “lunch & learn” style sessions are totally free and really just an opportunity to talk shop, ask questions and get answers.

Register here: cloudposse.com/office-hours

Basically, these sessions are an opportunity to get a free weekly consultation with Cloud Posse where you can literally “ask me anything” (AMA). Since we're all engineers, this also helps us better understand the challenges our users have so we can better focus on solving the real problems you have and address the problems/gaps in our tools.

Public “Office Hours” 2020-05-13

Cloud Posse holds public

Public “Office Hours” (2020-05-06)

Erik OstermanOffice Hours

Here's the recording from our DevOps “Office Hours” session on 2020-05-06.

We hold public “Office Hours” every Wednesday at 11:30am PST to answer questions on all things DevOps/Terraform/Kubernetes/CICD related.

These “lunch & learn” style sessions are totally free and really just an opportunity to talk shop, ask questions and get answers.

Register here: cloudposse.com/office-hours

Basically, these sessions are an opportunity to get a free weekly consultation with Cloud Posse where you can literally “ask me anything” (AMA). Since we're all engineers, this also helps us better understand the challenges our users have so we can better focus on solving the real problems you have and address the problems/gaps in our tools.

Public “Office Hours” 2020-05-06

Cloud Posse holds public