Public “Office Hours” (2020-05-27)

Erik OstermanOffice Hours

Here's the recording from our DevOps “Office Hours” session on 2020-05-27.

We hold public “Office Hours” every Wednesday at 11:30am PST to answer questions on all things DevOps/Terraform/Kubernetes/CICD related.

These “lunch & learn” style sessions are totally free and really just an opportunity to talk shop, ask questions and get answers.

Register here: cloudposse.com/office-hours

Basically, these sessions are an opportunity to get a free weekly consultation with Cloud Posse where you can literally “ask me anything” (AMA). Since we're all engineers, this also helps us better understand the challenges our users have so we can better focus on solving the real problems you have and address the problems/gaps in our tools.

Should You Run Stateful Systems via Container Orchestration?

Erik OstermanDevOpsLeave a Comment

Recently it was brought up that ThoughtWorks now says that:

We recommend caution in managing stateful systems via container orchestration platforms such as Kubernetes. Some databases are not built with native support for orchestration — they don’t expect a scheduler to kill and relocate them to a different host. Building a highly available service on top of such databases is not trivial, and we still recommend running them on bare metal hosts or a virtual machine (VM) rather than to force-fit them into a container orchestration platform

https://www.thoughtworks.com/radar/techniques/managing-stateful-systems-via-container-orchestration

This is just more FUD that deserves to be cleared up. First, not all container management platforms are the same. I can only address from experience, what it means for Kubernetes. Kubernetes is ideally suited to run these kinds of workloads when used properly.

NOTE: Just so we're clear–our recommendation for production-grade infrastructure is to always use a fully-managed service like RDS, Kinesis, MSK, Elasticache, etc rather than self-hosting it, whether it be on Kubernetes or bare-metal/VMs. Of course, that only works if these services meet your requirements.

To set the record straight, Kubernetes won't randomly kill Pods and relocate them to a different host if configured correctly. First, by setting requested resources equal to the limits, the pods will have a Guaranteed QoS (Quality of Service) – the highest scheduling priority and be the last ones evicted. Then by setting a PodDisruptionBudget, we can be very explicit on what sort of “SLA” we want on our pods.

The other recommendation is to use the appropriate replication controller for the Pods. For databases, it's typically recommended to use StatefulSets (formerly called PetSets for a good reason!). With StatefulSets, we get the same kinds of lifecycle semantics when working with discrete VMs. We can get static IPs, assurances that there won't ever be 2 concurrent pods (“Pets”) with the same name, etc. We've experienced first hand how some applications like Kafka hate it when their IP changes. StatefulSets solve that.

If StatefulSets are not enough of a guarantee, we can provision dedicated node pools. These node pools can even run on bare-metal to assuage even the staunchest critics of virtualization. Using taints and tolerations, we can ensure that the databases on run exactly where want them. There's no risk that the “spot instance” will randomly nuke the pod. Then using affinity rules, we can ensure that the Kubernetes scheduler places the workloads as best as possible on different physical nodes.

Lastly, Kubernetes above all else is a framework for consistent cloud operations. It exposes all the primitives that developers need to codify the business logic required to operate even the most complex business applications. Contrast this to ThoughtWorks' recommendation of running applications on bare metal hosts or a virtual machine (VM) rather than to “force-fit” into a container orchestration platform: when you “roll your own”, almost no organization posses the in-house skillsets to orchestrate and automate this system effectively. In fact, this kind of skillset used to only be posses by technology like Google and Netflix. Kubernetes has leveled the playing field.

Using Kubernetes Operators, the business logic of how to operate a highly available legacy application or cloud-native application can be captured and codified. There's an ever-growing list of operators. Companies have popped up whose whole business model is around building robust operators to manage databases in Kubernetes. Because this business logic is captured in code, it can be iterated and improved upon. As companies encounter new edge-cases those can be addressed by the operator, so that everyone benefits. With the traditional “snowflake” approach where every company implements its own kind of Rube Goldberg apparatus. Hard lessons learned are not shared and we're back in the dark ages of cloud computing.

As with any tool, it's the operator's responsibility to know how to operate it. There are a lot of ways to blow one's leg off using Kubernetes. Kubernetes is a tool that when used the right way, will unlock the superpowers your organization needs.

Public “Office Hours” (2020-05-20)

Erik OstermanOffice Hours

Here's the recording from our DevOps “Office Hours” session on 2020-05-20.

We hold public “Office Hours” every Wednesday at 11:30am PST to answer questions on all things DevOps/Terraform/Kubernetes/CICD related.

These “lunch & learn” style sessions are totally free and really just an opportunity to talk shop, ask questions and get answers.

Register here: cloudposse.com/office-hours

Basically, these sessions are an opportunity to get a free weekly consultation with Cloud Posse where you can literally “ask me anything” (AMA). Since we're all engineers, this also helps us better understand the challenges our users have so we can better focus on solving the real problems you have and address the problems/gaps in our tools.

Public “Office Hours” (2020-05-13)

Erik OstermanOffice Hours

Here's the recording from our DevOps “Office Hours” session on 2020-05-13.

We hold public “Office Hours” every Wednesday at 11:30am PST to answer questions on all things DevOps/Terraform/Kubernetes/CICD related.

These “lunch & learn” style sessions are totally free and really just an opportunity to talk shop, ask questions and get answers.

Register here: cloudposse.com/office-hours

Basically, these sessions are an opportunity to get a free weekly consultation with Cloud Posse where you can literally “ask me anything” (AMA). Since we're all engineers, this also helps us better understand the challenges our users have so we can better focus on solving the real problems you have and address the problems/gaps in our tools.

Public “Office Hours” (2020-05-06)

Erik OstermanOffice Hours

Here's the recording from our DevOps “Office Hours” session on 2020-05-06.

We hold public “Office Hours” every Wednesday at 11:30am PST to answer questions on all things DevOps/Terraform/Kubernetes/CICD related.

These “lunch & learn” style sessions are totally free and really just an opportunity to talk shop, ask questions and get answers.

Register here: cloudposse.com/office-hours

Basically, these sessions are an opportunity to get a free weekly consultation with Cloud Posse where you can literally “ask me anything” (AMA). Since we're all engineers, this also helps us better understand the challenges our users have so we can better focus on solving the real problems you have and address the problems/gaps in our tools.

Fun Facts About the Kubernetes Ingress

JeremyDevOpsLeave a Comment

Here are some important things to know about the Kubernetes Ingress resource as implemented by ingress-nginx (which is what we use at Cloud Posse).

  • An Ingress must send traffic to a Service in the same Namespace as the Ingress
  • An Ingress, if it uses a TLS secret, must use a Secret from the same Namespace as the Ingress
  • It is completely legal and supported to have multiple ingresses defined for the same host, and this is how you can have one host refer to 2 services in different namespaces
  • Multiple ingresses for the same host are mostly merged together as if they were one ingress, with some exceptions:
    • While the ingresses must refer to resources in their own namespaces, the multiple ingresses can be in different namespaces
    • Annotations that can be applied to only one ingress are applied only to that ingress
    • In case of conflicts, such as multiple TLS certificates or server-scoped annotations, the oldest rule wins
    • These rules are defined more rigorously in the documentation
  • Because of the way multiple ingresses are merged, you can have an ingress in one namespace that defines the TLS secret and external DNS name target and not have that defined at all in the other ingresses and yet they will all appear with the same TLS certificate

The paths section of the Ingress deserves some special attention, too:

  • The interpretation of the path is implementation-dependent. GCE ingress treats the path as an exact match, while Nginx treats it as a prefix. Starting with Kubernetes 1.18, there is a pathType field that can be either Exact or Prefix, but the default remains implementation-dependent. Generally, helm charts appear to expect path to be interpreted the way Nginx does.
  • The general rule is that the longest matching path wins, but it gets complicated with regular expressions (more below)
  • Prior to Kubernetes 1.18 (and maybe even then), there is no way for an Ingress to specify the native Nginx exact path match. The closest you can come is to use a regex match, but regex matches are case-independent. Furthermore, adding a regex path to an ingress makes all the paths of that ingress case-independent regexes, by default rooted as prefixes.
  • The catch here is that it is still the longest rule that wins, even over an exact match: /[abc]pi/ will take precedence over /api/
  • There is a simple explainer of priorities and gotchas with mutiple path rules in the ingress-nginx documentation and a fuller explanation in this tutorial from Digital Ocean.
  • With Nginx, if a path ends in /, then it creates an implied 301 permanent redirect from the path without the traling / unless that path is also defined. path: /api/ will cause https://host/api to redirect to https://host/api/. (Not sure if this applies to regex paths.)