#kubernetes - March 2021 | Slack Archive

20 messages

Archive: https://archive.sweetops.com/kubernetes/

Feb 2021

March 2021

Apr 2021

Brad McCoyabout 5 years ago

Hi everyone, I have two online events coming up that will be recording on CKAD/CKA study and exam tips, here are the links for those interested:

Brad McCoyabout 5 years ago

https://community.cncf.io/events/details/cncf-manly-presents-cloud-native-dojo-cka-study-exam-tips/#/

Brad McCoyabout 5 years ago

https://community.cncf.io/events/details/cncf-manly-presents-cloud-native-dojo-ckad-exam-tips/#/

Matt Gowieabout 5 years ago

Not exactly a Kubernetes question, but figured folks in this channel would know what I’m talking about exists — Does anyone know if there is a Network / TCP proxy tool out there that will do a manage-and-forward pattern (my own made up term for describing this) for long lived TCP connections?

I have a client running on K8s and one of their primary microservices holds long lived TCP socket connections with many thousands of clients through an AWS NLB. The problem is that whenever we do a deployment and update those pods the TCP connections require a re-connection which results in problems on the client side. So to provide an better experience for the clients we’re looking at what we can do to have those TCP connections always stay alive. My first thought is for a proxy layer that manages the socket connections with the client and then forwards socket connections to the actual service pods. That way even if the pods are swapped out behind the scenes, the original socket connection is still up and has no adverse affects on the clients.

Shreyank Sharmaabout 5 years ago

Hi All,

we have 4 node kubernetes cluster in production deployed using kops in AWS,
3 worker node and one master
nodes are c4.2xlarge 16gb memory each

with other pods we have elasticsearch deployed using helm, there we have
3 elasticsearch-data pods consumes 4000mb of memory each
3 elasticsearch-master pods consumes 2600mb of memory each
3 elasticsearch-client pod consumes 2600mb of memory each

all are distribed amoung nodes but one of the in one node, one of the elasticsearch-data pod is restarting daily like 2-3 times in a same node

i described the restarted pod which says just

Last State: Terminated
Reason: OOMKilled
Exit Code: 137
Started: Tue, 02 Mar 2021 20:31:07 +0530
Finished: Wed, 03 Mar 2021 17:46:02 +0530

there is no events

and when i checked the syslogs of the nodes in which the pod restarted it shows
C2 CompilerThre invoked oom-killer: gfp_mask=0x24000c0(GFP_KERNEL), nodemask=0, order=0, oom_score_adj=901
C2 CompilerThre cpuset=6126d0823d683f51d04603c4c6464c030464d3748c916c1a46621936846aac01 mems_allowed=0
CPU: 2 PID: 7743 Comm: C2 CompilerThre Not tainted 4.9.0-9-amd64 #1 Debian 4.9.168-1
Hardware name: Amazon EC2 c5.2xlarge/, BIOS 1.0 10/16/2017
...........
..
.

the version of elasticsearch is 6.7.0

anyone experienced same issue how to solve this pod restart issue

Shreyank Sharmaabout 5 years ago

Hi All,
Under what condition pod will exceeds its memory limit,

in my Kubernets cluster i have deployed elasticsearch deployed using helm, elasticsearch version is 6.7.0,
we have 3 elasticsearch-data pods and 2 elasticsearch-master pods and 1 client,

memory limit for elasticseach-data pod is 4gb, but one of the data pod is restarted everyday about 5-6 times(oom kill), when i checked in Grafana for pod’s memory and cpu usage, i can see that one of the elasticsearch-data pod is using twice the memory limit(8gb) ,

So i wanted to know
Under what condition pod will exceeds its memory limit,

also in syslogs- when oom_kill happened
C2 CompilerThre invoked oom-killer: gfp_mask=0x24000c0(GFP_KERNEL), nodemask=0, order=0, oom_score_adj=901
[28621138.637578] C2 CompilerThre cpuset=441fa5603f64f86888937bc911269fca47dfcdb318648cc1ac0832cdfb07134d mems_allowed=0
[28621138.639850] CPU: 5 PID: 7749 Comm: C2 CompilerThre Not tainted 4.9.0-9-amd64 #1 Debian 4.9.168-1
[28621138.641757] Hardware name: Amazon EC2 c5.2xlarge/, BIOS 1.0 10/16/2017
[28621138.643152] 0000000000000000 ffffffff85335284 ffffa53882de7dd8 ffff8dda11dec040
......
..
.
[28621138.662399] [<ffffffff85615f82>] ? schedule+0x32/0x80
[28621138.663485] [<ffffffff8561bc48>] ? async_page_fault+0x28/0x30
[28621138.669097] memory: usage 4096000kB, limit 4096000kB, failcnt 383494862

Here at shows aroung 4gb

but at last
[ pid ] uid tgid total_vm rss nr_ptes nr_pmds swapents oom_score_adj name
[28621138.876368] [24691] 0 24691 256 1 4 2 0 -998 pause
[28621138.878201] [ 7436] 1000 7436 2141564 989996 2342 11 0 901 java
[28621138.879983] Memory cgroup out of memory: Kill process 7436 (java) score 1870 or sacrifice child
[28621138.881978] Killed process 7436 (java) total-vm:8566256kB, anon-rss:3941732kB, file-rss:18252kB, shmem-rss:0kB
but here its showing total-vm is 8gb

am confused why its showing 4 in one place and 8 in another place

Andrew Rothabout 5 years ago

https://twitter.com/HelmPack/status/1367959661521948673?s=19

Issifabout 5 years ago

I just released a kubectl plugin I developed at my day job, maybe someone will find it useful https://github.com/qonto/kubectl-duplicate

btaiabout 5 years ago(edited)

whats the recommendation for pod level iam roles? I know the ones being used the most initially was kube2iam and kiam but iirc one or both of them had rate limiting issues (which is why i avoided them initially). I know AWS came out with one here and was just curious what people are using nowadays or if theres a general consensus of best one https://aws.amazon.com/blogs/opensource/introducing-fine-grained-iam-roles-service-accounts/ Personally I’m using kops, so curious if theres issues integrating w/ aws irsa

Mr.Devopsalmost 5 years ago

Hi does anyone have a recommended approach on injecting passwords to kops templates using cloud int?

Fernanda Martinsalmost 5 years ago

Hello everyone, I am configuring Federated Prometheus to monitor multiple Cluster for the first time. Any tips? On how to organize operators and etc? Tks!

Azulalmost 5 years ago

anyone using EKS with fargate profiles? I am on a project where we started using it and we had to submit a support request to increase the maximum number of profiles in the cluster from 10 to 20. A fargate profile maps to a kubernetes namespace, so I'm essentially looking into 20 namespaces in these EKS clusters. That's in my view a fairly small number, and I expect these to increase as we add more apps onto the cluster. Maybe be worth mentioning at this point, that by default I can launch about 1000 fargate nodes on a EKS cluster, so EKS was designed to scale. Anyway the docs list the fargate profiles quota as possible to be raised through the console, but that's incorrect, so I raised a support request to do this. The feedback I received was that they would raise it with the fargate service team as it was a fairly large increase. My thought here, is he serious ? we're talking about 20 namespaces/fargate profiles. What exactly is large about this request? A google search didn't show any relevant posts of the number of fargate profiles, so I thought of coming here to ask: who here is using fargate on EKS, and how many namespaces are you using?

M Hunteralmost 5 years ago

Hi. Can anyone recommend articles/videos on configuring k8s on an "airgapped" system? The OS is Centos7. Thanks!

Eric Bergalmost 5 years ago

Any body have thoughts on where I should use .Release.Name vs .Chart.Name?

Shreyank Sharmaalmost 5 years ago

Hi All, we have Elasticsearch cluster running in our Kubernetes cluster and it is deployed using helm chart, and fluentd is sending logs from each nodes,
we have
2 data nodes, 2 master node and a client node, and from yesterday data nodes are in not ready state and because of the client keep getting restarted. so as fluentd and kibana
elastichq-7cf55c6bbc-998pq 1/1 Running 0 1y
elasticsearch-client-5dbccbd776-7kpwk 1/1 Running 79 1d
elasticsearch-data-0 0/1 Running 18 1d
elasticsearch-data-1 0/1 Running 21 1d
elasticsearch-master-0 1/1 Running 0 1d
elasticsearch-master-1 1/1 Running 0 1d
fluentd-fluentd-elasticsearch-hhh8v 1/1 Running 147 1y
fluentd-fluentd-elasticsearch-ksfnx 1/1 Running 110 1y
fluentd-fluentd-elasticsearch-lnbll 1/1 Running 94 1y
kibana-b7768db9d-r57st 1/1 Running 347 1y
logstash-0 1/1 Running 6 1y

after describeing the fluentd pod i come to know that,

Killing container with id <docker://fluentd-fluentd-elasticsearch>:Container failed liveness probe.. Container will be killed and recreated.

after referring to some links i found ------------
Data nodes — stores data and executes data-related operations such as search and aggregation
Master nodes — in charge of cluster-wide management and configuration actions such as adding and removing nodes
Client nodes — forwards cluster requests to the master node and data-related requests to data nodes
-------------------------------------------------------

in the kubectl get events output it says rediness probe failed for elasticsearch-data pods(we increased timeout values and recreated all pods again)

so am assuming client is failing because of elasticsearch-data pods are in not ready state, and also in one of the data pod i can see
java.lang.OutOfMemoryError: Java heap space
Dumping heap to data/java_pid1.hprof ...
Unable to create data/java_pid1.hprof: File exists
data pod’s memory limit is 4gb and heap is 1.9 which is fine i think,,,

Since mater node is responsible for adding and removing the nodes i went inside the master pod and did
curl localhost:9200/_cat/nodes
elasticsearch-client-5dbccbd776-7kpwk
* elasticsearch-master-1
elasticsearch-master-0
data pods are not listed here, after checking the logs of master node, i can see lot of
[INFO ][o.e.c.s.ClusterApplierService] [elasticsearch-master-0] removed {{elasticsearch-data-1} master is keep adding and removing the data pods .
and in master-1 logs i can see
org.elasticsearch.transport.NodeDisconnectedException:

we did a helm upgrade <chartname> -f custom_valuefile.yaml --recreate-pods which did not worked.

is there any workaround or solution for this behaviour, Thanks in advance

Matt Gowiealmost 5 years ago

ArgoCD V2 — https://blog.argoproj.io/argo-cd-v2-0-rc1-is-here-f7d21ff1aa64

btaialmost 5 years ago

i wasn’t aware of this, but the ridiculously low allowed pod count on EKS (i.e 29 pods on m4.large) is tied specifically to the AWS VPC CNI. apparently we can skirt around that issue by uninstalling the default CNI and installing a different one. anyone try doing this? https://docs.projectcalico.org/getting-started/kubernetes/managed-public-cloud/eks

Padarnalmost 5 years ago

We’re looking for a nice way to orchestrate performance tests in a k8s cluster, any suggestions?

An example scenario: We want to test the performance of using redis vs using minio as an object cache. Would like to be able to easily setup, run the test, and teardown

Andreaalmost 5 years ago

Hi, to anyone who's running Windows worker nodes, can you please share/suggest how to collect the pod logs? On the linux nodes I've been fairly happy with fluent-bit (deployed as a helm chart). fluent-bit collect the logs and send them to elasticsearch. I'm not having much luck with the same procedure on Windows though...

Christianalmost 5 years ago

Do people generally use managed node groups now?

February 2021 Browse by date April 2021