#sre - February 2021 | Slack Archive

sreArchived

6 messages

Archive: https://archive.sweetops.com/monitoring/

Jan 2021

February 2021

Mar 2021

Kareemabout 5 years ago

Has anyone had success trying out Datadog Real User Monitoring (RUM)? Considering it and just curious about anybody's experiences. Also open to alternatives for tracking user events and behavior, more so for troubleshooting client-side interactions rather than analytics.

Joan Portaabout 5 years ago

Hi guys! in k8s, I want to use Opentelemetry collector to gatter logs, In the cluster I have multiple app's. Is it posible to not need in each app POD a sidecar with opentelemetry agent, just only the daemonset? I dont want extra overhead having to put a sidecar to all app's POD's

Pierre Humberdrozabout 5 years ago(edited)

Does someone know a helm chart for a good / updated query exporter?
https://github.com/albertodonato/query-exporter
https://github.com/free/sql_exporter
https://github.com/justwatchcom/sql_exporter

Shtrullabout 5 years ago(edited)

HELP
I have the next prom query (to reduce the results while testing I limited a specific reader_id)
irate(nexite_reader_all_packets_per_channel_total{reader_id="10000"}[1m])

and here is a cleaned up result (i have manully removed the ns, container,svc i.e)

{branch_id="3689", chain_id="3390", channel="37", instance="10.4.2.236:8188",  pod="my-super-pod-67cd5b7d5f-pn754", reader_id="10000" } = 108
{branch_id="3689", chain_id="3390", channel="37", instance="10.4.2.40:8188",  pod="my-super-pod-67cd5b7d5f-dwmkw", reader_id="10000" } = 163
{branch_id="3689", chain_id="3390", channel="38", instance="10.4.2.236:8188",  pod="my-super-pod-67cd5b7d5f-pn754", reader_id="10000" } = 77
{branch_id="3689", chain_id="3390", channel="38", instance="10.4.2.40:8188",  pod="my-super-pod-67cd5b7d5f-dwmkw", reader_id="10000" } = 121
{branch_id="3689", chain_id="3390", channel="39", instance="10.4.2.236:8188",  pod="my-super-pod-67cd5b7d5f-pn754", reader_id="10000" } = 86
{branch_id="3689", chain_id="3390", channel="39", instance="10.4.2.40:8188",  pod="my-super-pod-67cd5b7d5f-dwmkw", reader_id="10000" } = 131

before k8s they had one "pod" so if they did sum by (branch_id) they got the right results but becues the pods are dynamic they get the results in double
and each run qeuerys their database not at the same time they are off by a bit

is there a elagent way to first run avg on both pods, and then run sum by branch_id?

Garethabout 5 years ago

Good Afternoon all,
Can anybody make a recommendation for filtering unwanted or wanted lines from Logs running on EC2 instance within AWS using the Unified Cloudwatch Agent?
As far as I'm aware there isn't an ability to filter before ingestion into cloudwatch.

I believe AWS recommendation is to filter the log into another log and then consume the filtered log. So, before I have to write something for Centos and windows, I wonder if anybody can make a recommendation for an app that could be used to transform / filter the logs?

btaiabout 5 years ago

I’ve always struggled w/ this but my use case is kind of special so curious whether anyone has run into this. We have a ton of kubernetes deployments in our prod cluster (maybe like 15-20k in our production cluster). We run deployments nightly where we will have thousands of deployments of new pods. When this happens I get a ton of alerts for replica pods going down & unavailable deployment replicas detected. I believe this is somewhat normal procedure as the pods get rotated. I wish that I wouldn’t have to resolve all the alerts, but at the same time I don’t want to disable alerting during deployment time either. Anyone have a good workaround for this? (I’m testing out datadog currently)

January 2021 Browse by date March 2021