r/OpenTelemetry

▲ 39 r/OpenTelemetry+14 crossposts

I added dedicated AWS / EKS support to KubeShark.

Mini recap:

KubeShark is my Kubernetes skill for Claude Code and Codex.

It helps AI agents generate, review, and refactor Kubernetes manifests without falling into the usual LLM traps: missing security contexts, deprecated API versions, broken selectors, wildcard RBAC, unsafe probes, missing resource requests, and rollout configs that look okay but fail under real traffic.

The important part is that KubeShark is failure-mode-first. It does not just tell the model “write good Kubernetes”. It forces the model to reason about what can go wrong before it generates YAML, and then return validation and rollback guidance as part of the answer.

That matters a lot with Kubernetes, because many bad manifests are accepted by the API server and only fail later at runtime.

Repo: https://github.com/LukasNiessen/kubernetes-skill

---

Now what’s new:

KubeShark now has special dedicated AWS / EKS support.

When the task involves EKS, AWS, IRSA, EKS Pod Identity, AWS Load Balancer Controller, EBS/EFS CSI, AWS VPC CNI, or Karpenter, KubeShark switches into EKS-aware guidance.

That matters because EKS is “just Kubernetes” until identity, load balancing, storage, pod networking, and node provisioning enter the picture.

Common LLM mistakes include:

  • putting AWS access keys into Kubernetes Secrets
  • mixing IRSA and EKS Pod Identity assumptions
  • using nginx annotations with AWS Load Balancer Controller
  • treating EBS like ReadWriteMany storage
  • recommending Karpenter while omitting resource requests
  • assuming NetworkPolicy works without checking the CNI/policy engine

Example guidance KubeShark now keeps in mind:

apiVersion: v1
kind: ServiceAccount
metadata:
  name: app
  namespace: payments
  annotations:
    eks.amazonaws.com/role-arn: arn:aws:iam::123456789012:role/payments-app

It also knows that EBS is usually RWO and zone-sensitive, EFS is the RWX option, and Karpenter depends heavily on good workload requests.

So instead of generic Kubernetes advice, you get EKS-aware manifest generation and review.

u/trolleid — 4 days ago

Cleanup SQL query

I have a GO app that queries a database and it is instrumented with OTel.
I want to clean up the query as recorded in telemetry (not changing the code).

The GO code (screenshot below) produces this value:
"\n\t\tSELECT p.id, p.name, p.description, p.picture, \n\t\t p.price_currency_code, p.price_units, p.price_nanos, p.categories\n\t\tFROM catalog.products p\n\t\tWHERE p.id = $1\n\t"

This SQL query is recorded as a span attribute "db.query.text".

Q: How can I remove the escaped whitespace in the collector (or elsewhere?) so that there is a single space where there are sequences of escaped whitespaces?

GO code

reddit.com
u/Ordinary_Squirrel291 — 7 days ago
▲ 37 r/OpenTelemetry+4 crossposts

I built a repo of ready-to-run OpenTelemetry Collector configs (Prometheus, Jaeger, Dynatrace, Datadog, Loki, k8s), feedback welcome

I just open-sourced a collection of ready-to-run OpenTelemetry

Collector configurations, because finding complete, working configs

for your specific backend always takes hours of trial and error.

It now includes examples for:

  • Prometheus
  • Jaeger
  • Grafana Loki
  • Dynatrace
  • Datadog
  • Kubernetes Operator
  • Kubernetes Pod Annotation Scraping (with full relabeling)
  • Debug (no backend needed, perfect for local dev)

Each example includes Docker Compose so you can run it in 60 seconds.

The k8s pod annotation scraping example includes relabeling for

prometheus.io/scrape, prometheus.io/port, and prometheus.io/path

annotations, the config everyone googles when setting up k8s monitoring.

I also actively contribute to the OpenTelemetry open source project,

recently got PRs merged into open-telemetry/otel-arrow and have PRs

open in opentelemetry-android, opentelemetry-helm-charts, and

opentelemetry-dotnet-instrumentation.

https://github.com/Cloud-Architect-Emma/opentelemetry-collector-examples

Feedback and contributions welcome! ⭐ if it's useful.

#OpenTelemetry #DevOps #Observability #Kubernetes #SRE #Monitoring #CloudNative #OpenSource

u/EmmaOpu — 11 days ago

Decomposing OpenTelemetry Collector Configuration for Maintainability | OllyGarden Blog

This is one trick I tell people and surprise them most of the time: "the Collector can do this?"

This one took a while to write, the idea came during OTel Night here in Berlin and I noticed that decomposing the config wasn't helpful only for keeping sanity but also to enable small chunks to be tested.

ollygarden.com
u/jpkroehling — 10 days ago

How to convert Prometheus Remote Write metrics from Kafka into OTEL semantic conventions?

I’m trying to get OpenShift metrics into OTEL semantic conventions while keeping an OTel Collector after Kafka.

My understanding is that if Prometheus Remote Write data is received directly by the OTel Prometheus Remote Write receiver and exported as OTLP, the metrics are converted into OTEL metric format/semantic conventions where applicable.

However, our current pipeline is:

OpenShift Prometheus Remote Write -> Metricbeat -> Kafka -> OTel Kafka Receiver -> OTLP Exporter

The problem is that I don’t think the OTel Kafka receiver can decode Prometheus Remote Write payloads the same way the Prometheus Remote Write receiver does.

Has anyone implemented this architecture successfully with Kafka in the middle?

Specifically:
- Can the Kafka receiver process Prometheus Remote Write payloads correctly?
- Is there a way to preserve/convert to OTEL semantic conventions after Kafka?
- Should the data be converted to OTLP before it reaches Kafka instead?

TL;DR:
How do you convert Prometheus Remote Write metrics coming from Kafka into proper OTEL metrics/semantic conventions using an OTel Collector after Kafka?

reddit.com
u/13hyperdragoons — 11 days ago
▲ 48 r/OpenTelemetry+1 crossposts

CNCF TOC votes in favor of OTel Graduation

The CNCF technical oversight committee has voted to approve the OTel due diligence document.

This is one of the final steps towards graduation: the thorough due diligence, which included interviews with end users and resolution of the recommendations given in previous steps, has been finished and approved by the TOC 🎉

github.com
u/jpkroehling — 12 days ago