OpenTelemetry
A vendor-neutral standard for generating metrics, logs, and distributed traces.
OpenTelemetry (OTel) is a vendor-neutral standard and toolkit for producing all three signals — metrics, logs, and traces — with one set of APIs and SDKs. You instrument once and export to any backend (Prometheus, Jaeger, Datadog), avoiding lock-in.
Its standout is distributed tracing: a trace follows one request across services as a tree of spans, propagating a trace context (trace + span IDs) through HTTP headers. This shows where time goes in a microservice call chain. The OTel Collector receives, processes, and exports telemetry centrally.
A trace across services (one user request):
Trace abc123
├─ span: api-gateway 12ms
│ └─ span: auth-service 4ms
└─ span: orders-service 85ms ← the slow span
└─ span: db query 80ms ← root of the latencyPipeline:
App (OTel SDK) → OTel Collector → backend (Jaeger / Prometheus / vendor) - Define a trace and a span and how they relate.
- Explain how trace context propagates across services.
- Describe what problem distributed tracing solves that metrics alone can’t.
- What is the OTel Collector for?
Cheat Sheet▾
| Concept | Detail |
|---|---|
| OpenTelemetry | Vendor-neutral telemetry standard |
| Signals | Metrics, logs, traces |
| Trace | One request across services |
| Span | A unit of work in a trace |
| Context propagation | Pass trace IDs in headers |
| Collector | Receive/process/export telemetry |
| Benefit | No vendor lock-in |
Common Interview Questions▾
What is distributed tracing and what does it solve?
It follows a single request across multiple services as a tree of spans, revealing where latency and errors occur in a call chain — something aggregate metrics can’t pinpoint.
Why use OpenTelemetry instead of a vendor SDK?
It’s a vendor-neutral standard: instrument once with OTel APIs and export to any backend, avoiding lock-in and re-instrumentation if you switch tools.