Open source · MIT licensed

A telemetry graph agents can actually debug with.

Debug AI agents, and help AI agents debug software. Observability Unified turns traces, logs, replay, AI cost, profiles, agent actions, tool calls, evals, and instrumentation gaps into one evidence graph. The evidence retrieval layer applies CCR — compressed context retrieval — so agents start with compact bundles, citations, confidence, compaction provenance, and explicit refs they can expand only when needed. Start locally with one Docker image. For production, choose Cloudflare Workers with D1/R2, or run the Node collector on any cloud with Postgres and S3-compatible storage.

SDKs + MCP TypeScript · Go · Rust · browser · agents SDK docs

Identity propagated end-to-end: user_id → session_id → interaction_id → trace_id → span_id → action_id The interaction ID follows a frontend action into backend spans, logs, AI calls, actions, evals, and MCP tools; CPU/off-CPU profiles join through the trace it caused.

Observability Unified dashboard showing an agent action graph connected to traces, logs, replay, AI costs, and CPU profile evidence
Agent Action Graph

See what an agent did, what each step caused, which evidence is explicit or inferred, and which trace, log, replay, AI cost, tool, eval, or profile explains the result.

Product proof

The same demo, seen through every signal.

Captured from the OpenTelemetry Astronomy Shop flowing through Observability Unified, including the Agent Action Graph, interaction ID path, connected trace evidence, AI cost, and profile join point.

Agent Action Graph in the Observability Unified dashboard
Agent Action Graph

The agent run, plan, tool calls, evals, trace ID, and interaction ID are visible in one causal view.

Interaction ID path in the Observability Unified dashboard
Interaction ID path

The same interaction_id follows a frontend action into backend traces, logs, replay, AI spans, and profiles.

Astronomy Shop service map in the Observability Unified dashboard
Astronomy Shop service map

Real demo traffic across 21 services, edge volume, latency, and errors.

AI cost and LLM spans in the Observability Unified dashboard
AI cost and LLM spans

Model, token, cost, latency, and error signals stay queryable together.

Unified timeline in the Observability Unified dashboard
Unified timeline

User, trace, log, replay, and usage events line up in one incident story.

Trace to CPU profile in the Observability Unified dashboard
Trace to CPU profile

Trace detail keeps the profile join point visible so agents can carry a root-cause path down to CPU evidence.

One collector · one identity chain · dashboard + MCP

Unified observability means every signal knows its neighbors.

Observability Unified replaces the patchwork of APM + logging + product analytics + session replay + LLM observability + alerting with one unified stack, correlated through a single identity chain and explorable through the dashboard, action graph, structured evidence references, or MCP tools.

Distributed tracing

OTLP-native ingest. Spans, baggage, and propagation work with first-party Node, Go, and Rust SDKs — and any OpenTelemetry SDK over OTLP.

Structured logs

JSON logs with severity, automatic trace correlation, and per-module loggers. Auto-attached to the active span — no copy-paste IDs.

AI and agent actions

LLM calls, retrievals, tool calls, agent runs, eval cases, tokens, USD cost, latency, and failure category — all linked into one causal action graph.

Agent-readable evidence

Evidence bundles compact repeated logs, rank spans, preserve citations, and return retrieval refs so agents can expand raw logs, profiles, replay, AI calls, or tool calls on demand.

Session replay

Browser sessions recorded with rrweb, stored as DOM-mutation chunks in R2, and replayed inline next to the trace they belong to.

Usage analytics

Page views, interactions, errors, UTM parameters, and identity stitching — the product-analytics layer, in the same store as your traces.

Interaction ID to CPU

A click mints one interaction_id. Backend spans, logs, AI calls, action graphs, MCP tools, and CPU profiles join through the resulting trace.

Alerts

Alert rules over any signal — latency, error rate, AI spend, custom usage events. One rules engine and one notification surface across the stack.

User profiles

Identity linking from anonymous visitor to logged-in user, with traits and event history. Every signal can be filtered by user and replayed from their perspective.

SDKs and MCP server

First-party SDKs for TypeScript, Go, and Rust plus an MCP server that lets agents inspect traces, logs, replays, profiles, agent runs, actions, tools, and evals.

Evidence retrieval · CCR

Give the agent investigation-ready context, not a telemetry dump.

CCR means compressed context retrieval: the collector clusters repetitive logs, selects exemplars, ranks failed spans and critical paths, correlates traces/actions/profiles/replay, and hands the agent explicit retrieval refs for deeper inspection. The agent still controls expansion; the platform just makes the first packet useful.

Before CCR

Raw-tool debugging makes the agent ask for everything.

Without CCR, an agent tends to page through broad logs, full traces, profile metadata, replay metadata, and action records separately. Repeated messages like hundreds of identical 404s consume context even though one exemplar plus count is enough to start.

search_logs({ traceId, limit: 500 })
get_trace(traceId)
get_action(actionId)
get_profile(profileId)
get_replay(sessionId)
After CCR

Evidence bundles summarize first, expand second.

With CCR, the agent asks for one compact bundle anchored on a trace, action, agent run, or tool call. The response includes summaries, evidence refs, compaction records, suggested pivots, and retrieval refs for raw data that remains available but explicit.

get_evidence_bundle({ anchor, targetTokens })
retrieve_evidence_ref(refId)
search_evidence_ref(refId, "checkout 404")
get_evidence_stats()

Benchmark recipe

Benchmark by running the same incident twice: CCR off, then CCR on.

Use fixed anchors, the same agent prompt, and the same token budget. In the off run, let the agent use raw tools such as log search and trace detail. In the on run, start with get_evidence_bundle and only expand retrieval refs the agent chooses. Compare token input, tool calls, raw rows read, cited evidence, and whether the same root cause is reached.

Run it locally pnpm benchmark:ccr Read the benchmark methodology and raw output

Repeated 404 burst

CCR off
Search logs around the failing trace and return the broad result window.
CCR on
Cluster matching log signatures, include exemplars, and attach a logs retrieval ref.
Measure
Raw log rows read vs. exemplar count, prompt tokens, and whether the answer cites the trace and exemplar log.

Slow checkout trace

CCR off
Load the full span tree, then ask the model to infer the critical path.
CCR on
Return failed spans, critical path summaries, and connected profile refs up front.
Measure
Spans included in context, top latency span agreement, and profile ref expansion rate.

Agent tool regression

CCR off
Fetch agent runs, actions, tool calls, eval rows, AI calls, traces, and logs separately.
CCR on
Anchor on the agent run or tool call and return causal path, side-effecting tool refs, failed evals, and connected traces.
Measure
Tool calls made by the debugging agent, correct failed tool/eval identification, and evidence citations.

Replay/profile deep dive

CCR off
Include replay/profile metadata early even when the agent may not need it.
CCR on
Expose replay event-window and profile-frame refs but require explicit expansion.
Measure
How often agents expand deep refs, payload bytes read, and whether unnecessary raw replay/profile data stays out of context.

Executed benchmark

Measured CCR run: 500 repeated 404 logs collapsed to 3 exemplars.

Local benchmark run on June 4, 2026 against the shipped evidence retrieval route. Scenario: a checkout trace with a failed payment span and 500 correlated 404 log rows. CCR preserved the failed payment span reference while reducing the first context packet.

JSON bytes
202,406 5,274
Estimated tokens
50,602 1,319
Log rows in first packet
500 3 exemplars
Reduction
baseline 97.4%

Quick preview

From zero to correlated signals in three steps

Start the local stack, pick SDKs for your runtime, and initialize each service. Backend spans, frontend interactions, and AI spans share the same identity chain automatically.

1 Run the GHCR image

Pull the all-in-one image from GHCR, or build the same image from a clone.

bash
# Fastest first run
docker run --rm -p 5173:5173 -p 8790:8790 \
  ghcr.io/obs-unified/local:latest

# Editable local repo
git clone https://github.com/obs-unified/obs-unified.git
cd obs-unified
pnpm install
pnpm local:image
pnpm local:run
2 Pick SDKs

Runnable examples and recipes live in Examples and SDK docs

text
Backend:
  TypeScript  @obs-unified/* on GitHub Packages
  Go          sdks/go
  Rust        sdks/rust

Browser:
  React/vanilla  @obs-unified/analytics-sdk
3 Instrument backend and frontend

Short path shown. Full TypeScript, Go, and Rust examples live in Getting started.

typescript
// Backend
initObservability({ serviceName: "checkout-api" });
const log = createLogger("checkout");
const llm = startLLMSpan("checkout.assistant");
log.info("charge.starting", { interaction_id });
llm.end();

// Frontend
trackInteraction("checkout_click");

Architecture

One collector. Your telemetry graph.

Instrumented services write telemetry into Observability Unified. Humans and agents read the same connected graph through dashboard APIs, compact evidence bundles, structured evidence references, and MCP tools, while ingest credentials stay separate from investigation access.

Instrumented systems
Frontend app

interactions, replay, errors

Backend services

traces, logs, profiles

Workers

edge requests, jobs

AI / LLM calls

agent runs, tokens, cost, tools

Write boundary SDKs + OTLP ingest

Write-only keys send telemetry without read access.

Observability Unified
Collector

Normalizes every signal into one identity chain.

Owned storage

D1/R2 or Postgres/S3

Connected graph

evidence IDs, compact bundles, traces, logs, replay, AI cost, actions, CPU

Read boundary Dashboard + APIs + MCP

Agents inspect telemetry with MCP investigation tools.

Investigation clients
Dashboard users

inspect sessions, traces, logs, replay, alerts, costs

Debugging agents

follow compact bundles, refs, confidence, and pivots

Incident workflows

follow evidence from action to root cause

How it compares — snapshot as of May 2026

One stack instead of three (or nine)

Most teams glue an APM, a product-analytics tool, an error/session tool, and now an LLM-observability tool together. Observability Unified brings those workflows under one identity chain and one dashboard, so humans and agents can traverse from user action to backend trace, logs, replay, AI cost, and CPU profile while keeping the data plane in your infrastructure.

Capability Observability Unified Datadog Sentry PostHog Honeycomb New Relic Grafana Cloud SigNoz Uptrace HyperDX
Hosting model Self-host on your infra SaaS only [1] SaaS or Fair Source self-host [2] Cloud-first · OSS self-host (hobby) [3] SaaS · Private Cloud AWS [4] SaaS only [5] Cloud or self-host LGTM [6] Cloud or OSS self-host [7] Cloud or OSS self-host [8] Cloud · OSS · ClickStack [9]
Pricing model Free · pay your own infra Per host + per signal [10] Tier + per-unit overage [11] Per-unit, per product [12] Per event volume [13] $/GB + per-user seat [14] Per series + $/GB [15] $0.30/GB ingest [16] $0.10/GB ingest [17] $20 flat + $0.40/GB [18]
Traces / APM OTLP-native (HTTP) Yes · OTLP in Preview [19] Performance · OTLP beta [20] LLM-scoped only [21] OTLP-native [22] Native OTLP [23] Tempo · OTLP [24] OTLP-native [25] OTLP-native [26] OTLP-native [27]
Structured logs Trace-correlated Yes [28] Yes · trace-connected [29] Yes (GA Jan 2026) [30] Modeled as wide events [31] Yes · Grok-parsed [32] Loki [33] Yes · Logs Explorer [34] Yes [35] Yes · ClickHouse-backed [36]
AI / LLM observability Built-in LLM Observability [37] Agent Mon · Seer [38] LLM Analytics [39] Agent Obs (Early Access) [40] AI Monitoring [41] AI Observability (preview) [42] LLM Observability [43] [44] Via OpenLLMetry [45]
Session replay rrweb Yes (RUM) [46] Yes (web + mobile) [47] rrweb [48] [49] Yes · DOM-based [50] Yes · Frontend Obs [51] [52] [53] Yes · auto-linked [54]
Product analytics Yes Yes [55] [56] Flagship [57] [58] Browser-only [59] [60] [61] [62] [63]
Alerts All signals · one engine Many · Watchdog ML [64] Issues · uptime · crons [65] On trends only [66] Triggers · BubbleUp [67] NRQL conditions [68] Unified Alerting [69] 5 alert types [70] Metric + Error [71] Search + chart-based [72]
Cross-signal pivots One traversable graph Within platform [73] Within platform · trace-id [74] Within event store [75] Unified wide events [76] Via NRQL [77] Per-data-source plumbing [78] Trace ↔ logs [79] Within platform · UQL [80] Auto-linked across signals [81]
Data ownership Your D1/R2 or Postgres/S3 Datadog cloud [1] Sentry cloud or yours [2] PostHog cloud or yours [3] Honeycomb or your AWS [82] New Relic US/EU [5] Grafana cloud or yours [6] SigNoz cloud or yours [83] Uptrace cloud or yours [8] Cloud (US) or yours [84]

Numbered superscripts link to the underlying vendor source. Full methodology, vendor profiles, and quoted citations live in the comparison research doc (last reviewed 2026-05-31 · re-reviewed quarterly).

FAQ

Operational questions, answered up front

The questions evaluators ask in the first call — also exposed as structured data for AI search.

What is Observability Unified?

Observability Unified is an open-source observability platform for humans and AI agents debugging software. A single collector ingests traces, logs, AI calls, frontend events, session replays, alerts, profiles, analyses, and Agent Action Graph records. The dashboard helps engineers debug AI agents and production systems; the MCP server gives AI agents access to the same evidence graph with compact evidence bundles, stable IDs, confidence, citations, retrieval refs, and next pivots. The fastest first run is one local Docker image with Postgres, the collector, dashboard, blob storage, and seed data.

What is the Agent Action Graph?

The Agent Action Graph shows what an agent did and what each step caused. It links browser actions, agent runs, LLM calls, retrievals, tool calls, guardrails, backend traces, logs, profiles, and eval cases through stable action IDs. Engineers see it in the dashboard; AI agents can traverse the same graph through MCP.

Is this for debugging AI agents or for agents debugging software?

Both. Observability Unified debugs AI agents by showing LLM calls, retrievals, tool calls, agent runs, evals, costs, latency, and failures in an Agent Action Graph. It also helps AI agents debug software by exposing MCP tools for traces, logs, service maps, users, replays, connected signals, agent runs, actions, and tool calls.

Can AI agents inspect Observability Unified through MCP?

Yes. The Observability Unified MCP server exposes tools for status, traces, logs, service maps, AI sessions, users, replays, connected signals, profiles, evals, agent runs, actions, tool calls, evidence bundles, retrieval refs, and evidence expansion stats. Agents can start from a failing trace, AI cost spike, user session, profile frame, analysis result, or action ID and gather the same evidence an engineer sees in the dashboard.

What is CCR?

CCR means compressed context retrieval. Instead of sending an agent every correlated log, span, replay chunk, profile frame, AI call, and tool payload immediately, Observability Unified returns a compact evidence bundle first: clustered log exemplars, ranked spans, citations, compaction provenance, suggested pivots, and retrieval refs. The raw evidence remains available, but the agent has to expand it intentionally.

How is it different from Datadog, Sentry, or PostHog?

Its primary difference is unification for debugging. APM traces, logs, product analytics, session replay, AI observability, alerting, profiles, agent action graphs, and analyses live in one collector and one dashboard, correlated through a single identity chain and exposed to agents through MCP. Instead of leaving agents to scrape dashboards or prose, the platform returns compact evidence bundles, concrete evidence references, confidence, exemplar pivots, retrieval refs, and connected signals. It also runs on your own infrastructure, so no external telemetry vendor sits in the data path.

What's the data retention model?

Retention is controlled by the RETENTION_HOURS environment variable on the collector and defaults to 72 hours. Profile blobs have a separate PROFILE_RETENTION_HOURS override because they're larger per record. Because everything lives in your storage account, you set the policy and pay the storage directly; there's no per-event retention tier to negotiate.

When do I outgrow D1, and what's the upgrade path?

D1 is the default low-ops hosted path for small and medium deployments. The practical ceiling depends on event volume, cardinality, and retention, so heavy installs should move to the Node collector with Postgres plus S3-compatible blob storage before D1 becomes the bottleneck.

How does it handle PII and GDPR?

Self-hosting is the headline answer: data never leaves your infrastructure, so residency and processor questions reduce to where you deploy. On top of that, the usage-event pipeline applies default-redact scrubbing on ingest; fields named like email, token, password, authorization, or cookie are stripped from context and properties JSON before storage. Session replays use rrweb, which masks input fields by default and supports per-element block/mask attributes.

Does it support SSO or multi-user dashboards?

Not today. Two auth boundaries ship: a write-only ingest API key for SDKs and a single password for the dashboard. Multi-user, RBAC, and SSO are out of scope and tracked separately. Most teams put the dashboard behind their existing identity proxy, such as Cloudflare Access or Tailscale, in the meantime.

Can I migrate from Datadog, Sentry, or PostHog?

Sentry, PostHog, Honeycomb, and older @obs/* package migrations are covered in the docs. For Datadog, OTLP-native ingest accepts the standard OpenTelemetry SDK over OTLP HTTP, so traces and logs are usually a configuration change; a dedicated walkthrough is tracked separately.

Does it work with my existing OpenTelemetry SDK?

Yes. The collector accepts OTLP over HTTP using JSON or protobuf, with gzip. gRPC is intentionally not supported because Cloudflare Workers cannot host it, and OTLP HTTP covers every official SDK. The first-party SDKs are thin wrappers that point the standard OpenTelemetry SDK at the collector and add OpenInference helpers for LLM and tool spans.

Is it free and open source?

Yes. Observability Unified is MIT-licensed. You pay only for the infrastructure you run it on. For production, choose either Cloudflare Workers with D1/R2, or the Node collector on AWS, GCP, Azure, Fly.io, Render, Kubernetes, or any cloud that can provide Postgres and S3-compatible object storage.