Agent observability
Agent observability is the tooling that makes an agent’s behavior in production visible — tracing tool calls, prompts, costs, and failures.
Agent observability captures what an agent actually did: each model request, tool call, input, and output, plus latency and cost. Without it, a misbehaving agent in production is a black box.
Beyond tracing, mature observability adds prompt versioning, regression detection, and alerting on quality or cost drift — turning incidents into data you can act on.
It pairs naturally with evals: observability surfaces real-world failures, and those failures become new eval cases.