Skip to main content

Observability & Receipts

Eve Horizon provides built-in observability through structured logging, correlation IDs, execution receipts, org-level analytics, and OpenTelemetry integration. Every job produces a receipt with timing, token usage, and cost data -- giving you full visibility into what ran, how long it took, and what it cost.

Observability overview

The observability stack is designed for CLI-first debugging. Rather than requiring separate dashboards, Eve surfaces the data you need through CLI commands and API endpoints:

Correlation IDs

Every request that enters the Eve API receives a correlation ID via the x-eve-correlation-id header. If the caller provides one, it is preserved; otherwise a UUID is generated and echoed back in the response.

Correlation IDs propagate across the full request chain:

API --> Orchestrator --> Worker --> Runner Pod

This means you can trace a single job execution from the initial API call through to the harness output using one identifier.

Structured logging

All Eve services emit JSON logs with a consistent set of standard fields:

FieldDescription
timestampISO 8601 timestamp
levelLog level (info, warn, error)
serviceEmitting service (api, orchestrator, worker)
messageHuman-readable log message
correlation_idRequest correlation ID
trace_idOpenTelemetry trace ID (when OTEL is enabled)
job_idAssociated job ID (when available)
attempt_idAssociated attempt ID (when available)

Job execution lifecycle events are also written to execution_logs with correlation fields embedded in the lifecycle metadata, allowing you to reconstruct the full timeline of any job attempt.

Execution receipts

Every completed job attempt produces an execution receipt -- an immutable snapshot of what happened during execution. Receipts are the primary tool for understanding job performance and cost.

What receipts contain

SectionData
TimingBillable milliseconds, phase durations
LLM usageTotal input/output tokens, model breakdown
Base costCost in USD from rate card pricing
Billed costCost in org currency (after exchange rates)
ComputeResource class usage

Receipts are assembled from two sources:

  1. Lifecycle events -- timing and phase transitions recorded by the orchestrator
  2. llm.call events -- usage-only events (no content) emitted by harnesses after each provider call

Viewing receipts

# Receipt for the latest attempt on a job
eve job receipt <job-id>

# Receipt for a specific attempt
eve job receipt <job-id> --attempt 2

# Compare two attempts (with receipt data)
eve job compare <job-id> 1 2 --receipt

The eve job follow command also displays live cost totals as llm.call events stream during execution.

Receipt API endpoints

EndpointPurpose
GET /jobs/{job_id}/receiptReceipt for latest attempt
GET /jobs/{job_id}/attempts/{attempt_id}/receiptReceipt for specific attempt
GET /jobs/{job_id}/compare?a=1&b=2&include_receipt=trueCompare attempts with receipts

Receipts are immutable snapshots. Recomputation is only needed for backfills or pricing corrections:

eve admin receipts recompute --since 7d --project proj_xxx --dry-run

Analytics dashboard

Eve provides org-level analytics for operational reporting across jobs, pipelines, and environments. These are read-only endpoints designed for dashboards and health checks.

Analytics summary

The summary endpoint gives a high-level view of org activity within a time window:

eve analytics summary --org org_xxx --window 7d

Returns:

{
"as_of": "2026-02-12T12:00:00Z",
"window": "7d",
"projects": 3,
"jobs": { "created": 12, "completed": 9, "failed": 1, "active": 2 },
"pipelines": { "runs": 4, "success_rate": 75, "avg_duration_seconds": 420 },
"environments": { "total": 5, "healthy": 4, "degraded": 1, "unknown": 0 }
}

Job analytics

Drill into job-level metrics across the org:

eve analytics jobs --org org_xxx --window 7d

Returns individual job records with phase, duration, and outcome data. The window parameter accepts 1d, 7d, 30d, or 90d.

Metric definitions

MetricDefinition
jobs.createdJobs created within the time window
jobs.completedJobs that reached the done phase
jobs.failedJobs that failed (any attempt)
jobs.activeJobs currently in an active phase
pipelines.success_ratesucceeded / total for pipeline runs in the window
pipelines.avg_duration_secondsMean duration from started_at to completed_at

Environment health

Monitor the health of all environments across the org:

eve analytics env-health --org org_xxx

Returns the latest known deploy and health snapshot per environment:

{
"environments": [
{ "name": "staging", "project_id": "proj_xxx", "status": "healthy" },
{ "name": "production", "project_id": "proj_xxx", "status": "healthy" }
]
}

Environment status values: healthy, degraded, or unknown (based on the latest health snapshot).

Pipeline analytics

Track pipeline performance and reliability:

eve analytics pipelines --org org_xxx --window 30d

Returns per-pipeline metrics including run count, success rate, and average duration.

Cost tracking

Eve tracks costs at two levels: per-job (via receipts) and per-org (via the balance ledger).

Pricing model

Costs are driven by rate cards and exchange-rate snapshots:

  • Rate cards are immutable versioned documents (name + version + effective date)
  • Exchange-rate snapshots are stored for auditable currency conversions
  • Pricing is resolved per attempt based on the effective rate card at execution time

Per-job budgets

Jobs can set per-attempt budgets via scheduling hints in the manifest or at job creation:

x-eve:
defaults:
hints:
max_cost:
currency: usd
amount: 5
max_tokens: 200000
resource_class: job.c1

The worker tracks llm.call events during execution and terminates attempts with BUDGET_EXCEEDED when limits are breached. Budget enforcement is fail-open -- if pricing configuration cannot be resolved, the job continues rather than blocking.

Org balance and usage

Org balances are tracked via an immutable ledger. Non-job resources (services, PVCs, managed databases) are periodically metered into usage_records and charged against org balances.

# View org balance
eve admin balance show <org_id>

# Credit an org
eve admin balance credit <org_id> --amount 100 --currency usd --reason "Monthly allocation"

# View transaction history
eve admin balance transactions <org_id> --since 2026-01-01

# View non-job resource usage
eve admin usage summary --org <org_id>

Environment suspension

When org balances fall below thresholds, the suspension controller can suspend environments. Suspended environments block deploys and job creation until resumed:

eve env suspend <project> <env> --reason "Balance depleted"
eve env resume <project> <env>

Provider and model discovery

Eve surfaces available LLM providers and models through public endpoints:

# List providers
eve providers list

# List models from a specific provider
eve providers models <name>

# List all available managed models
eve models list

Managed models are used by setting harness_options.model to managed/<name> in the manifest or job creation.

OpenTelemetry integration

Eve supports OpenTelemetry (OTEL) for integration with external observability platforms. OTEL uses the OTLP HTTP exporter with automatic Node.js instrumentation.

Configuration

VariablePurpose
OTEL_ENABLEDEnable OTEL (true / false)
OTEL_DISABLEDHard disable OTEL (true to override)
OTEL_EXPORTER_OTLP_ENDPOINTCollector endpoint (e.g., http://otel-collector:4318)

OTEL is automatically enabled when OTEL_EXPORTER_OTLP_ENDPOINT is set. Traces include correlation IDs and job context, allowing you to link Eve operations to your existing observability stack.

Real-time monitoring

Job-level monitoring

# Stream harness logs as they happen (SSE)
eve job follow <job-id>

# Combined status + logs streaming
eve job watch <job-id>

# Stream K8s runner pod logs
eve job runner-logs <job-id>

# Wait with status updates
eve job wait <job-id> --verbose

System-level monitoring

# Quick health check
eve system health

# Platform service logs
eve system logs api
eve system logs orchestrator
eve system logs worker
eve system logs postgres

CLI reference

CommandPurpose
eve job receipt <job-id>View execution receipt
eve job compare <job-id> <a> <b> --receiptCompare attempts with receipts
eve job follow <job-id>Stream logs with live cost totals
eve analytics summary --org <id>Org-wide analytics summary
eve analytics jobs --org <id> --window 7dJob analytics for time window
eve analytics pipelines --org <id>Pipeline performance metrics
eve analytics env-health --org <id>Environment health snapshot
eve system healthPlatform health check
eve system logs <service>Platform service logs
eve admin balance show <org_id>View org balance
eve admin usage summary --org <org_id>View resource usage
eve providers listList LLM providers
eve models listList available models
note

Analytics endpoints require orgs:read permission. Empty orgs return zeroed summaries rather than 404 errors. The window parameter accepts 1d, 7d, 30d, or 90d.

See CLI Commands for the full command reference.