Stack Metrics - RunsOn

RunsOn exposes four distinct monitoring surfaces. They solve different problems, and the older docs blurred them together too much.

Pick the right page

If you need	Open
The authoritative OTEL reference and full generated metric inventory	`/monitoring/opentelemetry/`
Per-job ASCII charts for right-sizing and troubleshooting	`/monitoring/job-metrics/`
A high-level overview of all monitoring options	This page

Signal matrix

Surface	What it gives you	Best for	Important limits
OpenTelemetry	Server logs, metrics, and traces, plus runner host metrics, bootstrap logs, and RunsOn-emitted traces when `extras=otel` is enabled	Centralized observability backends such as Grafana, Datadog, SigNoz, or New Relic	`extras=otel` does not automatically export the full GitHub job log stream or create per-step spans
Prometheus `/metrics`	Legacy server metrics compatibility endpoint	Existing scrape-based Prometheus setups	No runner signals, no traces, and no OTLP transport
CloudWatch dashboard and logs	Built-in AWS-native views from structured server logs plus instance log groups	Quick operational visibility inside AWS	The dashboard is mostly log-derived, so it is less flexible and more brittle than a first-class OTEL backend
Inline job metrics	ASCII charts in GitHub Actions job steps; the built-in runner path also uploads raw `metrics.jsonl` to S3	Right-sizing and debugging a single job	Separate from remote OTLP export; `runs-on/action@v2` and the built-in runner path are different variants

OpenTelemetry

If you configure OtelExporterEndpoint, RunsOn exports server-side OTLP logs, metrics, and traces. Current runners already use the local collector for built-in job metrics, and if you also enable extras=otel on a job, or on the runner spec used by a pool, that collector can export runner host metrics, bootstrap logs, and RunsOn-emitted traces through the same OTLP destination.

The full behavior, exact emitted signals, and generated inventory live on the dedicated OpenTelemetry page.

Prometheus

RunsOn still supports a legacy Prometheus /metrics endpoint on the server. This is useful when you already have Prometheus scraping in place, but it is a compatibility surface rather than the preferred long-term observability model.

The endpoint is enabled by setting ServerPassword. For new installs, prefer OTLP export unless you specifically need Prometheus scraping.

CloudWatch

RunsOn creates an embedded CloudWatch dashboard and instance log groups for AWS-native troubleshooting. This is useful for quick checks, alarms, and instance-level investigation.

The important limitation is structural: the dashboard widgets are built from structured logs and AWS-native metrics rather than from a first-class OTEL time-series backend. That makes CloudWatch handy for operational visibility, but less expressive for long-lived observability, correlation, and custom analysis.

Inline job metrics

RunsOn now has two inline job-metrics variants. The built-in runner path renders ASCII charts in Complete runner and uploads metrics.jsonl to your RunsOn S3 bucket. The optional runs-on/action@v2 ↗ path renders CloudWatch-backed charts in Post Run runs-on/action@v2 and does not generate metrics.jsonl.

That flow is intentionally separate from OTLP export. You can have inline ASCII charts without sending runner metrics to your OTEL backend. See Job Metrics for the step-level metrics flow and OpenTelemetry for remote OTLP behavior.