Live Observability Infrastructure

Analytics Worker

Observability worker for uptime checks, incident tracking, analytics proxying, and runbook triggers.

Analytics Worker monitors sites in the network, sends downtime alerts, proxies analytics through first-party routes, and exposes operational APIs for Golem and dashboards.

← All projects

Architecture

How it's wired.

How It's Built

Implementation notes.

PostHog analytics proxy

Every channel site gets a ph.<domain> subdomain. The Worker proxies requests to PostHog EU endpoints so analytics stays first-party in the browser.

Uptime monitoring with alerting

A cron trigger runs every 5 minutes and checks target status, body validity, latency, and forms. State changes trigger Discord alerts and Golem webhooks.

Job heartbeat supervision

Background jobs POST heartbeats to /ops/jobs/heartbeat. If a job misses its SLA window, an incident is opened. This catches silent background failures.

Observability REST API

Authenticated endpoints expose summaries, incidents, anomalies, target details, and operational actions. These APIs are consumed by Golem and dashboards.

Runbook execution (approval-gated)

Runbook execution opens an incident and sends a webhook to Golem for approval. Only allowlisted runbook IDs are accepted, with Discord confirmation before execution.

Frontend crash logging

POST /_log accepts frontend crash reports with a shared token and stores them in KV. Critical crashes can then be forwarded to incidents and Discord.

Primitives Used

Cloudflare primitives in this project.

Workers Single worker handling all proxy, monitoring, and API functions

KV Monitoring state (target status, incidents), job heartbeat timestamps, crash logs

Cron Triggers 5-minute monitoring cycle

Zone Routes 13 domain-specific route bindings for analytics proxy (ph.<domain>)

Why This Design

Why I built it this way.

"The interesting part is scope: multiple observability responsibilities in one worker with explicit state and APIs. It keeps operational plumbing small while still giving enough visibility for day-to-day incident response."

← All projects Why Cloudflare