Analytics Worker
Observability infrastructure for the whole network. Uptime monitoring, incident management, analytics proxying, and runbook execution — one Worker, zero servers.
The Analytics Worker is the operational backbone. It monitors every site in the network, alerts on downtime, proxies analytics past ad blockers, and exposes observability APIs that Golem and dashboards consume. It's the thing that wakes me up when something's broken.
How It's Built
Architecture and implementation.
PostHog analytics proxy
Ad blockers kill analytics. Every channel site gets a ph.<domain> subdomain. The Worker proxies requests to PostHog's EU servers — static assets to CDN, API calls to ingestion endpoint. Browser sees first-party subdomain.
Uptime monitoring with alerting
Cron trigger every 5 minutes. Checks all targets: HTTP status, response body validity, latency (flags anomalies above 3.5s), form verification. State transitions trigger Discord alerts and Golem webhooks.
Job heartbeat supervision
Background jobs POST heartbeats to /ops/jobs/heartbeat. Configurable SLA per job — if a job hasn't checked in within the allowed window, an incident is created. Catches the silent failures that cron monitoring misses.
Observability REST API
Authenticated endpoints: GET /ops/summary, /ops/incidents, /ops/anomalies, /ops/target/:id. POST /ops/targets, /ops/runbook/:id/execute, /ops/restart/:target. Consumed by Golem and dashboards.
Runbook execution (approval-gated)
Runbook execution creates an incident and sends a webhook to Golem for approval. Only allowlisted runbook IDs accepted. Actual execution flows through Golem's policy engine with Discord confirmation required.
Frontend crash logging
POST /_log endpoint accepts crash reports from frontend apps, authenticated via shared token. Logs to KV with plans to forward critical crashes to the incident system and Discord.
Architecture Map
Request flow and service topology
Browser analytics → ph.<domain> → Worker → PostHog EU API / CDN
Cron (every 5 min) → Monitor all targets
├── HTTP checks (status, body, latency)
├── State transitions → KV
├── Incidents → KV + Discord notification
└── Webhook → Golem gateway
Background jobs → POST /ops/jobs/heartbeat → KV timestamp
↓ (if stale past SLA)
Incident created
Golem / Dashboard → GET /ops/* → Health summaries, incidents, anomalies
→ POST /ops/runbook/:id/execute → Incident + Golem webhook
Frontend → POST /_log → KV crash log Primitives Used
Every Cloudflare binding in this project.
What Makes This Interesting
The architectural angle worth paying attention to.
This is a complete observability platform — monitoring, alerting, incident management, job supervision, runbook execution, analytics proxying — running as a single Cloudflare Worker with KV for state. No Prometheus. No Grafana. No PagerDuty. No dedicated monitoring infrastructure. The cron trigger fires every 5 minutes, checks everything, updates KV, sends alerts if states changed, and goes back to sleep.