Golem
A personal AI operations agent. Chat-driven, policy-gated, with durable memory and real infrastructure side effects — all on Cloudflare.
Golem is an AI assistant that lives in Discord and manages operational tasks across my businesses. It's not a chatbot — it's an operational agent with side effects, confirmation workflows, rate limiting, circuit breakers, and a full audit trail. Built entirely on Workers, Durable Objects, Workflows, D1, and R2.
How It's Built
Architecture and implementation.
Ack-fast webhook pattern
Discord webhook returns HTTP 200 immediately. Processing happens async via a Session Durable Object. Prevents Discord timeouts, duplicate processing, and retry cascades.
Session coordination with Durable Objects
Each Discord chat gets its own DO instance handling deduplication (24-hour replay protection), session state management, and pending action coordination.
Agent execution with Workflows
GolemAgentWorkflow orchestrates the multi-step loop: load soul context → load history → parse commands or call Claude → evaluate policy → execute or confirm → feed result → next iteration. Up to 10 turns.
Policy engine
Every capability has a D1 policy record: confirmation required? Rate limited? Cooldown? Circuit breaker tripped? Mutations frozen? All enforced before any tool executes.
Isolated skill workers
Each capability runs in its own Worker via service bindings: web-fetch, email, hosting, server-admin, observability. A skill can crash or be redeployed without affecting the gateway.
Durable memory
Facts stored in D1 with confidence scores and source attribution. Memory injected into Claude's context on every turn. Redaction system prevents future extraction of sensitive topics.
Architecture Map
Request flow and service topology
Discord webhook → Gateway Worker → HTTP 200 (immediate)
↓ (async)
Session Durable Object
├── Dedup check (24h replay protection)
├── Upsert session + message → D1
└── Kick GolemAgentWorkflow
↓
Load context:
├── Soul files ← R2
├── Chat history ← D1
└── Memory facts ← D1
↓
Claude API (tool-use loop, max 10 turns)
↓
Policy check per tool call
├── Needs confirmation → Pending action + Discord buttons
├── Rate limited → Reject
└── Clear → Execute via skill worker (service binding)
├── skills-web-fetch
├── skills-email
├── skills-hosting
├── skills-server-admin
└── skills-observability
↓
Final text response → Discord API
Cron (every 5 min) → Expire pending actions past TTL Primitives Used
Every Cloudflare binding in this project.
What Makes This Interesting
The architectural angle worth paying attention to.
Golem is an AI agent with real side effects — restarting servers, sending emails, executing runbooks — and all the safety infrastructure is built on Cloudflare primitives. The policy engine, rate limiting, circuit breakers, confirmation workflows, and audit trails aren't bolted-on middleware. They're D1 records, Durable Object state, and Workflow checkpoints. The soul files in R2 mean personality and behavioral rules can change without redeploying code.