Orchestrator
A content pipeline that turns YouTube videos into published articles. AI-assisted, editorially controlled, running on 10 microservices — all on Cloudflare Workers.
The Orchestrator is the production backbone for the Channel Sites network. It discovers YouTube videos, matches them to scripts, selects images from approved sources, generates SEO-optimized articles with AI, suggests internal links, and publishes to the sites — all through durable Workflows with human-in-the-loop checkpoints.
How It's Built
Architecture and implementation.
YouTube discovery + script matching
Fetches all channel videos via YouTube Data API, stores metadata in D1. Scripts matched to videos via fuzzy Jaccard similarity (≥0.9) against Trello cards. Handles .docx, .pdf, and Google Docs.
AI image selection with editorial allowlist
Claude analyzes scripts to extract image search queries. Sources: Wikimedia, UN Photos, NATO, DVIDS, government Flickr. Each candidate scored for relevance, authenticity, dimensions (≥1200px), and licensing.
SEO article generation pipeline
Multi-step AI pipeline: cleanup → draft from source script (no hallucinated facts) → enrich with key takeaways and FAQ → optimize metadata → assemble final markdown.
Link suggestions via Vectorize
All published articles indexed as embeddings. New articles query for semantically similar content, generating internal and cross-site link suggestions stored in D1 for editorial review.
Human-in-the-loop publishing
Low-confidence images get flagged on Trello for manual approval with a 72-hour polling window. Final articles committed to GitHub as markdown, triggering CI/CD.
Idempotency at every stage
Post existence check before generation. Every pipeline execution recorded in workflow_runs. Per-phase timing, decision data, and AI outputs logged with correlation IDs.
Architecture Map
Request flow and service topology
orchestrator-sync (hourly cron trigger)
→ orchestrator-youtube (video discovery)
→ orchestrator-trello (script card matching)
→ orchestrator-store (D1/R2 persistence)
→ orchestrator-script (script extraction + normalization)
→ orchestrator-match (video-to-card matching)
→ orchestrator-workflow (durable orchestration)
→ orchestrator-image (image discovery + scoring)
→ orchestrator-seo (article generation pipeline)
→ orchestrator-linker (semantic link suggestions) Primitives Used
Every Cloudflare binding in this project.
What Makes This Interesting
The architectural angle worth paying attention to.
The Orchestrator is a content pipeline built like a production system: idempotency at every stage, durable execution with Workflows, human-in-the-loop approval via Trello with async polling, fact-preservation constraints on AI generation, source allowlisting for image provenance, full audit trails with timing and decision data. The microservice decomposition via Workers + service bindings means each stage scales, deploys, and fails independently. Total infrastructure: 10 Workers, a D1 database, two R2 buckets, and a Vectorize index.