Orchestrator
Content pipeline that turns YouTube videos into published articles with AI assistance and editorial controls.
Orchestrator powers publishing for the Channel Sites network. It handles discovery, script matching, image selection, article generation, link suggestions, and publishing flow with checkpoints and review steps.
Architecture
How it's wired.
How It's Built
Implementation notes.
YouTube discovery + script matching
Pulls channel videos from the YouTube Data API and stores metadata in D1. Scripts are matched with fuzzy Jaccard similarity (>=0.9) against Trello cards. Supports .docx, .pdf, and Google Docs.
AI image selection with editorial allowlist
Claude extracts image search queries from scripts. Sources are allowlisted (Wikimedia, UN Photos, NATO, DVIDS, government Flickr). Candidates are scored for relevance, authenticity, size (>=1200px), and licensing.
SEO article generation pipeline
Multi-step AI flow: cleanup, source-grounded draft, key takeaways/FAQ enrichment, metadata optimization, then final markdown assembly.
Link suggestions via Vectorize
Published articles are indexed as embeddings. New articles query similar content to suggest internal and cross-site links for editorial review in D1.
Human-in-the-loop publishing
Low-confidence images are sent to Trello for manual approval with a 72-hour polling window. Final articles are committed to GitHub as markdown to trigger CI/CD.
Idempotency at every stage
Posts are checked for existence before generation. Each run is recorded in workflow_runs with per-phase timing, decision data, and AI outputs tied to a correlation ID.
Primitives Used
Cloudflare primitives in this project.
Why This Design
Why I built it this way.
"Orchestrator is useful because it combines AI generation with editorial controls and operational durability. Each stage is isolated, retriable, and observable, so failures are easier to debug and recover from."