Grizzle
Live Content Pipeline

Orchestrator

Content pipeline that turns YouTube videos into published articles with AI assistance and editorial controls.

Orchestrator powers publishing for the Channel Sites network. It handles discovery, script matching, image selection, article generation, link suggestions, and publishing flow with checkpoints and review steps.

Architecture

How it's wired.

TRIGGER DISCOVERY + MATCHING WORKFLOW — durable orchestration OUTPUT Cron Trigger hourly orchestrator-sync discovery coordinator orchestrator-youtube video discovery via YT API orchestrator-trello script card matching D1 + R2 videos / scripts / metadata orchestrator-workflow Workflows — durable, checkpointed execution image discovery + score seo article generation linker semantic links AI Gateway Claude (article + scoring) GitHub PR article → Channel Site Trello human image approval Vectorize article embeddings

How It's Built

Implementation notes.

YouTube discovery + script matching

Pulls channel videos from the YouTube Data API and stores metadata in D1. Scripts are matched with fuzzy Jaccard similarity (>=0.9) against Trello cards. Supports .docx, .pdf, and Google Docs.

AI image selection with editorial allowlist

Claude extracts image search queries from scripts. Sources are allowlisted (Wikimedia, UN Photos, NATO, DVIDS, government Flickr). Candidates are scored for relevance, authenticity, size (>=1200px), and licensing.

SEO article generation pipeline

Multi-step AI flow: cleanup, source-grounded draft, key takeaways/FAQ enrichment, metadata optimization, then final markdown assembly.

Link suggestions via Vectorize

Published articles are indexed as embeddings. New articles query similar content to suggest internal and cross-site links for editorial review in D1.

Human-in-the-loop publishing

Low-confidence images are sent to Trello for manual approval with a 72-hour polling window. Final articles are committed to GitHub as markdown to trigger CI/CD.

Idempotency at every stage

Posts are checked for existence before generation. Each run is recorded in workflow_runs with per-phase timing, decision data, and AI outputs tied to a correlation ID.

Primitives Used

Cloudflare primitives in this project.

Workers 10 microservice workers with focused responsibilities
Workflows Durable multi-step orchestration with retry and state persistence
D1 Videos, posts, Trello cards, workflow runs, SEO pages, link placements, channel config
R2 Normalized scripts, AI output artifacts, image metadata, audit traces
AI Gateway Rate-limited, authenticated access to Claude for content and image scoring
Workers AI Text embeddings for Vectorize and alternative LLM inference
Vectorize Article embeddings for semantic link suggestion matching
Service Bindings Zero-latency inter-worker communication across all 10 services
Cron Triggers Hourly sync runner

Why This Design

Why I built it this way.

"Orchestrator is useful because it combines AI generation with editorial controls and operational durability. Each stage is isolated, retriable, and observable, so failures are easier to debug and recover from."