AI VIDEO PRODUCTION STUDIO · TRACK 1 · BUILD NEW AGENTS

Twenty-six agents.
Seven guilds.
One slate.

SLATE turns a single brief — a JSON manifest or a free-text prompt — into a finished, published video. One Showrunner (Gemini 2.5 Pro) reasons about the brief and dynamically dispatches 26+ specialized ADK agents across 7 guilds — research, script, voice, music, visuals, color and publish — unattended.

01 · WHY MIGRATE

V2 follows a script. It can't reason.

StudioF V2 was a procedural pipeline — a fixed sequence of function calls with hardcoded providers and no reasoning. Every new content type, provider, or edge case meant editing the orchestration code itself. The cost of iteration was the cost of the whole production.

12+ min
render time
for 90s of video
0
parallelism
sequential pipeline
·
retries
restart from scene 1
17K
lines of code
orchestrator.py · single file
02 · THE SYSTEM

Twenty-six agents, seven guilds, one Showrunner.

A single Showrunner — gemini-2.5-pro — reasons over the brief and dynamically dispatches 26+ specialized ADK agents across seven guilds via the AgentTool pattern, reading and writing a shared session-state contract. No fixed script: the Showrunner chooses research depth, visual sources, voices, and providers per production.

EDITORIAL · 5 ✓

Editorial

  • showrunner
  • deep_researcher
  • visual_scout
  • script_writer
  • dataviz_agent
AUDIO · 3 ✓

Audio

  • tts_agent
  • elevenlabs_agent
  • music_agent
IMAGE · 6 ✓

Image

  • imagen_agent
  • prompt_engineer_agent
  • diagram_agent
  • stock_agent
  • web_gallery_agent
  • kinetic_type_agent
VIDEO · 3 ✓

Video

  • veo_agent
  • video_composer_agent
  • youtube_clip_agent
3D · 1 ✓

3D

  • infographic_agent
  • blender_scene_agent
  • geo_agent
POST-PROD · 2 ✓

Post-Prod

  • assembly_agent
  • colorist_agent
  • captions_agent
PUBLISH · 6 ✓

Publish

  • youtube_agent
  • instagram_agent
  • tiktok_agent
  • twitter_agent
  • linkedin_agent
  • facebook_agent
"A slate is a contract: who, what, when, take, scene.
Our agents are slates."
03 · ARCHITECTURE

TTS first. Then parallel.

Audio drives durations, not the other way around. Narration is synthesized first — every visual agent then targets the exact scene duration, instead of the brittle "generate video, then trim audio" approach. Ten parallel visual-generation branches fan out once durations lock, reconciled by a quality gate before assembly.

PORTE 1 · MANIFEST
PORTE 2 · FREE-TEXT BRIEF
showrunner
deep_researcher
visual_scout
script_writer
dataviz_agent
tts_agent
🔒 EXACT DURATIONS LOCKED
∥ ∥ ∥
AUDIO POST
elevenlabs_agent · music_agent — sfx + score, parallel
VISUAL FAN-OUT
10 branches — Veo, Imagen, diagrams, dataviz, kinetic type, 3D, geo, stock, galleries, YouTube b-roll
STYLE
colorist_agent (LUT) + quality_gate
assembly_agent · FFmpeg (Cloud Run) → final.mp4 → GCS
04 · FINDINGS & LEARNINGS

From one script to 26 reasoning agents.

Five hard-won lessons from rebuilding StudioF V2 — a 17,000-line procedural pipeline — as a 26-agent reasoning system on Vertex AI Agent Engine.

26+
ADK agents
across 7 guilds
10
parallel branches
visual generation, fan-out + back
4
Vertex AI regions
quota distribution + retry/backoff
0
custom cost code
OTel → Cloud Trace → Billing
01

AgentTool scales further than expected.

A single gemini-2.5-pro Showrunner reliably dispatches 26+ heterogeneous agents — different models, tool types, MCP servers, even long-running async Veo jobs — as long as the shared session-state contract between them is explicit and documented.

02

TTS-first sync model.

Generating narration audio before the visual fan-out lets every visual agent target the exact scene duration — instead of the brittle "generate video, then trim audio" approach most pipelines use.

03

Session state is per-tool-call, not incremental.

VertexAiSessionService.append_event() only persists tool_context.state deltas when a tool function returns. A 10-15 minute, 8-phase orchestration tool that's interrupted mid-flight loses state from already-completed phases — even though our Firestore event log shows them done. A checkpoint/resume mechanism is on the roadmap.

04

Parallelism needs quota planning.

Running 10 parallel visual-generation branches against Vertex AI required distributing calls across 4 regions with consistent retry/backoff — without it, 429 ResourceExhausted errors dominated.

05 · HOW IT WORKS tap any step to expand

From a single prompt to a finished cut.

01 Submit a manifest, or a partial request.

Two doors. Porte 1 takes a complete ManifestConfig — topic, language, ratio, platform, duration, tone, research mode. Porte 2 takes an input video and a list of tasks: subtitles, color grade, translate, geo overlay.

{
  "topic": "La chute de Boeing",
  "language": "fr-FR",
  "aspect_ratio": "16:9",
  "platform": "YouTube",
  "target_duration_sec": 90,
  "tone": "Journalistique"
}
02 The Showrunner reasons, then dispatches across 7 guilds.

A single gemini-2.5-pro Showrunner reasons over the brief — research depth, visual sources, voice, providers — then dispatches AgentTool calls to 26+ specialized agents across 7 guilds, each reading its slice of one shared session-state contract.

showrunner →
├─ Editorial → research + script (5 agents)
├─ Audio → TTS + SFX + music (3 agents)
├─ Image/Video → 9 visual agents
├─ 3D → infographics
├─ Post-Prod → color + assembly (2 agents)
└─ Publish → 6 social platforms

03 TTS first. Exact scene durations lock before visuals.

The narration is synthesized first, before any visual is generated. Per-scene durations are extracted from the audio itself. Now every downstream agent knows the exact duration of its plate. No drift.

This is the sync lock that unblocks parallelism. Without it, ten branches couldn't run in parallel without timing collisions.

04 Ten branches fan out. A quality gate reconciles them.

Once durations lock, ten visual-generation branches run in parallel — Veo, Imagen, diagrams, dataviz, kinetic type, 3D infographics, geo-maps, stock, web galleries, YouTube b-roll — each scene routed to the provider that fits it best.

A quality gate checks asset coverage before assembly_agent runs a multi-threaded FFmpeg render on Cloud Run — color graded, assembled, and uploaded to Cloud Storage as a single MP4.

06 · ACCOMPLISHMENTS click a card to expand

From zero to a live agent platform.

01 · AGENTS & GUILDS
Zero to twenty-six, in weeks.
26+
Twenty-six production ADK agents across seven guilds — including MCP-based ElevenLabs and Splice integrations for audio — fully orchestrated by one Showrunner and deployed on Vertex AI Agent Engine.
guilds
7
agents
26+
mcp integrations
2
build time
~3 weeks
02 · A2A ENDPOINT
A live agent, not a demo.
LIVE
A public Cloud Run endpoint exposes Slate as a JSON-RPC 2.0 agent with a published agent card — reachable and composable by other agentic systems via tasks/send and tasks/sendSubscribe.
protocol
JSON-RPC 2.0
methods
2
agent card
host
Cloud Run
03 · VISUAL FAN-OUT
Ten branches out, one cut back.
10×
Veo 3, Imagen 3, hand-drawn diagrams, generative dataviz, kinetic type, 3D infographics, geo-maps, stock footage, web galleries and YouTube b-roll all run in parallel — reconciled by a quality gate before a single color-graded MP4 is assembled.
branches
10
models
Veo3 / Imagen3 / Lyria
quality gate
output
1 MP4
04 · OBSERVABILITY
Full tracing, zero extra code.
0 LOC
Every agent call, token count and pipeline phase emits OTel GenAI semantic-convention spans into Cloud Trace, plus a real-time Firestore event feed for live progress — with zero custom cost-tracking code.
spans
OTel GenAI
trace
Cloud Trace
events
Firestore
cost code
0
07 · WHERE WE STAND switch tab to compare

Picked a side? Pick all of them.

CapabilityRunwaySLATE
End-to-end journalistic production✗ frame-by-frame tool✓ full pipeline
Grounded research + fact check✓ deep_researcher (Google Search grounding)
Multi-agent orchestration✓ 26+ agents · 7 guilds
Self-hosted on your GCPSaaS only✓ Agent Engine deploy
Per-token cost visibilityopaque billing✓ OTel + Cloud Billing
CapabilityCaptionsSLATE
Vertical reels from text✓ great✓ Porte 1 · vertical preset
Long-form 90s+ journalistic✗ short-form only✓ up to 180s+
Sourced b-roll, archive footagestock templates✓ NewsClipper + Archive
Strangler-fig migration pathSaaS lock-in✓ wraps V2 monolith
Open architecture · MCP✓ MCP (ElevenLabs, Splice + 4 planned)
CapabilityDescriptSLATE
Editor-first workflow✓ excellentproducer-first
Autonomous production from prompt✓ Porte 1
Parallel agent executionlinear timeline✓ 3 pools · 15w
Custom voices + Lyria scorelimited library✓ Chirp 3 + Lyria
Pluggable models per agentopaque✓ manifest.models
CapabilityIn-house monolithSLATE
Time to ship a new featureweeks · cross-cutting✓ days · new agent
Parallelism0✓ 10 parallel branches
Retriesrestart from scene 1✓ per-agent retry
Observabilityartisanal cost tracker✓ Cloud Trace OTel
Testabilityend-to-end only✓ adk eval per agent
08 · FAQ click to expand

Questions, before access?

How is this different from Runway or Veo standalone?
Runway and Veo are generation models. SLATE is an end-to-end production pipeline — research, fact-check, script, narrate, source b-roll, generate, grade, render, package. We use Veo (and Imagen, Lyria, Chirp) inside, but the orchestration, the typed contracts, and the parallelism are what take it from a single brief to a published video, unattended.
Can I self-host SLATE on my own GCP project?
Yes. SLATE deploys to your Vertex AI Agent Engine. You bring the GCP project, we bring the agents. Multi-region quota rotation is baked in (four regions by default). Two MCP integrations ship today (ElevenLabs, Splice); four more — FFmpeg render, Remotion overlay, geo-render, web capture — are on the roadmap to run on your Cloud Run.
What is the "Strangler Fig" migration about?
We're replacing a 17,000-line procedural pipeline (StudioF V2) — fixed function calls, hardcoded providers, no reasoning — with 26 agents that reason about the brief, without touching the original code. V2 keeps running in production. A shadow-mode comparator that runs both systems on the same reference manifests is on the roadmap before V2 is retired.
Is SLATE open source?
The contracts and the graph definitions are open (Apache 2.0). The Showrunner and the visual scout heuristics are source-available for early-access partners under a commercial license. We're building the project in public on github.com/uraaura0o/slate.
What languages do you support today?
fr-FR and en-US are production-grade. es-ES, ar-MA, de-DE, pt-BR are in beta — voice quality matches, but our editorial style guide for those locales is still being written. Submit a request and we'll prioritise your locale.
How are costs tracked?
Every LLM call emits OTel spans with gen_ai.usage.input_tokens and gen_ai.usage.output_tokens attributes. They roll up into Cloud Trace and Cloud Billing natively. No custom cost tracker — you get the same dashboards Google ships.
What's next for Slate?
Four directions: Porte 2 à la carte — narrow requests ("redo the music for scene 4", "give me 3 alternate thumbnails") routed to only the relevant agents, no full production required. A Trajectory Optimizer agent that solves provider/agent assignment as a constrained optimization problem instead of per-scene LLM heuristics. Promoting our FFmpeg, Remotion, geo-render and web-capture services to first-class internal MCP servers. And filling the remaining guild agents (voice cloning, dubbing, captions, upscaling) plus a checkpoint/resume mechanism for long-running productions.
09 · PRICING private beta · early access

Pay for what rolls.

Three tiers. All include the full agent stack, multi-region quota rotation, Cloud Trace, and Model Armor. Differences are seat count, support, and per-production allowance.

Starter
$0 · hackathon
For solo creators and contributors. Hosted SLATE on shared infra, fair-use quota.
  • 5 productions / month
  • Porte 1 + Porte 2 access
  • Community Discord
  • Public manifest library
Enterprise
Custom
Broadcasters, agencies, regulated industries. Air-gapped deploy, on-call, custom guilds.
  • Unlimited productions
  • Air-gapped + SSO · SAML
  • Dedicated solutions architect
  • SOC2 + GDPR · DPA on file
  • Custom agent guilds
  • SLA · 99.95% uptime
05 · BUILT ON

Standing on good shoulders.

SLATE is opinionated about almost nothing infrastructure. We use Google's stack end-to-end and Anthropic's MCP as the tool protocol.

Google ADK
GEAP 2026
Vertex AI
Gemini 2.5
Imagen 3.1
Veo 3
Chirp 3 HD
Lyria
MCP
Cloud Run
Cloud Trace · OTel
EARLY ACCESS · APPLY

One slate.

We're inviting a small group of newsrooms + production studios to the private beta. If you make video at scale and want to migrate off a monolith, talk to us.