← All workflows

Produce short-form video content without a camera or crew

Video content without booking a shoot

Script to short-form video using only AI. Not perfect. But fast, cheap, and actually yours.

5 min read

Video is still the highest-trust medium. A person explaining something on camera is worth five of the same words in a blog post, in terms of how audiences relate to the maker. The problem: production friction. Camera setup, lighting, reshoots, editing, colour grading. Most people with something worth saying never say it on video because the cost-to-output ratio is brutal.

AI video tools change that ratio. Not to zero — the human writing and directing is still the work — but to something that makes a consistent video practice possible without a production budget.


Step 1 — Script with Claude

A video script is not a blog post with line breaks. It's spoken prose: shorter sentences, one idea per beat, deliberate pauses built in. Ask Claude to convert your outline into a script written for spoken delivery. Give it the duration (60 seconds, 90 seconds) and the register (direct explanation, narrative, argument).

Read it aloud before you go further. If you stumble, revise. The script is the most important part of this workflow — everything downstream is only as good as what it's given.

ClaudeAI CHAT

The AI that actually thinks before it speaks — brilliant for long-form reasoning, writing, and analysis that holds up under scrutiny.

Step 2 — Voiceover with ElevenLabs

Paste the final script. Choose a voice that matches the tone — not a commercial announcer, not an audiobook narrator. Something that sounds like a person who knows what they're talking about, talking to someone they respect.

Listen to the full output at speed. The AI reads what you wrote, which means any awkward phrasing you missed on the page becomes obvious in audio. Fix the script if you need to. The second pass is usually right.

ElevenLabsAUDIO

The moment you hear it speak, you stop thinking about AI — it just sounds like a person. Best-in-class voice generation for anyone who wants their words to actually land.

Step 3 — Visual generation with Runway or Kling

This is the honest part of the workflow: AI-generated video is still visually distinctive in a way that human footage isn't. It works best for abstract concepts, atmospheric b-roll, and scenes that are hard to film — not for anything that needs to look photorealistic or documentary.

Runway is stronger on cinematic motion and colour. Kling handles longer durations and has stronger physical coherence. Use them for the sections of your video where the voiceover is carrying the argument and the visual just needs to not be distracting.

RunwayVIDEO GEN

Where serious video work meets generative AI. The tool that made text-to-video feel cinematic instead of weird — still the benchmark everything else gets measured against.

Kling 3.0VIDEO GEN

It’s basically the cinematic king right now; everything I make looks like it was shot on an IMAX by a director with a massive budget.


What this workflow is actually for

This isn't a replacement for a produced video series with real footage. It's a way to publish a consistent stream of short-form explanations, breakdowns, and arguments — the kind of content that builds an audience over time — without the logistics of a shoot making every single video a project.

The ceiling on quality is real. But the floor — which used to be "own a camera, have good lighting, know how to edit" — has dropped to "have something to say."

← All workflowsBrowse all tools →