A reference framework for how 6-, 15-, and 30-second video ads are structured — a shared language for talking about the elements, durations, and decisions that shape short-form work.
Whether it's six seconds or thirty, every short-form ad runs through the same five-step anatomy. The difference between durations isn't which jobs get done — it's how much room each job has to land.
Stops the scroll. Earns the rest of the seconds. A visual surprise, a question, motion that can't be ignored, a face you recognize, a stat that punches.
Tells the viewer who this is for and what it is. Brand cue, product reveal, category line. The "this is X, for Y" beat.
Stacks the reasons-to-care. One per beat. Not a list — a build. Punchy callouts, ingredient stats, social proof, quick-cut variety. Each beat earns the next.
Resolves what the spot has been building toward. Names the offer or the feeling. One bold line that lands clean.
Tells the viewer exactly what to do next. Static. Logo lock-up, CTA, market-specific info. Holds long enough to be screenshotted; readable without sound.
Six seconds is not enough room for an arc. Pick the single most important thing the viewer needs to feel — and lead with it. Everything else is supporting cast.
Pacing. 4–6 cuts maximum. Each cut advances the truth — no decorative footage. The opening hook is visual only: action carries it; text waits for the category line and truth. End-card holds at least one full second.
The default short-form unit. Long enough to build a small case, short enough that every beat must earn its place. Three middle beats is the sweet spot — fewer feels light, more feels rushed.
Pacing. A new visual idea every 1.5–2 seconds. The three middle beats should feel different from each other — different shot scale, different motion, different angle — so the eye stays alert.
Now there's room for a story shape: setup → escalation → climax → resolution. Five middle beats. The risk doubles — more time means more places to lose the viewer. Pacing has to build, not stay flat.
Pacing. Cut frequency accelerates from middle to outro — slower setup beats, faster climax beats, then a deliberate breath into the end-card. Energy curves up; final card holds 1.5–2 seconds so the brand mark is unmissable.
On a phone in a vertical feed, the viewer decides whether to keep watching in roughly the time it takes to read a single sentence. If the first 1.5–3 seconds don't create a reason to stay, the viewer scrolls — and the rest of the ad never plays. The hook isn't an opening; it's the entire ad's permission to exist.
Visual attention rebuilds itself with each new frame. Holding a single shot too long lets the viewer's gaze drift off-screen. A cut, a camera move, or a fresh element resets attention. Too much, and the viewer can't track meaning; too little, and they tune out. The 1.5–2s rhythm is the working compromise.
After the kinetic middle, the brain needs a beat of stillness to encode the brand and the action. A flashy or moving end-card reduces brand recall. The card should be readable without sound, hold long enough to be screenshotted, and answer one question: what next?
Beyond structure, four categories of choice shape how the ad feels. Each carries its own performance signal independent of the writing or the visuals themselves.
9:16 vertical — Reels, TikTok, Shorts, Stories. Mobile-native, full-screen. Where 80% of short-form lives in 2026.
1:1 square — feed, in-line. Lower commitment from the viewer; shorter messages work better here.
4:5 portrait — feed-optimized portrait. More vertical real estate than square; good for in-line stops.
16:9 horizontal — YouTube pre-roll, connected TV. Longer dwell, sound-on, often family co-viewing.
Fast cuts = energy, urgency, modernity. Use for hooks and climax beats.
Held shots = trust, calm, quality cues. Use for ingredient or product hero moments — the viewer needs to see.
Camera movement (push-in, pan, parallax) substitutes for cuts when continuity matters but the picture still needs life.
Pattern interrupts — a sudden style shift every 4–6 seconds keeps a 30 from feeling slow.
On phones, sound-off is the default — roughly 85% on Meta, 50% on TikTok. Text is the script for half the audience.
Six-word rule. Any line of text must be readable in 1.5 seconds or less. If it can't be read aloud in that window, cut it down.
Place text away from the bottom 15% of vertical frames — that's where platform UI sits (captions, profile pic, CTA button).
Sound design has to work without being heard. Match-cut on beat (snap, hit, drop) so even on mute, the rhythm reads visually.
Voiceover should never carry information that's not also on screen. Treat VO as a bonus layer, not a load-bearing one.
Music genre is a persona signal — energy/club for 18–24, indie/folk for 25–34 lifestyle, hip-hop instrumental for fashion-forward, lo-fi for wellness. Choose deliberately.
The same idea — "Real Energy. Real People. Feel-Good. Do Good." — sized across all three formulas. Notice how the hook, brand cue, and end-card stay constant across cuts; only the middle stretches.
| Time | Job | What's on screen |
|---|---|---|
| 0–1s | Hook | Skater explodes through frame, fruit splash freeze. |
| 1–3s | Context + Truth | Product hero shot. Caption: "Real rides." |
| 3–5s | Outro | Headline: "Real Energy. Real People." |
| 5–6s | End-Card | Logo + CTA: "Find a location nearest you." |
| Time | Job | What's on screen |
|---|---|---|
| 0–2s | Hook | Skater + fruit explosion. Caption: "Pick your ride." |
| 2–4s | Context | Brand cue, product reveal. |
| 4–7s | Beat 1 — Peach | Bright color flash + can. |
| 7–10s | Beat 2 — Strawberry Lemonade | Bubble cascade. |
| 10–13s | Beat 3 — Tropical Mango | Golden splash. |
| 13–14s | Outro | "Four Flavors. One Smooth Ride." |
| 14–15s | End-Card | Logo + CTA. |
| Time | Job | What's on screen |
|---|---|---|
| 0–3s | Hook | Skater roll-in, neon-rink reveal. Caption: "Refreshed. Turned up." |
| 3–6s | Context | Skaters laughing, brand cue, can entrance. |
| 6–10s | Beat 1 — Real Energy | Ingredient overlay: 80mg natural caffeine. |
| 10–14s | Beat 2 — Real People | Real-skater montage, no models. |
| 14–18s | Beat 3 — Variety | Pattern-break flavor reel: Peach / Lemonade / Mango / Punch. |
| 18–22s | Beat 4 — Feel-Good | Community moment: cheers, group skate. |
| 22–28s | Outro | Headline + sub: "Real Energy. Real People. / Feel-Good. Do Good." |
| 28–30s | End-Card | Logo + market-specific CTA. |
What stayed constant: the hook image, the brand cue, the headline, and the end-card. What stretched: how many proof beats fit. That consistency is what makes a campaign feel like a campaign.
The words the work uses — handy reference for craft conversations across writing, edit, and post.