Concrete, copy-paste checks for every layer · 2026 · ITDT LLC
The story explains why the studio is split across two machines; the API describes the contract between them. This page is the third thing you need to actually build one: the verification steps — the small, concrete commands that prove each layer works, run bottom-up so a failure is easy to locate. They double as the best worked examples of how to use the system, and as a recipe your own Claude Code can follow to stand up and check an equivalent pipeline on your hardware.
Everything machine-specific here is a placeholder:
<pc-host> is the render machine's address on your network,
<project-root> is wherever you keep the project, and the shared folder mounts at
<relay-mount>. Network credentials live in the operating system's keychain — there
are no usernames, passwords, or tokens to paste anywhere. Substitute your own values as you go.
Verify from the bottom up. Each layer assumes the one below it passed, so the first failure points straight at the culprit. Most layers need no GPU at all — only the cross-machine render does.
Confirm the render machine is up and the shared folder ("the relay") will mount. The mount uses credentials saved in the OS keychain, so it is unattended — no interactive login.
cd <project-root>
# render machine answers, and exposes the render script:
curl -s -m 5 http://<pc-host>:8765/healthz; echo
curl -s -m 5 http://<pc-host>:8765/scripts \
| python3 -c "import sys,json; print('render_base_clip present:', 'render_base_clip' in json.load(sys.stdin).get('scripts',{}))"
# mount the shared folder (unattended, keychain credentials):
python3 Tools/scripts/mount_relay.py
Pass: healthz responds, render_base_clip
present: True, and the mount prints {"status":"ok","mounted":true,…}. If the
render machine is off or unreachable, start it (and its job server) before Layer 1.
The render machine fronts its GPU with a tiny asynchronous job server (full reference on the API page). Verify it the way the rest of the studio uses it: submit one render, get an id back immediately, poll until it completes. Because the call never blocks, the queue is also how you confirm a single GPU is never double-booked.
# submit one short render (the result is staged back to the shared folder for review):
curl -s -X POST http://<pc-host>:8765/run_script -H 'content-type: application/json' -d '{
"script": "render_base_clip",
"args": { "start_frame": "host_start.png", "lora": "<character-model>",
"prompt": "<setting>", "width": 832, "height": 480,
"framing": "waist-up medium shot", "seconds": 3, "output_name": "verify01" },
"run_id": "verify_run", "stage_review": true, "run_total": 1 }'
# -> { "job_id": "a1b2c3d4e5f6", "queue_position": 0 }
# poll it (repeat until state == complete):
curl -s http://<pc-host>:8765/job/a1b2c3d4e5f6 | python3 -m json.tool
# see what is running / queued / recent:
curl -s http://<pc-host>:8765/queue | python3 -m json.tool | head -20
# free the GPU if you want to stop early:
curl -s -X POST http://<pc-host>:8765/cancel_job/a1b2c3d4e5f6
Pass: the submit returns a job_id at once; polling moves
queued → running → complete and the finished clip appears in the shared folder under the
run_id. (You judge the clip itself by eye — that is the one thing no automated check
replaces.)
Every Mac capability is a small command-line tool with one contract: arguments in, a single line of
JSON out — {"status":"ok",…} or {"status":"error",…}. That uniformity is
what makes them scriptable and callable by the local model. Most need no GPU, so they verify
in seconds. Two representative checks:
cd <project-root> # (a) validate an episode script: structure + a content-rules screen, no render: python3 Tools/scripts/validate_episode.py --episode episodes/<episode>.json | python3 -m json.tool | head -20 # (b) synthesize a line of narration in the host's locked voice (no GPU): python3 Tools/scripts/generate_voiceover.py --text "This is a verification line." \ --voice <voice-id> --output /tmp/vo.wav
Pass: (a) prints "valid": true with the shot count and
any not-yet-ready assets listed as warnings (a missing capture is a warning, not an error, so
you can author and dry-run before recording); off-rules narration trips the content screen. (b) writes
a .wav and reports its duration.
The orchestrators tie the layers into one run. Verify them without spending GPU first: the dry-run is a cost gate, and the pause/resume controls are testable on their own.
cd <project-root> python3 Tools/scripts/render_episode.py --episode episodes/<episode>.json --dry-run | python3 -m json.tool
Pass: a plan with the number of host shots and an honest estimate — how many GPU segments, and roughly how many minutes on each machine — and nothing touches the GPU. This is the "approve the cost before committing" step.
A long host shot is built by chaining short GPU segments, so the run has a natural pause point at every segment boundary. Pausing lets the in-flight segment finish, then holds before the next one — so you can review progress (for example, the seam where two segments join) without throwing away GPU work. Resuming continues from the next segment; finished segments are reused, never re-rendered. This layer verifies with no GPU, because the control is independent of whether a render is live:
cd <project-root> python3 Tools/scripts/render_control.py status # paused? is an orchestrator running? python3 Tools/scripts/render_control.py pause # hold before the next segment python3 Tools/scripts/render_control.py resume # continue from the next segment
Pass: status flips from not-paused to paused and back;
while paused, a running chain finishes its current segment then waits, and resume lets it
continue. Because completed segments are cached on disk, even a render that was stopped outright
resumes without redoing finished work. (Pausing is distinct from cancelling the whole run.)
The top layer is the always-on local model: every tool above is registered with it, so the whole studio can be driven in plain language. Verify it by saying things and checking it reaches for the right tool — and stays honest about long jobs. You talk to the model in its own chat, not the terminal:
| Say to the model… | Expect… |
|---|---|
"Validate the episode at episodes/<episode>.json — is it ready to render?" |
It runs the validator and reports valid / shot count / warnings; it renders nothing. |
| "Do a dry run of rendering that episode — I want the plan and time first." | It runs the dry-run and reports the plan and cost, then offers to start only on your say-so. |
| "Render that episode for real." | It starts the job in the background and hands you a job id at once; it does not claim the video is done until the job reports complete. |
| "Pause the render." / "Resume it." / "Is the render paused?" | It uses the pause / resume / status controls, explains the in-flight segment finishes first, and treats pause as different from stopping the whole run. |
Pass: each request maps to the right tool, long jobs run in the background with a job id, and the model never reports a render finished before it is. (If it reaches for the wrong tool, its registration/prompt wasn't rebuilt after the tool was added.)
These checks are written to be acted on, not just read. Point your own Claude Code at the three pages of this project — the story, the API, and this verification guide — tell it what hardware you have, and ask it to build and verify an equivalent pipeline for your setup. A workable prompt:
Read these three pages: /video-pipeline/ · /video-pipeline/api.html · /video-pipeline/verify.html My hardware is: <describe your machines — GPUs, RAM, OS, what's always-on>. Design the functional split for my rig, implement the job API + orchestrator + the Mac-side tools, and reproduce these verification layers (0 through 4) so I can confirm each one. Keep the same contract shape; change only where each step runs.
It will re-derive the implementation for your machines — choosing models, drawing the functional split, and reproducing the layered checks — rather than copying this one. The verification layers are the acceptance test: when Layers 0–4 pass on your hardware, you have the real product of this project — a video-generation pipeline custom-fit to the machines it runs on, and proven so.
A note on the placeholders: <pc-host>,
<project-root>, <relay-mount>, <character-model>,
and <voice-id> are intentionally generic, and credentials are kept in the OS keychain
rather than written down. Your Claude Code fills these in for your environment; nothing here depends on
ours.