Screenplay Compiler — Mac Verification

Verbatim verification procedure (Mac terminal) · 2026 · ITDT LLC · sanitized placeholders

Audience: the owner, running on the Mac. Goal: Independently confirm the 6 Mac build steps of the Screenplay Compiler work, then run the cross-machine integration test (Mac + PC together). What this covers: generate_image start-frame mode, the screenplay schema + validator + library, generate_voiceover --voice, assemble_timeline, compose_video --pip-round/--pip-fade, the render_screenplay orchestrator + Elira tool wiring, and the full end-to-end run.

Each test = a command to copy/paste + what a PASS looks like. If a test fails, stop and tell me which test number + paste the output.

Companion docs: Design/Screenplay_Compiler_Mac_Build.md (what was built + why), Design/Screenplay_Compiler_Design.md (the locked feature spec), Relay/Design/Supervisor_API.md (the PC contract).

Time budget: Tests 1–9 are quick (seconds to ~2 min each, except the two marked SLOW). Test 10 (integration) includes a real ~31 min PC render + an ~11 min Mac lip-sync, so set aside ~50 min and run it when you can leave it.

How to run this document — read this first

You run these in Terminal, not in Elira. Open the macOS Terminal app (Applications → Utilities → Terminal), and keep it open for the whole session.

Each gray box is one command block. Run them one box at a time, top to bottom:

Select everything inside the gray box — from the first character to the last — and copy it (⌘C).
Click into the Terminal window and press ⌘V to paste.
Press Return to run it. If you pasted a multi-line box and the last line didn’t run on its own, press Return once more.
Look at what Terminal prints and compare it to the PASS description right below that box.

Paste each box whole — don’t run it line by line. Several boxes are a single multi-line command (you’ll see <<'EOF', <<'PYEOF', or a trailing \ at the end of a line). Those only work if pasted as one piece; splitting them mid-command will error.

What is and isn’t a command:

Lines like echo "--- valid ---" are just labels that print a heading so you can tell which part of the output belongs to which step. Safe to paste.
The command boxes don’t contain # comment lines for you to type. (Your shell, zsh, does not treat a # typed at the prompt as a comment — it would try to run it and error — so the boxes use echo labels instead. The only # lines you’ll see are Python comments inside a python3 … <<'PYEOF' block, which are fed to Python, not the shell.)
The PASS: text, the tables, and this prose are for you to read — never paste them.

Run the tests in order. Test 0 (setup) creates the working folder and a sample file that later tests reuse, so do it first. Within a test, run its boxes top-to-bottom.

If a command seems to hang: the slow ones are marked (Test 1 first run loads models; Test 10 is ~50 min). Everything else should finish in seconds. If something else hangs, press Control-C to stop it, and tell me which test.

0. One-time setup

Run this first. It moves into the project folder and creates the working directory that every later test writes into (/Users/<user>/dev/ITDT/Videos/verification/temp/sc_verify). Without this, the next command fails with "No such file or directory."

cd /Users/<user>/dev/ITDT/Videos
mkdir -p /Users/<user>/dev/ITDT/Videos/verification/temp/sc_verify
echo "ready — working dir is /Users/<user>/dev/ITDT/Videos/verification/temp/sc_verify"

PASS: prints ready — working dir is /Users/<user>/dev/ITDT/Videos/verification/temp/sc_verify and no error.

Now create a reusable valid 1-scene screenplay (used by several tests). Paste this whole box — it's one command that writes a file and then confirms it:

cat > /Users/<user>/dev/ITDT/Videos/verification/temp/sc_verify/mini.json <<'EOF'
{
  "title": "Verify Mini",
  "scenes": [
    {
      "id": "s1",
      "character": "elira",
      "outfit": "default",
      "setting": "neon-lit city rooftop at dusk, wide skyline",
      "presentation": "full",
      "hold": 2,
      "timeline": [
        { "say": "This is the way." },
        { "wait": 3 },
        { "say": "Follow me." }
      ]
    }
  ]
}
EOF
echo "wrote /Users/<user>/dev/ITDT/Videos/verification/temp/sc_verify/mini.json"

Test 1 — `generate_image` start-frame mode (PuLID, 9:16)

Generates an identity-anchored Elira start frame from her reference still. Needs Mac ComfyUI (the script auto-starts it; first run is slower while models load).

cd /Users/<user>/dev/ITDT/Videos
REF=$(python3 -c "import json;print(json.load(open('character_library.json'))['elira']['outfits']['default']['reference'])")
python3 Tools/scripts/generate_image.py \
  --prompt "elira_host, medium close-up chest up, against a neon-lit city rooftop at dusk, holding pose, looking directly into camera, gentle closed-mouth expression, photorealistic, cinematic lighting" \
  --reference "$REF" --seed 7 \
  --output /Users/<user>/dev/ITDT/Videos/verification/temp/sc_verify/startframe.png

PASS: stdout is one JSON line {"status": "ok", "mode": "start_frame", "size": "480x832", ...}. Confirm the dimensions and look:

sips -g pixelWidth -g pixelHeight /Users/<user>/dev/ITDT/Videos/verification/temp/sc_verify/startframe.png
open /Users/<user>/dev/ITDT/Videos/verification/temp/sc_verify/startframe.png

PASS: pixelWidth 480, pixelHeight 832 (9:16 portrait), and the image is recognizably Elira — jet-black hair, magenta blazer, high-neck/modest, facing camera. (If the wardrobe is off, that's the start-frame review gate's job — see Test 1b.)

Test 1b — candidate set for the review gate (optional)

cd /Users/<user>/dev/ITDT/Videos
REF=$(python3 -c "import json;print(json.load(open('character_library.json'))['elira']['outfits']['default']['reference'])")
python3 Tools/scripts/generate_image.py \
  --prompt "elira_host, medium close-up chest up, against a quiet modern office at night, holding pose, looking directly into camera, gentle closed-mouth expression, photorealistic" \
  --reference "$REF" --candidates 3 \
  --output /Users/<user>/dev/ITDT/Videos/verification/temp/sc_verify/candidates
ls -la /Users/<user>/dev/ITDT/Videos/verification/temp/sc_verify/candidates

PASS: JSON {"status":"ok","mode":"start_frame","count":3,...} and the directory holds 3 PNGs named candidate_01_seed*.png … candidate_03_*.

Test 1c — fail-loud (bad inputs)

cd /Users/<user>/dev/ITDT/Videos
python3 Tools/scripts/generate_image.py --prompt x --reference /nope/none.png
python3 Tools/scripts/generate_image.py --prompt x --pulid
python3 Tools/scripts/generate_image.py --prompt x --size foo

PASS: each prints a single {"status":"error","message":"..."} line (missing reference / --pulid without --reference / bad size) and renders nothing.

Test 1d — library resolution (`--character`) + negative prompt (new this session)

--character resolves Elira's reference + composed look from character_library.json (so --prompt is just the SETTING); --negative steers wardrobe/anatomy away (Flux ignores a negative at cfg 1.0, so it implies true CFG > 1 automatically). First the fail-loud checks (no GPU), then one real render with both:

cd /Users/<user>/dev/ITDT/Videos
echo "--- unknown character / outfit error cleanly (no ComfyUI) ---"
python3 Tools/scripts/generate_image.py --character nobody --prompt "on a rooftop"
python3 Tools/scripts/generate_image.py --character elira --outfit tuxedo --prompt "on a rooftop"
echo "--- real render: yourself, with a wardrobe negative ---"
python3 Tools/scripts/generate_image.py --character elira \
  --prompt "on a neon-lit city rooftop at dusk, wide skyline" \
  --negative "low-cut top, exposed cleavage, open blazer, deformed hands, extra fingers" \
  --seed 7 --output /Users/<user>/dev/ITDT/Videos/verification/temp/sc_verify/char_neg.png
open /Users/<user>/dev/ITDT/Videos/verification/temp/sc_verify/char_neg.png

PASS: the two --character errors print {"status":"error",...} (character 'nobody' not in library / outfit 'tuxedo' not found ...) with no render; the real render prints {"status":"ok","mode":"start_frame","size":"480x832",...} and the image is recognizably Elira (jet-black hair, magenta blazer, high-neck/modest) on the rooftop.

Test 1e — `explore_outfits` — survey DIFFERENT outfits (new this session)

generate_image --candidates N re-rolls the SEED of ONE outfit (the wardrobe text is fixed by the library). To survey different outfits, explore_outfits.py holds the face / appearance / framing / quality constant (PuLID-anchored to the library reference) and varies ONLY the wardrobe text — one image per outfit. Each look composes in the exact start-frame order (trigger, framing, appearance, wardrobe, setting, gaze, quality).

cd /Users/<user>/dev/ITDT/Videos
echo "--- no outfits given: clean error, no GPU ---"
python3 Tools/scripts/explore_outfits.py --character elira
echo "--- real: three different outfits, one image each ---"
python3 Tools/scripts/explore_outfits.py --character elira \
  --outfit "charcoal_suit=wearing a charcoal grey tailored business pantsuit over a crisp white blouse, modest high neckline" \
  --outfit "navy_turtleneck=wearing a fitted navy blazer over a grey turtleneck, modest neckline" \
  --outfit "techwear=wearing a matte-black structured sci-fi utility jacket, high collar" \
  --output /Users/<user>/dev/ITDT/Videos/verification/temp/sc_verify/outfits
open /Users/<user>/dev/ITDT/Videos/verification/temp/sc_verify/outfits

PASS: the no-outfit run prints {"status":"error","message":"no outfits given ..."} with no render; the real run prints {"status":"ok","count":3,...,"outfits":[...]} and the folder holds 01_charcoal_suit.png, 02_navy_turtleneck.png, 03_techwear.png — the same Elira face in three visibly different outfits (NOT three magenta-blazer takes). If a stubborn garment/colour from the reference bleeds through, add --negative "magenta blazer, pink jacket" and/or lower --pulid-weight (default 0.9).

Test 1f — new-character pipeline: `build_character_set` + `register_character` (new this session)

The push-button path for a NEW character: build_character_set.py makes a PuLID-anchored training dataset (PNG stills + .txt captions, varied pose/angle/lighting, identity+outfit held constant) → the PC's train_character consumes it → register_character.py adds the trained character/outfit to character_library.json programmatically (preserving every other entry). Settings need no training — they're just text. Here a 1-image smoke (real run = default 8 buckets × 3 seeds ≈ 24 stills, minutes) and the library helper against a TEMP copy (so the real library is untouched):

cd /Users/<user>/dev/ITDT/Videos
W=/Users/<user>/dev/ITDT/Videos/verification/temp/sc_verify
printf 'head | tight headshot, head facing forward, neutral expression, looking into the camera | in a soft-lit studio against a grey backdrop\n' > $W/buckets_smoke.txt
python3 Tools/scripts/build_character_set.py --character smoketest \
  --reference Brand/Elira_Outfits/business_suit_480x832.png \
  --appearance "a woman in her early 30s, jet-black hair" --wardrobe "wearing a charcoal business suit" \
  --buckets-file $W/buckets_smoke.txt --seeds-per-bucket 1 --output $W/charset_smoke
ls $W/charset_smoke           # expect a .png AND a matching .txt
cp character_library.json /tmp/lib_test.json
python3 Tools/scripts/register_character.py --library /tmp/lib_test.json --character nova \
  --voice nova_blend --trigger nova_host --appearance "a woman in her late 20s, short auburn hair" \
  --outfit default --reference Brand/Elira_Outfits/business_suit_480x832.png \
  --wardrobe "wearing a teal flight jacket" --lora nova_lora_v1
python3 Tools/scripts/register_character.py --library /tmp/lib_test.json --character x --wardrobe foo  # error path

PASS: the build prints {"status":"ok","count":1,"trigger":"smoketest_host","base_caption":...} and the dir holds a 1024×1024 PNG plus a same-stem .txt whose text is the base caption; the first register prints {"status":"ok",...,"created_character":true,"created_outfit":true} and /tmp/lib_test.json is valid JSON still containing _meta + elira (with business_suit) and a new nova; the last prints {"status":"error","message":"outfit fields ... require --outfit"}.

Test 2 — schema + `validate_screenplay` + library

The validator gates malformed input and emits the work plan.

cd /Users/<user>/dev/ITDT/Videos
echo "--- valid ---"
python3 Tools/scripts/validate_screenplay.py /Users/<user>/dev/ITDT/Videos/verification/temp/sc_verify/mini.json
echo "--- pip without background (invalid) ---"
printf '{"scenes":[{"id":"s1","character":"elira","outfit":"default","setting":"office","presentation":"pip","timeline":[{"say":"Hi."}]}]}' > /Users/<user>/dev/ITDT/Videos/verification/temp/sc_verify/bad_pip.json
python3 Tools/scripts/validate_screenplay.py /Users/<user>/dev/ITDT/Videos/verification/temp/sc_verify/bad_pip.json
echo "--- unknown outfit (invalid) ---"
printf '{"scenes":[{"id":"s1","character":"elira","outfit":"tuxedo","setting":"office","presentation":"full","timeline":[{"say":"Hi."}]}]}' > /Users/<user>/dev/ITDT/Videos/verification/temp/sc_verify/bad_outfit.json
python3 Tools/scripts/validate_screenplay.py /Users/<user>/dev/ITDT/Videos/verification/temp/sc_verify/bad_outfit.json
echo "--- malformed JSON ---"
printf '{ "scenes": [ {"id":"s1", } ] }' > /Users/<user>/dev/ITDT/Videos/verification/temp/sc_verify/malformed.json
python3 Tools/scripts/validate_screenplay.py /Users/<user>/dev/ITDT/Videos/verification/temp/sc_verify/malformed.json
echo "--- missing file ---"
python3 Tools/scripts/validate_screenplay.py /Users/<user>/dev/ITDT/Videos/verification/temp/sc_verify/nope.json

PASS, in order:

valid → "valid": true, "work_plan": [{"character":"elira","outfit":"default","setting":"neon-lit city rooftop at dusk, wide skyline"}], "new_base_clips": 1.
pip-no-background → "valid": false, error mentions pip presentation requires 'background'.
unknown outfit → "valid": false, error mentions outfit 'tuxedo' not found.
malformed → "valid": false, error mentions invalid JSON (a successful run reporting the problem, NOT a crash).
missing file → {"status":"error", ...} (this one is a tool error; exit code non-zero is expected).

Test 3 — `generate_voiceover --voice`

Per-character voice; default is Elira's locked blend.

cd /Users/<user>/dev/ITDT/Videos
echo "--- default voice ---"
python3 Tools/scripts/generate_voiceover.py --text "Your finances, decade by decade." --output /Users/<user>/dev/ITDT/Videos/verification/temp/sc_verify/vo.wav
echo "--- explicit elira_blend + pad to 5s ---"
python3 Tools/scripts/generate_voiceover.py --text "This is the way." --voice elira_blend --pad-to 5.0 --output /Users/<user>/dev/ITDT/Videos/verification/temp/sc_verify/vo_padded.wav
echo "--- bad voice blend (fail-loud) ---"
python3 Tools/scripts/generate_voiceover.py --text "x" --voice /nope/missing.pt

PASS:

default → {"status":"ok","voice":"elira_blend","duration_s":<~3>, ...}; afplay /Users/<user>/dev/ITDT/Videos/verification/temp/sc_verify/vo.wav sounds like Elira.
padded → "duration_s": 5.0 exactly (the pad-to invariant lip-sync relies on).
bad blend → {"status":"error",...}, no file produced.

Test 4 — `assemble_timeline` deterministic ops (no PC needed)

This exercises the editing engine (timeline math, continuous-take WAV, ping-pong fill, concat, music, final mix) on synthetic inputs — no PC base clip and no 11-min lip-sync. It's the fast confidence check that the mini-editor math is correct.

cd /Users/<user>/dev/ITDT/Videos
python3 - <<'PYEOF'
import sys; sys.path.insert(0, "Tools/scripts")
import assemble_timeline as A
from pathlib import Path
import subprocess
wd = Path("/Users/<user>/dev/ITDT/Videos/verification/temp/sc_verify/asm"); wd.mkdir(parents=True, exist_ok=True)

# synthetic say wavs + a 5s base clip
subprocess.run(["ffmpeg","-y","-f","lavfi","-i","sine=frequency=440:duration=2:sample_rate=24000","-ac","1",str(wd/"a.wav")],capture_output=True)
subprocess.run(["ffmpeg","-y","-f","lavfi","-i","sine=frequency=660:duration=1.5:sample_rate=24000","-ac","1",str(wd/"b.wav")],capture_output=True)
subprocess.run(["ffmpeg","-y","-f","lavfi","-i","testsrc=size=320x568:rate=30:duration=5","-pix_fmt","yuv420p",str(wd/"base.mp4")],capture_output=True)

pl, dur = A.compute_timeline([{"say":"a"},{"wait":3},{"say":"b"}], [2.0,1.5], hold=2.0)
assert dur == 8.5, dur                                  # 2 + 3 + 1.5 + 2
sw = A.build_scene_wav([wd/"a.wav",wd/"b.wav"], pl, dur, wd/"scene.wav")
assert abs(A.probe_duration(sw) - 8.5) < 0.05, A.probe_duration(sw)
canvas = A.pingpong_fill(wd/"base.mp4", 8.5, 60, wd/"canvas.mp4", wd/"pp")
assert abs(A.probe_duration(canvas) - 8.5) < 0.1 and A.probe_fps(canvas) == 60.0
print("PASS: timeline=%.1fs scene_wav=%.2fs canvas=%.2fs@%dfps" % (
      dur, A.probe_duration(sw), A.probe_duration(canvas), A.probe_fps(canvas)))
PYEOF

PASS: prints PASS: timeline=8.5s scene_wav=8.50s canvas=8.50s@60fps (no AssertionError).

Test 5 — `compose_video --pip-round` + `--pip-fade`

Rounded-corner + alpha-fade PIP overlay.

cd /Users/<user>/dev/ITDT/Videos
W=/Users/<user>/dev/ITDT/Videos/verification/temp/sc_verify
ffmpeg -y -f lavfi -i "testsrc=size=1280x720:rate=60:duration=4" -pix_fmt yuv420p $W/bg.mp4 >/dev/null 2>&1
ffmpeg -y -f lavfi -i "testsrc=size=480x832:rate=60:duration=4" -pix_fmt yuv420p $W/pip.mp4 >/dev/null 2>&1
python3 Tools/scripts/compose_video.py --background $W/bg.mp4 --pip $W/pip.mp4 \
  --pip-pos top-right --pip-width 360 --pip-start 0.5 --pip-end 3.5 \
  --pip-round 40 --pip-fade 0.5 --duration 4 --output $W/composite.mp4
ffmpeg -y -ss 2.0 -i $W/composite.mp4 -frames:v 1 $W/frame.png >/dev/null 2>&1
open $W/frame.png

PASS: JSON {"status":"ok","duration_s":4.0,"fps":60,...}; the extracted frame shows the PIP inset in the top-right with visibly rounded corners over the background.

Test 5b — `fill_clip` — extend a short clip to a target length (new this session)

Ping-pong fill (forward+reverse, seamless) a short clip to an exact duration — the step that turns a ~5s base clip into a ~10s lipsync canvas (VRT needs a face track as long as the audio). Thin CLI over the proven assemble_timeline.pingpong_fill().

cd /Users/<user>/dev/ITDT/Videos
W=/Users/<user>/dev/ITDT/Videos/verification/temp/sc_verify
ffmpeg -y -f lavfi -i "testsrc=size=480x832:rate=30:duration=3" -pix_fmt yuv420p $W/short3.mp4 >/dev/null 2>&1
python3 Tools/scripts/fill_clip.py --input $W/short3.mp4 --duration 10 --fps 60 --output $W/filled10.mp4
python3 Tools/scripts/fill_clip.py --input $W/short3.mp4 --duration 0 --output $W/x.mp4   # error path

PASS: the first prints {"status":"ok","duration":10.0,"fps":60,"source_duration":3.0,...} (output is exactly the target length); the second prints {"status":"error","message":"--duration must be > 0 ..."}.

Test 5c — `compose_video` general multi-overlay + multi-audio (new this session)

The general path (triggered by any --overlay/--audio) places multiple timed overlays — videos and still images — and mixes multiple offset audio tracks. This is the App Store preview layout (two corner PIPs + a centered icon + two voiceovers at different offsets + a ducked music bed) as a single repeatable API. The legacy single---pip path (Test 5) is unchanged.

cd /Users/<user>/dev/ITDT/Videos
W=/Users/<user>/dev/ITDT/Videos/verification/temp/sc_verify
ffmpeg -y -f lavfi -i "color=c=navy:s=540x960:r=30:d=27" -pix_fmt yuv420p $W/g_bg.mp4 >/dev/null 2>&1
ffmpeg -y -f lavfi -i "testsrc=s=240x416:r=30:d=10" -pix_fmt yuv420p $W/g_intro.mp4 >/dev/null 2>&1
ffmpeg -y -f lavfi -i "testsrc=s=240x416:r=30:d=10" -pix_fmt yuv420p $W/g_outro.mp4 >/dev/null 2>&1
ffmpeg -y -f lavfi -i "color=c=white:s=200x200:d=1" -frames:v 1 $W/g_icon.png >/dev/null 2>&1
ffmpeg -y -f lavfi -i "sine=frequency=440:duration=10"  $W/g_intro_vo.wav >/dev/null 2>&1
ffmpeg -y -f lavfi -i "sine=frequency=660:duration=8.5" $W/g_outro_vo.wav >/dev/null 2>&1
ffmpeg -y -f lavfi -i "sine=frequency=220:duration=25"  $W/g_music.wav >/dev/null 2>&1
python3 Tools/scripts/compose_video.py --background $W/g_bg.mp4 --fps 30 --duration 27 --output $W/general.mp4 \
  --overlay "src=$W/g_intro.mp4,pos=top-right,start=0,end=10,round=40,fade=0.5,width=240" \
  --overlay "src=$W/g_outro.mp4,pos=top-right,start=17,end=27,round=40,fade=0.5,width=240" \
  --overlay "src=$W/g_icon.png,type=image,pos=center,start=25,end=27,fade=1,width=200" \
  --audio "src=$W/g_intro_vo.wav,delay=0" --audio "src=$W/g_outro_vo.wav,delay=17" --audio "src=$W/g_music.wav,gain=-12"
ffmpeg -y -ss 26 -i $W/general.mp4 -frames:v 1 $W/g_f26.png >/dev/null 2>&1
open $W/g_f26.png

PASS: JSON {"status":"ok","duration_s":27.0,"overlays":3,"audio_tracks":3,...}; the t=26 frame shows the outro PIP top-right AND the centered icon, a frame at t=12 shows neither (the gap), and the file has an audio stream. The Test 5 (legacy --pip) result is unchanged.

Test 6 — `render_screenplay --dry-run` (plan + cost, no GPU)

The orchestrator's planning path. Spends nothing.

cd /Users/<user>/dev/ITDT/Videos
echo "--- valid screenplay ---"
python3 Tools/scripts/render_screenplay.py --screenplay /Users/<user>/dev/ITDT/Videos/verification/temp/sc_verify/mini.json --dry-run
echo "--- invalid screenplay (must gate) ---"
python3 Tools/scripts/render_screenplay.py --screenplay /Users/<user>/dev/ITDT/Videos/verification/temp/sc_verify/bad_pip.json --dry-run

PASS:

valid → {"status":"ok","valid":true,"dry_run":true,"scenes":1,"work_plan":[...1 entry...],"estimate":{"new_base_clips":1,"pc_minutes_est":31.0,"mac_minutes_est":12.0,"total_minutes_est":43.0},...}.
invalid → {"status":"ok","valid":false,"errors":["... pip presentation requires 'background'"]} — no GPU touched.

Test 7 — character library is correct

cd /Users/<user>/dev/ITDT/Videos
python3 -c "import json; d=json.load(open('character_library.json'))['elira']['outfits']['default']; print('lora =', d['lora']); print('reference exists =', __import__('os').path.exists(d['reference']))"

PASS: lora = elira_lora_v2_lownoise and reference exists = True.

Test 8 — Elira sees the new tools (model wiring)

Confirms the rebuilt elira:latest advertises the new tools and the dispatch is wired.

cd /Users/<user>/dev/ITDT/Videos
echo "--- system prompt mentions the new tools ---"
ollama show elira --system | grep -o -E "render_screenplay|stop_pc_job|stop_generation|mount_relay|prepare_delivery" | sort -u
echo "--- tool registry + dispatch all consistent ---"
python3 - <<'PYEOF'
import re
src = open("Elira/elira_tools.py").read()
dispatch = set(re.findall(r'if name == "(\w+)"', src))
handlers = set(re.findall(r'def (t_\w+)\(', src))
# The 4 memory tools dispatch via the elira_memory module, not t_* handlers.
memory_tools = {"remember", "recall_memory", "list_memories", "forget_memory"}
names = []
for n in re.findall(r'"name": "(\w+)"', src):
    if n not in names: names.append(n)
missing = [n for n in names if n not in memory_tools
           and not (n in dispatch and ("t_"+n) in handlers)]
print("tools:", len(names), "| missing wiring:", missing)
PYEOF

PASS: the grep prints all five names (mount_relay, prepare_delivery, render_screenplay, stop_generation, stop_pc_job); the python prints tools: 24 | missing wiring: [].

Test 8b — drive the orchestrator through Elira (optional, talks to the 72B)

Only if Elira's chat shell is running and you want to confirm the model actually calls the tool. In Elira chat, say:

"Do a dry run of the screenplay at /Users/<user>/dev/ITDT/Videos/verification/temp/sc_verify/mini.json"

PASS: Elira calls render_screenplay with dry_run and reports back the plan (1 scene, 1 base clip, ~43 min estimate) rather than describing it in prose.

Test 9 — PC reachable + Relay mountable (pre-flight for the integration test)

cd /Users/<user>/dev/ITDT/Videos
curl -s -m 5 http://<pc-host>:8765/healthz; echo
curl -s -m 5 http://<pc-host>:8765/queue | python3 -m json.tool | head -20
curl -s -m 5 http://<pc-host>:8765/scripts | python3 -c "import sys,json; print('render_base_clip present:', 'render_base_clip' in json.load(sys.stdin).get('scripts',{}))"
echo "--- mount the Relay share (unattended, uses saved credentials) ---"
python3 Tools/scripts/mount_relay.py

PASS: healthz responds (supervisor 0.4.0), /queue returns JSON, render_base_clip present: True, and mount_relay.py prints {"status":"ok","mounted":true,...} (either already:true or a fresh mount). If the PC is off or unreachable, power it on / start the Supervisor before Test 10. (You no longer have to log into the share by hand — render_screenplay also auto-mounts it when a PC render is needed.)

Test 10 — CROSS-MACHINE INTEGRATION TEST (the real end-to-end) ⏳ ~50 min

This is the one that proves the whole pipeline. It will: generate an Elira start frame on the Mac → stage it to the PC → render a ~5 s base clip on the PC (~31 min) → voiceover + single lip-sync pass (~11 min) + compose on the Mac → assemble the master.

Prereqs: Test 9 passed (PC up; Relay mountable). render_screenplay auto-mounts the share itself when it needs the PC, so you don't have to mount it by hand. Leave the machine alone while it runs; it's a "kick it off and come back" job.

Recommended: run the dry-run first (Test 6) so you see the plan, then:

cd /Users/<user>/dev/ITDT/Videos
python3 Tools/scripts/render_screenplay.py \
  --screenplay /Users/<user>/dev/ITDT/Videos/verification/temp/sc_verify/mini.json \
  --output /Users/<user>/dev/ITDT/Videos/verification/temp/sc_verify/master.mp4 \
  2>/Users/<user>/dev/ITDT/Videos/verification/temp/sc_verify/render.log
echo "EXIT=$?"

While it runs you can watch progress from another Terminal window. This is the Mac-side step log — it now prints a timestamped line for each step (validating, start frame, staging, PC job id, poll progress, assembling, done):

tail -f /Users/<user>/dev/ITDT/Videos/verification/temp/sc_verify/render.log

And, in yet another window, to watch the PC render land in the review folder:

ls -la /Volumes/Relay/review/screenplay_*/ 2>/dev/null

(Press Control-C to stop a tail -f when you're done watching — it doesn't affect the render.)

PASS (the full chain):

Final stdout is one JSON line: {"status":"ok","valid":true,"path":".../master.mp4","scenes":1,"base_clips_generated":1,...}.
The master plays and is correct:
```
open /Users/<user>/dev/ITDT/Videos/verification/temp/sc_verify/master.mp4
```
Elira on the rooftop, lips moving on “This is the way.” and “Follow me.” and at rest (mouth closed) during the 3 s gap, audio in her voice, ~8.5 s long, no seams at the ping-pong loop point. Her eyes should be steady — no flickering/sparkling “twinkle” (the eye-restore composite handles this; if you see eye shimmer, note it). The output is 60 fps; head/background motion is from the 16 fps Wan source, the mouth is synced at 60 fps.
The base clip was promoted into the library cache:
```
python3 -c "import json; print(json.load(open('character_library.json'))['elira']['outfits']['default']['base_clips'])"
```
shows the rooftop setting mapped to a .mp4 filename (no longer empty {}).
Resume check (cheap, proves caching): re-run the exact command above. The second run should skip the 31-min PC render (the base clip is cached) and just re-assemble in a couple of minutes. PASS = much faster second run, same master.

If it fails partway: the step log (/Users/<user>/dev/ITDT/Videos/verification/temp/sc_verify/render.log) and the final JSON message say which step. Common ones: PC unreachable (Test 9), the start frame failed (Test 1), or the lip-sync timed out. Tell me the failing step + the log tail.

Test 11 — `prepare_delivery` (publish-ready formats, no PC) ⏳ first run ~4 min

Upscales a finished master (Real-ESRGAN, no face-enhance → no eye flicker) and formats it per platform. Uses the master from Test 10 (or any master). The first platform does the upscale (~4 min for an ~8 s clip); the rest reuse the cache (~2 s). App Store is special — a preview is a composite, so it produces a PIP-ready element by default.

cd /Users/<user>/dev/ITDT/Videos
M=/Users/<user>/dev/ITDT/Videos/verification/temp/sc_verify/master.mp4
echo "--- YouTube 1080x1920@60 (does the upscale) ---"
python3 Tools/scripts/prepare_delivery.py --master "$M" --platform youtube
echo "--- X 1080x1920@30 (reuses the cache, fast) ---"
python3 Tools/scripts/prepare_delivery.py --master "$M" --platform x
echo "--- App Store: PIP-ready element (default) ---"
python3 Tools/scripts/prepare_delivery.py --master "$M" --platform appstore
echo "--- App Store: full-frame letterbox fallback ---"
python3 Tools/scripts/prepare_delivery.py --master "$M" --platform appstore --full-frame
open /Users/<user>/dev/ITDT/Videos/Generated/Delivery/

PASS: each prints {"status":"ok",...}. youtube → res:"1080x1920","fps":60; x → fps:30; appstore default → "mode":"pip_ready" + a note that the preview is a composite; appstore --full-frame → res:"1290x2796" (a tall device frame with visible letterbox — that's expected, it's why the composite is the real App Store path). The files open and play; the faces are sharper than the 480×832 master with no eye shimmer.

Optional — the App Store composite (talking head as a corner PIP over an app capture):

python3 Tools/scripts/prepare_delivery.py --master "$M" --platform appstore \
  --background /Users/<user>/dev/ITDT/Videos/Episodes/AppStore_v1/capture_trimmed_27s.mp4 \
  --pip-width 460

PASS: "mode":"composite"; the output shows the app screen full-frame with Elira as a rounded top-right inset.

⚠️ Verify the composite isn't silent or the wrong file. Two real defects shipped here before: the composite came out silent (the app-capture background has no audio and the compositor dropped the talking head's voice), and a near-identically-named intermediate …_composite_pip.mp4 (just the talking head, no app behind it) was left next to the real composite. Run these three checks on the actual output:

F=/Users/<user>/dev/ITDT/Videos/Generated/Delivery/master_appstore_composite.mp4
# (a) AUDIO PRESENT + not silent — the key regression guard. Expect an aac stream and a real
#     max_volume (around -8 dB), NOT "mean/max_volume: -91 dB" and NOT an empty result.
ffmpeg -v info -i "$F" -map 0:a -af volumedetect -f null - 2>&1 | grep -iE "mean_volume|max_volume"
# (b) only the deliverable is left — no leftover _pip / _voice files to open by mistake.
ls /Users/<user>/dev/ITDT/Videos/Generated/Delivery/ | grep -i composite   # expect ONLY master_appstore_composite.mp4
# (c) eyeball a frame from the middle — Elira should be a rounded corner PIP over the app.
ffmpeg -y -v error -ss 12 -i "$F" -frames:v 1 /tmp/appstore_check.png && open /tmp/appstore_check.png

PASS: (a) prints a real max_volume near −8 dB; (b) lists only master_appstore_composite.mp4; (c) the frame shows the rounded corner PIP over the app. If (a) shows no audio stream or −91 dB, the composite is silent — stop and report it.

Episode Compiler — Mac Verification

The newer, broader system: one episode JSON (a list of typed shots — host, host_pip, narration, title, montage) becomes a full multi-minute master with overlays (chyrons / lower-thirds), a music bed + one-shot stings, transitions (cut / dissolve / fade), and camera moves. Driven by render_episode (orchestrator) → assemble_episode (editor), checked by validate_episode. Tests E1–E4 + E6–E7 need no PC; E5 is the long PC chain. (The Screenplay Compiler above is still the short single-scene path; the Episode Compiler reuses its base-clip pipeline underneath.)

Test E1 — `validate_episode` (structure + content lint, no GPU)

cd /Users/<user>/dev/ITDT/Videos
python3 Tools/scripts/validate_episode.py --episode Episodes/E01/E01.episode.json | python3 -m json.tool | head -20

PASS: "status":"ok", "valid": true, "summary": {"shots": 13}, errors empty, and warnings lists the not-yet-recorded app captures (e.g. capture_05_…mov … not found yet) — those are expected until you record them (a missing capture is a WARNING, not an error, so you can author and dry-run first). A malformed script → "valid": false with errors; off-brand narration trips the content lint.

Test E2 — assemble a full NO-HOST episode (title + narration + montage + overlays + music; no PC) ⏳ ~30 s

Exercises every non-host shot type, both overlay kinds, a bed + a sting, and a transition — all on the Mac, no GPU/PC.

cd /Users/<user>/dev/ITDT/Videos
mkdir -p verification/temp/ep
cat > verification/temp/ep/verify.episode.json <<'JSON'
{
  "title": "verify — no-host episode", "fps": 60, "canvas": { "w": 1920, "h": 1080 },
  "music": { "bed": { "source": "prompt", "value": "warm synth bed", "duck_db": -12 },
             "cues": [ { "id": "chime", "at_s": 1.5, "source": "prompt", "value": "short bright ui chime" } ] },
  "overlays_global": [ { "text": "Next episode: coming soon.", "style": "lower_third", "pos": "lower_third", "from_s": 5.0, "to_s": 9.0 } ],
  "shots": [
    { "id": "t", "type": "title", "duration_s": 4.0,
      "elements": [ { "text": "Slipstream", "style": "title_magenta", "anim": "materialize" } ],
      "overlays": [ { "text": "E01 — Slipstream", "style": "chyron", "pos": "lower_right" } ],
      "transition_out": "dissolve" },
    { "id": "n", "type": "narration",
      "background": { "source": "file", "value": "/Users/<user>/dev/ITDT/Videos/Episodes/E01/captures/placeholder_app_portrait.png" },
      "voiceover": [ { "say": "Voice over the app capture." } ], "transition_out": "cut" },
    { "id": "m", "type": "montage",
      "voiceover": [ { "say": "Holographic panels, dial by dial." } ],
      "panels": [ { "kind": "income", "capture": "/Users/<user>/dev/ITDT/Videos/Episodes/E01/montage_panels/panel_income.png" },
                  { "kind": "chart",  "capture": "/Users/<user>/dev/ITDT/Videos/Episodes/E01/montage_panels/panel_chart.png" } ] }
  ]
}
JSON
python3 Tools/scripts/render_episode.py --episode verification/temp/ep/verify.episode.json \
  --workdir verification/temp/ep/work --output verification/temp/ep/verify.mp4
# inspect
ffprobe -v error -show_entries stream=codec_type,width,height -show_entries format=duration -of default=noprint_wrappers=1 verification/temp/ep/verify.mp4
ffmpeg -y -v error -ss 2  -i verification/temp/ep/verify.mp4 -frames:v 1 /tmp/ep_title.png    # title card + chyron
ffmpeg -y -v error -ss 9  -i verification/temp/ep/verify.mp4 -frames:v 1 /tmp/ep_montage.png  # hologram panel + outro lower-third
open /tmp/ep_title.png /tmp/ep_montage.png

PASS: {"status":"ok","valid":true,...}; the master is 1920×1080 (no host → assembled at the canvas), has a video + an aac audio stream, and runs ~13 s. The title frame shows the “SLIPSTREAM” card with the lower-right chyron; the later frame shows a glowing hologram panel with the “Next episode” lower-third. (volumedetect should show real audio — bed + VO + chime.)

Test E3 — `render_episode --dry-run` on the real E01 (plan + extended-chain cost, no GPU)

cd /Users/<user>/dev/ITDT/Videos
python3 Tools/scripts/render_episode.py --episode Episodes/E01/E01.episode.json --dry-run 2>/dev/null | python3 -m json.tool | head -28

PASS: "valid": true, "host_shots": 6, base_clip_seconds ≈ 18.6 / 18.5 s per setting (the host VO is pre-synthesized on the Mac to size the clip — no GPU), and estimate shows ~8 pc_segments and ~248 pc_minutes_est — i.e. the long host shots are built as drift-managed chains of short segments, not one impossible long render. Nothing touches the GPU.

Test E4 — holographic montage look (Shot 11, no PC) ⏳ ~20 s

cd /Users/<user>/dev/ITDT/Videos
python3 Tools/scripts/render_episode.py --episode Episodes/E01/E01_montage_holo.episode.json --workdir Episodes/E01/work_montage_holo
open Episodes/E01/master/E01_montage_holo.episode.mp4

PASS: ~17 s, 1920×1080; the four translucent 3D hologram panels materialize full-frame in sequence under the voiceover (you judge the look — this is style verification).

Test E5 — extended-clip chain (drift-managed, NEEDS THE PC) ⏳ ~30 min / 5 s segment

How a long host shot stays on-model. In production this is automatic: render_episode calls the orchestrator (render_screenplay.render_extended_base_clip) whenever a host shot's VO is longer than one segment — it renders short segments on the PC, seeds each from the previous segment's seam frame (raw at d0.0, or identity-re-anchored if a denoise > 0), and concatenates them aligned. To verify the mechanism by hand without a full episode, use the test drivers:

cd /Users/<user>/dev/ITDT/Videos
# one segment from a seam frame (d0.00 = raw seed = smoothest seam; --seconds 3 = shorter/cheaper):
python3 Episodes/E01/sample15/chain_step.py \
  --seg1 Episodes/E01/base_clips/elira_rain_street_832x480_c3_5s.mp4 \
  --reference Episodes/E01/start_frames/elira_shot01_rain_street_832x480_c3.png \
  --setting "rain-slicked empty street at night, NY-style neon megalopolis towering behind, volumetric haze, magenta and cyan rim light" \
  --denoise 0.0 --seconds 3 --out-prefix verify_chain
# multi-segment chain (5s seed clip + two 3s segments), then watch the join:
python3 Episodes/E01/sample15/chain_run.py --init-clip Episodes/E01/base_clips/elira_rain_street_832x480_c3_5s.mp4 \
  --reference Episodes/E01/start_frames/elira_shot01_rain_street_832x480_c3.png \
  --setting "rain-slicked empty street at night, NY-style neon megalopolis towering behind, volumetric haze, magenta and cyan rim light" \
  --steps "3:0.0,3:0.0" --out-prefix verify_chain2
open Episodes/E01/sample15/chain_verify_chain2.mp4

PASS: the PC renders each segment (~16 min for a 3 s / 49-frame clip, ~30 min for 5 s); the concatenated clip plays continuously and the character holds together across the seams (you judge the video). The orchestrator's segment length and seam mode are the constants EXTEND_SEG_S / EXTEND_DENOISE in render_episode.py (default 5.06 s, raw seam).

Test E6 — `prepare_delivery` upscale of an episode master (no PC) ⏳ ~3 min

cd /Users/<user>/dev/ITDT/Videos
python3 Tools/scripts/prepare_delivery.py --master Episodes/E01/master/E01_montage_holo.episode.mp4 --platform youtube_landscape

PASS: a …_youtube_landscape_1920x1080.mp4 under Generated/Delivery/, 1920×1080@60, audio intact (Real-ESRGAN upscale is the terminal step — assembly stays native, upscale is dead-last).

Test E7 — pause / resume the extended chain (no PC) ⏳ ~15 s

The long host chain can be paused at a segment boundary and resumed without losing GPU work (render_control.py; the chain checks the .render_paused sentinel before each segment). This verifies the control + the hold/release logic headlessly — no GPU.

cd /Users/<user>/dev/ITDT/Videos
# 1) the control CLI round-trips the sentinel
python3 Tools/scripts/render_control.py status    # paused:false
python3 Tools/scripts/render_control.py pause     # paused:true
python3 Tools/scripts/render_control.py status    # paused:true
python3 Tools/scripts/render_control.py resume    # was_paused:true
ls Tools/scripts/.render_paused 2>/dev/null && echo "STILL THERE (FAIL)" || echo "sentinel gone (ok)"
# 2) the chain actually HOLDS while paused and continues when cleared (function-level, no GPU)
python3 - <<'PY'
import sys, time, threading
sys.path.insert(0, "Tools/scripts")
import render_screenplay as rs
rs.PAUSE_POLL_S = 1                                   # snappier poll for the test
assert rs.pause_requested() is False
rs.PAUSE_SENTINEL.write_text("paused\n")             # arm a pause
assert rs.pause_requested() is True
released = []
def clear(): time.sleep(2); rs.PAUSE_SENTINEL.unlink(); released.append(True)
threading.Thread(target=clear, daemon=True).start()
t0 = time.time(); rs._wait_if_paused("test boundary"); held = round(time.time() - t0, 1)
assert held >= 1.5 and released, f"did not wait for resume (held {held}s)"
print(f"PASS — held {held}s until resume, then continued")
PY

PASS: the CLI flips paused true→false and the sentinel file is gone afterward; the Python block prints PASS — held ~2s until resume, then continued (i.e. _wait_if_paused blocked while the sentinel was present and returned once it was removed). Segment-resume (no separate test): re-run E5's chain — or re-run render_episode after a pause-then-kill — and the chain reuses any seg_NN.mp4 already on disk (already rendered — reusing in the log) instead of re-rendering finished segments. Elira drives the same controls via pause_render / resume_render / render_pause_status (Elira-chat doc Test E5).

Cleanup (optional)

rm -rf /Users/<user>/dev/ITDT/Videos/verification/temp/sc_verify
rm -rf /Users/<user>/dev/ITDT/Videos/verification/temp/ep

The promoted base clip stays cached in character_library.json + the run workdir under Generated/Screenplay/. To force a clean re-render, clear that setting's entry from base_clips in the library.

Quick reference — what each test proves

Test	Build step	Proves
1 / 1b / 1c / 1d	1	PuLID start frame at 480×832; candidate gate; fail-loud; `--character` lib resolution + negative
2	2	schema + validator + library; work plan; gating
3	3	per-character voice; pad-to invariant; fail-loud
4	4	timeline math + scene WAV + ping-pong + concat + mix (no PC)
5	5	rounded-corner + alpha-fade PIP
6	6	orchestrator dry-run plan + cost; gating
7	2/6	library lora + reference correct
8 / 8b	6	Elira tool wiring; model sees the tools
9	—	PC pre-flight + unattended Relay mount
10	all	the real cross-machine end-to-end + resume/cache + steady eyes
11	delivery	Real-ESRGAN upscale + platform formats (YouTube/X/App Store) + cache

Screenplay Compiler — Mac Verification

How to run this document — read this first

0. One-time setup

Test 1 — generate_image start-frame mode (PuLID, 9:16)

Test 1b — candidate set for the review gate (optional)

Test 1c — fail-loud (bad inputs)

Test 1d — library resolution (--character) + negative prompt (new this session)

Test 1e — explore_outfits — survey DIFFERENT outfits (new this session)

Test 1f — new-character pipeline: build_character_set + register_character (new this session)

Test 2 — schema + validate_screenplay + library

Test 3 — generate_voiceover --voice

Test 4 — assemble_timeline deterministic ops (no PC needed)

Test 5 — compose_video --pip-round + --pip-fade

Test 5b — fill_clip — extend a short clip to a target length (new this session)

Test 5c — compose_video general multi-overlay + multi-audio (new this session)

Test 6 — render_screenplay --dry-run (plan + cost, no GPU)