Screenplay Compiler — Verify by Talking to Elira

Verbatim verification procedure (talking to Elira) · 2026 · ITDT LLC · sanitized placeholders

Audience: the owner. Goal: Verify the same Screenplay Compiler functionality as Screenplay_Compiler_Mac.md, but entirely through text prompts to the local Elira AI — you don't run any video-generation commands in the terminal. The only terminal command is starting Elira.

How to read this: for each test, say the prompt to Elira (type it into her chat), then check the Expect block. Elira should call the tool and report a concrete result (a saved path, a job id, a plan), not just describe what she would do. If she only describes it, nudge: "Please actually run it." If a test fails, note the test number and paste her reply.

These map 1:1 to the build steps. The fast tests (1–7) need no PC. Test 8 is the full cross-machine render (~50 min, PC must be on).

Start Elira (the only terminal step)

cd /Users/<user>/dev/ITDT/Videos
python3 Elira/elira_text.py

You'll see Elira ready (...). Type your message; 'stop' to end. Type your prompts at the cursor. Type stop when finished.

If Elira ever says she can't reach the PC, that only affects Test 8 (the others are Mac-only).

Ground rules — keep ALL files inside the temp sandbox (say this FIRST)

Before any test, paste this to Elira so she doesn't scatter outputs or scratch files into /tmp, /private/..., the Desktop, or anywhere else at the root level:

Elira, for this whole session keep **every** file you create — final outputs **and any intermediate, scratch, cache, or temp files** — inside the directory `/Users/<user>/dev/ITDT/Videos/verification/temp` (the tests use the `sc_chat` subfolder under it). Do **not** write to `/tmp`, `/private`, `~/Desktop`, your home directory, or any system temp location. If a tool defaults to a temp path, override it to a path under `verification/temp`. Each test below gives you an exact path to save to — use it. Before you write a file, confirm the path starts with `/Users/<user>/dev/ITDT/Videos/verification/temp`. You do not need to create any folders now — just save to the paths I give you (folders are created automatically).

Expect: She acknowledges in one short message (she should not need to create any directories yet, and should not start asking which folder to make). If you ever see her report a path outside verification/temp in a later test, stop and remind her: "That path is outside the sandbox — put it under verification/temp."

Every explicit path in the tests below already lives under verification/temp/. If Elira proposes a different location for any working file, correct her before letting her proceed.

Test 1 — Start frame (identity-anchored image)

Say:

Elira, generate a start frame of yourself on a neon-lit city rooftop at dusk. Use your default outfit reference so it's really you, and save it to /Users/<user>/dev/ITDT/Videos/verification/temp/sc_chat/startframe.png.

Expect: She calls generate_image with character: "elira" (this is the fix from this session — she now resolves her own reference + look from the character library, so "of yourself" really is her), then reports a saved path and that it's a 480x832 (9:16 portrait) image. Open it:

Say:

Open that image for me.

(She runs open, or just open /Users/<user>/dev/ITDT/Videos/verification/temp/sc_chat/startframe.png yourself.) It should look like Elira — jet-black hair, magenta blazer, modest/high-neck, facing camera. (Previously the chat tool couldn't look up her reference and produced a generic woman; that's fixed.)

Also try the review-gate variant:

Make me three start-frame candidates of yourself in a quiet modern office at night, so I an pick the cleanest one. Put them in /Users/<user>/dev/ITDT/Videos/verification/temp/sc_chat/candidates/.

Expect: generate_image with candidates = 3; she reports a folder with 3 images, all under verification/temp/sc_chat/candidates/.

Negative-prompt variant (new this session):

Make another start frame of yourself on that rooftop, but this time steer away from any low-cut top or open blazer — keep it modest. Save it to /Users/<user>/dev/ITDT/Videos/verification/temp/sc_chat/startframe_neg.png.

Expect: she calls generate_image with character: "elira" and a negative (e.g. "low-cut top, open blazer, exposed cleavage"); the saved image keeps the high-neck/modest look. (A real negative is more reliable than wording it in the positive.)

Different-outfits variant (new this session):

Show me three different outfits for yourself — a charcoal grey business pantsuit, a navy blazer over a grey turtleneck, and a matte-black sci-fi utility jacket — one image each so I can pick. Put them in /Users/<user>/dev/ITDT/Videos/verification/temp/sc_chat/outfits/.

Expect: she calls explore_outfits (NOT generate_image with candidates) with one entry per outfit, and reports a folder with three images — the same Elira face in three visibly different outfits. (Candidates would have given three takes of the same outfit; explore_outfits is the right tool when the outfits themselves differ.)

Test 2 — Validate a screenplay + show the plan

First have Elira create a small screenplay file (so you don't write JSON):

Say:

Write a one-scene screenplay to /Users/<user>/dev/ITDT/Videos/verification/temp/sc_chat/mini.json: I'm on a neon-lit city rooftop at dusk in my default outfit, I say "This is the way", wait three seconds, then say "Follow me", and hold two seconds at the end. Then validate it and tell me the plan.

Expect: She write_files the JSON to mini.json (NOT opening it in TextEdit) using the correct format — top-level scenes, each with character/outfit/setting/ presentation/timeline of single-key {"say": …} / {"wait": …} — then validates it (via render_screenplay dry-run or validate_screenplay), reporting it's valid, 1 scene, and 1 new base clip (the rooftop setting). If she opens a TextEdit window or invents keys like actions/dialogue, the Modelfile grounding didn't take — tell me.

Now test that bad input is caught:

Now make a broken version at /Users/<user>/dev/ITDT/Videos/verification/temp/sc_chat/bad.json where the scene is picture-in-picture but has no background, and validate that one.

Expect: She reports it's invalid, with the reason that a pip scene requires a background. (She should NOT try to render it.)

Test 3 — Voiceover in Elira's voice

Say:

Record a voiceover of you saying "Your finances, decade by decade." and save it to /Users/<user>/dev/ITDT/Videos/verification/temp/sc_chat/vo.wav.

Expect: generate_voiceover runs; she reports the saved WAV and ~3 second duration. Listen:

Play that back for me.

(Or afplay /Users/<user>/dev/ITDT/Videos/verification/temp/sc_chat/vo.wav yourself.) It should be Elira's voice.

Exact-length variant (matters for lip-sync):

Make another one of "This is the way." but pad it to exactly five seconds, save to /Users/<user>/dev/ITDT/Videos/verification/temp/sc_chat/vo5.wav.

Expect: she reports a 5.0-second clip.

Test 4 — Compose a video with a rounded, fading overlay

This checks the compositor (--pip-round / --pip-fade) through Elira. Have her make the two inputs first, then composite:

Say:

Make me two quick test clips with a shell command, as standard H.264 yuv420p MP4 (use -pix_fmt yuv420p and -y to overwrite): a 1280x720 four-second test pattern at /Users/<user>/dev/ITDT/Videos/verification/temp/sc_chat/bg.mp4 and a 480x832 four-second test pattern at /Users/<user>/dev/ITDT/Videos/verification/temp/sc_chat/pip.mp4.

(She'll use run_shell with ffmpeg. If you'd rather, this is the one spot you can paste the two ffmpeg lines from Screenplay_Compiler_Mac.md Test 5 — but Elira can do it.)

Then say:

Now compose a video: background bg.mp4, with pip.mp4 in the top-right corner sized about 360 pixels wide (so the whole inset fits inside the frame), from half a second to three and a half seconds, with rounded corners and a fade, four seconds total. Save to /Users/<user>/dev/ITDT/Videos/verification/temp/sc_chat/composite.mp4 and show me a frame from the middle so I can see the rounded corners.

Expect: compose_video runs with rounded/fade options; she reports the saved MP4. The mid-frame shows the inset with visibly rounded corners.

Test 5 — Dry-run the whole screenplay (plan + cost, no rendering)

Say:

Do a dry run of the screenplay at /Users/<user>/dev/ITDT/Videos/verification/temp/sc_chat/mini.json — I want to see the plan and the time estimate before we render anything.

Expect: render_screenplay with dry_run; she reports 1 scene, 1 new base clip, and an estimate of about 31 minutes on the PC plus ~12 minutes on the Mac (~43 total) — and that nothing was rendered yet. This is the cost gate; she should offer to start the full render only on your say-so.

Test 6 — Elira knows the new capabilities

Say:

What can you do with a finished screenplay file, and how do I stop a render that's already going?

Expect: She describes, in her own words, that she can render a whole screenplay end-to-end (start frames on the Mac, base clips on the PC, then voice/lip-sync/cut), that she'll dry-run first for the cost, and that she can stop one clip (stop_pc_job) or halt the whole run (stop_generation). This confirms the model actually has the new tools in view.

Also check she knows about mounting the share (new this session):

If the shared drive to the PC isn't mounted, can you mount it yourself without me logging in?

Expect: she says yes — she can mount it with mount_relay (it uses saved credentials, no login needed), and that render_screenplay mounts it automatically when a render needs the PC.

Test 7 — PC reachability + share mount (pre-flight for Test 8)

Say:

Is the PC render machine reachable right now, is the base-clip renderer available, and please make sure the shared drive is mounted.

Expect: She checks the PC (script catalog / status) and reports it's reachable with render_base_clip available, and she calls mount_relay and confirms the share is mounted at /Volumes/Relay (already-mounted is fine). If she says the PC is off or unreachable, power it on / start the Supervisor before Test 8.

Test 8 — FULL END-TO-END (the real one) ⏳ ~50 min, PC must be on

This is the integration test, driven entirely by talking to Elira.

Say:

Render the screenplay at /Users/<user>/dev/ITDT/Videos/verification/temp/sc_chat/mini.json into a finished video at /Users/<user>/dev/ITDT/Videos/verification/temp/sc_chat/master.mp4. Go ahead and start the full render — tell me the run id, and I'll check back.

Expect immediately (this is the key fix): She confirms she's starting the full render and gives you a job id right away — she does not freeze waiting for the render. The chat stays responsive (the render runs in the background). She should NOT claim it's finished yet. (If the chat hangs and won't take your next message, you're on the old build — stop / restart python3 Elira/elira_text.py so the render runs as a background job.)

While it runs (~31 min PC render, then ~11 min Mac lip-sync), check in every so often:

How's the render coming along?

Expect: She checks the job and reports whether it's still running or complete (the background job reports running/done rather than fine-grained stages). She only says it's done once the job actually reports complete with the saved master path. The chat keeps responding between checks.

When she says it's done — verify the result:

Open the finished video.

PASS: the master plays — Elira on the rooftop, lips moving on "This is the way" and "Follow me", mouth at rest during the 3-second gap, her voice, ~8–9 seconds, no visible seam where the background loops, and her eyes are steady (no flickering "twinkle") — the eye-restore step handles that.

Resume / cache check (proves it won't redo the slow render):

Render that same screenplay again to /Users/<user>/dev/ITDT/Videos/verification/temp/sc_chat/master2.mp4.

PASS: this run is much faster (a couple of minutes) because the rooftop base clip is already cached — she shouldn't trigger another 31-minute PC render. The second master should match the first.

Optional — stop controls (only if you want to test cancelling): start the render, then while the PC clip is going, say:

Actually, stop the whole generation.

Expect: she halts the run (stop_generation) and confirms it's cancelled.

Test 9 — Export a finished video for a platform (delivery) ⏳ first run ~4 min

This checks prepare_delivery through Elira (upscale + format). Needs a finished master — use the one from Test 8, or verification/temp/sc_chat/master.mp4.

Say:

Take the finished video at /Users/<user>/dev/ITDT/Videos/verification/temp/sc_chat/master.mp4 and make a version ready to post on YouTube.

Expect immediately: she calls prepare_delivery (platform youtube), tells you it's a background job with a job id, and that the upscale takes a few minutes. She should NOT claim it's done yet.

Check in:

Is that YouTube version ready yet?

Expect: she checks the job and, once done, reports a saved 1080×1920 file. Open it — Elira is sharper than the original with no eye shimmer.

Now a second platform (proves the cache makes it quick):

Now also make an X version of that same video.

Expect: much faster (the upscale is cached); she returns an X-formatted file.

App Store (she should know it's a composite):

What about an App Store version of it?

Expect: she explains an App Store preview is a composite (the talking head goes in a corner over the app screen), and either produces a PIP-ready element or asks for the app screen capture to composite it onto — rather than just exporting a full-frame clip.

Then have her actually composite over a real app capture:

composite the talking head at /Users/<user>/dev/ITDT/Videos/verification/temp/sc_chat/master.mp4 over the app capture at /Users/<user>/dev/ITDT/Videos/Episodes/AppStore_v1/capture_trimmed_27s.mp4

Expect: prepare_delivery (platform appstore, with a background) runs as a background job; she gives a job id, you ask her to check it, and it completes with Generated/Delivery/master_appstore_composite.mp4 (mode composite, e.g. 1320×2868).

⚠️ Don't trust the "done" message — verify the file yourself. Two real defects have shipped here before that the chat reply happily masked: (1) the composite came out silent (Elira's mouth moves, no sound — the app-capture background has no audio and the compositor dropped the talking head's voice), and (2) a near-identically-named intermediate …_composite_pip.mp4 (just the talking head, no app behind it) sat next to the real composite and got opened by mistake. So run these three checks on the actual output:

F=/Users/<user>/dev/ITDT/Videos/Generated/Delivery/master_appstore_composite.mp4
# (a) AUDIO PRESENT + not silent — the key regression guard. Expect an aac stream and a
#     real max_volume (around -8 dB), NOT "mean/max_volume: -91 dB" and NOT an empty result.
ffmpeg -v info -i "$F" -map 0:a -af volumedetect -f null - 2>&1 | grep -iE "mean_volume|max_volume"
# (b) only the deliverable is left — no leftover _pip / _voice files to open by mistake.
ls /Users/<user>/dev/ITDT/Videos/Generated/Delivery/ | grep -i composite   # expect ONLY master_appstore_composite.mp4
# (c) eyeball a frame from the middle — Elira should be a rounded corner PIP over the app.
ffmpeg -y -v error -ss 12 -i "$F" -frames:v 1 /tmp/appstore_check.png && open /tmp/appstore_check.png

PASS: (a) prints a real max_volume near −8 dB (audio is present and audible), (b) lists only master_appstore_composite.mp4, (c) the frame shows the talking head as a rounded corner inset over the app. If (a) shows no audio stream or −91 dB, the composite is silent — stop and report it.

Episode Compiler — verify by talking to Elira

These check the newer episode tools registered with Elira (render_episode, validate_episode, and the render controls pause_render / resume_render / render_pause_status) — the full multi-shot pipeline, not just a single short scene. They use the real E01 episode so you don't have to author one; E1–E3 and E5 use no GPU. (If she reaches for render_screenplay here instead of render_episode, the Modelfile prose wasn't rebuilt — see the command doc's Test E-tools.)

Test E1 — Elira validates an episode

Say:

Validate the episode at /Users/<user>/dev/ITDT/Videos/Episodes/E01/E01.episode.json — is it ready to render?

Expect: she calls validate_episode and reports it is valid, 13 shots, content check passed, and lists the app captures not recorded yet as warnings. She does NOT render anything or claim a video was made.

Test E2 — Elira dry-runs the episode render (plan + cost, no GPU)

Say:

Do a dry run of rendering that episode — I want the plan and time estimate before any GPU.

Expect: she calls render_episode with dry_run and reports about 6 host shots, ~8 PC segments, ~248 min on the PC + ~72 min on the Mac (~320 total) — and offers to start the real render only on your say-so. She should treat the long host shots as buildable (chained), not impossible.

Test E3 — Elira knows screenplay vs episode

Say:

What's the difference between rendering a screenplay and rendering an episode?

Expect: she distinguishes render_screenplay (one short scene) from render_episode (a full multi-shot episode — host shots, app-capture narration, title cards, a hologram montage, overlays, music, transitions), and says she'd dry-run first either way. Confirms the rebuilt model advertises both workflows.

Test E4 — full episode render (optional) ⏳ hours, PC must be on

Say:

Go ahead and render that episode for real.

Expect: she starts render_episode in the background and gives you a job id right away (she must NOT freeze the chat for hours). Ask "how's the episode render going?" — she uses check_mac_job; she must NOT claim it's finished until the job reports complete. (This one is really for when E01 is asset-complete — without the recorded app captures the narration shots will error.)

Test E5 — Elira pauses and resumes a render (no GPU needed)

Checks the pause/resume controls. The control works whether or not a render is actually running, so you can run this without spending GPU.

Say:

Is the video render paused right now?

Expect: she calls render_pause_status and reports it is not paused (and that no render orchestrator is running, if none is).

Say:

Pause the render.

Expect: she calls pause_render and explains it'll finish the current segment then hold before the next one (and that finished segments aren't redone). If nothing is running she says so honestly rather than pretending she stopped a render.

Say:

Okay, resume it.

Expect: she calls resume_render and says it continues from the next segment, reusing finished segments. (Afterward, Tools/scripts/.render_paused should be gone — she should not be left "paused.") She should treat pause as different from stop the whole generation (stop_generation).

When you're done

Type stop to end the Elira chat. Tell me which tests passed; for any that failed, paste Elira's reply for that test.

Everything this run produced lives under /Users/<user>/dev/ITDT/Videos/verification/temp/. To reset for a clean run, wipe just that folder:

rm -rf /Users/<user>/dev/ITDT/Videos/verification/temp/*

If you find any session artifacts outside that directory (in /tmp, /private, the Desktop, etc.), that's a ground-rules miss — note it and tell me which test produced it.

Map to the command-based doc

Every test here exercises the same thing as Screenplay_Compiler_Mac.md, just through Elira. This doc has Tests 1–9. The middle column below is the equivalent test number in the OTHER file (Screenplay_Compiler_Mac.md), not a test in this doc — e.g. this doc's Test 8 is the same thing as the Mac doc's Test 10.

Test in THIS doc (Elira chat)	Same as in `Screenplay_Compiler_Mac.md`	Build step
Test 1	Tests 1 / 1b / 1d / 1e	start-frame mode (incl. `--character`, negative, `explore_outfits`)
Test 2	Test 2	schema + validator
Test 3	Test 3	voiceover --voice
Test 4	Test 5	compose --pip-round/--pip-fade
Test 5	Test 6	render_screenplay dry-run
Test 6	Test 8	Elira tool wiring
Test 7	Test 9	PC pre-flight + Relay mount
Test 8	Test 10	full cross-machine end-to-end + cache + steady eyes
Test 9	Test 11	delivery formats (Real-ESRGAN upscale → YouTube/X/App Store)

(The only build-step piece with no natural single-prompt here is assemble_timeline's internal editing math — it isn't a standalone Elira tool; it runs inside Test 8's render. The command-doc Test 4 covers it directly if you want the isolated check.)