Audio Generator - PR0TA Docs

Workspace Map

Top Header: editor title, panel toggle, generation status, and close.
Left Sidebar (collapsible + resizable): modality panels, metadata panel, and performance text panel.
Main Workspace: audio selection/clear controls plus full player and trim controls.
Bottom Generations Panel (collapsible + resizable): previous generated outputs for quick A/B selection.

Panel-by-Panel Reference

Dialogue Panel (Text-to-Speech)

Core dialogue generation panel with character selector, prompt text, model picker, dynamic model options, and output format. Character-specific voice settings can be loaded/saved against cast entries.

SFX Panel

Focused sound-effect generation with prompt, model picker, dynamic settings, duration slider, and output format. Best for effects/foley layers.

Voice Design Panel

Builds reusable voice profiles using character name, voice characteristics, and sample line fields. Includes model picker and advanced options for voice identity shaping.

Voice-to-Voice Panel

Transforms source audio into a target voice. Supports file upload, drag-drop, asset browser selection, source preview player, target voice picker, and model-specific options.

Audio Metadata Panel

Sidebar metadata context for selected audio and timeline points (duration/current time/in-out).

Performance Text Panel

Displays performance text context tied to the current asset so generation choices stay aligned to script intent.

Workspace Header

Main-center controls to select project audio into workspace and clear the current canvas audio.

Audio Player Panel

Playback and edit controls: play/pause, seek, volume, speed, reverse, in/out points, apply trim, and reset trim.

Previous Generations Panel

Bottom history strip of generated outputs with quick selection. Collapsible and vertically resizable for compare-heavy review passes.

Panel Persistence Behavior

Each modality keeps its own configuration snapshot, so switching panels preserves previous settings instead of resetting them.

Main Workspace Capabilities

Select existing project audio into the workspace header.
Play/pause, seek, volume, playback speed, and reverse playback controls.
Set in/out points and apply trim or reset trim.
Review metadata and attached performance text context during iteration.

Step-by-Step Workflow

Open Audio Generator in the target project.
Choose a modality panel (Dialogue, SFX, Voice Design, or Voice-to-Voice).
Set model + prompt/options in that panel.
Generate and monitor status in header/task indicators.
Preview in the central player and trim if needed.
Compare alternates from the bottom previous-generations panel.
Select the approved take and continue to timeline/assembly workflows.

Usage Tips

Name outputs by scene + intent + version so downstream review is faster.
Keep each modality focused on one task type to avoid prompt confusion.
Use the bottom generations panel as your source-of-truth shortlist before exporting or placing in edits.

Controls and Functions

Control	Function	Output
Dialogue panel	Generates text-to-speech from character, prompt, voice, model, and format settings.	Dialogue audio assets and takes.
SFX panel	Generates sound effects or foley from prompt, duration, model, and output settings.	SFX assets ready for Timeline or Mixer.
Voice design panel	Creates reusable voice profiles from descriptive characteristics and sample lines.	Voice options for character and dialogue workflows.
Voice-to-voice panel	Transforms uploaded or selected source audio into a target voice.	Alternate voice takes.
Audio source controls	Select, upload, drag-drop, preview, and clear source audio.	The active source drives trim and transform actions.
Player transport	Play, pause, seek, set volume, speed, reverse, and scrub.	Use to evaluate generated takes precisely.
Trim controls	Set in/out points, apply trim, or reset trim.	Creates cleaner clips for timeline use.
Metadata/performance panels	Show duration, current time, selected asset data, and script/performance text.	Use for alignment with character or scene intent.
Previous generations panel	Displays prior audio takes for comparison and selection.	Select the approved take before handoff.