Study Design Guide
How to compose effective study scripts for UserTold.ai. Covers mode selection, segment sequencing, field-by-field guidance, anti-patterns, annotated examples, and the full SegmentV2 field reference.
Quick Schema Reference
Before writing a script, make sure you have the required fields for each segment mode.
| Field | Required | Type | Notes |
|---|---|---|---|
version | yes | 2 | Must be exactly 2 |
goals | yes | array | [{ id, description }] objects |
segments | yes | array | Segment objects |
segments[].id | yes | string | Unique within script |
segments[].mode | yes | string | talk | speak | observe |
segments[].title | yes | string | Display label |
segments[].speak_text | yes for speak | string | Spoken text delivered by AI |
segments[].talk | recommended for talk | object | { system_prompt?, goals? } |
segments[].instruction | required for observe | string | Task instruction shown to participant |
segments[].conductor_context | required for observe | string | AI-only context for interpretation and later debrief |
Missing speak_text on a speak segment or missing both instruction and conductor_context on an observe segment will fail validation.
Production scripts only support deterministic advancement: max_duration_s, user Done / step_done, url:<substring>, action:<selector-or-pattern>, complete_segment for talk segments, and scripted speak completion.
Modes
Every segment runs in one of three modes. Picking the right mode is the single most important decision per segment.
talk — Conversational Interview
The AI conducts a natural voice conversation: asks questions, listens, follows up.
Use when you need:
- Open-ended discovery (triggers, motivations, decision criteria)
- Follow-up probing on specific answers
- Rapport building at interview start/end
- Debrief after observation
Key fields: talk.system_prompt, talk.goals
The system_prompt shapes the interviewer's personality, question style, and focus. Goals tell the AI which research objectives this segment should pursue.
speak — Scripted Transition
The AI delivers a scripted one-way transition message, then advances when the spoken line completes. No back-and-forth.
Use when you need:
- Task instructions before an observe segment
- Welcome/intro messages
- Consent language or disclaimers
- Transitions between study phases
Key fields: speak_text, speak_submode
Keep speak_text concise. Participants tune out after ~30 seconds of monologue. Speak mode is output-only: microphone VAD is not used to interrupt the assistant because it cannot reliably distinguish participant speech from assistant playback echo. If interruption is needed later, use an explicit participant control rather than automatic voice barge-in.
observe — Silent Observation
The AI watches the participant use your product. It stays quiet and preserves what the participant says and does.
Use when you need:
- Usability testing (watch a task end-to-end)
- Workflow observation (see how they actually work)
- Any scenario where interruption would bias behavior
Key fields: instruction, conductor_context, max_duration_s, deterministic advance_when
The instruction is shown to the participant. The conductor_context is AI-only background knowledge that helps later interpretation and planned talk debriefs understand expected behavior and friction.
Segment Sequencing
Order matters. The right sequence produces richer data than any individual segment.
The Core Pattern: speak → observe → talk → speak/end
Most usability studies follow this arc:
- speak — Set up the task ("I'd like you to complete a purchase...")
- observe — Watch them do it (silent, no leading)
- talk — Debrief on what happened ("What were you thinking when you paused on the payment page?")
- speak/end — Thank the participant and close the interview
Why this order works:
- Speak gives clear instructions without a conversation that might bias behavior
- Observe captures natural behavior before you ask about it
- Talk references concrete moments the participant just experienced
- The closing speak segment ends cleanly without starting a new conversation
Talk-Only Pattern: talk → talk → talk → talk
Pure interview. Each segment narrows focus:
- Rapport & context — Establish the recent event
- Trigger & need — What happened, what they needed
- Friction & workarounds — Where things broke, what they did instead
- Wrap-up — Confirm understanding, capture the key moment
Exploration Pattern: talk → observe → talk
Start with context, then watch, then probe:
- Context — Understand their routine and tools
- Demo — Watch them do the thing
- Probe — Dig into what you observed
Principles
- Never start with observe. Participants need context first — at minimum a speak segment with instructions.
- Never end with observe. Always debrief. The richest insights come from asking "why did you do X?" after watching X happen.
- Use speak for transitions, not conversations. If you need back-and-forth, use talk.
- Limit observe segments to 5-7 minutes. Beyond that, participants lose focus and data quality drops. Use
max_duration_s.
Field-by-Field Guidance
system_prompt (talk mode)
The system_prompt defines the interviewer's behavior for a talk segment. It's the most impactful field in the entire script.
Good system_prompt traits:
- Specifies question style (one question at a time, concrete, no leading)
- Names the evidence to pursue (behaviors, not opinions)
- Sets boundaries (no solution selling, no roadmap talk)
- Includes 2-3 example follow-up questions
Example:
Ask only about concrete recent behavior, not hypothetical futures.
For each friction point mentioned, ask:
- "What did you click or type next?"
- "What did you expect to happen?"
- "What happened instead?"
Do not move on until you capture behavior + consequence.
Avoid:
- Generic prompts ("Ask good questions about the user experience")
- Long preambles that dilute the core instruction
- Contradictory rules ("Be concise" + "Always ask 3 follow-ups")
conductor_context (observe mode)
Background knowledge for interpretation and planned talk debriefs — NOT shown to the participant.
Good conductor_context traits:
- Describes expected vs. stuck behavior
- Mentions UI elements that are commonly missed
- Provides domain-specific context
Example:
The "Submit" button is below the fold on mobile. Users frequently scroll past it.
Expected flow: fill form → scroll down → tap Submit → see confirmation.
If the user scrolls up and down repeatedly, they are likely stuck.
Avoid:
- Empty string (wastes an opportunity to preserve useful interpretation context)
- Participant-facing language (this is AI-only)
instruction (observe mode)
Shown to the participant. Tells them what to do.
Good instruction traits:
- One clear task
- Concrete start and end points
- No hints about how to complete it
Example: "Complete a purchase of any item, from product page through to the confirmation screen."
Avoid:
- Multiple tasks in one instruction
- Hints: "Click the blue button to check out" (biases behavior)
- Vague: "Use the product" (no clear success criteria)
advance_when
Tells the conductor when to auto-advance to the next segment. Production scripts only accept deterministic rules.
Supported rules:
url:https://example.com/confirmation
action:#submit-order
URL rules advance when the participant navigates to a URL containing the value. Action rules advance when a matching participant action is observed. Goals guide analysis and planned debriefs, not live advancement.
Tips:
- Prefer URL-based or action-based rules when possible.
- For talk segments, prefer the built-in
complete_segmenttool overadvance_when.
goals (study-level and talk-level)
Study-level goals define what the entire study should learn. Talk-level goals (talk.goals) tell a specific segment which study goals to pursue.
Good goals:
- Observable and specific: "Capture the exact trigger event and context"
- Outcome-oriented: "Identify workarounds used when the primary flow fails"
Avoid:
- Vague: "Understand the user" (understand what, specifically?)
- Too many: 3-5 goals per study is ideal. More than 7 dilutes focus.
- Duplicated: Don't repeat the same goal in every segment. Assign goals to the segments where they're most relevant.
max_duration_s
Safety valve for observe segments. Auto-advances after this many seconds.
Recommendations:
- Observe segments: 300-420s (5-7 minutes)
- Speak segments: rarely needed (they're short by nature)
- Talk segments: rarely needed (conversation has natural endings)
skip_if
Natural language condition for skipping this segment entirely.
Example: "The participant already described their trigger event in the previous segment."
Common Anti-Patterns
1. All-talk studies with no observation
Problem: Five talk segments in a row. You're collecting opinions, not behavior. Fix: Add at least one observe segment. Watch them do the thing, then talk about it.
2. Missing conductor_context
Problem: Observe segment with empty conductor_context. The AI has no idea what "stuck" looks like for your specific task.
Fix: Always describe expected behavior, common failure points, and what stuck looks like.
3. Vague goals
Problem: Goals like "Understand the user experience" or "Learn about needs." Fix: Make goals concrete and observable: "Capture the exact workaround used when export fails."
4. Starting with observe
Problem: First segment is observe. Participant has no idea what they're supposed to do. Fix: Always precede observe with speak (task instructions) or talk (context gathering).
5. No debrief after observe
Problem: Observe segment followed by interview end. You captured behavior but never asked why. Fix: Always follow observe with a talk debrief that references what you just watched.
6. Giant system_prompt
Problem: 500-word system_prompt that tries to cover every scenario. The AI loses focus.
Fix: Keep prompts to 3-5 concrete rules. Use talk.goals to direct focus rather than long prompts.
7. Too many goals
Problem: 10 goals across 3 segments. None get adequate coverage. Fix: 3-5 goals per study. Each goal assigned to 1-2 segments where it's most relevant.
8. No max_duration_s on observe
Problem: Observe segment runs indefinitely. Participant wanders for 15 minutes. Fix: Set max_duration_s to 300-420 for observe segments. Use a deterministic advance_when rule for early completion.
Annotated Examples
Talk Interview
Why this works: Pure talk study is appropriate when you are asking about past behavior instead of observing current behavior. Each segment narrows the aperture from context to trigger to friction to alternatives.
{
"version": 2,
"defaults": {
"system_prompt": "You are a product interviewer. Ask one concrete question at a time. Prefer behavior evidence over opinions."
},
"goals": [
{ "id": "g_trigger", "description": "Capture the exact trigger event and context" },
{ "id": "g_outcome", "description": "Capture desired outcome and success criteria" },
{ "id": "g_friction", "description": "Capture specific friction points during execution" },
{ "id": "g_workaround", "description": "Capture workarounds or alternate paths used" },
{ "id": "g_decision", "description": "Capture tradeoffs and stop/go decisions" }
],
"segments": [
{
"id": "seg_rapport",
"title": "Rapport & Context",
"mode": "talk",
"talk": {
"system_prompt": "Set a 1-sentence boundary and ask for the last real time the participant did the target task end-to-end.",
"goals": ["g_trigger"]
}
},
{
"id": "seg_trigger",
"title": "Trigger & Job",
"mode": "talk",
"talk": {
"system_prompt": "Ask: what happened right before they started, what they needed done, what success meant. Follow with one strict probe per answer.",
"goals": ["g_trigger", "g_outcome"]
}
},
{
"id": "seg_friction",
"title": "Friction & Workarounds",
"mode": "talk",
"talk": {
"system_prompt": "For every friction mention, ask: what did you click next, what obstacle appeared, what did you do to keep going, what did it cost? Do not move on until you capture behavior + consequence.",
"goals": ["g_friction", "g_workaround", "g_decision"]
}
},
{
"id": "seg_compare",
"title": "Alternatives",
"mode": "talk",
"talk": {
"system_prompt": "Ask only about alternatives already used, not hypothetical futures. Push for one concrete example per alternative.",
"goals": ["g_outcome", "g_decision"]
}
},
{
"id": "seg_wrap",
"title": "Wrap Up",
"mode": "talk",
"talk": {
"system_prompt": "Summarize what you heard. Ask: is there one concrete moment that shows what matters most?",
"goals": ["g_decision"]
}
}
]
}
Design notes:
defaults.system_promptsets the baseline interviewer persona — individual segments override with specific focus areas.- Goals are distributed across segments.
g_triggeris covered in rapport and trigger segments;g_decisionspans friction, compare, and wrap-up. - Each segment's
talk.system_promptincludes concrete example questions, not just topic descriptions.
Usability Test
Why this works: The speak → observe → talk → speak arc captures natural behavior (observe) with clear setup, focused reflection, and a scripted close. The opening speak segment prevents bias by delivering instructions without conversation.
{
"version": 2,
"goals": [
{ "id": "g_completion", "description": "Evaluate task completion and ease of use" },
{ "id": "g_friction", "description": "Identify points of confusion or friction" }
],
"segments": [
{
"id": "seg_intro",
"title": "Task Instructions",
"mode": "speak",
"speak_text": "Hi there — thanks for joining. I'll give you a task to complete. While you work, speak out loud about what you're doing and anything that feels confusing. There are no right or wrong answers.",
"speak_submode": "speak_balanced"
},
{
"id": "seg_task",
"title": "Complete the Task",
"mode": "observe",
"instruction": "Complete a purchase from product page through to confirmation.",
"conductor_context": "Expected flow: browse → add to cart → checkout → payment → confirmation. The payment form requires scrolling on mobile. Users who tap 'Back' from payment often cannot find their cart again.",
"max_duration_s": 420,
"advance_when": "url:https://example.com/confirmation"
},
{
"id": "seg_debrief",
"title": "Debrief",
"mode": "talk",
"talk": {
"system_prompt": "Ask what was easy, what was confusing, and what outcome they expected. Reference specific moments you observed. Follow with one concrete improvement question.",
"goals": ["g_completion", "g_friction"]
}
},
{
"id": "seg_thanks",
"title": "Thanks",
"mode": "speak",
"speak_text": "Thanks for walking through that task and sharing your thoughts."
}
]
}
Design notes:
conductor_contexttells interpretation and debrief what friction can mean in this specific task.advance_whenuses URL matching for deterministic advancement when the task is done.max_duration_s: 420(7 minutes) prevents indefinite observation.- The debrief prompt says "reference specific moments" — the AI has the observation transcript and can ask about concrete behavior.
Exploration Study
Why this works: Starts with talk to understand context, moves to observe to see reality (not just what they say), then probes the gap between what they described and what you observed.
{
"version": 2,
"goals": [
{ "id": "g_workflow", "description": "Understand daily routines and workflows" },
{ "id": "g_needs", "description": "Discover unmet needs and workarounds" }
],
"segments": [
{
"id": "seg_context",
"title": "Context & Routine",
"mode": "talk",
"talk": {
"system_prompt": "Explore the participant's daily routine, tools, habits. Ask about the last time they did the target task. Get concrete, recent examples — not general descriptions.",
"goals": ["g_workflow"]
}
},
{
"id": "seg_demo",
"title": "Show Current Workflow",
"mode": "observe",
"instruction": "Show how you currently do this task from start to finish.",
"conductor_context": "We are watching their current workflow to identify friction and workarounds. Note any copy-paste between tools, manual steps that could be automated, or moments of hesitation.",
"max_duration_s": 360
},
{
"id": "seg_probe",
"title": "Pain Points & Needs",
"mode": "talk",
"talk": {
"system_prompt": "Dig into frustrations and workarounds observed in the demo. Ask: why do you do it that way? What breaks? What would you change? Look for unmet needs behind stated preferences.",
"goals": ["g_needs"]
}
}
]
}
Design notes:
- The talk → observe → talk pattern lets you compare what people say (context) with what they do (demo), then probe the differences.
conductor_contextin the demo segment primes the AI to watch for specific signals (copy-paste, manual steps, hesitation).- The probe segment explicitly references the demo: "Dig into frustrations and workarounds observed."
SegmentV2 Field Reference
Required Fields
| Field | Type | Description |
|---|---|---|
id | string | Unique segment identifier (e.g. "seg_intro"). Used in deterministic advancement, skip conditions, and logging. |
title | string | Human-readable segment name. Shown in the dashboard and logs. |
mode | "talk" | "speak" | "observe" | Interaction mode for this segment. |
Talk Mode Fields
| Field | Type | Default | Description |
|---|---|---|---|
talk.system_prompt | string | Inherits from defaults.system_prompt | Interviewer persona and question strategy for this segment. |
talk.goals | string[] | [] | IDs of study goals this segment should pursue. |
talk.tools | LLMTool[] | [] | Custom tools available to the interviewer LLM. |
Speak Mode Fields
| Field | Type | Default | Description |
|---|---|---|---|
speak_text | string | — | Text for the AI to speak. Required for speak segments. |
speak_submode | "speak_fast" | "speak_balanced" | "speak_rich" | "speak_balanced" | Voice quality/speed tradeoff. |
speak_interruptible | boolean | false | Deprecated legacy field. Ignored by the runtime; speak segments are output-only. |
Observe Mode Fields
| Field | Type | Default | Description |
|---|---|---|---|
instruction | string | — | Task instruction shown to the participant. |
conductor_context | string | "" | AI-only background knowledge for interpretation and planned debriefs. |
Segment Flow Control
| Field | Type | Default | Description |
|---|---|---|---|
advance_when | string | — | Deterministic auto-advance rule: url:<substring> or action:<selector-or-pattern>. |
skip_if | string | — | Condition to skip this segment entirely. |
max_duration_s | number | — | Auto-advance after this many seconds. |
Host Actions
| Field | Type | Default | Description |
|---|---|---|---|
host_actions_on_enter | HostAction[] | [] | Scripted host-page actions executed when entering this segment: highlight, navigate, or scroll_to. |
host_actions_on_exit | HostAction[] | [] | Scripted host-page actions executed when leaving this segment. |
Host actions are deterministic segment-boundary steps. Scripted highlight and scroll_to actions use the selectors in their action payloads; do not rely on allowed_selectors as widget-runtime enforcement for those actions. Study-level allowed_origins allows conductor runtime requests from those origins; it does not authorize cross-origin host navigation. When allowed_origins is non-empty, include the embed origin that runs the widget. Scripted navigate actions should target the current host origin. Observation still stays silent; advancement uses max_duration_s, user Done / step_done, url:<substring>, action:<selector-or-pattern>, complete_segment for talk, or scripted speak completion.
StudyScriptV2 Top-Level Fields
| Field | Type | Description |
|---|---|---|
version | 2 | Always 2. |
segments | SegmentV2[] | Ordered list of segments. |
goals | StudyGoal[] | Study-level goals: { id: string, description: string }. |
defaults.system_prompt | string | Fallback system_prompt for talk segments that don't specify one. |
defaults.voice | string | Default TTS voice. |
defaults.speak_submode | SpeakSubmode | Default speak quality/speed. |
defaults.language | string | Interview language hint. |
See also
- Studies — create and manage studies
- Quickstart — end-to-end setup including study creation