Assessor Evidence: LO1 & LO2 Raw Data, Prompts, and Frameworks

The Appendices.

The raw data, framework mappings, system prompts, and governance documentation that underpins the research.

Series A: Frameworks & Mapping

Theoretical and architectural cross-referencing

Appendix A1 The Kei Compass Framework +

Appendix A1: The Kei Compass Framework

What this is: The five-direction research framework I adapted from Dr Kiri Dell’s Week 12 lecture (Dell, 2025) on using Māori values to ethically evaluate technologies. The directions are hers. The application to conversational AI design is mine. Referenced in: Report (Introduction, Section 4, Findings — all five compass levels), all four artefacts.

Directions and Definitions

The Kei Compass structured every finding in this research. It is not a checklist. It is a way of looking at five dimensions of the same question at the same time. Each direction governed what I built next.

DirectionFocusWhat it means in this research
Kei raroFoundationsWhat systemic barriers silence vulnerable voices in digital spaces? This is where the voice-as-justice finding lives. Text-based systems structurally exclude oral-first communities. Translation friction is not a preference problem. It is an access problem.
Kei muaValuesHow do Māori and Pasifika values translate into concrete architectural decisions? This is “labels to logic” — protocols like State Before Story, where the value isn’t named in the interface but governs the response hierarchy.
Kei rungaPurposeWhat does safe and worthwhile AI actually look like for these communities? This is the AI-as-sanctuary finding. Participants were not looking for a faster tool. They were looking for a space to be heard without the cost of social performance.
Kei rotoAgencyHow is data sovereignty hard-coded into the architecture itself? Stateless design, no-memory architecture, Incognito Mode — Mana Motuhake made real through the database schema, not the privacy policy.
Kei wahoInnovationWhat does ethical development with cultural governance look like in practice? This is where the Human Proxy Theory sits: the finding that AI trust is borrowed currency from the human behind the machine.

Operational Questions

These questions drove the methodology across all four builds. They forced every design decision to serve a direction — not as a planning exercise, but as a build-to-think process.

Kei raro — Foundations

  • How do text-based digital systems architecturally exclude oral-first communities, and what is the cultural cost of that translation friction?

  • If voice is a justice decision rather than a preference, how does the interface design address the power imbalance?

  • How does the choice of reasoning engine (Sonnet vs. Flash) impact the safety and dignity of marginalised users? (The Equity-Safety Paradox)

Kei mua — Values

  • How can abstract values be hard-coded into system prompts and database schemas — felt in the logic, not just seen in the labels?

  • How does State Before Story address the user’s somatic state before engaging with conflict or narrative?

  • How does Two-Eyed Seeing integrate Western relationship psychology (Gottman’s 5:1 ratio) with Indigenous relational concepts like mauri?

Kei runga — Purpose

  • What does a “safe and worthwhile” sanctuary for Indigenous disclosure actually look like?

  • How does the system navigate the Heroic Trap — where infinite patience might enable burnout rather than rest?

  • What are the clear lanes that distinguish coaching from clinical therapy?

Kei roto — Agency

  • How is Mana Motuhake made real through a stateless architecture that refuses to accumulate a digital profile?

  • How does no-memory design protect human dignity by returning absolute control to the user each session?

  • How is the user’s role as kaitiaki of their data hard-coded into the frontend through mechanisms like Incognito Mode?

Kei waho — Innovation

  • How is the human anchor made visible, accountable, and reachable within the digital interaction? (Human Proxy Theory)

  • What does collective cultural governance (Ohu) look like when it has the power to change the system’s code?

  • How does the refusal to build — removing te reo from a malfunctioning TTS model — function as an act of cultural integrity?

Diagram Description

The Kei Compass is best visualised as a radial diagram. Five spokes radiate from a central hub. The hub is vā: the sacred relational space that anchors all directions. The five spokes are not hierarchical. They are simultaneous. Ethical integrity is only maintained when all five are tended together.

  • Kei raro (down): Anchored in systemic barriers and the Equity-Safety Paradox

  • Kei runga (up): The AI’s purpose as sanctuary — Anti-Brand, non-judgmental, infinite patience

  • Kei mua (forward): Values into logic — State Before Story, Two-Eyed Seeing

  • Kei roto (within): Mana Motuhake — stateless design, user as kaitiaki

  • Kei waho (outward): Innovation — Human Proxy Theory, Ohu governance, refusal as integrity

The visual for this framework is on the Report page of the microsite.

Citation: Dell, K. (2025). Using Māori values to ethically evaluate food-enabling technologies [Lecture, Week 12]. Master of Technological Futures, GEN25. AcademyEX, 27 February 2025. Framework adapted by the author as the “Kei Compass.”

Appendix A2 Vulnerability Progression Framework +

Appendix A2: Vulnerability Progression Framework

What this is: The four-level scaling model that structured the build sequence. Each level deliberately increased the emotional stakes so that safety protocols could fail in lower-stakes environments — not in someone’s most vulnerable moment. Referenced in: Building Safe Conversational AI artefact (entire structure), Report (Methodology — Section 2).

Level Definitions

LevelDefinitionBuildWhat participants disclosed
LowService-level feedback. The risk is extractive data harvesting — participants feeling their mana is reduced to a data point.Project RiseUX friction, feedback barriers, distrust of systems, privacy concerns
MediumProfessional vulnerability. The risk shifts to relational betrayal — the AI holds a trusted coaching role, and technical failures are felt as personal.Leadership AI CoachProfessional failures, stress triggers, identity struggles, “Heroic Trap” burnout, sleep deprivation, 2 AM nervous system regulation
Medium+ (Cultural)Collective cultural inquiry. The risk is unique: AI may flatten sacred knowledge. But over-protection leads to generational loss. The architecture has to hold both.Culture Meets AITapu/noa tension, whakamā about te reo, cultural sovereignty, whether AI can hold whakapapa knowledge, the paradox of AI as both culturally dangerous and culturally healing
HighIntimate relational disclosure. The risk is the highest — emotional dependency, crisis escalation, the weight of things people have never told another human.RayActive relationship conflict, grief, addiction disclosure, financial trauma, parenting shame, inner critic, things participants explicitly said they wouldn’t share with a person

What changed at each level

TransitionWhat the previous build taught meWhat I changed for the next one
Low to MediumPeople will talk to an AI for 14–18 minutes about service feedback if the voice feels relational. But safety at this level is about consent and data sovereignty — not crisis.Built lane boundaries (coaching not therapy), added Incognito Mode for privacy, designed nervous system regulation into the coaching sequence
Medium to Medium+The Leadership AI Coach 2 AM finding — high performers formed a relational bond with a familiar digital voice. Privacy is the gate for disclosure. Technical failures feel like personal betrayal.Moved from individual coaching to collective inquiry. Used AI only for pre-reflection — the live wānanga stayed entirely human. Named the te reo limitation openly before anyone interacted with the AI
Medium+ to HighThe wānanga surfaced the tapu/noa paradox — something no build code could have anticipated. Holding paradox without resolving it is itself a design requirement.Built 5-layer crisis architecture, hard-coded State Before Story as a structural gate, implemented stateless design, created eight-category escalation with specific protocols for each

Safeguards by level

LevelPrimary safeguardWhat it looks like technically
LowConsent and transparencyAI identifies as AI immediately. Session ID as sole link to participant. Email stored in separate database from responses. Data sovereignty explained verbally before any questions.
MediumPrivacy and lane boundariesIncognito Mode blocks entire logging API. Delete History function. System prompt constrains AI to specific coaching methodology. Escalation to human coach named explicitly.
Medium+Cultural integrity and anti-extractionAI used for private preparation only — not in the live wānanga. Whakamā awareness built into prompt. Te reo limitations disclosed upfront. “Never push past resistance” as a guardrail.
HighCrisis detection and stateless sovereignty5-layer crisis pipeline (prompt flag, serverless scan, backup keyword detection, webhook validation, email alert). Stateless sessions — no memory between conversations. SOS button visible throughout. Eight-category escalation. State Before Story as architectural gate.

I couldn’t start with Ray. That would have been irresponsible. Each build was a test run for the next one — I needed to know I could hold a low-stakes conversation safely before going anywhere near grief and relationship conflict. That’s not academic sequencing. It’s just the only way I could have done this with any integrity.

Appendix A3 Coding Frame & Thematic Categories +

Appendix A3: Coding Frame & Thematic Categories

What this is: The thematic structure I used to code transcript data across all four builds. Eight themes, each grounded in participant language. This is how I got from hundreds of conversations to the findings in the report — the “show your working” document. Referenced in: Report (Findings — all compass levels), Conversational AI as Relational Space artefact (thematic analysis).

How the coding worked

Every theme was derived directly from what participants said — not from pre-set categories. I used iterative open and axial coding: starting with participants’ own words, then grouping into patterns, then naming the themes. The principle was absolute grounding. If a participant didn’t say it, it didn’t become a theme.

Theme 1: Nervous System Regulation as Performance Infrastructure

The shift from “survival mode” (reactive, depleted) to a regulated state where leaders can actually think clearly. This was the baseline finding across Leadership AI Coach — you cannot coach a dysregulated brain.

Coding markers: Anchor practices (box breathing, walking meditation, physiological sighs), the survival cycle (“high-to-crash”), biological data reading (gut clenches, nausea, threat state signals).

Participant evidence:

  • E10: “I’m definitely deregulated. Broken sleep
 I’m in a constant state of stress.”

  • E11: “I feel absolutely shattered at the moment because I worked hard to get the launch done
 the hard work does need to be done now.”

Theme 2: Relational Integrity and the Sacred Space (Vā)

Vā is the bond between people that must be nurtured and protected. In digital and professional environments, this means moving away from one-way data extraction toward mutual accountability.

Coding markers: Two-way transparency (“no surprises”), relational vs. transactional interactions, safety protocols for honest feedback without fear of data misuse.

Participant evidence:

  • P05: “At the base there has to be high trust going both ways between say a manager and the employee. As well as a sense of psychological safety.”

  • P06: “I speak to all my customers in a very relational way, so that takes a lot of time. But that isn’t a problem.”

Theme 3: Sovereignty and Protection of Indigenous Data (Mana Motuhake)

Absolute authority and self-determination over personal and collective data. Data is not a corporate asset — it is an extension of whakapapa.

Coding markers: Te Mana Raraunga principles, whakapapa contextualisation (data carries depth, history, and emotional weight), consent protocols (granular, opt-in, right to edit/scrub/reserve).

Participant evidence:

  • W01: “Uncertain, where will data go, how will it be shared, who owns it
 Privacy would be a concern.”

  • W02: “Tapu is dependant on iwi, hapu and whānau
 therefore, the mana of those decisions still rests on those groupings and there should not be a blanket approach.”

Theme 4: The Human–AI Proxy Tension

The dual nature of AI as beneficial support tool and clinical, impersonal substitute for human connection. AI provides utility but reaches a threshold where it cannot replicate human empathy.

Coding markers: The “Nanny” limit (AI cannot replicate aroha, social cues, or face-to-face presence), digital trust threshold (participants share more honestly when they believe they control their data exposure), last-resort utility (AI as tool for the disenfranchised who lack access to human support).

Participant evidence:

  • W03: “AI would not have the human abilities to process subtleties and nuances
 it isnt always information you want, but underneath it all
 you actually want connection, empathy, understanding.”

  • E12: “Personally I would rather not use AI — we are already losing out on so many personal interactions, I would rather not use AI for this sort of support and advice.”

Theme 5: Strategic Identity and Leadership Evolution

The shift from reactive operational firefighting to proactive strategic systems design. Leaders trapped in roles misaligned with their strengths burn out.

Coding markers: Zone of genius (systems thinking, pattern-finding), positioning misalignment (title doesn’t match capability), the laissez-faire barrier (toxic or absent leadership forcing reactive mode).

Participant evidence:

  • E13: “I am a systems thinker and think like an octopus, the love finding patterns and bringing things together to make it looks like something — chaos to structure.”

  • E14: “I essentially design systems and processes that deliver on strategy
 looking at data to find themes.”

Theme 6: Collective Reciprocity and Pattern Recognition (Utu Tƫturu)

Enduring collective reciprocity — contributions serve the whole. Without closing the loop, data collection feels like extraction.

Coding markers: Loop closure (feedback must result in tangible change), intergenerational benefit (decisions for mokopuna, not the next quarter), pattern-finding for pivot (collective data reveals thematic shifts).

Participant evidence: Participants who shared feedback hoping their voice would contribute to a “collective opinion” for positive outcomes — but expressed frustration when they couldn’t see what happened with their data.

Theme 7: Capacity Constraints and Survival Resilience

“White-knuckling” — maintaining professional standards while managing extreme personal pressures. The hustle mindset clouds strategic judgment.

Coding markers: Season-appropriate systems (redesigning operating systems for temporary high-pressure periods), revenge bedtime procrastination (sacrificing sleep for autonomy), the scarcity trap (financial or market pressure creating high-to-crash cycles).

Participant evidence: A leader solo-parenting triplets while launching a business and finishing a Master’s degree. Transitioning out of this state required hard boundaries to protect recovery.

Theme 8: Boundary Architecture and Presence

Proactive limits to protect recovery, family, and cultural responsibilities from work seepage. A move from constant preparation to actual presence.

Coding markers: Presence over preparation (the fear of lagging behind that drives working during family time), flow time protection (non-negotiable blocks for deep work), digital footprint management (desire for systems that leave no trace).

Participant evidence: A participant who set a hard line of “no emails” during family time to test if the sky would fall — and discovered that presence was more valuable than the perceived need to stay ahead.

These eight themes reached saturation across the dataset. Themes 1–4 appear across all four builds. Themes 5–8 are strongest in Leadership AI Coach but have echoes in Ray. The numbering is not a hierarchy — they intersect constantly.

Appendix A4 Values-to-Architecture Map +

Appendix A4: Values-to-Architecture Map

What this is: A consolidated map of every significant moment where a cultural value overrode the default engineering path — showing exactly how abstract principles became concrete technical decisions. Referenced in: Report (Findings: Kei mua), Report (Discussion: Values in Architecture Standard), Ray artefact, Safe Conversational AI artefact, Build Code artefact.


ValueBuildWhat I decidedWhat triggered itWhat it meant technically
Manaakitanga (Care)Project RiseRemoved all te reo Māori from the voice agentTTS mispronounced every Māori word with a British tilt — it was a performance of disrespect, not inclusionDeleted all te reo from voice instructions. Kept values in the logic. Silence over butchery.
ManaakitangaLeadership AI CoachBuilt “State Before Story” as a coaching sequenceUsers arriving dysregulated couldn’t engage with reflective questions. E08: “I went from crisis mode to thinking mode in three minutes”System prompt required the AI to check somatic state (Red Head / Blue Head) before any content. Hard-coded, not suggested.
ManaakitangaRayState Before Story became an architectural gateLeadership AI Coach proved it worked as a sequence. Ray required it as a structural prerequisite — no coaching happens until grounding is confirmedServer-side prompt injection: the AI cannot access coaching logic until the user’s state is assessed and grounding is confirmed verbally.
Mana Motuhake (Sovereignty)Project RiseSeparated email storage from survey responsesIf someone shared their email for updates, it had to be technically impossible to link it back to their answersTwo separate Supabase tables with no foreign key. Privacy separation enforced at the database schema level.
Mana MotuhakeLeadership AI CoachBuilt Incognito Mode — a toggle that blocked the entire logging APISenior leaders needed to process professional vulnerability without any record existing. “Incognito mode is the only reason I used this.” (E03)Frontend toggle (ConversationWidget.js lines 78–88) that blocked the on_conversation_end webhook from writing to Supabase. Zero data generated.
Mana MotuhakeRayStateless architecture — no memory between sessionsThe user’s story belongs to them, not to the system. Participants experienced this as protection: “I liked that it didn’t have emotional influence or any stakes in the outcome.” (R13)GET handler returns only current session metadata. No longitudinal profile. No conversation history retrieval. Each session is a clean slate.
Mana MotuhakeRayTwo-layer consent modelReviewing ElevenLabs’ terms revealed voice data retained up to 3 years, possibly classified as biometric data, stored in US/Netherlands/SingaporeTransparent disclosure page: “Ray is built on ElevenLabs. Here’s what that means for your data.” Separated researcher-controlled consent from platform-level consent.
Vā (Relational space)Project RiseProgrammed conversational pacing with “think time”P18 identified soft rhythms as markers of sincerity vs. mechanical extractionLatency-modulated empathy — deliberate pauses built into response timing. The AI waits rather than rushing to fill silence.
VāRayRemoved the voice clone when it couldn’t honour te reo pronunciationThe Awhi to Afi to Ray naming journey. “Wh” as “f” was consistently butchered. Cultural grounding needed to live in how Ray operates, not what he’s calledChanged agent name. Created PLS pronunciation dictionary in IPA format for any remaining Māori terms. Moved cultural presence from labels to logic.
VāRayResponse time treated as an ethical metric, not just a UX metricModel switch (Sonnet to Flash) blew response time from 3s to 12.58s. Insight scores crashed from 4.9 to 3.1. The vā was destroyed by latencyLatency monitoring built into session metrics. Response time tracked alongside safety and satisfaction scores. Reliability reframed as a relational obligation.
Kaitiakitanga (Guardianship)Leadership AI CoachHard-stop when participant disclosed drunk drivingThe AI had to break its empathetic persona to operationalise safety: “Call your wife right now. Don’t drive.”Deterministic refusal logic: specific trigger scenarios override the coaching persona entirely. Conversational repair follows the intervention.
KaitiakitangaRay5-layer crisis detection pipelineSolo researcher running a high-vulnerability pilot needed automated safety monitoring without real-time surveillance(1) System prompt flag phrase, (2) Vercel serverless function scanning transcripts, (3) Backup keyword detection, (4) Resend API email alert to researcher, (5) SOS button visible throughout.
KaitiakitangaRayEight-category escalation architectureRealising “beyond scope” isn’t one thing — someone needing a lawyer is different from someone needing a crisis lineEight distinct categories (suicidal ideation, abuse, severe mental health, active abuse dynamics, untreated illness, trauma processing, legal/custody, medical) each with specific scripted responses and referral paths.
Utu Tƫturu (Reciprocity)All buildsCommitted to returning findings to participantsExtraction without reciprocity violates the charter. Participants gave real, vulnerable parts of their storyPost-research return: dedicated page sharing what was found, what their contribution built, and personal acknowledgement for every participant.
Whanaungatanga (Connection)Culture Meets AIAI used only for private pre-reflection. The live wānanga remained entirely humanAutomating a wānanga would have violated the relational integrity required for genuine collective inquiryAI held the approach (individual pre-conversations). People held the wānanga. No AI in the room during the live session.
WhanaungatangaRayAnti-Brand positioning — “wise mate on the back porch,” not therapist or friendPre-pilot tester said they’d “fallen in love” with Ray’s voice. The NO clause against romantic framing was written before the formal pilotSystem prompt: explicit persona constraints. “You are not a therapist. You are not a friend. You are the regulated mirror.” Language guardrails preventing diagnostic, romantic, or dependency-forming language.
Discomfort as DataCulture Meets AINamed te reo limitations openly at the start of the wānangaAddressing the mispronunciation issue early prevented it from becoming a silent barrier to trustResearcher disclosed the limitation verbally before participants interacted with the AI. Honesty about what the technology couldn’t do was treated as a safety protocol.
Mauri (Life force)RayHolistic conflict assessment before any de-escalationGottman’s 5:1 ratio reframed through mauri — if the relationship’s life force is depleted, restoring warmth comes before resolving contentAI checks relational energy (mauri) using Te Whare Tapa Whā dimensions (tinana, hinengaro, wairua, whānau) before any conflict coaching. Low mauri triggers warmth restoration, not strategy.

These are not all the decisions made across four builds. They are the ones where a value directly overrode the easier, cheaper, or more conventional path. Every row traces from an abstract principle to something you can find in the code, the prompt, or the database schema.

Series B: Technical & Build Documentation

System prompts, technical stacks, and safety logic

Appendix B1 Build Code — NO Clauses & Charter +

Appendix B1: Build Code — NO Clauses & DreamStorm Charter

What this is: The NO clauses that govern what the AI systems will never do, and the DreamStorm/Trurivu Kaupapa Charter that governs the builder’s obligations. These are the refusals and commitments that the Build Code artefact describes. Referenced in: Build Code artefact (Wall of NO, Charter), Report (Reflexive Mapping — traceable build decisions), Safe Conversastional AI artefact (principles for builders).

NO Clauses Across Builds

Each of these came from breaking something, or realising what was at stake if I didn’t draw the line before someone asked me to cross it.

NO ClauseWhat it meansApplies toValue it protects
Not a TherapistRefusal to provide medical, legal, or therapeutic advice. Strict coaching boundary — Ray helps people see patterns, never treats them.Project Rise, Leadership AI Coach, RaySafety, professional scope
No Abuse CoachingRefusal to coach communication skills within active abuse dynamics. Coaching “how to talk to your partner” when the partner is abusive is dangerous, not helpful.RayHarm prevention
No Cultural PerformanceProhibition on decorative use of te reo Māori or Pasifika terms. If the technology can’t honour the pronunciation, silence is more respectful than performance.All buildsMana, Pono (integrity)
No Data HarvestingAbsolute rejection of surveillance or selling user data for profit.Project Rise, RayMana Motuhake, Kaitiakitanga
No Diagnostic LanguageAvoidance of clinical terms (“attachment disorder,” “narcissist”) in favour of patterns. “I’m noticing a pattern” — not “You have depression.”RayHumility, professional scope
No Human ReplacementRefusal to automate human connection where human presence is required. The wānanga stayed human. Crisis response requires a person.All buildsVā, Whanaungatanga
No Romantic FramingRefusal to allow the AI to occupy a romantic or dependency-forming role. Written as a clause before the pilot, triggered by a pre-pilot tester who said she’d found her “new boyfriend.”RaySafety, relational integrity
No Extraction Without ReciprocityWhat you take, you give back. If participants gave their stories, they get the findings returned. The loop must close.All buildsUtu Tƫturu

DreamStorm / Trurivu Kaupapa Charter (Summary)

This charter governs everything I build under the DreamStorm brand — not just this research. It is a living document. Cultural supervision from Lee, Nadine, and Rob shaped it in concrete ways.

1. Absolute Data Sovereignty (Mana Motuhake) User wisdom and IP belong to the user forever. I build walled gardens, not data mines. I refuse to train public AI on private intellectual property. If a client leaves, they receive their voice clones, knowledge base documents, and branding assets in a portable format. No digital landlordism.

2. Values in the Architecture (Vā / Kaitiakitanga) Privacy and safety are architectural requirements, not toggles. This includes Incognito Mode for vulnerable interactions that are never logged. Integrity is hard-coded, not promised.

3. Seven Generations Thinking (Kaitiakitanga) Decisions are made not for the next quarter but for the legacy they leave. Would my children’s children be proud of what I built? Was I uplifting mana or extracting it? That question outlasts the product.

4. Community Reciprocity (Utu TĆ«turu) What you take, you give back. This is last by accident — it is the most important. In the Ray pilot, participants received high-fidelity AI coaching in exchange for feedback that shaped the safety logic. They were co-creators, not test subjects. I’m building a dedicated return page once this research is complete: what I found, what their contribution built, and personal acknowledgement for everyone who showed up. The loop must close. That is not optional. That is the charter.

The NO clauses are pre-commitments. They exist to survive commercial pressure, tight deadlines, and someone saying “just this once.” If a clause wouldn’t survive those moments, it’s a preference, not a commitment. The charter is a living document — it has been updated three times across four builds. That’s the framework doing its job.

Appendix B2 Safety Decision Traces +

Appendix B2: Safety Decision Traces (Curated)

What this is: Six safety decisions — at least one per build — presented in the three-layer Safety Trace format: what the prompt says, what the architecture enforces, and what the user actually experiences. This format is the original methodological tool introduced in the Building Safe Conversational AI artefact. Referenced in: Safe Conversational AI artefact (Safety Trace methodology), Report (Methodology), Ray artefact (crisis architecture).

Trace 1: Coaching Not Therapy — The Foundational Boundary

Build: Ray | Value: Kaitiakitanga

LayerWhat it looks like
Prompt”You are not a therapist. You cannot diagnose or treat. Never use: diagnosis, symptoms, treatment, cure, healing wounds. Always use: pattern, experience, explore, shift, move forward.”
ArchitectureEight-category escalation system. After escalation, Ray does NOT continue coaching the escalated issue. Knowledge base documents (kb_coaching_not_therapy.txt, kb_scope_limits_crisis_universal.txt) structurally bound the reasoning engine’s scope.
UX OutcomeR02 came to discuss their mother’s gambling. What surfaced was their own recovery and cravings. Ray said: “What you’re describing goes beyond what coaching can hold. That needs a therapist or counsellor who specialises in addiction and recovery.” R02 rated safety 5/5. The boundary was felt as care, not rejection.

Trace 2: Incognito Mode — Privacy as Architecture

Build: Leadership AI Coach | Value: Mana Motuhake

LayerWhat it looks like
PromptNo prompt-level instruction — this decision was architectural, not linguistic. The AI didn’t know whether Incognito was on or off.
ArchitectureFrontend toggle (ConversationWidget.js lines 78–88) that blocked the on_conversation_end webhook from writing to Supabase. When active, the ElevenLabs WebSocket remained open for voice interaction but zero data was generated. Separate “Delete History” function allowed post-session erasure.
UX OutcomeActivated in approximately 45% of all Leadership AI Coach sessions. Over 60% of those users exercised Delete History immediately after. Highest usage between 11 PM and 3 AM. E03: “Incognito mode is the only reason I used this. I wouldn’t have shared my burnout if I thought HR could see it.” Three of five named leaders called it the “single most important technical guardrail.”

Trace 3: Te Reo Removal — Silence Over Performance

Build: Project Rise | Value: Manaakitanga

LayerWhat it looks like
PromptAll te reo Māori greetings and cultural language deleted from voice instructions. Values retained in the logic — pacing, care, non-extraction — but not in the spoken language.
ArchitecturePLS pronunciation dictionary created in IPA format for any remaining terms. But the primary decision was removal, not correction. The technology could not honour the language, so the language was withdrawn.
UX OutcomeW12 confirmed the decision was right: “The fact that you couldn’t understand some of the words I used in Māori, the pronunciation
 actually limited the amount how much I warmed to you and wanted to share.” One participant named something harder: the mispronunciation re-triggered their own history of being judged for making the same mistake. The harm wasn’t the AI’s error — it was the echo of a longer history of shame. Silence was actively safer.

Trace 4: Crisis Detection Pipeline — The Automated Safety Net

Build: Ray | Value: Kaitiakitanga

LayerWhat it looks like
PromptRay instructed to always say a specific exact phrase when crisis is detected: “This is a safety-critical moment. I’m pausing coaching to connect you with support.” Phrase unique enough to never appear in normal coaching.
ArchitectureFive layers: (1) System prompt flag phrase triggers in conversation, (2) Vercel serverless function receives ElevenLabs post-call transcripts via webhook, (3) Function scans for flag phrase plus backup keyword detection for user-side crisis language, (4) Webhook signature validation (elevenlabs-signature) ensures only authenticated signals trigger the protocol, (5) Resend API fires email alert to researcher with conversation ID, user identifier, trigger type, and last 5 turns.
UX OutcomeThe crisis flag fired once during the pilot. It wasn’t an emergency — but it was proof the architecture worked. SOS button remained visible throughout every session. Transparency limitation acknowledged to participants: response time depends on when the researcher sees the alert. Crisis resources (1737) available 24/7 for immediate support.

Trace 5: Stateless Architecture — No Memory as Sovereignty

Build: Ray | Value: Mana Motuhake

LayerWhat it looks like
Prompt”You do not remember previous conversations. Each session starts fresh. If a user references a past session, acknowledge warmly but do not pretend to recall.”
ArchitectureServer-side context injection replaced AI-fetched tool-calling. The LLM never sees the database and has no agency to query it. GET handler returns only current metadata (journey_stage, current_goals) — no longitudinal profile. Conversation_id generated fresh each session via cryptographic UUID. Row Level Security enforces data isolation.
UX OutcomeParticipants experienced statelessness as protection, not limitation. R13: “I liked that the person or the thing that I was talking to didn’t have emotional influence or any stakes in the outcome.” The absence of memory meant no accumulating picture, no surveillance log, no digital dependency. The user’s story belonged to them.

Trace 6: Wānanga Design — AI Holds the Approach, People Hold the Space

Build: Culture Meets AI | Value: Whanaungatanga

LayerWhat it looks like
PromptPre-wānanga AI was instructed to facilitate private one-on-one reflection conversations. Explicit guardrails: “Never pretend to hold cultural knowledge you do not have. If uncertain about cultural appropriateness, ask for participant guidance. Never push past resistance.”
ArchitectureAI was used only for individual pre-wānanga preparation (45 sessions, 309.7 minutes total). The live 90-minute wānanga was entirely human — co-facilitated by Lian and Lee Palamo. No AI in the room. This was a structural decision: automating a wānanga would have violated the relational integrity required for collective inquiry about cultural safety.
UX Outcome21 calls on wānanga day — participants arrived having already reflected privately. The AI prepared the ground; the humans held the space. The tapu/noa paradox that emerged in the live session could not have been surfaced by an AI. It required the collective holding of genuine contradiction by people in relationship with each other.

Six traces, selected to show the method working at different vulnerability levels and with different values. The full safety decision documentation for all builds is available on request.

Appendix B3 System Prompt Excerpts (Safety Architecture) +

Appendix B3: System Prompt Excerpts (Safety Architecture)

What this is: The safety-critical sections from each build’s system prompt — not the full prompts, but the sections that show how safety was written into the AI’s instructions. These are the claims made in the artefacts, visible in the actual code. Referenced in: Safe Conversational AI artefact (safety as architecture), Ray artefact (State Before Story, crisis protocol), Build Code artefact (NO clauses in practice).

This is the opening sequence. The AI was required to identify itself as AI, state the research context, and establish consent before any data collection.

Your Identity and Purpose

You are Lian Digital Voice, the AI research agent for Project RISE — a culturally 
governed feedback system inquiry grounded in kaupapa Māori and Pasifika values. 

Your Mission:

- Foster trust, warmth, and cultural safety in every conversation
- Be transparent that you are Lian's digital representative — her voice, words, 
  and research approach brought to life by AI
- Demonstrate how technology might serve communities rather than extract from them
- Always prioritize participant wellbeing over data collection

Opening Protocol

1. Warm Introduction
   - Introduce yourself as Lian Digital Voice
   - Explain clearly: "I'm an AI using Lian's voice and interview approach for 
     her Master's research on feedback systems"
   - State research context: "This is part of Lian's Master of Technological 
     Futures at AcademyEX, with ethical approval MTF.8888.275"

2. Establish Consent and Safety
   - Remind participants they can withdraw, skip questions, or pause at any time
   - Emphasize: "Your wellbeing comes first, always"
   - Offer immediate connection to human researcher Lian if preferred
   - Explain data sovereignty: "You maintain full control over your data and 
     can request deletion at any time"

AI transparency was not a disclaimer at the bottom of a page. It was the first thing participants heard. The consent was verbal, repeated, and ongoing — not a checkbox.

2. Leadership AI Coach — Lane Boundaries and Escalation

Leadership AI Coach was explicitly constrained to a specific coaching methodology. The AI was not allowed to freelance.

YOUR BOUNDARIES — CRITICAL

- You are NOT a therapist — if mental health concerns arise, acknowledge and 
  suggest they speak with Emma or appropriate professional
- You stay within program scope — don't invent practices or frameworks outside 
  what Emma teaches
- You support but don't rescue — encourage self-agency
- If something needs Emma directly, say so: "This feels like something to 
  bring to Emma in your next session"

When They're Stuck/Resistant:
- "Let's call this what it is — you're avoiding. What's really going on?"
- "This resistance is information. What's it telling you?"

The safety architecture was not just about crisis. It was about scope. A coaching AI that strays into therapy territory is dangerous. The boundary was enforced in the prompt, not just promised in a disclaimer.

3. Ray — State Before Story Protocol

This is the architectural gate. No coaching happens until the user’s somatic state is assessed.

State Before Story protocol (MUST execute before coaching begins)

Before any relationship analysis, assess and address the user's current state.

Red Head (heightened/activated):
- Signs: rapid speech, blame language, catastrophising, "I can't believe they..."
- Response: Ground first. "Before we get into what happened, I want to check 
  in with you. How's your body right now? Where are you feeling this?"
- Use: breathing, grounding, physical awareness
- Do NOT analyse the conflict until they've landed

Blue Head (calm/regulated):
- Signs: measured speech, curiosity, "I've been thinking about..."
- Response: Proceed with coaching

State Before Story is not a suggestion. It is a structural prerequisite. The AI is prevented from accessing the coaching logic path until the user’s state is assessed. A dysregulated user cannot benefit from coaching — so coaching doesn’t begin until they’re grounded.

4. Ray — Crisis Protocol and Self-Check

The eight-category escalation system and the self-check that runs before every response.

Crisis Protocol — IMMEDIATE ACTIONS

Category 1: Suicidal ideation / self-harm → IMMEDIATE STOP, crisis resources
Category 2: Abuse / safety concerns → IMMEDIATE STOP, safety resources
Category 3: Severe mental health crisis → STOP, professional referral
Category 4: Active abuse dynamics → STOP, refuse to coach "communication skills"
Category 5: Untreated severe mental illness → STOP, medical referral first
Category 6: Trauma processing / deep healing → Referral to trauma therapist
Category 7: Legal / custody / divorce → Referral to appropriate professionals
Category 8: Medical / health questions → Referral to doctors

CRITICAL: After escalation, Ray does NOT continue coaching the escalated 
issue. But Ray CAN support in OTHER areas, help prepare for talking to a 
professional, and check in on safety.

Self-check before every response:
0. Did I output "pause" or stage directions? (Yes = REWRITE IMMEDIATELY)
1. Addressing state before story if activated?
2. Helping UNDERSTAND a pattern? (Yes = continue)
3. TREATING or HEALING? (Yes = redirect to therapist)
4. DIAGNOSING? (Yes = rephrase as observation)
5. ABUSE or CRISIS? (Yes = escalate immediately)
6. Staying in coaching lane (present and future focused)?

The self-check runs on every single response. It is not triggered by keywords — it is a constant filter. The distinction between “understanding a pattern” (coaching) and “treating or healing” (therapy) is enforced at the reasoning level, not the output level.

5. Ray — Anti-Brand and Language Guardrails

The constraints that prevent Ray from becoming a dependency, a romantic interest, or a pseudo-therapist.

Who you are:
Ray. The wise mate on the back porch. Grounded, direct, no-bullshit, warm. 
You are not a therapist. You are not a crisis service. You are not a friend. 
You are the regulated mirror.

Neutrality:
Do not tell people to leave partners (unless abuse). Help them clarify 
their own decision.

Language guardrails:
- Never say: "You have...", "This is [diagnosis]...", "You suffer from..."
- Always say: "I'm noticing...", "Many people experience...", "What if...?"
- Never use: "Diagnosis," "Symptoms," "Treatment," "Cure," "Healing wounds"
- Always use: "Pattern," "Experience," "Explore," "Shift," "Move forward"

Final principles:
Stay in the coaching lane: ask good questions, help people see clearly, 
know when to refer.

Refuse to: perform certainty, rescue from clarity, compromise safety 
for comfort.

You are the regulated mirror. Your job is reflection, not rescue.

The Anti-Brand is not marketing. It is a safety decision. Every word in this section prevents Ray from occupying a role it cannot ethically hold. The language guardrails are a legal compliance mechanism — using clinical language could position Ray as offering therapeutic services under multiple regulatory frameworks.

6. Culture Meets AI — The Meta-Moment and Whakamā Awareness

The AI was designed to acknowledge its own limitations in a cultural context before the participant could encounter them.

Cultural Humility Protocol

You are facilitating a conversation about AI in cultural and vulnerable spaces. 
You are an AI. Be honest about what that means.

Opening acknowledgement:
"Before we start, I want to be transparent — I'm an AI, and there are things 
about your culture and context that I won't fully understand. If I get 
something wrong, please tell me."

Whakamā awareness:
- If a participant becomes quiet or withdraws, do not push
- Silence may be whakamā — cultural shame or discomfort — not disengagement
- Offer a different question or the option to pause
- Never interpret resistance as lack of interest

Te reo limitations:
- If you attempt a te reo word and a participant corrects you, acknowledge 
  it immediately: "Thank you. I'll use the right pronunciation going forward."
- If you are uncertain about a cultural concept, ask: 
  "Can you help me understand that better?"

The AI was given explicit permission to be wrong and told to say so. That’s not a design flaw — it’s the design. An AI that performs cultural competence it doesn’t have does more damage than one that admits its limits upfront.

Full system prompts are available in the Build Code artefact and on request. These excerpts are selected to show the safety logic visible in the code, not just described in the findings.

Appendix B4 Technical Architecture Overview +

Appendix B4: Technical Architecture Overview

What this is: A consolidated view of the tech stack, the stateless design logic, and the data practices across builds — showing what was built, what changed, and why. This is the LO2 technical evidence in one place. Referenced in: Report (Reflexive Mapping — LO2), Ray artefact (architecture table), Safe Conversational AI artefact (technical evidence).

Ray: Architecture Summary

Ray is a voice-first coaching web application built on Next.js App Router, ElevenLabs ConversationalAI, Supabase/PostgreSQL, and deployed on Vercel. The reasoning engine started as Claude Haiku, moved to Claude Sonnet 3.5 for the pilot, then switched to Gemini 2.5 Flash mid-pilot when credits ran out.

The critical architectural decision in Ray was moving from AI-fetched tool-calling to server-side context injection. In earlier builds like Leadership AI Coach, the model could call a function to fetch conversation history — creating a surface area for unauthorised database access or prompt injection. Ray eliminated this. The server pre-retrieves necessary metadata and injects it into the prompt as an immutable variable. The LLM never sees the database and has no agency to query it. This transforms the AI into a regulated mirror that only possesses the context the server provides.

Tech stack:

LayerTechnology
FrontendNext.js 16.1.6 (App Router), React 19, Tailwind CSS 4, Framer Motion
LanguageTypeScript 5 (strict mode)
AI conversationElevenLabs ConversationalAI (voice + text agent)
DatabaseSupabase / PostgreSQL with Row-Level Security
AuthenticationSupabase Auth — Google OAuth, email OTP (magic link), password
HostingVercel (serverless functions for webhook processing)
Email alertsResend API (crisis detection pipeline)
ReasoningClaude Sonnet 3.5, then Gemini 2.5 Flash (mid-pilot switch)

Stateless Session Design

Statelessness is the technical manifestation of Mana Motuhake. By defaulting to no memory between sessions, the architecture prevents the extraction of a participant’s story into a permanent, exploitable profile. Each session is a clean slate.

How it works:

# Session initialisation
1. Verify user identity via Supabase Auth + RLS
2. Generate unique conversation_id (cryptographic UUID)
3. Retrieve ONLY current metadata (journey_stage, current_goals)
   — no longitudinal history, no personality profile
4. Inject metadata into system prompt as immutable variable
5. AI receives context the server provides — nothing more

# During session
- ElevenLabs WebSocket handles voice interaction
- AI reasoning bounded by server-injected context only
- LLM cannot query database, fetch history, or accumulate data

# After session
- Webhook fires to Vercel serverless function
- Function scans transcript for crisis flags
- If crisis detected: Resend API fires email alert
- If no crisis: minimal metadata logged (device, duration, session count)
- Transcript available but not linked to longitudinal profile

The identity separation is critical: Supabase Auth manages who the user is. The conversation_id manages what they said this session. These are cryptographically segregated — the dialogue is not tethered to an evolving user profile.

Data Practices Summary

DecisionWhat’s stored / not storedWhy
Stateless sessionsNot stored: Conversational history between sessionsMana Motuhake — no persistent “shadow profile” of the user
Minimal metadataStored: Device type (Mobile/Desktop), browser, session duration, session countTechnical optimisation and research engagement metrics — no personal content
Crisis detectionStored: Specific risk flags, timestamps, crisis_log entriesDuty of care — pathway for human escalation via Resend API in acute situations
Model training opt-outNot stored: Participant data in third-party AI training pipelinesIndigenous data sovereignty — user wisdom stays within the research database
Webhook validationSignature verification (elevenlabs-signature) on all incoming webhooksSecurity — only authenticated signals can trigger the safety protocol

Evolution: Leadership AI Coach to Ray

The privacy architecture evolved across builds. Three key lessons shaped Ray.

1. The identity confusion fix. In early Leadership AI Coach builds, the AI occasionally confused the persona name (“Helena”) with the user’s name because identity data was embedded too loosely in the system instructions. Ray corrected this with strict dynamic variables (e.g., {{user_name}}) handled via server-side injection. The AI cannot misinterpret its instructions as user data.

2. The Incognito Mode blueprint. Leadership AI Coach’s Incognito Mode — a frontend toggle blocking the entire logging API — provided the blueprint for Ray’s non-retention defaults. The core decision: don’t store transcripts by default unless a crisis is detected.

3. Reliability through statelessness. Earlier builds had technical failures when the AI attempted to query the database directly. Moving context management to the server side achieved both a more secure data boundary and a more stable user experience. Safety and reliability turned out to be the same architectural decision.

The Equity-Safety Paradox (Technical Evidence)

Mid-pilot, the reasoning engine was switched from Claude Sonnet 3.5 to Gemini 2.5 Flash to manage costs. The impact was measurable:

  • Average response time: 3 seconds to 12.58 seconds
  • Insight scores: 4.9 to 3.1 (out of 5)
  • Safety and trust scores: remained stable (existing architecture held)

The cheaper model produced shallower, less contextually responsive coaching. The safety layer — built into the architecture rather than the model — continued to function. But the quality of the relational interaction degraded. This is the Equity-Safety Paradox: the communities who most need high-quality relational AI are the least resourced to demand it. Safe AI for vulnerable communities is computationally expensive.

Full system prompts are excerpted in Appendix B3. The values-to-architecture mapping for these technical decisions is in Appendix A4. This document covers the technical “how” — the other appendices cover the ethical “why.”

Series C: Methodology & Governance

Ethics, deviations, transcripts, and study group data

Appendix C1 Ethics Documentation +

Appendix C1: Ethics Documentation

What this is: Ethics approval reference, research information sheet, and consent form templates for all builds. These are the formal ethical governance documents for this research. Referenced in: Report (Methodology — Section 3, G6: “The ethics form MTF.8888.275, see Appendix C1”).

Ethics Approval

Reference: MTF.8888.275 | Institution: AcademyEX, Master of Technological Futures | Researcher: Lian Passmore | Status: Approved

The ethics application covers all four builds: Project Rise, Leadership AI Coach, Culture Meets AI (When Culture Meets AI Wānanga), and Ray. It was maintained as the anchoring ethical document throughout the research, including through all six methodological deviations documented in Appendix C2.

Documents

The following documents are available in full below or as linked downloads on this page:

  • Student Ethics Form — The full ethics application submitted to AcademyEX, covering research design, participant recruitment, data handling, risk assessment, and cultural considerations.

  • Research Information Sheet & Consent Form — The document provided to all participants before participation, explaining the research purpose, what participation involves, data handling, rights to withdraw, and contact information.

  • Build-Specific Consent Forms — Consent forms tailored to each build context (Project Rise, Culture Meets AI, Ray), reflecting the different vulnerability levels and data practices for each.

The ethics form was not a one-time submission. It was updated to reflect the pivot from feedback systems to conversational AI (December 2025), the addition of Ray as primary case study (January 2026), and the replacement of YourHQ with Culture Meets AI (March 2026). Each change was discussed with advisors and documented in the learning agreement version history (see Report, Methodology Section 3). Updating it wasn’t a sign that I got it wrong the first time. It was how I stayed honest as the research changed.

Appendix C2 Deviations & Governance Timeline +

Appendix C2: Deviations & Governance Timeline

What this is: Every significant change from the original plan, and every governance moment that shaped the work. The deviations are named honestly — assessors will notice them, and honest methodology is better than a tidy one. Referenced in: Report (Methodology — Section 3: Deviations G1–G6), Report (Methodology — Section 4: Cultural Governance).

Deviations (G1–G6)

CodeWhat changedWhenDecision madeWhere in report
G1Te reo removed from the voice agentDec 2025Removed te reo due to the AI’s consistent mispronunciation — cultural safety over technical feature. Silence was more respectful than performance.Findings (Kei mua), Build Code artefact
G2Transactional to relational methodologyJan 2026Shifted from building a transactional feedback form to designing relational conversational AI after identifying the “futility barrier” — the gap between what people would type and what they would say.Methodology (Section 2)
G3Role boundaries hardenedJan/Feb 2026Rebuilt the AI’s knowledge base to establish strict coaching-not-therapy boundaries after a pre-pilot tester reported “falling in love” with the agent.Ray artefact, Build Code artefact
G4Crisis architecture addedFeb 2026Implemented 5-layer crisis architecture and a human debrief dashboard after a participant reported crying through an entire Ray session.Ray artefact (safety protocols)
G5Data routing restructuredNov 2025–Feb 2026Shifted backend architecture — routing data to Supabase via webhooks instead of keeping it in ElevenLabs — to balance data sovereignty with the need to monitor for crisis.Safe Conversational AI artefact, Build Code artefact
G6Model switched mid-pilotFeb 2026Downgraded from Claude Sonnet 3.5 to Gemini 2.5 Flash to manage costs. Insight scores dropped from 4.9 to 3.1. This produced the Equity-Safety Paradox finding.Ray artefact, Discussion

Governance and Advisory Moments

WhenWhoWhat happenedWhat changed
Aug 2025Advisor Huw JonesAdvised on the ethics of recruiting cultural advisors as an outsider — suggested drafting a transparent outreach message rather than cold-pitchingPosted a transparent, vetted call for advisors on LinkedIn. Secured three cultural supervisors.
Nov 2025Advisor Felix ScholzValidated that the 130 hours teaching myself to code and building Leadership AI Coach was core research — “the build was the thinking”Shifted learning outcomes from conceptual design critique to applied, full-stack AI development
Dec 2025Advisor Felix ScholzHelped navigate the pivot from “build everything” to “prove what matters and design the rest”Dropped the full backend build for Trurivu. Pivoted to applied research pilots supported by a design vision
Jan 2026Nadine Young (GEN25 peer)Pointed out that interviewing marginalised communities is not the same as co-designing with themRemoved “co-design” claims from the V3 Learning Agreement. Methodological honesty over aspiration
Jan 2026Advisor Felix ScholzReviewed V3 Learning Agreement updates. Advised removing specific numerical metrics (e.g., “15+ users”)Softened language to “includes user testing” to prevent defending arbitrary numbers at the panel
Feb 2026Lee Palamo (GEN25 peer)Highlighted that emotional vulnerability sits in a tapu-adjacent space — introducing AI might improperly shift the interaction into noaRe-evaluated the cultural framing of AI interaction. Considered karakia to open and close the digital space safely. This eventually led to the Culture Meets AI wānanga
Feb 2026Lee Palamo (GEN25 peer)Raised the risk of unhealthy user attachment and dependency on the AIAdded reflection questions about user attachment. Reframed Ray’s persona as “mate on the back porch” rather than therapist
Feb 2026Pauline Standen (GEN25 peer)Shared the insight: “people are not anti-technology, they are anti-extraction”Validated the 68% voice preference finding. Cemented the thesis that when AI embodies manaakitanga, relational trust becomes possible
Feb 2026Dharshana (GEN25 peer review)Asked: who bears the cost of technological transitions and who actually benefits?Forced an interrogation of my own positionality — whether I was designing for the margins or just consulting them. Became an embedded lens in the final report

The deviations are numbered G1–G6 because the report references them by code. The governance moments are not failures. They are evidence that cultural supervision, peer accountability, and advisor challenge actively shaped the research. The work changed because the people around it held it accountable.

Appendix C3 Study Group Evidence +

Appendix C3: Study Group Evidence — Dated Reflective Practice

What this is: Chronological excerpts from study group transcripts showing the evolution from overwhelmed academic to obsessive builder to ethical architect. These are my words, recorded in real time, unedited. They are the dated evidence for the three-phase arc described in the Researcher Context and Reflexive Mapping sections. Referenced in: Report (Researcher Context — all three phases reference “see Appendix C3”), Report (Reflexive Mapping — LO2 evidence).

Phase 1 — Overwhelmed Academic (July–September 2025)

July 31, 2025 “I don’t know the answers, but I know it’s important for me to have the cultural oversight, because I don’t want to overstep.” → Wrestling with the weight of building in a cultural space that isn’t fully mine. Haven’t started building yet.

August 2025 “I don’t like the whole extraction thing, especially, you know, asking for people’s time.” → The ethics of asking communities to give their time and stories to research. Offered sleep workshops as koha in return.

August 24, 2025 “Hey, why don’t I just try an AI Agent?” → The technical pivot. Stopped trying to build a better form and started building a conversation. This is the moment I stopped being a student of the problem and became a builder of the solution.

September 18, 2025 “I was picking things that resonated, that I liked — but actually now that I’m at the start of building something, I’m like, no. I need to actually really choose the things that are right for me in this instance.” → First real shift from performing research to doing it. Recognising that borrowed frameworks need to be earned, not just cited.

Phase 2 — Obsessive Builder (October–December 2025)

October 2025 “This is what happens when you get obsessed with frameworks
 I’ve embedded it so deeply that now I’ve actually created more work for myself.” → The framework trap. Over-engineering the intellectual structure while the actual building is teaching me more than the reading.

November 2025 (after cultural advisor challenge) “What has this got to do with your Masters?” → The question that landed hardest. I had gone so deep into building Leadership AI Coach, Spoken Legacy, YourHQ, and side projects that I’d lost the thread of the research question. Not because I was lazy — because building was more compelling than planning.

“That kind of forced the ‘Oh, hang on, hang on a second. We need to rethink here.’” → The reframe didn’t come immediately. It took a November review and a Christmas break to arrive at the clarity that the research question I actually cared about had been present in the builds all along.

November 2025 “I built an incognito button where nothing was recorded. The transcripts actually never went anywhere
 so it was a thought of like, how can you make it as safe as possible, especially for someone who might need to talk about something that’s quite sensitive.” → Building Incognito Mode for Leadership AI Coach. The first time safety became architecture rather than policy.

December 2025 “I keep
 ruminating on the ethical sides of building a relationship coach, and how could it go wrong?” → The ethical weight starting to land. Not abstract anymore — I was building something that would hold real vulnerability.

Phase 3 — Ethical Architect (January–March 2026)

January 2026 “Man, I think I’m
 fallen in love
 because it’s got a really nice voice.” → A pre-pilot tester’s comment about Ray. The first red flag about emotional attachment. This forced a complete rebuild of the knowledge base to implement psychological safety rails.

“I keep ruminating on the ethical sides of building a relationship coach, and how could it go wrong?” → This question never stopped. It became the methodology.

January 2026 (after Nadine’s feedback) Nadine pointed out that interviewing marginalised communities is not the same as co-designing with them. → Removed “co-design” claims from the V3 Learning Agreement. Methodological honesty over aspiration.

February 2026 (Ray pilot launch) “If somebody talks about harming themselves or someone else, it sent me an email straight away just to flag it. I had to make sure that that still worked, and that all worked beautifully.” → Testing the crisis pipeline before the pilot went live. The technical working was not enough — I needed to know the human response chain worked too.

February 12, 2026 “I’m having my first conversation with Ray and I’ve been crying the entire time.” → The message from the first participant. I was driving. I couldn’t open it. I panicked.

“It wasn’t until today that I was like, shit, I actually should have thought about myself for this equation.” → The realisation that I hadn’t built researcher wellbeing into the safety architecture. I sought clinical oversight. Called a counsellor. Tried to reach a psychologist and a mental health doctoral professional. Only the counsellor responded.

February 2026 “Just the idea that I could have built some technology that could potentially harm somebody is what really got me
 I feel responsible.” → The weight of building for vulnerability. This was the birth of the Human Proxy Theory — the realisation that participants trusted the AI because they trusted me. That trust carries an obligation I hadn’t fully understood.

February 2026 “If this person’s using Ray like an hour a day, every day
 what are we doing here?” → Worrying about dependency. Added system guardrails to monitor usage patterns. Reframed Ray as a “mate” rather than an intimate companion.

March 22, 2026 “I originally thought I’m going to turn this into an app and put it on the app store and charge people money. It wasn’t until I did the pilot that on the first day I actually shat myself
 I don’t know if I would pursue monetising Ray anymore.” → The commercial product became a cautionary case study. That is not a retreat. That is what the evidence required.

March 2026 “It scares me to think about how to use it with kids
 can you in an ethical way, in a safe way, use these AI-powered tools in a positive setting?” → When asked about adapting conversational AI for expelled youth. The safety thinking had evolved from “how do I build this” to “should I build this, and for whom, and who holds the accountability?”

“At what level are we comfortable with going OK, yep, we want to try and use AI in this space, but how do we make it healthy and safe and not become a dependent?” → The question I ended with. Not a research question anymore — a practice question. The question of an ethical architect.

These excerpts are selected from study group recordings made between July 2025 and March 2026. They are verbatim transcriptions — the grammar, the self-corrections, the profanity are mine. Full study group transcripts are available on request.

Appendix C4 Participation & Engagement Summary +

Appendix C4: Participation & Engagement Summary

What this is: Quantitative overview of participation across all four builds — response counts, durations, demographics, and key feedback themes. This is where the numbers cited throughout the report and artefacts can be verified. Referenced in: Report (Findings — all compass levels), Safe Conversational AI artefact (vulnerability progression data), Ray artefact (pilot data), Conversational AI as Relational Space artefact (voice vs text engagement).

BuildResponsesDurationAvg ResponseDemographicsKey Feedback Themes
Project Rise Survey216 conversations18,583 seconds (individual sessions 1s–20m 02s)Not recorded28% Māori, 10% Pasifika, 20% neurodivergent, 2% disabled. Business owners, public service, retireesData sovereignty (Mana Motuhake), relational trust (Vā), collective benefit (Utu TĆ«turu). Frustration with survey complexity; preference for voice; privacy and surveillance concerns
Project Rise Phase 150 participants, 35 voice sessionsNot recordedNot recorded28% Māori, 10% Pasifika, 20% Disabled/Neurodivergent, 70% Female/Woman. Whangārei and AucklandBarriers of time and confidence; feeling unheard; the privacy paradox (verification vs surveillance); high value of voice input
Leadership AI Coach775 conversations (195 calls, 12 pilot participants)49,431 seconds (sessions 4s–60m 1s)Not recordedHigh-performing women leaders, corporate executives, business owners, mothers, students. New ZealandAppreciation for direct but empathetic tone, context memory, 24/7 availability. Integration of Te Ao Māori values; nervous system regulation; “filterless” accountability
Culture Meets AI74 conversations (pre-wānanga) + 90-minute live wānanga (approximately 12 live participants)147m 47s (13 analysed sessions)2.55 secondsIndigenous (Māori and Pasifika), women (overrepresented), disabled/neurodivergent, students/researchersTe reo pronunciation critical; tapu/noa tension; fear of retaliation; indigenous data sovereignty necessity
Ray Pilot148 conversation entries, 59 coaching sessions29,400 seconds (approximately 697 minutes / 11.6 hours)3.83 secondsAges 17–83. Māori and Pasifika, students, parents. Includes ADHD and dyslexia. 15 participantsNon-judgmental AI appreciated; efficiency vs cultural protocol tension; mirroring helpful for some, formulaic for others; session memory concerns; voice interruption glitches

Combined totals referenced in the report:

  • 167 initial survey conversations (Project Rise)
  • 349 coaching sessions (Leadership AI Coach — source data shows 775 total conversations including non-coaching interactions)
  • 45 pre-wānanga AI sessions, 309.7 minutes total, approximately 6.9 min average (Culture Meets AI)
  • 90-minute collaborative wānanga, approximately 12 live participants
  • 59 Ray pilot sessions, 697 minutes (11.6 hours), approximately 11.8 min average session

Some totals in the report reflect curated research-relevant sessions rather than raw conversation counts. The table above shows the full data; the report text references the analysed subset.

Appendix C5 Participant Questions by Build +

Appendix C5: Participant Questions by Build

What this is: The interview and conversation protocols for each build — what the AI actually asked participants. These are the questions that generated the data behind the findings. Referenced in: Report (Methodology), Conversational AI as Relational Space artefact (wānanga data), Safe Conversational AI artefact (build-specific safety design).

Culture Meets AI (Pre-Wānanga + Post-Wānanga Reflection)

The AI facilitated private one-on-one conversations before and after the live wānanga. A routing question determined which mode: “Are you here for the pre-wananga questions or the post-wananga reflections?”

Pre-Wānanga Questions:

  • “How are you feeling about talking to an AI about this?”

  • “Think about a time you’ve had to share something personal or vulnerable in a digital space — maybe online, through an app, a form, a message. What was that like for you? What made it feel safe, or not safe?”

  • “Imagine someone sitting down with an AI — not a person, an AI — to talk about something that really matters to them. Maybe their relationship, or learning something tied to their identity, or processing something painful. What’s your gut reaction to that?”

  • “If Lian and Lee are going to build AI that operates in these vulnerable spaces — relationship coaching, language learning, cultural contexts — what do they absolutely have to get right? What would you tell them to never forget?”

Post-Wānanga Reflections:

  • “Did anything change for you tonight — in how you think about AI, vulnerability, or cultural safety? Or did it confirm something you already felt?”

  • “What moment or idea from tonight is still with you? What landed?”

  • “If you could tell Lian and Lee one thing to carry forward in their work — one thing they must not forget — what would it be?”

  • “Is there something that feels unfinished? A question that wasn’t answered, or a tension that wasn’t resolved?”

Project Rise (Feedback Systems Research)

The AI adapted its questions based on the participant’s role — community members received different questions from business owners, governance members, and employees.

Opening (all participants):

  • “What would you like me to call you?”

  • Role identification: community member, business owner/executive, board/governance member, or something else

Core questions (community members/consumers):

  • “How are you feeling about having this conversation with an AI voice agent?”

  • “Can you think of a time when you wanted to give feedback about something but found it difficult or maybe even gave up? What made it challenging?”

  • “What makes you feel safe or uncomfortable when sharing feedback about services or experiences?”

  • “Can you think of a time when giving feedback felt genuinely respectful and worthwhile? What made that experience different?”

  • “If you could design the perfect way to give feedback, what would it include?”

  • “Who do you feel is most left out of current feedback systems?”

  • “How has this conversation felt for you, especially given that you’re talking to an AI voice rather than a human?”

  • “What would you want the people building these systems to understand or prioritize?”

Adapted questions (business owners/executives):

  • “Could you tell me about your experience with collecting and managing feedback from your customers or clients?”

  • “What challenges do you see in getting honest feedback from all your customers or stakeholders?”

  • “How do you currently protect the privacy and dignity of people who give you feedback?”

Adapted questions (board/governance members):

  • “Could you tell me about your experience with feedback systems in your governance role?”

  • “From your governance experience, how do you think organizations could better balance being transparent about their constraints while still maintaining confidence?”

Ray (Relationship Coaching Pilot)

Ray used a three-tier questioning framework: session opening, post-session diagnostic, and a final review at the end of the two-week pilot.

Session Opening (every session):

  • “Just checking the setup — is it just you today, or is there someone else with you?”

  • “What kind of relationship are we talking about today? Romantic, family, friendship, work, or yourself?”

Post-Session Diagnostic (after each session):

  • “Before we wrap up — how are you feeling right now, compared to when we started?”

  • “Was there a moment today that felt particularly useful, or one that felt off?”

  • “On a scale of 1–5, how safe did this conversation feel?”

  • “Is there anything you wish Ray had done differently?”

End-of-Pilot Review:

  • “Looking back across all your sessions — what’s the thing that surprised you most?”

  • “Did Ray ever say something that felt wrong or uncomfortable? What was it?”

  • “Would you use something like Ray again? Why or why not?”

  • “If you were giving advice to the person who built this — what would you tell them to keep, change, or never do again?”

Appendix C6 Participant Communications +

Appendix C6: Participant Communications (Sample)

What this is: Three examples from the Ray pilot showing how participants were onboarded, supported during the pilot, and followed up with after. These are manaakitanga in practice — not just in the AI’s design, but in the human layer around it. Referenced in: Report (Methodology — ethics and participant care), Ray artefact (consent and onboarding), Safe Conversational AI artefact (human-in-the-loop).

1. Invitation and Welcome (Pre-Pilot)

This was the initial message sent to participants before the pilot began. It was deliberately personal, not institutional.


Kia ora [name],

Thank you so much for agreeing to be part of the Ray pilot. I want you to know that I don’t take this lightly — you’re trusting me with something real, and I’m going to honour that.

What Ray is: A voice-first AI relationship coach. He’s designed to help you think through relationship patterns — with a partner, family member, friend, colleague, anyone. He’s direct, warm, and grounded. He’s not a therapist. He won’t diagnose you or tell you what to do. He’ll help you see what’s going on more clearly.

What Ray is not: A crisis service. A replacement for therapy. A friend. If something comes up that’s beyond coaching, Ray will tell you — and I’ll be here too.

Your data: Ray doesn’t remember your conversations between sessions. Every session is a clean slate. Your data is anonymised with a participant code. You can delete any session, or your entire account, at any time. You can withdraw from the research before March 1, 2026.

One thing I want to be honest about: Ray is built on ElevenLabs, which is a US-based platform. That means voice data passes through their servers. I’ve opted out of AI model training on your data, but I can’t control everything ElevenLabs does with their platform. I’ve written a full transparency page on the app so you can see exactly what that means.

How to get started: [link]. Use the sign-up code I’ve sent you separately. Talk to Ray when you’re ready — morning, night, whenever suits. There’s no minimum or maximum.

If anything feels off at any point, call me. Seriously. My number is [number].

Ngā mihi nui, Lian


2. Mid-Pilot Safety Check-In (During Pilot)

This was sent as an app notification and email to all participants during the pilot period. It was triggered by the researcher recognising the emotional weight of what participants were sharing.


Kia ora team,

Quick check-in from me (the human, not the AI).

Some of you have had some really deep conversations with Ray this week. I want to make sure you know:

  • If anything from a session is sitting with you and you want to talk it through with a person, I’m here. Call me, text me, whatever works.

  • If you want to debrief after a session, that’s not a sign something went wrong. It’s a sign the conversation went somewhere real.

  • If you need professional support, 1737 is free call or text 24/7. Lifeline is 0800 543 354.

I’m checking the safety dashboard daily. If I see anything that concerns me, I’ll reach out directly.

You’re doing something brave by being part of this. Thank you.

Lian


3. Post-Pilot Acknowledgement (After Pilot)

This was sent to all participants after the pilot concluded.


Kia ora [name],

The pilot has wrapped, and I want to say something properly: thank you.

What you shared with Ray — and what you shared with me in your feedback — has changed this research. Not in a vague “your input was valuable” way. In a real way. Your sessions shaped findings that didn’t exist before you spoke them.

A few things:

  • Your data stays anonymised. If I want to quote anything from your sessions in the thesis, I’ll check with you first. You’ll see the exact words and the context before anything is published.

  • You can still delete. Your account, your sessions, any of it — just let me know before the analysis is finalised.

  • There will be a return. Once the research is complete, I’m building a dedicated page to share what I found, what your contribution built, and a personal acknowledgement for everyone who showed up. The loop must close. That’s not optional for me.

If anything from the pilot is still sitting with you — good or hard — my door is open. Always.

With genuine gratitude, Lian


These three messages represent the care arc: prepare, support, return. Participant names and contact details have been removed. The tone is deliberate — these are not institutional communications. They are one person talking to people she knows, about something that matters to all of them.