As Founding Product Designer in an engineer-led startup, I championed user-centered design by transforming passive live captioning into an interactive AI second-screen ecosystem, solving cognitive overload for trilingual government summit attendees.

AI

B2B + SAAS + START UP

FOUNDING PRODUCT DESIGNER

UX + STRATEGY

2023

CPII AI Meeting Assistant

Designing Real-Time AI Transcription for High-Stakes Multilingual Events

AI model for navigating the chaos of trilingual “code-mixing”

Hong Kong’s linguistic landscape is one of the most complex in the world. Professionals weave Cantonese, English, and Mandarin into single sentences—a 'code-mixing' challenge that breaks standard AI models. The CPII Meeting Assistant is engineered to bridge this gap. Designed specifically for this multilingual reality, it ensures accuracy where it matters most: at critical government summits and high-level ceremonial events.

CONTEXT

Journalists & Broadcasters

Professionals covering high-stakes government policy addresses who need to capture precise quotes in real-time without errors.

The Public

Viewers of major ceremonial events (e.g., Government Policy Address) who struggle to follow complex, trilingual speeches in real-time.

Workflow Fragmentation in Live Journalism

PROBLEM + REQUIREMENTS

The core problems we set out to solve was workflow fragmentation: the fractured experience of journalists juggling recorders, notepads, and translation apps while trying to stay present in a fast-moving event. Stop to transcribe a quote, and you miss the next sentence. Switch to translate, and you lose your place entirely.


We wanted to create a space where capturing, understanding, and synthesizing could happen simultaneously, without breaking flow.

Input

Live code-mixed audio (Cantonese/English/Mandarin)

Processing

Live Stream: real-time transcription + translation

Context: rolling AI summaries for instant catch-up

Chat: bounded chatbot for fact-checking

Output

Synthesis: automated post-event report, note

When I first joined, the product focused on providing real-time transcription and translation as live subtitles during events. However, AI latency and hardware constraints limited scalability. Through contextual inquiries and user interviews, I helped pivot our strategy from generic subtitles to solving a critical workflow problem for journalists covering multilingual events.

Product Evolution

Early product: event subtitles

The product focused on live subtitles during events when I first joined the early project.

Post-event summary

Pivoted product direction to target journalists, introducing post-event summary.

The event workspace

Integrate capturing and synthesizing into a scalable and adaptive workspace.

The “Stream-to-Synthesis” Mental Model

The mental model of the CPII Meeting Assistant was built around the journalist’s cognitive workflow during and after event: transforming a chaotic live stream into structured intelligence:

Augmented Input: Transforming fleeting audio into navigable, real-time text.

Active Processing: Empowering users to interrogate the data via Chat capture context without losing the live thread.

Structured Output: Crystallizing fragmented insights into verified reports and audio clips.


By grounding the design in this Input → Processing → Output flow, we gave users a stable anchor within the high-velocity environment, keeping complex AI interactions digestible and intuitive.

SOLUTION

To eliminate the "tool -switching fatigue" of live reporting, I designed a Dynamic Panel System where consumption and verification coexist. The interface anchors the Live Feed as the persistent core, while the Intelligence Layer (Chat & Summary) scales fluidly alongside it. This allows journalists to interrogate data and fact-check in real-time without ever obscuring the context of the stage.

Standard

The default layout with a balanced view of capturing inputs and synthesizing.

Transcription & Translation

Optimized for concentrating on the event content with the real-time navigable text.

Input + Note

Ideal for synthesizing and documenting the real-time summary onto notes.

Dynamic Panel Structure • During Event

Dynamic Panel Structure • After Event

Post-event, the platform transitions from Capture to Analysis mode. In this state, the Bounded AI Chat becomes a verification engine for journalists to interrogate specific details, while the automated Meeting Report handles the heavy lifting of structural synthesis.

Standard

The default layout with a balanced view of inputs and synthesizing.

Processing through Chat

Optimized for referencing transcription and synthesizing the event content.

Notes & Meeting Report

Automated post-event synthesis via AI Meeting Reports, consolidating insights from the live transcription and chat.

Output Panel

The transformation space where raw event inputs become outputs. Users can take notes, organize insights, and package their synthesis into shareable work.

Chat Panel

The chat panel remains central to the post-meeting experience, while shrinking during live events to serve as a lightweight tool for quickly checking event content.

Input Panel

This is the real-time transcription and translation of the event that serves as the foundation of the user experience, providing the information source for both the chat panel and post-meeting report.