smartagents-workshop/agenda.html at main · w3c/smartagents-workshop · GitHub

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
<!DOCTYPE html>
<html lang="en">

<head>
	<meta content="text/html; charset=utf-8" http-equiv="Content-Type">
	<title>W3C Workshop on Smart Voice Agents</title>
	<meta name="viewport" content="width=device-width, initial-scale=1">
	<link rel="stylesheet" href="style.css">
	<meta name="twitter:site" content="@w3c">
	<meta name="twitter:card" content="summary_large_image">
</head>

<body>
	<header class="header">
		<div id="banner">
			<div>
				<p>
					<a href="https://www.w3.org/">
						<img alt="W3C" src="https://www.w3.org/Icons/w3c_home" height="48" width="72">
					</a>
				</p>
				<div id="banner-image">
					<h1>W3C Workshop on<br>Smart Voice Agents</h1>
					<h2>February 2026, Virtual on Zoom</h2>
				</div>
				<p>A virtual event with talks and interactive sessions, February 2026</p>
			</div>
		</div>
		<nav class="menu" id="menu">
			<ul>
				<li><a href="index.html">Call for Participation</a></li>
				<li><a href="speakers.html">Apply as a speaker</a></li>
				<li><a class="active-tab">Agenda</a></li>
				<!--li><a href="sponsor.html">Sponsorship</a></li-->
				<!--li><a href="talks.html">Talks</a></li-->
				<!--li><a href="attendees.html">Attendees</a></li-->
				<!--li><a href="https://www.w3.org/2021/06/25-smart-cities-minutes.html">Minutes</a></li-->
				<!li><a href="report.html">Report</a></li>
			</ul>
		</nav>
	</header>

	<aside class="box" id="important">
		<h2 class="footnote">Virtual event!</h2>
		<p>This workshop will be a fully virtual event with talks, online discussions, and interactive sessions.</p>
		<p>With the consent of the speakers, recordings and slides of the talks will be made available after the workshop.</p>
	</aside>

	<aside class="box" id="dates">
		<h2 class="footnote">Important Dates</h2>
		<dl>
			<dt>20 October 2025</dt>
			<dd><a href="index.html">Call-for-Participation page</a> published and Participant registration starts.</dd>

			<dt>11 December 2025</dt>
			<dd>Deadline for submitting a proposal for a talk during the workshop
			</dd>

			<dt>16 December 2025</dt>
			<dd>Speakers acceptance notification</dd>

			<dt>23 January 2026</dt>
			<dd>Publish the basic agenda for the workshop</dd>

			<dt>
				5 February 2026 (<span class="updates">deadline extended!</span>)
			</dt>
			<dd>Deadline for selected speakers to provide their presentation slides and recorded talks</dd>

			<dt>19 February 2026</dt>
			<dd>Public release of all accepted talks</dd>
			<dd>
				Deadline to register as workshop
					participant
			</dd>

			<dt>25-27 February 2026</dt>
			<dd>Workshop (virtual)</dd>
			<dd>We will try to accommodate different time zones as much as
				possible.</dd>
		</dl>
	</aside>

	<main id="main" class="main">
		<section id="home">
			<p>This is the agenda of the W3C Workshop on Smart Voice Agents.</p>
			<p>
				The workshop is free. To help us prepare and run the event
				effectively, please see the <a href="index.html">Call for
					Participation</a>.<br /> There will be three half-day sessions on
				February 25, 26, and 27, 2026.
			</p>
			<h2>Agenda Overview</h2>
			<ol>
				<li><a href="#session1">Session 1: February 25, 2026, 12:00
						GMT-5</a></li>
				<li><a href="#session2">Session 2: February 26, 2026, 10:00
						GMT+1</a></li>
				<li><a href="#session3">Session 3: February 27, 2026, 12:00
						GMT-5</a></li>
			</ol>
		</section>

		<section id="session1">
			<h2>Session 1: February 25, 2026, 12:00 GMT-5</h2>
			<h3 id="session1-times">Start times in various timezones:</h3>
			<ul>
				<li>PST (Pacific Standard Time): 09:00 AM</li>
				<li>MST (Mountain Standard Time): 10:00 AM</li>
				<li>EST (Eastern Standard Time): 12:00 PM</li>
				<li>CET (Central European Time): 6:00 PM (18:00)</li>
				<li>ISR (Israel Standard Time): 7:00 PM (19:00)</li>
				<li>JST (Japan Standard Time): 2:00 AM (Next Day)</li>
			</ul>
			<h3 id="session1-agenda">Agenda</h3>
			<dl>
				<dt>1. Scene setting (10min)</dt>
				<dd>Goals and expectations for the workshop</dd>

				<dt>2. Governance and Greenlights: Leveraging the '3 Ps' to
					Standardize Trust, Scale, and Usability in Voice Agent Web
					Integration (20min)</dt>
				<dd><strong>Talk:</strong> Patricia Lee (10min)</dd>
				<dd class="video"><a href="https://www.youtube.com/watch?v=kT_Z5NWyi5Q&list=PLNhYw8KaLq2U_6YPLm6UitxQyXzGcy6jQ">Video Recording</a></dd>
				<dd>
					<strong>Bio:</strong> Growing AI Engineer with IT Executive
					Management experience. See also <a
						href="https://www.linkedin.com/in/patriciaminglee/">https://www.linkedin.com/in/patriciaminglee/</a>.
				</dd>
				<dd><strong>Abstract:</strong> The expansion of Voice Agent
					technologies across diverse platforms—from smart speakers to mobile
					browsers—is currently blocked by issues in usability, security, and
					system interoperability. This presentation will leverage an
					extensive background in Product, Program, and Project Management
					(the '3 Ps') combined with Lean Six Sigma expertise to present a
					framework for driving effective Web standardization. The talk will
					focus on identifying the critical stakeholder needs that must
					anchor any standardization effort, moving beyond technical
					specifications to address real-world GRC challenges. The talk will
					explore two major barriers: (1) Establishing a cyber-resilient and
					compliant trust model that satisfies both user privacy concerns and
					evolving regulatory requirements, (2) Applying process optimization
					principles to define measurable requirements for supporting various
					dependencies and capabilities. This data-driven, methodological
					approach provides the necessary structure to clarify reasonable
					applications for Voice Agents and deliver the technical clarity
					needed for developers, standards bodies, and regulators to align
					and move the Web Voice Agent ecosystem toward trusted, scalable
					adoption.</dd>
				<dd>Discussion &amp; questions (10min)</dd>

				<dt>3. Solving Lead vs. Lead: Consistent Pronunciation for Web
					Content (20min)</dt>
				<dd><strong>Talk:</strong> Sarah Wood (10min)</dd>
				<dd class="video"><a href="https://www.youtube.com/watch?v=Bc04fXrR1U4&list=PLNhYw8KaLq2U_6YPLm6UitxQyXzGcy6jQ">Video Recording</a></dd>
				<dd><strong>Bio:</strong> read-aloud/TTS and dyslexia expertise and
					standards development</dd>
				<dd><strong>Abstract:</strong> Pronunciation in web content suffers
					from the lack of a standardized way to specify Synthetic Speech
					Markup Language in HTML. A standard would benefit assistive
					technologies, voice agents, and AI systems learning from web
					content. We present a recent solution from the EdTech standards
					community and seek discussion of approaches to more broadly ensure
					reliable, interoperable pronunciation for inclusive, accessible
					voice experiences across the web.</dd>
				<dd>Discussion &amp; questions (10min)</dd>

				<dt>4. Hallucination in Automatic Speech Recognition Systems
					(20min)</dt>
				<dd><strong>Talk:</strong> Bhiksha Raj (10min)</dd>
				<dd class="video"><a href="https://www.youtube.com/watch?v=BT1kd0Agv5Q&list=PLNhYw8KaLq2U_6YPLm6UitxQyXzGcy6jQ">Video Recording</a></dd>
				<dd><strong>Bio:</strong> Speech recognition and related areas
				</dd>
				<dd><strong>Abstract:</strong> We discuss the problem of hallucination
					in ASR systems, including their definition, description,
					quantification and mitigation.</dd>
				<dd>Discussion &amp; questions (10min)</dd>

				<dt>5. Multi-Agent Conversational Methodology (25min)</dt>
				<dd><strong>Talk:</strong> Emmett Coin (10min)</dd>
				<dd class="video"><a href="https://www.youtube.com/watch?v=NKPnx73RF6I&list=PLNhYw8KaLq2U_6YPLm6UitxQyXzGcy6jQ">Video Recording</a></dd>
				<dd><strong>Bio:</strong> <a
						href="https://www.linkedin.com/in/emmettcoin/">https://www.linkedin.com/in/emmettcoin/</a>
				</dd>
				<dd><strong>Abstract:</strong> As part of the Open Floor Protocol (OFP)
					standards group with the Linux Foundation, I am working on a common
					protocol for multi-agent conversational systems. We have
					demonstrated multi-agent coordination with a human participant and
					are working toward multi-agent and human conversational support
					that is fully collaborative across all participants. I will show
					how OFP works and why it is important.</dd>
				<dd>Demo (5min)</dd>
				<dd>Discussion &amp; questions (10min)</dd>

				<dt>6. Reimagining Standards for Voice AI: Interoperability
					Without Sacrificing Innovation (20min)</dt>
				<dd><strong>Talk:</strong> RJ Burnham (10min)</dd>
				<dd class="video"><a href="https://www.youtube.com/watch?v=wSbdFGzZZhs&list=PLNhYw8KaLq2U_6YPLm6UitxQyXzGcy6jQ">Video Recording</a></dd>
				<dd><strong>Bio:</strong> RJ Burnham has spent more than two decades
					shaping the evolution of voice and conversational technology. He
					was a member of the W3C Voice Browser Working Group and later
					chaired the Call Control Working Group, contributing directly to
					the creation of VoiceXML and CCXML. As CTO at Voxeo, he helped
					drive the adoption of open standards and large-scale voice
					platforms. His later work in computer vision and neural networks
					gave him an early view into the foundations that eventually became
					modern large language models, which pulled him back into voice as
					President and CTO of Plum, where he helped build one of the first
					generative AI voice platforms. RJ is now the founder of Consig AI,
					where he’s applying the latest generation of voice AI and decades
					of experience to solve real problems in healthcare communication.
				</dd>
				<dd><strong>Abstract:</strong> Proprietary voice AI platforms emerged
					largely because the industry moved far beyond the directed-dialog
					world where VoiceXML excelled. As we shifted from structured
					dialogs to intent-based systems and now to LLM-driven agents,
					vendors built closed stacks to move quickly, but the result is
					fragmentation and lock-in. With voice systems becoming more capable
					and more central to real workflows, it’s worth asking whether the
					pendulum should swing back. What standards, if any, make sense in
					this new landscape? Can we design a shared foundation that restores
					the interoperability and portability we once had, without slowing
					innovation? This talk explores whether the time is right for a new
					generation of voice AI standards and what principles should guide
					them.</dd>
				<dd>Discussion &amp; questions (10min)</dd>

				<dt>
					<strong>Break</strong> (10min)
				</dt>

				<dt>7. Introduction to Breakout Groups (10min)</dt>
				<dd>Work mode, goals and expectations for the break out groups</dd>

				<dt>8. Breakout Groups (45min)</dt>
				<dd>Working in breakout groups</dd>

				<dt>9. Breakout Group Results (50min)</dt>
				<dd>Presentation of breakout group results (10min per group)</dd>

				<dt>10. Wrap-up (10min)</dt>
				<dd>Summary and conclusion</dd>
			</dl>
			<p>
				<a href="https://www.w3.org/2026/02/25-smartagents-main-minutes.html">Main session minutes</a> are available.
			</p>
		</section>

		<section id="session2">
			<h2>Session 2: February 26, 2026, 10:00 GMT+1</h2>
			<h3 id="session2-times">Start times in various timezones:</h3>
			<ul>
				<li>PST (Pacific Standard Time): 01:00 AM</li>
				<li>MST (Mountain Standard Time): 02:00 AM</li>
				<li>EST (Eastern Standard Time): 04:00 AM</li>
				<li>CET (Central European Time): 10:00 AM</li>
				<li>ISR (Israel Standard Time): 11:00 AM</li>
				<li>JST (Japan Standard Time): 6:00 PM (18:00)</li>
			</ul>
			<h3 id="session2-agenda">Agenda</h3>
			<dl>
				<dt>1. Scene setting (10min)</dt>
				<dd>Goals and expectations for the workshop</dd>

				<dt>2. Towards Smarter Voice Interfaces: Using Grounding and
					Knowledge (20min)</dt>
				<dd><strong>Talk:</strong> Kristiina Jokinen (10min)</dd>
				<dd class="video"><a href="https://www.youtube.com/watch?v=8Pmxmn7gCsA&list=PLNhYw8KaLq2U_6YPLm6UitxQyXzGcy6jQ">Video Recording</a></dd>
				<dd><strong>Bio:</strong> <a href="https://blogs.helsinki.fi/kjokinen/">https://blogs.helsinki.fi/kjokinen/</a>
				</dd>
				<dd><strong>Abstract:</strong> In this talk I will discuss the design
					and development of trustworthy GenAI-based applications, and in
					particular, focus on grounding (i.e. anchoring conversations,
					perceptions, and knowledge in a shared context) as a key principle
					of spoken interaction management. I will review challenges,
					opportunities, and lessons learnt in creating accountable reasoning
					and fluent natural dialogue between Smart AI agents and users, to
					support long-term interactions and trust for responsible and safe
					AI agents.</dd>
				<dd>Discussion &amp; questions (10min)</dd>

				<dt>3. Accessibility of 3D and Immersive Content via Voice
					Interaction (25min)</dt>
				<dd><strong>Talk:</strong> Zohar Gan (10min)</dd>
				<dd class="video"><a href="https://www.youtube.com/watch?v=Cq2Is-uIuNU&list=PLNhYw8KaLq2U_6YPLm6UitxQyXzGcy6jQ">Video Recording</a></dd>
				<dd>
					<strong>Bio:</strong> Solutions architect and software developer
					researching accessibility of 3D content for people with
					disabilities (as a social impact project). Presented at the W3C
					Workshop on Inclusive Design for Immersive Web Standards in 2019.
					As follow up later published the semantic-xr MIT-licensed GitHub
					repository. Recently, built a web-based voice-powered proof of
					concept demonstrating multiple 3D content accessibility solutions.
				</dd>
				<dd><strong>Abstract:</strong> Focused on people with disabilities,
					assistive-tech developers and content creators, this talk envisions
					a more voice-accessible 3D web content (videos, XR etc.) using
					semantic 3D metadata. Beyond AI-only descriptions, rich metadata
					like names, hierarchy, hidden elements and instructions can greatly
					improve access. The talk features a voice-powered demo based on <a
						href="https://github.com/techscouter/semantic-xr">https://github.com/techscouter/semantic-xr</a>
					and proposes standardizing: a semantic 3D metadata schema, its
					embedding in media and web spatial voice.</dd>
				<dd>Demo (5min)</dd>
				<dd>Discussion &amp; questions (10min)</dd>

				<dt>4. Gaze-Aware Dialog Systems (20min)</dt>
				<dd><strong>Talk:</strong> Fares Abawi (10min)</dd>
				<dd class="video"><a href="https://www.youtube.com/watch?v=cHUXbVS7RJI&list=PLNhYw8KaLq2U_6YPLm6UitxQyXzGcy6jQ">Video Recording</a></dd>
				<dd><strong>Bio:</strong> Multimodal social cue integration for robot
					gaze control</dd>
				<dd><strong>Abstract:</strong> Gaze provides informative cues for
					barge-in detection, turn-taking coordination, and reference
					resolution in dialog systems, yet commonly remains underutilized in
					current dialog system implementations. This talk examines how
					webcam-based gaze tracking can ground deictic expressions and
					anticipate conversational intent in web interfaces, and discusses
					practical integration methods using neural architectures that fuse
					gaze features with acoustic and language representations for
					multimodal dialog systems.</dd>
				<dd>Discussion &amp; questions (10min)</dd>

				<dt>5. Transition of Use Cases for Voice to LLM-based RAG or
					Agent setups in difficult scenarios (20min)</dt>
				<dd><strong>Talk:</strong> Ulrike Stiefelhagen (10min)</dd>
				<dd class="video"><a href="https://www.youtube.com/watch?v=nGf2mvRV2QI&list=PLNhYw8KaLq2U_6YPLm6UitxQyXzGcy6jQ">Video Recording</a></dd>
				<dd><strong>Bio:</strong> project manager voice projects</dd>
				<dd><strong>Abstract:</strong> Handling Hallucinations in Voice agents
					might be even trickier than in textual Chatbots. Use Cases from
					Industry (Workers Daily Summary) and Health (Patient Chat) show
					that changed requirements in LLM-based systems may be a chance for
					voice to be included even in settings that are usually difficult
					for voice (privacy, and noise, respectively).</dd>
				<dd>Discussion &amp; questions (10min)</dd>

				<dt>6. Towards Web Standards for Configurable Naturally
					Responsive Voice Interaction for AI Agents (25min)</dt>
				<dd><strong>Talk:</strong> Paola Di Maio (10min)</dd>
				<dd class="video"><a href="https://www.youtube.com/watch?v=gB7IHey8o_Q&list=PLNhYw8KaLq2U_6YPLm6UitxQyXzGcy6jQ">Video Recording</a></dd>
				<dd><strong>Bio:</strong> Paola Di Maio, PhD is a systems engineer with
					research experience spanning neurosymbolic AI, knowledge
					representation, and human-AI co-evolution. Chair of the W3C AI
					Knowledge Representation Community Group, she participates in the
					development of web standards for AI systems. Her research focuses
					on the intersection of cognitive science and intelligent systems
					design.</dd>
				<dd><strong>Abstract:</strong> Voice agents are in place, but current
					designs fundamentally misunderstand how humans think and
					communicate. We have the technology, but we lack usability
					standards that would make voice interfaces truly work for users.

					Current voice AI systems suffer from critical UX failures that
					break natural conversation flow and violate fundamental principles
					of cognitive respect. Through systematic documentation of real-time
					voice interactions with state-of-the-art conversational AI, I have
					identified critical gaps that prevent voice agents from supporting
					how humans actually think and communicate. This talk presents
					empirical findings based on real use cases, will attempt to include
					a recorded demo and gather consideration toward development of a
					web standard for user friendly voice AI</dd>
				<dd>Demo (5min)</dd>
				<dd>Discussion &amp; questions (10min)</dd>

				<dt>
					<strong>Break</strong> (10min)
				</dt>

				<dt>7. Introduction to Breakout Groups (10min)</dt>
				<dd>Work mode, goals and expectations for the break out groups</dd>

				<dt>8. Breakout Groups (45min)</dt>
				<dd>Working in breakout groups</dd>

				<dt>9. Breakout Group Results (50min)</dt>
				<dd>Presentation of breakout group results (10min per group)</dd>

				<dt>10. Wrap-up (10min)</dt>
				<dd>Summary and conclusion</dd>
			</dl>
			<p>
				<a href="https://www.w3.org/2026/02/26-smartagents-main-minutes.html">Main session minutes</a> are available.
			</p>
		</section>

		<section id="session3">
			<h2>Session 3: February 27, 2026, 12:00 GMT-5</h2>
			<h3 id="session3-times">Start times in various timezones:</h3>
			<ul>
				<li>PST (Pacific Standard Time): 09:00 AM</li>
				<li>MST (Mountain Standard Time): 10:00 AM</li>
				<li>EST (Eastern Standard Time): 12:00 PM</li>
				<li>CET (Central European Time): 6:00 PM (18:00)</li>
				<li>ISR (Israel Standard Time): 7:00 PM (19:00)</li>
				<li>JST (Japan Standard Time): 2:00 AM (Next Day)</li>
			</ul>
			<h3 id="session3-agenda">Agenda</h3>
			<dl>
				<dt>1. Scene setting (10min)</dt>
				<dd>Goals and expectations for the workshop</dd>

				<dt>2. Do we need real-time processing capabilities on voice
					agents? (20min)</dt>
				<dd><strong>Talk:</strong> Casey Kennington (10min)</dd>
				<dd class="video"><a href="https://www.youtube.com/watch?v=I26KBfY3Vx8&list=PLNhYw8KaLq2U_6YPLm6UitxQyXzGcy6jQ">Video Recording</a></dd>
				<dd><strong>Bio:</strong> Dialogue systems research since 2011.</dd>
				<dd><strong>Abstract:</strong> Alexa, Siri, and other voice agents play
					a game of verbal ping-pong with their users: human utters a wake
					word, they speak something, then the agent processes. Humans don't
					play verbal ping-pong in that they are constantly listening and
					updating their understanding in real-time. "Incremental" agents
					process inputs and outputs in real-time, at the word level instead
					of waiting for an utterance to finish before responding. This
					introduces technical challenges, but also makes the agents more
					natural. In my talk, I will explore the need and requirements for
					building real-time processing into agents.</dd>
				<dd>Discussion &amp; questions (10min)</dd>

				<dt>3. Voice Agents for In-Vehicle Interaction (20min)</dt>
				<dd><strong>Talk:</strong> Frankie James (10min)</dd>
				<dd class="video"><a href="https://www.youtube.com/watch?v=rxLhGn6BUp8&list=PLNhYw8KaLq2U_6YPLm6UitxQyXzGcy6jQ">Video Recording</a></dd>
				<dd><strong>Bio:</strong> Automotive industry consultant/HCI Researcher
				</dd>
				<dd><strong>Abstract:</strong> The in-cabin experience for passenger
					vehicles continues to increase in complexity as auto manufacturers
					add more features. However, most people only use a small fraction
					of their vehicle’s functionality, either because they do not know
					how to access certain features or are not aware of what features
					are available. This talk will discuss the use of voice agents to
					unlock hidden vehicle features, along with potential pitfalls and
					issues related to the technology.</dd>
				<dd>Discussion &amp; questions (10min)</dd>

				<dt>4. Trust &amp; Empathy with Multimodal Assistants (20min)</dt>
				<dd><strong>Talk:</strong> Raj Tumuluri (10min)</dd>
				<dd class="video"><a href="https://www.youtube.com/watch?v=AJMoRjPXn-g&list=PLNhYw8KaLq2U_6YPLm6UitxQyXzGcy6jQ">Video Recording</a></dd>
				<dd><strong>Bio:</strong> <a href="https://dblp.org/pid/154/2852.html">https://dblp.org/pid/154/2852.html</a>
				</dd>
				<dd><strong>Abstract:</strong> This talk will explore the gap between
					functional performance and emotional resonance in modern AI. We
					will delve into key design principles for engineering cognitive
					empathy (the perceived understanding of user state/intent) and
					trustworthiness in systems handling complex inputs. Key topics
					include: tone consistency across modalities, handling ambiguity and
					error states gracefully, and visual/aural design choices that
					signal reliability and care. The presentation will conclude with a
					practical framework.</dd>
				<dd>Discussion &amp; questions (10min)</dd>

				<dt>5. Beyond Screen Readers: Standardizing Embeddable Voice
					Agents for Universal Web Accessibility (20min)</dt>
				<dd><strong>Talk:</strong> Bryan Vuong (10min)</dd>
				<dd class="video"><a href="https://www.youtube.com/watch?v=qnnYe6D9Cgc&list=PLNhYw8KaLq2U_6YPLm6UitxQyXzGcy6jQ">Video Recording</a></dd>
				<dd><strong>Bio:</strong> Bryan Vuong is the CTO of InnoSearch AI, a
					company dedicated to developing technologies that make the Web more
					accessible and inclusive. He leads the technical strategy for
					CoBrowse AI, focusing on bridging the gap between visual web
					interfaces and non-visual access. Bryan holds a Ph.D. in Computer
					Science from the University of Wisconsin-Madison. He brings
					extensive industry experience to the discussion on standardization,
					having previously held engineering and product leadership roles at
					Google, Meta, and Walmart e-commerce.</dd>
				<dd><strong>Abstract:</strong> While screen readers have long been the
					standard for non-visual web access, they often present a steep
					learning curve and lack conversational context. This talk presents
					a case study of CoBrowse AI, an embeddable voice agent designed to
					layer voice interaction over existing websites. We will demonstrate
					how LLM-driven voice agents can bridge the gap between visual
					interfaces and blind/low-vision users, and critically, we will
					identify the specific Web Standard gaps that currently hinder the
					seamless integration of such third-party agents into the DOM.</dd>
				<dd>Discussion &amp; questions (10min)</dd>

				<dt>
					<strong>Break</strong> (10min)
				</dt>

				<dt>6. Introduction to Breakout Groups (10min)</dt>
				<dd>Work mode, goals and expectations for the break out groups</dd>

				<dt>7. Breakout Groups (45min)</dt>
				<dd>Working in breakout groups</dd>

				<dt>8. Breakout Group Results (40min)</dt>
				<dd>Presentation of breakout group results (10min per group)</dd>

				<dt>9. Wrap-up (10min)</dt>
				<dd>Summary and conclusion</dd>
			</dl>
			<p>
				<a href="https://www.w3.org/2026/02/27-smartagents-main-minutes.html">Main session minutes</a> are available.
			</p>
		</section>

	</main>

	<footer class="footer" id="footer">
		<p>W3C is proud to be an open and inclusive organization, focused on productive discussions and actions. Our <a href="https://www.w3.org/policies/code-of-conduct/">Code of Conduct</a> ensures that all voices can be heard. Questions? Contact Philippe Le H&#233;garet &lt;<a href="mailto:plh@w3.org">plh@w3.org</a>&gt;.</p>
		<p>Suggestions for improving this workshop page, such as fixing typos or adding specific topics, can be made by opening a <a href="https://github.com/w3c/smartcities-workshop">pull request on GitHub</a>.</p>
	</footer>
</body>

</html>