-
Notifications
You must be signed in to change notification settings - Fork 3
Expand file tree
/
Copy pathagenda.html
More file actions
502 lines (455 loc) · 24.5 KB
/
agenda.html
File metadata and controls
502 lines (455 loc) · 24.5 KB
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
<!DOCTYPE html>
<html lang="en">
<head>
<meta content="text/html; charset=utf-8" http-equiv="Content-Type">
<title>W3C Workshop on Smart Voice Agents</title>
<meta name="viewport" content="width=device-width, initial-scale=1">
<link rel="stylesheet" href="style.css">
<meta name="twitter:site" content="@w3c">
<meta name="twitter:card" content="summary_large_image">
</head>
<body>
<header class="header">
<div id="banner">
<div>
<p>
<a href="https://www.w3.org/">
<img alt="W3C" src="https://www.w3.org/Icons/w3c_home" height="48" width="72">
</a>
</p>
<div id="banner-image">
<h1>W3C Workshop on<br>Smart Voice Agents</h1>
<h2>February 2026, Virtual on Zoom</h2>
</div>
<p>A virtual event with talks and interactive sessions, February 2026</p>
</div>
</div>
<nav class="menu" id="menu">
<ul>
<li><a href="index.html">Call for Participation</a></li>
<li><a href="speakers.html">Apply as a speaker</a></li>
<li><a class="active-tab">Agenda</a></li>
<!--li><a href="sponsor.html">Sponsorship</a></li-->
<!--li><a href="talks.html">Talks</a></li-->
<!--li><a href="attendees.html">Attendees</a></li-->
<!--li><a href="https://www.w3.org/2021/06/25-smart-cities-minutes.html">Minutes</a></li-->
<!li><a href="report.html">Report</a></li>
</ul>
</nav>
</header>
<aside class="box" id="important">
<h2 class="footnote">Virtual event!</h2>
<p>This workshop will be a fully virtual event with talks, online discussions, and interactive sessions.</p>
<p>With the consent of the speakers, recordings and slides of the talks will be made available after the workshop.</p>
</aside>
<aside class="box" id="dates">
<h2 class="footnote">Important Dates</h2>
<dl>
<dt>20 October 2025</dt>
<dd><a href="index.html">Call-for-Participation page</a> published and Participant registration starts.</dd>
<dt>11 December 2025</dt>
<dd>Deadline for submitting a proposal for a talk during the workshop
</dd>
<dt>16 December 2025</dt>
<dd>Speakers acceptance notification</dd>
<dt>23 January 2026</dt>
<dd>Publish the basic agenda for the workshop</dd>
<dt>
5 February 2026 (<span class="updates">deadline extended!</span>)
</dt>
<dd>Deadline for selected speakers to provide their presentation slides and recorded talks</dd>
<dt>19 February 2026</dt>
<dd>Public release of all accepted talks</dd>
<dd>
Deadline to register as workshop
participant
</dd>
<dt>25-27 February 2026</dt>
<dd>Workshop (virtual)</dd>
<dd>We will try to accommodate different time zones as much as
possible.</dd>
</dl>
</aside>
<main id="main" class="main">
<section id="home">
<p>This is the agenda of the W3C Workshop on Smart Voice Agents.</p>
<p>
The workshop is free. To help us prepare and run the event
effectively, please see the <a href="index.html">Call for
Participation</a>.<br /> There will be three half-day sessions on
February 25, 26, and 27, 2026.
</p>
<h2>Agenda Overview</h2>
<ol>
<li><a href="#session1">Session 1: February 25, 2026, 12:00
GMT-5</a></li>
<li><a href="#session2">Session 2: February 26, 2026, 10:00
GMT+1</a></li>
<li><a href="#session3">Session 3: February 27, 2026, 12:00
GMT-5</a></li>
</ol>
</section>
<section id="session1">
<h2>Session 1: February 25, 2026, 12:00 GMT-5</h2>
<h3 id="session1-times">Start times in various timezones:</h3>
<ul>
<li>PST (Pacific Standard Time): 09:00 AM</li>
<li>MST (Mountain Standard Time): 10:00 AM</li>
<li>EST (Eastern Standard Time): 12:00 PM</li>
<li>CET (Central European Time): 6:00 PM (18:00)</li>
<li>ISR (Israel Standard Time): 7:00 PM (19:00)</li>
<li>JST (Japan Standard Time): 2:00 AM (Next Day)</li>
</ul>
<h3 id="session1-agenda">Agenda</h3>
<dl>
<dt>1. Scene setting (10min)</dt>
<dd>Goals and expectations for the workshop</dd>
<dt>2. Governance and Greenlights: Leveraging the '3 Ps' to
Standardize Trust, Scale, and Usability in Voice Agent Web
Integration (20min)</dt>
<dd><strong>Talk:</strong> Patricia Lee (10min)</dd>
<dd class="video"><a href="https://www.youtube.com/watch?v=kT_Z5NWyi5Q&list=PLNhYw8KaLq2U_6YPLm6UitxQyXzGcy6jQ">Video Recording</a></dd>
<dd>
<strong>Bio:</strong> Growing AI Engineer with IT Executive
Management experience. See also <a
href="https://www.linkedin.com/in/patriciaminglee/">https://www.linkedin.com/in/patriciaminglee/</a>.
</dd>
<dd><strong>Abstract:</strong> The expansion of Voice Agent
technologies across diverse platforms—from smart speakers to mobile
browsers—is currently blocked by issues in usability, security, and
system interoperability. This presentation will leverage an
extensive background in Product, Program, and Project Management
(the '3 Ps') combined with Lean Six Sigma expertise to present a
framework for driving effective Web standardization. The talk will
focus on identifying the critical stakeholder needs that must
anchor any standardization effort, moving beyond technical
specifications to address real-world GRC challenges. The talk will
explore two major barriers: (1) Establishing a cyber-resilient and
compliant trust model that satisfies both user privacy concerns and
evolving regulatory requirements, (2) Applying process optimization
principles to define measurable requirements for supporting various
dependencies and capabilities. This data-driven, methodological
approach provides the necessary structure to clarify reasonable
applications for Voice Agents and deliver the technical clarity
needed for developers, standards bodies, and regulators to align
and move the Web Voice Agent ecosystem toward trusted, scalable
adoption.</dd>
<dd>Discussion & questions (10min)</dd>
<dt>3. Solving Lead vs. Lead: Consistent Pronunciation for Web
Content (20min)</dt>
<dd><strong>Talk:</strong> Sarah Wood (10min)</dd>
<dd class="video"><a href="https://www.youtube.com/watch?v=Bc04fXrR1U4&list=PLNhYw8KaLq2U_6YPLm6UitxQyXzGcy6jQ">Video Recording</a></dd>
<dd><strong>Bio:</strong> read-aloud/TTS and dyslexia expertise and
standards development</dd>
<dd><strong>Abstract:</strong> Pronunciation in web content suffers
from the lack of a standardized way to specify Synthetic Speech
Markup Language in HTML. A standard would benefit assistive
technologies, voice agents, and AI systems learning from web
content. We present a recent solution from the EdTech standards
community and seek discussion of approaches to more broadly ensure
reliable, interoperable pronunciation for inclusive, accessible
voice experiences across the web.</dd>
<dd>Discussion & questions (10min)</dd>
<dt>4. Hallucination in Automatic Speech Recognition Systems
(20min)</dt>
<dd><strong>Talk:</strong> Bhiksha Raj (10min)</dd>
<dd class="video"><a href="https://www.youtube.com/watch?v=BT1kd0Agv5Q&list=PLNhYw8KaLq2U_6YPLm6UitxQyXzGcy6jQ">Video Recording</a></dd>
<dd><strong>Bio:</strong> Speech recognition and related areas
</dd>
<dd><strong>Abstract:</strong> We discuss the problem of hallucination
in ASR systems, including their definition, description,
quantification and mitigation.</dd>
<dd>Discussion & questions (10min)</dd>
<dt>5. Multi-Agent Conversational Methodology (25min)</dt>
<dd><strong>Talk:</strong> Emmett Coin (10min)</dd>
<dd class="video"><a href="https://www.youtube.com/watch?v=NKPnx73RF6I&list=PLNhYw8KaLq2U_6YPLm6UitxQyXzGcy6jQ">Video Recording</a></dd>
<dd><strong>Bio:</strong> <a
href="https://www.linkedin.com/in/emmettcoin/">https://www.linkedin.com/in/emmettcoin/</a>
</dd>
<dd><strong>Abstract:</strong> As part of the Open Floor Protocol (OFP)
standards group with the Linux Foundation, I am working on a common
protocol for multi-agent conversational systems. We have
demonstrated multi-agent coordination with a human participant and
are working toward multi-agent and human conversational support
that is fully collaborative across all participants. I will show
how OFP works and why it is important.</dd>
<dd>Demo (5min)</dd>
<dd>Discussion & questions (10min)</dd>
<dt>6. Reimagining Standards for Voice AI: Interoperability
Without Sacrificing Innovation (20min)</dt>
<dd><strong>Talk:</strong> RJ Burnham (10min)</dd>
<dd class="video"><a href="https://www.youtube.com/watch?v=wSbdFGzZZhs&list=PLNhYw8KaLq2U_6YPLm6UitxQyXzGcy6jQ">Video Recording</a></dd>
<dd><strong>Bio:</strong> RJ Burnham has spent more than two decades
shaping the evolution of voice and conversational technology. He
was a member of the W3C Voice Browser Working Group and later
chaired the Call Control Working Group, contributing directly to
the creation of VoiceXML and CCXML. As CTO at Voxeo, he helped
drive the adoption of open standards and large-scale voice
platforms. His later work in computer vision and neural networks
gave him an early view into the foundations that eventually became
modern large language models, which pulled him back into voice as
President and CTO of Plum, where he helped build one of the first
generative AI voice platforms. RJ is now the founder of Consig AI,
where he’s applying the latest generation of voice AI and decades
of experience to solve real problems in healthcare communication.
</dd>
<dd><strong>Abstract:</strong> Proprietary voice AI platforms emerged
largely because the industry moved far beyond the directed-dialog
world where VoiceXML excelled. As we shifted from structured
dialogs to intent-based systems and now to LLM-driven agents,
vendors built closed stacks to move quickly, but the result is
fragmentation and lock-in. With voice systems becoming more capable
and more central to real workflows, it’s worth asking whether the
pendulum should swing back. What standards, if any, make sense in
this new landscape? Can we design a shared foundation that restores
the interoperability and portability we once had, without slowing
innovation? This talk explores whether the time is right for a new
generation of voice AI standards and what principles should guide
them.</dd>
<dd>Discussion & questions (10min)</dd>
<dt>
<strong>Break</strong> (10min)
</dt>
<dt>7. Introduction to Breakout Groups (10min)</dt>
<dd>Work mode, goals and expectations for the break out groups</dd>
<dt>8. Breakout Groups (45min)</dt>
<dd>Working in breakout groups</dd>
<dt>9. Breakout Group Results (50min)</dt>
<dd>Presentation of breakout group results (10min per group)</dd>
<dt>10. Wrap-up (10min)</dt>
<dd>Summary and conclusion</dd>
</dl>
<p>
<a href="https://www.w3.org/2026/02/25-smartagents-main-minutes.html">Main session minutes</a> are available.
</p>
</section>
<section id="session2">
<h2>Session 2: February 26, 2026, 10:00 GMT+1</h2>
<h3 id="session2-times">Start times in various timezones:</h3>
<ul>
<li>PST (Pacific Standard Time): 01:00 AM</li>
<li>MST (Mountain Standard Time): 02:00 AM</li>
<li>EST (Eastern Standard Time): 04:00 AM</li>
<li>CET (Central European Time): 10:00 AM</li>
<li>ISR (Israel Standard Time): 11:00 AM</li>
<li>JST (Japan Standard Time): 6:00 PM (18:00)</li>
</ul>
<h3 id="session2-agenda">Agenda</h3>
<dl>
<dt>1. Scene setting (10min)</dt>
<dd>Goals and expectations for the workshop</dd>
<dt>2. Towards Smarter Voice Interfaces: Using Grounding and
Knowledge (20min)</dt>
<dd><strong>Talk:</strong> Kristiina Jokinen (10min)</dd>
<dd class="video"><a href="https://www.youtube.com/watch?v=8Pmxmn7gCsA&list=PLNhYw8KaLq2U_6YPLm6UitxQyXzGcy6jQ">Video Recording</a></dd>
<dd><strong>Bio:</strong> <a href="https://blogs.helsinki.fi/kjokinen/">https://blogs.helsinki.fi/kjokinen/</a>
</dd>
<dd><strong>Abstract:</strong> In this talk I will discuss the design
and development of trustworthy GenAI-based applications, and in
particular, focus on grounding (i.e. anchoring conversations,
perceptions, and knowledge in a shared context) as a key principle
of spoken interaction management. I will review challenges,
opportunities, and lessons learnt in creating accountable reasoning
and fluent natural dialogue between Smart AI agents and users, to
support long-term interactions and trust for responsible and safe
AI agents.</dd>
<dd>Discussion & questions (10min)</dd>
<dt>3. Accessibility of 3D and Immersive Content via Voice
Interaction (25min)</dt>
<dd><strong>Talk:</strong> Zohar Gan (10min)</dd>
<dd class="video"><a href="https://www.youtube.com/watch?v=Cq2Is-uIuNU&list=PLNhYw8KaLq2U_6YPLm6UitxQyXzGcy6jQ">Video Recording</a></dd>
<dd>
<strong>Bio:</strong> Solutions architect and software developer
researching accessibility of 3D content for people with
disabilities (as a social impact project). Presented at the W3C
Workshop on Inclusive Design for Immersive Web Standards in 2019.
As follow up later published the semantic-xr MIT-licensed GitHub
repository. Recently, built a web-based voice-powered proof of
concept demonstrating multiple 3D content accessibility solutions.
</dd>
<dd><strong>Abstract:</strong> Focused on people with disabilities,
assistive-tech developers and content creators, this talk envisions
a more voice-accessible 3D web content (videos, XR etc.) using
semantic 3D metadata. Beyond AI-only descriptions, rich metadata
like names, hierarchy, hidden elements and instructions can greatly
improve access. The talk features a voice-powered demo based on <a
href="https://github.com/techscouter/semantic-xr">https://github.com/techscouter/semantic-xr</a>
and proposes standardizing: a semantic 3D metadata schema, its
embedding in media and web spatial voice.</dd>
<dd>Demo (5min)</dd>
<dd>Discussion & questions (10min)</dd>
<dt>4. Gaze-Aware Dialog Systems (20min)</dt>
<dd><strong>Talk:</strong> Fares Abawi (10min)</dd>
<dd class="video"><a href="https://www.youtube.com/watch?v=cHUXbVS7RJI&list=PLNhYw8KaLq2U_6YPLm6UitxQyXzGcy6jQ">Video Recording</a></dd>
<dd><strong>Bio:</strong> Multimodal social cue integration for robot
gaze control</dd>
<dd><strong>Abstract:</strong> Gaze provides informative cues for
barge-in detection, turn-taking coordination, and reference
resolution in dialog systems, yet commonly remains underutilized in
current dialog system implementations. This talk examines how
webcam-based gaze tracking can ground deictic expressions and
anticipate conversational intent in web interfaces, and discusses
practical integration methods using neural architectures that fuse
gaze features with acoustic and language representations for
multimodal dialog systems.</dd>
<dd>Discussion & questions (10min)</dd>
<dt>5. Transition of Use Cases for Voice to LLM-based RAG or
Agent setups in difficult scenarios (20min)</dt>
<dd><strong>Talk:</strong> Ulrike Stiefelhagen (10min)</dd>
<dd class="video"><a href="https://www.youtube.com/watch?v=nGf2mvRV2QI&list=PLNhYw8KaLq2U_6YPLm6UitxQyXzGcy6jQ">Video Recording</a></dd>
<dd><strong>Bio:</strong> project manager voice projects</dd>
<dd><strong>Abstract:</strong> Handling Hallucinations in Voice agents
might be even trickier than in textual Chatbots. Use Cases from
Industry (Workers Daily Summary) and Health (Patient Chat) show
that changed requirements in LLM-based systems may be a chance for
voice to be included even in settings that are usually difficult
for voice (privacy, and noise, respectively).</dd>
<dd>Discussion & questions (10min)</dd>
<dt>6. Towards Web Standards for Configurable Naturally
Responsive Voice Interaction for AI Agents (25min)</dt>
<dd><strong>Talk:</strong> Paola Di Maio (10min)</dd>
<dd class="video"><a href="https://www.youtube.com/watch?v=gB7IHey8o_Q&list=PLNhYw8KaLq2U_6YPLm6UitxQyXzGcy6jQ">Video Recording</a></dd>
<dd><strong>Bio:</strong> Paola Di Maio, PhD is a systems engineer with
research experience spanning neurosymbolic AI, knowledge
representation, and human-AI co-evolution. Chair of the W3C AI
Knowledge Representation Community Group, she participates in the
development of web standards for AI systems. Her research focuses
on the intersection of cognitive science and intelligent systems
design.</dd>
<dd><strong>Abstract:</strong> Voice agents are in place, but current
designs fundamentally misunderstand how humans think and
communicate. We have the technology, but we lack usability
standards that would make voice interfaces truly work for users.
Current voice AI systems suffer from critical UX failures that
break natural conversation flow and violate fundamental principles
of cognitive respect. Through systematic documentation of real-time
voice interactions with state-of-the-art conversational AI, I have
identified critical gaps that prevent voice agents from supporting
how humans actually think and communicate. This talk presents
empirical findings based on real use cases, will attempt to include
a recorded demo and gather consideration toward development of a
web standard for user friendly voice AI</dd>
<dd>Demo (5min)</dd>
<dd>Discussion & questions (10min)</dd>
<dt>
<strong>Break</strong> (10min)
</dt>
<dt>7. Introduction to Breakout Groups (10min)</dt>
<dd>Work mode, goals and expectations for the break out groups</dd>
<dt>8. Breakout Groups (45min)</dt>
<dd>Working in breakout groups</dd>
<dt>9. Breakout Group Results (50min)</dt>
<dd>Presentation of breakout group results (10min per group)</dd>
<dt>10. Wrap-up (10min)</dt>
<dd>Summary and conclusion</dd>
</dl>
<p>
<a href="https://www.w3.org/2026/02/26-smartagents-main-minutes.html">Main session minutes</a> are available.
</p>
</section>
<section id="session3">
<h2>Session 3: February 27, 2026, 12:00 GMT-5</h2>
<h3 id="session3-times">Start times in various timezones:</h3>
<ul>
<li>PST (Pacific Standard Time): 09:00 AM</li>
<li>MST (Mountain Standard Time): 10:00 AM</li>
<li>EST (Eastern Standard Time): 12:00 PM</li>
<li>CET (Central European Time): 6:00 PM (18:00)</li>
<li>ISR (Israel Standard Time): 7:00 PM (19:00)</li>
<li>JST (Japan Standard Time): 2:00 AM (Next Day)</li>
</ul>
<h3 id="session3-agenda">Agenda</h3>
<dl>
<dt>1. Scene setting (10min)</dt>
<dd>Goals and expectations for the workshop</dd>
<dt>2. Do we need real-time processing capabilities on voice
agents? (20min)</dt>
<dd><strong>Talk:</strong> Casey Kennington (10min)</dd>
<dd class="video"><a href="https://www.youtube.com/watch?v=I26KBfY3Vx8&list=PLNhYw8KaLq2U_6YPLm6UitxQyXzGcy6jQ">Video Recording</a></dd>
<dd><strong>Bio:</strong> Dialogue systems research since 2011.</dd>
<dd><strong>Abstract:</strong> Alexa, Siri, and other voice agents play
a game of verbal ping-pong with their users: human utters a wake
word, they speak something, then the agent processes. Humans don't
play verbal ping-pong in that they are constantly listening and
updating their understanding in real-time. "Incremental" agents
process inputs and outputs in real-time, at the word level instead
of waiting for an utterance to finish before responding. This
introduces technical challenges, but also makes the agents more
natural. In my talk, I will explore the need and requirements for
building real-time processing into agents.</dd>
<dd>Discussion & questions (10min)</dd>
<dt>3. Voice Agents for In-Vehicle Interaction (20min)</dt>
<dd><strong>Talk:</strong> Frankie James (10min)</dd>
<dd class="video"><a href="https://www.youtube.com/watch?v=rxLhGn6BUp8&list=PLNhYw8KaLq2U_6YPLm6UitxQyXzGcy6jQ">Video Recording</a></dd>
<dd><strong>Bio:</strong> Automotive industry consultant/HCI Researcher
</dd>
<dd><strong>Abstract:</strong> The in-cabin experience for passenger
vehicles continues to increase in complexity as auto manufacturers
add more features. However, most people only use a small fraction
of their vehicle’s functionality, either because they do not know
how to access certain features or are not aware of what features
are available. This talk will discuss the use of voice agents to
unlock hidden vehicle features, along with potential pitfalls and
issues related to the technology.</dd>
<dd>Discussion & questions (10min)</dd>
<dt>4. Trust & Empathy with Multimodal Assistants (20min)</dt>
<dd><strong>Talk:</strong> Raj Tumuluri (10min)</dd>
<dd class="video"><a href="https://www.youtube.com/watch?v=AJMoRjPXn-g&list=PLNhYw8KaLq2U_6YPLm6UitxQyXzGcy6jQ">Video Recording</a></dd>
<dd><strong>Bio:</strong> <a href="https://dblp.org/pid/154/2852.html">https://dblp.org/pid/154/2852.html</a>
</dd>
<dd><strong>Abstract:</strong> This talk will explore the gap between
functional performance and emotional resonance in modern AI. We
will delve into key design principles for engineering cognitive
empathy (the perceived understanding of user state/intent) and
trustworthiness in systems handling complex inputs. Key topics
include: tone consistency across modalities, handling ambiguity and
error states gracefully, and visual/aural design choices that
signal reliability and care. The presentation will conclude with a
practical framework.</dd>
<dd>Discussion & questions (10min)</dd>
<dt>5. Beyond Screen Readers: Standardizing Embeddable Voice
Agents for Universal Web Accessibility (20min)</dt>
<dd><strong>Talk:</strong> Bryan Vuong (10min)</dd>
<dd class="video"><a href="https://www.youtube.com/watch?v=qnnYe6D9Cgc&list=PLNhYw8KaLq2U_6YPLm6UitxQyXzGcy6jQ">Video Recording</a></dd>
<dd><strong>Bio:</strong> Bryan Vuong is the CTO of InnoSearch AI, a
company dedicated to developing technologies that make the Web more
accessible and inclusive. He leads the technical strategy for
CoBrowse AI, focusing on bridging the gap between visual web
interfaces and non-visual access. Bryan holds a Ph.D. in Computer
Science from the University of Wisconsin-Madison. He brings
extensive industry experience to the discussion on standardization,
having previously held engineering and product leadership roles at
Google, Meta, and Walmart e-commerce.</dd>
<dd><strong>Abstract:</strong> While screen readers have long been the
standard for non-visual web access, they often present a steep
learning curve and lack conversational context. This talk presents
a case study of CoBrowse AI, an embeddable voice agent designed to
layer voice interaction over existing websites. We will demonstrate
how LLM-driven voice agents can bridge the gap between visual
interfaces and blind/low-vision users, and critically, we will
identify the specific Web Standard gaps that currently hinder the
seamless integration of such third-party agents into the DOM.</dd>
<dd>Discussion & questions (10min)</dd>
<dt>
<strong>Break</strong> (10min)
</dt>
<dt>6. Introduction to Breakout Groups (10min)</dt>
<dd>Work mode, goals and expectations for the break out groups</dd>
<dt>7. Breakout Groups (45min)</dt>
<dd>Working in breakout groups</dd>
<dt>8. Breakout Group Results (40min)</dt>
<dd>Presentation of breakout group results (10min per group)</dd>
<dt>9. Wrap-up (10min)</dt>
<dd>Summary and conclusion</dd>
</dl>
<p>
<a href="https://www.w3.org/2026/02/27-smartagents-main-minutes.html">Main session minutes</a> are available.
</p>
</section>
</main>
<footer class="footer" id="footer">
<p>W3C is proud to be an open and inclusive organization, focused on productive discussions and actions. Our <a href="https://www.w3.org/policies/code-of-conduct/">Code of Conduct</a> ensures that all voices can be heard. Questions? Contact Philippe Le Hégaret <<a href="mailto:plh@w3.org">plh@w3.org</a>>.</p>
<p>Suggestions for improving this workshop page, such as fixing typos or adding specific topics, can be made by opening a <a href="https://github.com/w3c/smartcities-workshop">pull request on GitHub</a>.</p>
</footer>
</body>
</html>