-
Notifications
You must be signed in to change notification settings - Fork 26
Expand file tree
/
Copy pathrpc_test_data.txt
More file actions
347 lines (199 loc) · 12.5 KB
/
rpc_test_data.txt
File metadata and controls
347 lines (199 loc) · 12.5 KB
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
Here are the main issues / foot-guns I see in that snippet, grouped by severity.
1) Input validation & type safety problems
Blindly trusting $body[...] shapes.
participant_attributes, participant_metadata, room_config are assumed to be the correct types. If a client sends "participant_attributes": "lol" you’ll pass a string into setAttributes() and may get a runtime error or (worse) unexpected serialization.
Fix: explicitly validate types:
participant_identity, participant_name, room_name → strings, non-empty, length capped
participant_metadata → string (or JSON string, depending on SDK expectation)
participant_attributes → associative array of strings
room_config → array / specific schema expected by SDK
!empty() is the wrong check for some fields.
empty() treats "0", 0, false, [] as empty. If someone intentionally sets metadata to "0" you’ll skip it.
Fix: use array_key_exists() / isset() + type checks instead.
No bounds on identity/name/metadata sizes.
A client can send megabytes of metadata/attributes and you’ll happily embed it into a JWT → big CPU + big response + possible gateway/proxy issues.
Fix: enforce max lengths (identity/name/metadata) and max attribute count/size.
2) Security / abuse concerns
Unauthenticated token minting endpoint (likely).
If this is exposed publicly without auth/rate limiting, anyone can mint tokens and join any room name they choose (including “admin-ish” room names), and they can set arbitrary identity/name/metadata/attributes.
Fix: require auth (session cookie, API key, JWT from your app, etc.) + rate limit + allowlist/validate room names and identities.
Identity spoofing.
Because identity comes from the request body, a malicious client can claim to be another user (participant_identity: "alice").
Fix: identity/name should come from your authenticated user context, not from client input.
Room name injection / namespace collisions.
Letting clients pick arbitrary room_name can cause collisions or unauthorized access patterns.
Fix: server decides the room or validates it against what the authenticated user is allowed to join.
3) Error handling & operational problems
Missing checks for env vars.
If LIVEKIT_API_KEY, LIVEKIT_API_SECRET, or LIVEKIT_URL are missing, you’ll mint invalid tokens or return bad data without a clear error.
Fix: validate envs and return 500 with a clear message (don’t leak secrets).
No try/catch around SDK calls.
->toJwt() and some setters can throw. As-is, you may return HTML/500 with no JSON body.
Fix: wrap token generation in try { ... } catch (\Throwable $e) { ... }.
No response headers.
You’re returning JSON but not setting Content-Type: application/json.
Fix: header('Content-Type: application/json'); (and ideally charset).
json_decode without checking for non-object JSON.
If the request body is valid JSON but not an object (e.g. [] or "hi"), $body['room_name'] will emit warnings because $body isn’t an array.
Fix: after decode, ensure is_array($body).
4) Grants / permissions clarity
Grant is “join room” only, but still potentially too broad.
setRoomJoin() allows joining the room; depending on defaults, users can publish/subscribe. That may be fine, but many apps want finer control.
Fix: set explicit publish/subscribe/record/admin permissions if your SDK supports it and your use case needs it.
RoomConfig trust boundary.
If room_config can influence server-side room behavior, accepting it from clients is risky unless you strictly validate/allowlist.
Fix: server-owned config only, or validate against a safe subset.
5) Minor correctness / style nits
Port 3000 comment is meaningless in PHP.
PHP typically runs behind a web server or built-in server; the code can’t “make sure it’s on port 3000.” This is likely copy/paste from a Node quickstart.
Potential JWT bloat.
Attributes/metadata/config all contribute to token size. Large JWTs can exceed header limits if later used in headers, or simply become unwieldy.
A “safer skeleton” checklist (no big rewrite)
If you want quick improvements without changing the structure:
Set headers: Content-Type: application/json
Ensure $body is an array: if (!is_array($body)) ...
Validate env vars are present
Validate/normalize inputs (string lengths, allowed characters, attribute schema)
Don’t take identity/name from the client; derive from auth
Wrap minting in try/catch and return JSON errors
Add auth + rate limiting
If you tell me what LiveKit PHP SDK version you’re using and what types it expects for setMetadata, setAttributes, and setRoomConfig, I can point out the exact type mismatches and suggest concrete validation rules.
Yes — an intentional reconnect (especially if it wasn’t fully handled / “reconnected” wasn’t applied cleanly) can explain exactly that pattern: B is publishing audio, C hears it, A doesn’t; A still sees B’s video; everyone else is fine.
Here are the most common mechanisms that produce that “A can’t hear B, but everything else works” symptom, and how they relate to reconnect / missing handling.
1) A’s receiver-side subscription for B’s audio got dropped or stuck
After reconnect/resume, the SDK often has to re-sync:
which tracks A is subscribed to,
which track IDs/SIDs are current,
and the receiver pipeline for each track.
If the reconnect path misses “re-apply subscriptions” (or misses the audio subset), you can get:
B’s video subscribed correctly (so A sees B),
B’s audio not subscribed / not attached / not resumed (so A hears nothing),
while C successfully re-subscribed (so C hears B).
What you’d see in logs (often on A’s side, not B’s):
track subscribed/unsubscribed events for B audio missing
“muted”/“enabled=false”/“track not attached” for audio only
receiver stats: video inbound bytes increasing; audio inbound bytes ~0
2) A is receiving B’s audio RTP, but decrypt/MLS state is wrong for that one stream
If you’re using end-to-end encryption / MLS, a reconnect/desync can produce a selective decrypt failure:
video might decrypt (different key usage / timing / SSRC mapping / separate sender keys)
audio might fail decrypt (or fail key lookup) → silence
other participants still fine (they have correct epoch/keys)
This matches “C hears B, A doesn’t” because only A is out of sync.
What you’d see:
on A: “cannot decrypt frame”, “unknown key”, “epoch mismatch”, “discarding packet” for audio SSRC
on B: usually nothing (B is just sending)
on C: normal decrypt / no errors
3) Track identity changed across reconnect and A is still bound to the old audio track
An intentional reconnect can result in:
B’s audio track being republished (new track SID / new transceiver / new SSRC),
but A’s app logic or state machine still pointing at the old one.
Result:
UI shows B present + video (new video track handled)
audio element for B is still bound to the old track (or never attached)
C happened to bind to the new track
Clues:
two different audio track SIDs for B around the reconnect
“unpublished old audio track” followed by “published new audio track”
A never logs “subscribed to new audio track”
4) Audio receiver exists but is muted/disabled only on A due to state not re-applied
During reconnect, some SDKs re-create track objects; app code that manages mute/unmute can accidentally:
keep B’s audio track muted on A,
while video remains enabled.
Clues:
on A: audio track enabled=false, muted=true, or volume set to 0
no inbound audio level, but subscription exists
Why “B-side logs only” often can’t prove this
Because the failure is usually on A’s receiving path:
B is publishing and C hears it → B’s publisher is probably fine.
The divergence is: A’s subscription/decrypt/attach path is broken.
So B’s logs might show reconnect happened, but to explain “A can’t hear B” you almost always need:
A-side track events + inbound-rtp stats + (if E2EE) decrypt logs.
What to check quickly (high signal)
If you can grab these, it will pinpoint which bucket you’re in:
On participant A (receiver)
Did A ever log Subscribed to B’s audio track (track SID)?
Are inbound RTP audio bytes/packets increasing for B’s audio SSRC?
Any decrypt/MLS errors tied to audio SSRC or key epoch?
Is the audio track muted/disabled or not attached to an audio element?
On participant B (publisher)
Did B’s audio track get republished (new track SID) around reconnect?
Any “audio track ended/restarted” events?
Direct answer to your question
Intentional reconnect + missing reconnected handling absolutely can explain this behavior, most commonly by leaving A’s audio subscription or receiver pipeline out of sync while video stays fine.
If MLS/E2EE is in play, it can also explain it via a selective decrypt/key mismatch on A for B’s audio stream.
If you paste:
the reconnect section from B’s logs, and
the same time window from A’s logs (even shorter),
I can tell you which of the above is most likely and what exact log lines to look for next.
Here’s a small, practical pattern that works well for 100+ rooms with autoSubscribe:false and survives reconnect/resume without “A can’t hear B” type drift.
Core idea
Keep a single source of truth for “who should I be subscribed to right now?”
On any event that can desync state (reconnect/resumed, participant joined/left, tracks published/unpublished, active speaker change), re-apply that desired subscription set.
You don’t need to persist “current subscription state” perfectly — you can recompute desired state and call subscribe/unsubscribe idempotently.
Minimal JS/TS snippet
import { Room, RoomEvent, Track } from "livekit-client";
const room = new Room({
autoSubscribe: false,
// ...other options
});
// Your policy knobs
const MAX_AUDIO_SUBS = 12; // keep small for 100+ rooms
const pinnedIdentities = new Set<string>(); // optional: user pins
function desiredAudioPublishers(): string[] {
// 1) Pins always included
const pins = [...pinnedIdentities];
// 2) Active speakers next (Room keeps this list updated)
const speakers = room.activeSpeakers
.map(p => p.identity)
.filter(Boolean);
// Merge, preserve order, cap
const ordered = [...new Set([...pins, ...speakers])];
return ordered.slice(0, MAX_AUDIO_SUBS);
}
function applyAudioSubscriptions() {
const wanted = new Set(desiredAudioPublishers());
for (const [, p] of room.remoteParticipants) {
// Find this participant’s mic publication (if any)
const micPub = [...p.audioTrackPublications.values()].find(pub =>
pub.source === Track.Source.Microphone
);
if (!micPub) continue;
const shouldSub = wanted.has(p.identity);
// Idempotent: calling setSubscribed repeatedly is fine
if (micPub.isSubscribed !== shouldSub) {
micPub.setSubscribed(shouldSub);
}
}
}
// Re-apply policy on anything that can change reality/state
room
.on(RoomEvent.Connected, applyAudioSubscriptions)
.on(RoomEvent.Reconnected, applyAudioSubscriptions) // WS/RTC recovered
.on(RoomEvent.Resumed, applyAudioSubscriptions) // if using resume
.on(RoomEvent.ParticipantConnected, applyAudioSubscriptions)
.on(RoomEvent.ParticipantDisconnected, applyAudioSubscriptions)
.on(RoomEvent.TrackPublished, applyAudioSubscriptions)
.on(RoomEvent.TrackUnpublished, applyAudioSubscriptions)
.on(RoomEvent.ActiveSpeakersChanged, applyAudioSubscriptions);
Do you need to track participant list / subscription state?
You need less than you think:
Participant list
No — the room.remoteParticipants map is your participant list. You can iterate it whenever you re-apply.
Subscription state
Not really.
You can compute desired state and call setSubscribed(true/false).
Checking pub.isSubscribed is only to avoid spamming calls; it’s optional.
What you do want to track
Policy inputs you own:
pinnedIdentities (if you support pins)
maybe a “stage” list / visible tiles list
MAX_AUDIO_SUBS
Everything else can be derived from the room.
Why this helps your “reconnect caused selective audio loss” case
If a reconnect/resume accidentally leaves some audio subscriptions “off” (or bound to old track pubs), calling applyAudioSubscriptions() after Reconnected/Resumed/TrackPublished forces the client back to the correct state.
This is the key: treat reconnect as “my local state might be wrong; re-sync everything.”
Two small extras that prevent common gotchas
Handle track SID changes
On reconnect, you can see new publications. Hooking TrackPublished and reapplying covers this.
Make sure your “identity” is stable
Use participant.identity (string) rather than SIDs that might change between sessions.