Hotel Receptionist example#5983
Conversation
Framework (livekit-agents/livekit/agents/beta/workflows/): - Add `require_explicit_ask: bool = False` parameter to every field-collection task (GetNameTask, GetEmailTask, GetPhoneNumberTask, GetAddressTask, GetDOBTask, GetCardNumberTask, GetSecurityCodeTask, GetExpirationDateTask). When set, the task's `update_*` tool gets `IGNORE_ON_ENTER` so the model is structurally forced to ask the user before recording (no silent-fill from chat_ctx). - Refactor each task's update tool to a `_build_update_*_tool()` builder so the flag can be applied per-instance. - Rename `record_card_number` -> `update_card_number` for consistency with the other `update_*` field tools. - `GetCardNumberTask`/`GetSecurityCodeTask`/`GetExpirationDateTask` now accept `chat_ctx` and `extra_instructions`, plumbed through from `GetCreditCardTask`. - `GetCreditCardTask` reorders the TaskGroup so card_number runs first and cardholder name runs last, passes chat_ctx into the TaskGroup AND every sub-task, and sets `require_explicit_ask=True` on the cardholder name task with a role-anchoring extra_instructions hint. - `GetNameTask` on_enter rewritten with role-anchoring reasoning template plus alpha-character validation in update_name (rejects digit-only values so a card number/phone number can't get crammed into the name field). - Standardize "credit card number" wording in `_CARD_NUMBER_BASE_INSTRUCTIONS` so the model isn't pulled toward saying just "card". Example (examples/hotel_receptionist/): - New hotel-receptionist example agent demonstrating the workflow patterns: room booking, restaurant reservation, cancellation, invoice lookup, dispute handling, all backed by SQLite via hotel_db + ui_view streaming changes. - BookRoomTask / BookRestaurantTask use `open_*_dialog` naming for sub-task spawning tools (less "asky" prior than `ask_*`), action-directive `_status()` helper so missing-field names don't leak into spoken questions. - HotelReceptionistAgent system prompt includes "Browse vs. book" section (names the gate between browse and act tools) and "Never invent a confirmation" rule (kills hallucinated "you're booked" without a tool call). - `start_room_booking` / `start_restaurant_booking` naming for booking-flow initiators (clear they start a multi-step process, not a one-shot finalize). - `room_details` split from `check_room_availability` so listing rooms shows only types, with prices/views fetched lazily after the caller picks. - COMMON_INSTRUCTIONS covers: one-sentence-per-reply, one question per turn, spelled-out numbers/codes, no vague qualifiers, varied phrasing across consecutive asks, no input vocabulary in voice, never invent values, date-interpretation rule (specific weekdays interpret, vague timeframes ask), tool-interactions-are-invisible, acknowledgment-is-not-a-turn rule. - Persona inheritance: `capture_name`/`capture_email`/`capture_phone` pass `extra_instructions=COMMON_INSTRUCTIONS` to every spawned sub-task so they speak in the same voice. - VerifyBookingTask instructions strip tool names (replaced with action verbs like "look up the booking"). - Past-stay guard on cancel_room_booking; "we" -> "I" persona cleanup across ToolErrors, return strings, and DISPUTE_POLICIES.minibar explanation. - lookup_invoice returns prose, not JSON-shaped data. dispute_charge docstring simplified (drops redundant inline enum list; Literal type carries it).
|
|
||
|
|
||
| def _speak_code(code: str) -> str: | ||
| return ", ".join(code.replace("-", " dash ").upper()) |
There was a problem hiding this comment.
π‘ _speak_code joins individual characters instead of words, spelling out 'DASH' letter-by-letter
", ".join(string) iterates over each character of the string, not words. For input "HTL-AB12", the function produces "H, T, L, , D, A, S, H, , A, B, 1, 2" β the word "DASH" is spelled out as individual letters D, A, S, H and the spaces from " dash " become comma-separated empty-looking entries. The persona instructions at examples/hotel_receptionist/persona.py:22 explicitly specify the expected format as "H, T, L, dash, A, B, one, two". This function is called in 10+ places for confirmation codes, case numbers, and followup references, so every spoken code will be garbled.
Actual vs expected output
Input: "HTL-AB12"
Actual: "H, T, L, , D, A, S, H, , A, B, 1, 2"
Expected: "H, T, L, dash, A, B, 1, 2"
| return ", ".join(code.replace("-", " dash ").upper()) | |
| return ", ".join("dash" if ch == "-" else ch for ch in code.upper()) |
Was this helpful? React with π or π to provide feedback.
No description provided.