Pr/multilingual#121
Conversation
bd0e0d9 to
81923ab
Compare
68fb05a to
f5a5b52
Compare
606dc7d to
ada9699
Compare
katstankiewicz
left a comment
There was a problem hiding this comment.
can you also add ensure_ascii=False to AuditLog save()
aa152bd to
b3ad5dc
Compare
26fa5ec to
ff7bd59
Compare
…behavioral_fidelity judge prompt
|
@gabegma new design to fix accent issue for user while keeping the real world potential of lacking accents on database side: romanized placeholders are stored in DB rather than done at generation time. More deterministic, readable, cleaner, etc. and user is always seeing native scrip |
| @@ -0,0 +1,22 @@ | |||
| { | |||
| "name": "A Garage", | |||
There was a problem hiding this comment.
Thanks for doing this - I'm still thinking we should pass the translation to the user too, because otherwise they might not match, but we can tackle it in a future PR
| "first_name": "<FIRST_NAME_ROMANIZED>", | ||
| "last_name": "<LAST_NAME_ROMANIZED>", |
There was a problem hiding this comment.
Why do we have romanized here now?
There was a problem hiding this comment.
Real world data may not always keep native script. I can say my name is राघव and the agent should know I might mean "Raghav". So I made some of it romanized in DB and some not
There was a problem hiding this comment.
Oh I didn't see your comment about the new design - so that's on purpose? It doesn't seem like it was applied to all files.
There was a problem hiding this comment.
Won't this be an issue when computing the hash for the expected DB state or for authentication success if we can have either?
There was a problem hiding this comment.
no I tested the task completion in French with a romanized expectation. Even with the accent (on the letter) task completion passed. The agent is told to try the romanized name spelling if needed. I didn't do it to all files because both could be valid depending on the database behind the scenes
There was a problem hiding this comment.
Is it possible to add a test that checks for a disconnect between the JSON individual files and the actual dataset? I think it would be hard to spot right now if one of them as romanized but not the other one?
There was a problem hiding this comment.
placeholders will be static and unchanging but I can add a general thing that takes the initial DB, applies expected trace, and should pass per language? or something along these lines
gabegma
left a comment
There was a problem hiding this comment.
Excellent work!! I love that you have pushed your script so others can add new languages.
I'm still nervous about the aliases, and I think we should feed the translations to the user so they match. Or we change the tools themselves so we don't need to pass unstructured text.
I would do that before adding new languages, but in a separate PR.
I also think we are missing a few tests - especially for looking at the disconnect between the dataset's individual files and the main one, but I need to drop, so I'm pre-approving!
initial multilingual version
Easily extendable to many language using the add_culture_data script. This will do translation, gender consistent naming, suggest names, extend data, etc. So if anyone wants to run a language not committed in EVA data, it is trivially easy to do so.
Readme section showing basic of adding a language.
This adds:
Multilingual data schema and content (initial utterances, system prompt, name aliases)
multilingual support in code
Prompt updates to support multi languages
Script to "add a language" with high degree of automation
WER metric normalization rules, dynamically set per language and creatable via LLM through adding script
Automatic .env.example adjustments (maintains config app accuracy)
Still TODO: