Commit e60bbd2
Upgrade planner to Llama 3.1 70B and add anti-pattern examples
Critical fixes for planner reliability:
1. Model upgrade: Mixtral-8x7B → Llama 3.1 70B Instruct Turbo
- Superior instruction following capability
- Better at structured output generation
- Proven reliability with similar spec generation tasks
2. Added explicit anti-pattern examples showing:
- ❌ WRONG: Landmarks outside FUNCTION blocks
- ✅ CORRECT: Landmarks inside FUNCTION blocks
- ❌ WRONG: Undefined DATA types in signatures
- ✅ CORRECT: DATA defined before use
3. Concrete examples for both correct and incorrect structures
help Llama understand exactly what to avoid
These changes address the critical issue where even default planner
examples were producing invalid specs with W001 (landmarks outside
FUNCTION) and W006 (undefined DATA type) errors.
Cost impact: Minimal (~$0.88/M tokens vs $0.60/M for Mixtral)
Quality impact: Significant improvement expected in spec validity
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>1 parent d6ebb4b commit e60bbd2
2 files changed
Lines changed: 61 additions & 2 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1 | 1 | | |
2 | 2 | | |
3 | 3 | | |
4 | | - | |
| 4 | + | |
5 | 5 | | |
6 | 6 | | |
7 | 7 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
303 | 303 | | |
304 | 304 | | |
305 | 305 | | |
| 306 | + | |
| 307 | + | |
| 308 | + | |
| 309 | + | |
| 310 | + | |
| 311 | + | |
| 312 | + | |
| 313 | + | |
| 314 | + | |
| 315 | + | |
| 316 | + | |
| 317 | + | |
| 318 | + | |
| 319 | + | |
| 320 | + | |
| 321 | + | |
| 322 | + | |
| 323 | + | |
| 324 | + | |
| 325 | + | |
| 326 | + | |
| 327 | + | |
| 328 | + | |
| 329 | + | |
| 330 | + | |
| 331 | + | |
| 332 | + | |
| 333 | + | |
| 334 | + | |
| 335 | + | |
| 336 | + | |
| 337 | + | |
| 338 | + | |
| 339 | + | |
| 340 | + | |
| 341 | + | |
| 342 | + | |
| 343 | + | |
| 344 | + | |
| 345 | + | |
| 346 | + | |
| 347 | + | |
| 348 | + | |
| 349 | + | |
| 350 | + | |
| 351 | + | |
| 352 | + | |
| 353 | + | |
| 354 | + | |
| 355 | + | |
| 356 | + | |
| 357 | + | |
| 358 | + | |
| 359 | + | |
| 360 | + | |
| 361 | + | |
| 362 | + | |
| 363 | + | |
| 364 | + | |
306 | 365 | | |
307 | 366 | | |
308 | 367 | | |
| |||
311 | 370 | | |
312 | 371 | | |
313 | 372 | | |
314 | | - | |
| 373 | + | |
315 | 374 | | |
316 | 375 | | |
317 | 376 | | |
| |||
0 commit comments