Draft
Conversation
…ries and improve orchestrator configurations
…gress tracking; add comprehensive tests for termination logic
- Updated various files to enhance code formatting by aligning dictionary entries and improving whitespace usage. - Refactored the `CosmosWorkflowCheckpoint` and `CosmosWorkflowCheckpointRepository` classes for better clarity. - Improved the readability of list comprehensions and dictionary appends across multiple modules. - Ensured consistent use of inline comments and docstrings where applicable. - Enhanced the structure of error handling and logging in the `TelemetryManager` and `LoggingUtils` classes. - Made minor adjustments to test cases for better clarity and maintainability.
…gs and Pydantic models; improve orchestration logic
… clarity and maintainability
…ception handling and telemetry updates; add tests for conversion report quality gates
… the processor steps - Changed output folder naming from `/output` to `/converted` in various orchestration and prompt files. - Updated timestamp handling in reports to use a consistent UTC format with a helper function. (mcp tool to local function) - Enhanced routing instruction format for telemetry in prompt files to include phase labels. - Added utility functions for generating current timestamps in UTC. - Adjusted test cases to reflect changes in output folder structure and timestamp handling.
…nhance prompt coordinator with detailed RAI content policy
…put folder rename, and test reorganization
- Add missing copyright headers (6 files) - Add missing module docstrings (16 files) - Remove banner/section-divider comments (main.py, main_service.py, queue_service.py) - Remove redundant inline comments (main.py, prompt_util.py, console_util.py) - Remove commented-out code (agent_framework_helper.py, groupchat_orchestrator.py, orchestrator_base.py, credential_util.py) - Remove placeholder comments (main.py, main_service.py, prompt_util.py)
…lint fixes - Update AZURE_OPENAI_API_VERSION from 2025-01-01-preview to 2025-03-01-preview in both main.bicep and main_custom.bicep - Fix ruff lint errors in backend-api (None comparison, unused variables) - Apply ruff formatting across processor and backend-api source files
- Fix missing 'import re' in orchestrator_base.py causing silent NameError - Update regex to match new instruction format 'Phase X : Phase Title - <what to do>' - Update all 4 coordinator prompts with consistent phase format in instruction field - Fix progress bar using apiData.step instead of apiData.phase (sub-phase overwrites broke indexOf) - Fix redundant phase labels: show step name as category, phase as sub-detail - Redesign Current Activity section with multi-line agent cards - Show detailed action labels (Thinking, Speaking, Invoking Tool, Analyzing) - Add step-level elapsed timer from step_timings - Add update_phase() to TelemetryManager for sub-phase UI updates - Fix step timing seed to always initialize on step start
fix: progress modal phase tracking and UI improvements
Frontend: - Add Migration Overview, Step Timeline, Agent Participation to summary page - Show Coordinator routing target and instruction in progress modal - Fix YAML step name capitalization - Show phase name in modal title instead of step name - Reverse Recent Activity order (latest first) - Move Recent Activity title outside scroll box - Hide Coordinator from Agent Participation (system agent) Agent Prompts: - Add ANALYSIS SIGN-OFF SCOPE to all 8 analysis agent prompts - Add DESIGN SIGN-OFF SCOPE to all 8 design agent prompts - Prevent analysis agents from FAIL on design-time concerns - Prevent design agents from FAIL on stakeholder-dependent actions - Scope sign-offs to step-appropriate criteria only
- Add last_full_message to agent_activities render output - Use full message (not truncated preview) for Coordinator JSON parsing - Fallback to regex extraction if JSON parse fails
… args - Add Phase 0 to _trim_messages: compress consumed tool outputs before generic trimming - Keep 4 most recent tool results in full, truncate older ones to 500 chars - Compress save_content_to_blob arguments to short summary - Preserves agent responses (high-value) while cutting tool outputs (low-value) - Saves ~87K chars in typical documentation step with 10+ tool calls
…human decisions - Add CONVERSION SIGN-OFF SCOPE to all 4 convert reviewer prompts - Clarify that environment-specific values (GPU labels, storage accounts, identity) should use best-practice defaults - Update coordinator: environment unknowns are NOT grounds for hard_blocked - Agents use placeholders and document in Assumptions instead of FAILing
… and RAI violations Analysis: add scope reminders to Phase 2/3/4, fix Phase 5 to re-scope design concerns Convert: add conversion sign-off scope rules, prevent hard_blocked on assumed values
- Change SIGN-OFF: PASS format from optional notes to mandatory notes - Coordinator will reject bare PASS without verification details - Added examples showing expected note format per step - Applied to all 4 step coordinators (analysis, design, convert, documentation)
…grams - Add remark-gfm for table/strikethrough/task list support - Add rehype-raw for inline HTML rendering - Add mermaid for rendering flowchart/sequence diagrams from code blocks - Add GitHub-style CSS for tables, code blocks, blockquotes, headings - Code blocks with language 'mermaid' render as SVG diagrams - Other code blocks use syntax highlighting via react-syntax-highlighter
…-off - All executors raise with actual error instead of yielding None - Migration processor shows descriptive error with troubleshooting hints - Smart tool-result compression in context trimming (Phase 0) - Progressive trimming on each retry attempt - YAML Expert signs off proactively after conversion
- Add actual mermaid.js rendering check via Node.js subprocess - Validate syntax using real mermaid parser, not just heuristics - Return detailed error messages from mermaid renderer for agent self-correction - Install mermaid npm package in Dockerfile - Update design architect prompt to use validate_mermaid_in_markdown and fix_mermaid tools
…ns in edge labels
- max_total_chars: 240K -> 600K (model supports ~950K chars) - max_message_chars: 20K -> 40K - keep_last_messages: 40 -> 50 - Progressive retry starts from 600K, reduces by 100K per attempt - Final retry at 30K chars / 2K per msg / 4 messages
…e 5 re-scope routing - Add explicit state-aware routing rule: if FAILs present, route to failing expert with re-scope instruction - Strengthen NON-NEGOTIABLE rule: NEVER set finish=true while any reviewer has SIGN-OFF: FAIL - Fix DOMPurify handling in mermaid render check - Frontend batchView JSON support and markdown header cleanup
- Design architect: add DO NOT RE-EVALUATE rule, lower circuit breaker to 2 - Design coordinator: add Phase 6 state-aware routing, PENDING detection rule - Design coordinator: add NON-NEGOTIABLE no-finish-with-FAILs rule - Tested with process ff54015d - all 4 steps produce all-PASS sign-offs
- QdrantMemoryStore: in-process Qdrant embedded vector store, per-process isolation - SharedMemoryContextProvider: ContextProvider that reads/writes shared memories - invoking(): queries Qdrant for relevant memories before each LLM call - invoked(): stores agent responses into shared memory after each LLM call - OrchestratorBase: auto-initializes memory store + attaches provider to expert agents - Enabled by default, controlled via SHARED_MEMORY_ENABLED env var - Requires AZURE_OPENAI_EMBEDDING_DEPLOYMENT_NAME for embedding generation - mem0_async_memory: reduced max_tokens from 100K to 4K for extraction calls - All 77 existing tests pass
… tests - MigrationProcessor: creates QdrantMemoryStore at workflow start, disposes in finally - Memory persists across all 4 steps (analysis→design→convert→documentation) - OrchestratorBase: resolves memory from AppContext instead of creating its own - SharedMemoryContextProvider: fix duck typing for isinstance checks - 18 tests for QdrantMemoryStore (init, add, search, workflow lifecycle) - 20 tests for SharedMemoryContextProvider (invoking, invoked, edge cases) - All 115 tests pass (77 existing + 38 new)
- SharedMemoryContextProvider: log inject count + stored content per agent turn - MigrationProcessor: log total memory count after each step completes - Enables real-time monitoring of memory flow across workflow steps
- Fix: use get_bearer_token_provider() instead of async variant (AzureCliCredential await error) - Add print() statements for memory init diagnostics (embedding deployment found/missing/failed) - Tested locally: 20 memories across 4 steps, workflow completed successfully in 19m 25s
- Workspace context injected into agent system instructions (never trimmed)
- keep_last_messages reduced 50→20, max_total_chars 600K→400K
- ResultGenerator prompts moved to prompt_resultgenerator.txt (4 steps)
- Step transition phase shows 'Initializing {Step}' instead of step name
- flush_agent_memories() fixed: use agent.context_provider.providers
- Guard against uninitialized store in _flush_memory()
- Same-step memory skip (only search cross-step memories)
- Buffered storage (only last response per agent stored)
- Debug log for memory store resolution per step
- Tested: 17m 23s with keep_last_messages=20, all 4 steps PASS
…WARE ROUTING - Added rule 6: route to YAML Expert if their sign-off is PENDING before Chief Architect finalizes - Same pattern as Chief Architect PENDING fix in design coordinator
… UI fixes - Bicep: add text-embedding-3-large model deployment (capacity 500) alongside GPT5.1 - Bicep: add AZURE_OPENAI_EMBEDDING_DEPLOYMENT_NAME to App Config keys - mem0_async_memory: replace hardcoded endpoints with env vars (AZURE_OPENAI_*) - keep_last_messages adjusted 20→30 for analysis step stability - Analysis executor: phase shows 'Initializing Analysis' instead of 'Analysis' - Test assertions updated for new phase name
…fix, logging - Fix list_blobs_in_container trailing-slash bug causing intermittent 'files not found' - Remove tool-result truncation; only summarize save_content_to_blob writes - Protect last message from per-message truncation - Increase retry config: 8 retries, 5s base, 120s max with exponential backoff - Add cooldown delay on context-trim retries to avoid triggering 429s - Retry transient errors: empty messages, 5xx server errors - Add embedding retry logic (3 retries) in QdrantMemoryStore - Reduce keep_last_messages 30->15; disable per-message truncation - Fix duplicate yaml_conversion/yaml telemetry key - Clear OrchestratorBase._client_cache between processes - Convert all runtime print() to logger.info/error/warning - Remove text2art dependency - Add debug logging to SharedMemoryContextProvider invoked/flush - Prohibit Markdown footnotes in documentation reports - Add diagnostic logging for _embed and _flush_memory failures
fix: resilience improvements for retry logic, context trimming, blob listing, and logging
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Purpose
Does this introduce a breaking change?
Golden Path Validation
Deployment Validation
What to Check
Verify that the following are valid
Other Information