feat: Add CD001 - OrchestratorAgent Unit Tests (CD001, #27)#461
Open
steadhac wants to merge 2 commits intoGenAI-Security-Project:mainfrom
Open
feat: Add CD001 - OrchestratorAgent Unit Tests (CD001, #27)#461steadhac wants to merge 2 commits intoGenAI-Security-Project:mainfrom
steadhac wants to merge 2 commits intoGenAI-Security-Project:mainfrom
Conversation
- 62 tests across 11 groups: ORCH-INIT, ORCH-CFG, ORCH-PROMPT, ORCH-TOOLS, ORCH-DELIM, ORCH-CTX, ORCH-DEL, ORCH-EVENT, ORCH-CTF, ORCH-EDGE, ORCH-QA - 61 passing / 2 failing (ORCH-QA-001, ORCH-QA-002) — real bugs Bug_202, Bug_203 - CTF vulnerability tests confirm 3 known vulnerabilities are present: context injection, whitespace summary bypass, unconditional payment_confirmation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
PR — OrchestratorAgent Unit Tests
Add a comprehensive unit test suite for the
OrchestratorAgent— the LLM-poweredworkflow coordinator that delegates tasks across 6 specialized agents. Tests cover
initialization, config, system/user prompts, tool definitions, delegation limits,
workflow context propagation, delegate callables, event emission, CTF vulnerabilities,
and edge cases.
Tests follow the established pattern with:
Bug-exposing tests included for each confirmed production defect.
📁 Test Files
tests/unit/agents/test_orchestrator.pyTestOrchestratorInit
_delegation_attemptsis an empty dict on init_current_task_datastarts as None_workflow_contextstarts as empty listagent_nameis'orchestrator_agent'max_delegation_attemptsis 2workflow_idstored on initTestOrchestratorConfig
load_configreturnscustom_goalsas None by defaultmax_iterationsis 15TestOrchestratorPrompts
TestOrchestratorTools
get_tool_definitionsreturns exactly 6 toolsget_callablesreturns 6 entriesnotification_typeenumTestDelegationLimit
TestWorkflowContext
_enrich_with_prior_contextreturns original when no context_enrich_with_prior_contextappends prior agent summaries_capture_agent_contextstores summary_capture_agent_contextskips empty summary_enrich_with_prior_contextincludes all prior contextsTestDelegateCallables
delegate_to_onboardingcalls runnerdelegate_to_invoicecalls runnerdelegate_to_fraudcalls runnerdelegate_to_paymentsappendsnext_stepon successdelegate_to_communicationpassesnotification_typeattachment_file_idsforwarded to invoice agentto_addressesincluded when providedTestEventEmission
_emit_delegation_eventcalls event bustask_summarytruncated to 200 charsTestCtfVulnerability
_workflow_contextpayment_confirmationnext_step injected on failed paymentTestEdgeCases
_capture_agent_contextignores missingtask_summarykey_capture_agent_contextskips Nonetask_summary_enrich_with_prior_contextappends context to empty description'include all directives'headertask_datauses fallback descriptioncc_addressesnot forwardedbcc_addressesnot forwarded_emit_delegation_eventhandles empty result dictsystem_maintenanceroutes through fraud agenton_task_completiondoes not raiseprocessstorestask_databefore runningTestQAFindings
task_summaryshould not be capturednext_stepon failed payment misleads LLMsystem_maintenanceinjects dangerous tool namessystem_maintenancewith tool accessRelated Bug Tickets
Bug_202, Bug_203