feat: Add CHAT-ASSIST — VendorChatAssistant & CoPilotAssistant Unit Tests for for Chat Streaming Layer Summary#456
Open
steadhac wants to merge 1 commit intoGenAI-Security-Project:mainfrom
Conversation
…ests (CD001,GenAI-Security-Project#27, Bug_186-Bug_201) - 177 unit tests covering init, prompts, tool definitions, masking, workflow dispatch, injection resistance, and boundary values - 5 intentionally failing tests document open bugs: Bug_186 GenAI-Security-Project#407, Bug_187 GenAI-Security-Project#408, Bug_188 GenAI-Security-Project#409, Bug_189 GenAI-Security-Project#410, Bug_190 GenAI-Security-Project#411 - Boundary tests document open bugs: Bug_194 GenAI-Security-Project#415, Bug_195 GenAI-Security-Project#416, Bug_196 GenAI-Security-Project#417, Bug_197 GenAI-Security-Project#418, Bug_198 GenAI-Security-Project#419 - Prompt consistency findings: Bug_199 GenAI-Security-Project#442, Bug_200 GenAI-Security-Project#443, Bug_201 GenAI-Security-Project#452
35b23c0 to
d58b4da
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Add Unit Test Suite for Chat Streaming Layer Summary
Add a comprehensive unit test suite for the Chat Streaming Layer, covering both:
VendorChatAssistant (vendor-facing) CoPilotAssistant (admin-facing)
The suite validates core functionality, security, and edge cases, and includes bug-exposing tests for known production defects.
Scope
Tests cover:
Agent initialization MCP server configuration System prompt structure, correctness, and security rules Tool definitions and callable integrity Workflow dispatch behavior Sensitive field masking (PII protection) Injection/adversarial input handling Internationalization support Boundary and type validation Test Structure
All tests follow the standard format:
Title Basically (question being validated) Steps Expected Results
Bug-exposing tests are included and explicitly marked (⚠️ ).
Test File
tests/unit/agents/test_chat_assistant.py
Test Suites Overview
Initialization — TestChatAssistantInit
Validates correct setup of agent properties:
Agent naming (vendor vs copilot) Session context persistence Default values (history limit, MCP state, workflow ID, etc.) Tool callable structure
MCP Configuration — TestChatMCPServerTypes
Ensures correct MCP server types per agent:
Vendor vs CoPilot differences Base class defaults (findrive, finmail)
System Prompts
Includes:
TestVendorSystemPrompt TestCoPilotSystemPrompt Extended + isolation + negative tests
Validates:
Required sections (CAPABILITIES, RULES, workflow guidance) Tool usage instructions Security constraints Prompt isolation and encoding Date and identity correctness
Tool Definitions
Vendor Tools — TestVendorToolDefinitions
Correct tool count (6) Name matching Callable validation Strict mode enforcement
CoPilot Tools — TestCoPilotToolDefinitions
Expanded tool set (12) Enum validation (e.g., save_report) Parameter requirements
Tool Execution — TestExecuteTool
Covers:
Unknown tool handling (error JSON) Successful execution paths Exception handling Return-type normalization
Tool Labels & Definitions
Includes:
TestToolDisplayLabel TestGetToolDefinitions TestToolLabelAudit
Validates:
Label mapping correctness Fallback behavior MCP tool inclusion/exclusion Detection of stale or missing label mappings
Missing labels for active tools Stale label dictionary entries
Workflow Dispatch
TestCallStartWorkflow + TestWorkflowEdgeCases
Validates:
Background task handling Workflow ID propagation Parent/child relationships Attachment handling Event summary truncation
Sensitive Field Masking
TestSensitiveFieldMasking + TestMaskingEdgeCases
Ensures:
Proper masking of TIN, bank account, routing numbers Last-4-digit preservation Robust handling of edge cases (nulls, formats, types)
QA Findings (Bug-Exposing Tests) — TestQAFindings
Documents confirmed defects:
Internationalization — TestInternationalInputs
Validates handling of:
Chinese, Arabic (RTL), Japanese text Emojis and mixed Unicode Currency symbols Whitespace formatting preservation
Injection & Adversarial Inputs — TestInjectionAndAdversarialInputs
Covers resilience against:
Prompt injection SQL injection XSS payloads JSON injection Null bytes and shell characters Extremely long malicious inputs
Boundary & Type Handling — TestBoundaryAndTypeValues
Validates:
Extreme numeric values Large payloads (50k chars) Serialization edge cases Optional field handling
Missing validation for vendor_id=None Crashes on description=None Unsafe type forwarding to DB Lack of coercion/validation Related Bug Tickets Bug_186 (#407) Bug_187 (#408) Bug_188 (#409) Bug_189 (#410) Bug_190 (#411) Bug_194 (#415) Bug_195 (#416) Bug_196 (#417) Bug_197 (#418) Bug_198 (#419) Bug_199 (#442) Bug_200 (#443) Bug_201 (#452)