Skip to content

Latest commit

 

History

History
442 lines (418 loc) · 32.1 KB

File metadata and controls

442 lines (418 loc) · 32.1 KB

Graphium Test Restructure Plan

Goals

  1. Separate unit vs. integration behaviour clearly. Tiny unit tests should run without external services; integration suites should own database/driver dependencies and be opt‑in through markers or environment flags.
  2. Align test layout with module ownership. Group tests by Graphium subsystem (orchestration, search, drivers, MCP, UI) so the intent is discoverable.
  3. Enable focused CI pipelines. Keep a fast unit tier in the default workflow; provide reusable make/uv targets for the heavier integration checks.
  4. Eliminate ad‑hoc fixtures and environment leaks. Centralise service fixtures (Neo4j, FalkorDB, Kùzu, Neptune) and make their lifecycle predictable.

Current Pain Points

  • tests/test_graphium_mock.py mixes mock‑heavy assertions with DB access and now fails in CI without manual DISABLE_* overrides.
  • Service fixtures live in tests/helpers_test.py, which also registers drivers at import time; this makes selective execution fragile.
  • Module grouping is inconsistent: some suites live under tests/orchestration/, others at the root (e.g., test_edge_int.py, test_node_int.py), and tests/evals/ blends end‑to‑end runs with unit helpers.
  • CI cannot distinguish low‑cost regression checks from integration jobs that need real GraphDB instances.

Proposed Directory Layout

tests/
├─ unit/
│  ├─ orchestration/
│  │  ├─ test_bulk_serialization.py
│  │  ├─ test_ingestion_service.py
│  │  └─ …
│  ├─ search/
│  ├─ mcp/
│  ├─ utils/
│  └─ …
├─ integration/
│  ├─ neo4j/
│  │  ├─ test_graphium_neo4j.py
│  │  └─ fixtures_neo4j.py
│  ├─ falkordb/
│  ├─ kuzu/
│  └─ shared/
│     └─ fixtures_services.py
├─ e2e/
│  └─ (long‑running graph build, eval harnesses)
└─ helpers/
   ├─ embeddings.py
   ├─ factories.py
   └─ markers.py
  • Unit tier contains tests that can run with pure mocks/in-memory fixtures.
  • Integration tier retains the existing service coverage but moves per‑provider suites behind pytest markers (@pytest.mark.neo4j, etc.).
  • E2E tier is optional, only for the eval harness and smoke/regression scenarios that exercise CLI or the MCP server end-to-end.
  • Shared helper modules move under tests/helpers/ to avoid import side effects when the unit tier is collected.

Execution Strategy

Tier Location Marker / Env Flag CI default Command suggestion
Unit tests/unit pytest -m "not integration and not e2e" uv run pytest tests/unit
Integration tests/integration @pytest.mark.integration + provider markers opt‑in uv run pytest -m "integration and neo4j"
E2E tests/e2e @pytest.mark.e2e manual uv run pytest -m e2e

Markers to add in pytest.ini:

[pytest]
markers =
    integration: tests that require external services
    neo4j: requires a running Neo4j instance
    falkordb: requires a running FalkorDB instance
    kuzu: requires a running Kùzu instance
    e2e: long-running end-to-end scenarios

Detailed File Mapping

Integration Tests

  • tests/integration/core/shared/test_community_operations.py
    • tests.test_graphium_mock::test_determine_entity_community
    • tests.test_graphium_mock::test_get_community_clusters
  • tests/integration/core/shared/test_entity_exclusion.py
    • tests.test_entity_exclusion_int::test_exclude_all_types
    • tests.test_entity_exclusion_int::test_exclude_default_entity_type
    • tests.test_entity_exclusion_int::test_exclude_no_types
    • tests.test_entity_exclusion_int::test_exclude_specific_custom_types
    • tests.test_entity_exclusion_int::test_excluded_types_parameter_validation_in_add_episode
    • tests.test_entity_exclusion_int::test_validation_invalid_excluded_types
    • tests.test_entity_exclusion_int::test_validation_valid_excluded_types
  • tests/integration/core/shared/test_graphium_bootstrap.py
    • tests.test_graphium_int::test_graphium_init
  • tests/integration/core/shared/test_ingestion_pipeline.py
    • tests.test_graphium_mock::test_add_bulk
    • tests.test_graphium_mock::test_add_episode_persists_nodes_and_edges
    • tests.test_graphium_mock::test_filter_existing_duplicate_of_edges
    • tests.test_graphium_mock::test_get_embeddings_for_communities
    • tests.test_graphium_mock::test_get_embeddings_for_edges
    • tests.test_graphium_mock::test_get_embeddings_for_nodes
    • tests.test_graphium_mock::test_graphium_retrieve_episodes
    • tests.test_graphium_mock::test_remove_episode
  • tests/integration/core/shared/test_repository_edges.py
    • tests.test_edge_int::test_community_edge
    • tests.test_edge_int::test_entity_edge
    • tests.test_edge_int::test_episodic_edge
  • tests/integration/core/shared/test_repository_nodes.py
    • tests.test_node_int::test_community_node
    • tests.test_node_int::test_entity_node
    • tests.test_node_int::test_episodic_node
  • tests/integration/core/shared/test_search_edges.py
    • tests.test_graphium_mock::test_edge_bfs_search
    • tests.test_graphium_mock::test_edge_fulltext_search
    • tests.test_graphium_mock::test_edge_similarity_search
    • tests.test_graphium_mock::test_episode_mentions_reranker
    • tests.test_graphium_mock::test_get_relevant_edges_and_invalidation_candidates
  • tests/integration/core/shared/test_search_nodes.py
    • tests.test_graphium_mock::test_community_fulltext_search
    • tests.test_graphium_mock::test_community_similarity_search
    • tests.test_graphium_mock::test_episode_fulltext_search
    • tests.test_graphium_mock::test_get_communities_by_nodes
    • tests.test_graphium_mock::test_get_mentioned_nodes
    • tests.test_graphium_mock::test_get_relevant_nodes
    • tests.test_graphium_mock::test_node_bfs_search
    • tests.test_graphium_mock::test_node_distance_reranker
    • tests.test_graphium_mock::test_node_fulltext_search
    • tests.test_graphium_mock::test_node_similarity_search
  • tests/integration/cross_encoder/test_bge_reranker.py
    • tests.cross_encoder.test_bge_reranker_client::test_rank_basic_functionality
    • tests.cross_encoder.test_bge_reranker_client::test_rank_empty_input
    • tests.cross_encoder.test_bge_reranker_client::test_rank_single_passage
  • tests/integration/drivers/test_falkordb_driver.py
    • tests.driver.test_falkordb_driver::TestDatetimeConversion.test_convert_datetime_dict
    • tests.driver.test_falkordb_driver::TestDatetimeConversion.test_convert_datetime_list_and_tuple
    • tests.driver.test_falkordb_driver::TestDatetimeConversion.test_convert_other_types_unchanged
    • tests.driver.test_falkordb_driver::TestDatetimeConversion.test_convert_single_datetime
    • tests.driver.test_falkordb_driver::TestFalkorDriver.test_close_calls_connection_close
    • tests.driver.test_falkordb_driver::TestFalkorDriver.test_delete_all_indexes
    • tests.driver.test_falkordb_driver::TestFalkorDriver.test_execute_query_converts_datetime_parameters
    • tests.driver.test_falkordb_driver::TestFalkorDriver.test_execute_query_handles_index_already_exists_error
    • tests.driver.test_falkordb_driver::TestFalkorDriver.test_execute_query_propagates_other_exceptions
    • tests.driver.test_falkordb_driver::TestFalkorDriver.test_execute_query_success
    • tests.driver.test_falkordb_driver::TestFalkorDriver.test_get_graph_with_name
    • tests.driver.test_falkordb_driver::TestFalkorDriver.test_get_graph_with_none_defaults_to_default_database
    • tests.driver.test_falkordb_driver::TestFalkorDriver.test_init_with_connection_params
    • tests.driver.test_falkordb_driver::TestFalkorDriver.test_init_with_falkor_db_instance
    • tests.driver.test_falkordb_driver::TestFalkorDriver.test_provider
    • tests.driver.test_falkordb_driver::TestFalkorDriver.test_session_creation
    • tests.driver.test_falkordb_driver::TestFalkorDriver.test_session_creation_with_none_uses_default_database
    • tests.driver.test_falkordb_driver::TestFalkorDriverIntegration.test_basic_integration_with_real_falkordb
    • tests.driver.test_falkordb_driver::TestFalkorDriverSession.test_close_method
    • tests.driver.test_falkordb_driver::TestFalkorDriverSession.test_execute_write_passes_session_and_args
    • tests.driver.test_falkordb_driver::TestFalkorDriverSession.test_run_converts_datetime_objects_to_iso_strings
    • tests.driver.test_falkordb_driver::TestFalkorDriverSession.test_run_propagates_exceptions
  • tests/integration/llm_client/test_anthropic_client.py
    • tests.llm_client.test_anthropic_client_int::test_extract_json_from_text
    • tests.llm_client.test_anthropic_client_int::test_generate_simple_response

Unit Tests

  • tests/unit/core/maintenance/test_bulk_utils.py
    • tests.utils.maintenance.test_bulk_utils::test_build_directed_uuid_map_chain
    • tests.utils.maintenance.test_bulk_utils::test_build_directed_uuid_map_empty
    • tests.utils.maintenance.test_bulk_utils::test_build_directed_uuid_map_preserves_direction
    • tests.utils.maintenance.test_bulk_utils::test_candidate_edges_for_uses_semantic_similarity
    • tests.utils.maintenance.test_bulk_utils::test_collect_edge_candidates_filters_by_endpoints
    • tests.utils.maintenance.test_bulk_utils::test_dedupe_edges_bulk_deduplicates_within_episode
    • tests.utils.maintenance.test_bulk_utils::test_dedupe_nodes_bulk_handles_empty_batch
    • tests.utils.maintenance.test_bulk_utils::test_dedupe_nodes_bulk_missing_canonical_falls_back
    • tests.utils.maintenance.test_bulk_utils::test_dedupe_nodes_bulk_reuses_canonical_nodes
    • tests.utils.maintenance.test_bulk_utils::test_dedupe_nodes_bulk_single_episode
    • tests.utils.maintenance.test_bulk_utils::test_dedupe_nodes_bulk_uuid_map_respects_direction
    • tests.utils.maintenance.test_bulk_utils::test_find_exact_name_match_handles_case
    • tests.utils.maintenance.test_bulk_utils::test_merge_canonical_nodes_detects_exact_match
    • tests.utils.maintenance.test_bulk_utils::test_resolve_edge_pointers_updates_sources
  • tests/unit/core/maintenance/test_edge_operations.py
    • tests.utils.maintenance.test_edge_operations::test_apply_invalidation_policy_invalidates_older_edges
    • tests.utils.maintenance.test_edge_operations::test_apply_invalidation_policy_inserts_new_fact_when_no_duplicates
    • tests.utils.maintenance.test_edge_operations::test_apply_invalidation_policy_invalidates_matching_edges
    • tests.utils.maintenance.test_edge_operations::test_apply_invalidation_policy_invalidates_partial_duplicates
    • tests.utils.maintenance.test_edge_operations::test_apply_invalidation_policy_removes_inconsistent_facts
    • tests.utils.maintenance.test_edge_operations::test_apply_invalidation_policy_updates_expired_edges
    • tests.utils.maintenance.test_edge_operations::test_apply_invalidation_policy_updates_fact
    • tests.utils.maintenance.test_edge_operations::test_apply_invalidation_policy_updates_validity_window
    • tests.utils.maintenance.test_edge_operations::test_apply_invalidation_policy_updates_with_new_fact
    • tests.utils.maintenance.test_edge_operations::test_convert_extracted_edges_to_entities_filters_blank_facts
    • tests.utils.maintenance.test_edge_operations::test_convert_extracted_edges_to_entities_logs_invalid_indices
    • tests.utils.maintenance.test_edge_operations::test_resolve_extracted_edge_accepts_unknown_fact_type
    • tests.utils.maintenance.test_edge_operations::test_resolve_extracted_edge_exact_fact_short_circuit
    • tests.utils.maintenance.test_edge_operations::test_resolve_extracted_edge_rejects_unmapped_fact_type
    • tests.utils.maintenance.test_edge_operations::test_resolve_extracted_edge_uses_integer_indices_for_duplicates
    • tests.utils.maintenance.test_edge_operations::test_resolve_extracted_edges_fast_path_deduplication
    • tests.utils.maintenance.test_edge_operations::test_resolve_extracted_edges_keeps_unknown_names
  • tests/unit/core/maintenance/test_node_resolution.py
    • tests.utils.maintenance.test_node_operations::test_collect_candidate_nodes_dedupes_and_merges_override
    • tests.utils.maintenance.test_node_operations::test_extract_attributes_from_nodes_with_callback
    • tests.utils.maintenance.test_node_operations::test_extract_attributes_with_callback_generate_summary
    • tests.utils.maintenance.test_node_operations::test_extract_attributes_with_callback_skip_summary
    • tests.utils.maintenance.test_node_operations::test_extract_attributes_with_selective_callback
    • tests.utils.maintenance.test_node_operations::test_extract_attributes_with_selective_callback_override_summary
    • tests.utils.maintenance.test_node_operations::test_extract_attributes_without_callback_generates_summary
    • tests.utils.maintenance.test_node_operations::test_has_high_entropy_rules
    • tests.utils.maintenance.test_node_operations::test_hash_minhash_and_lsh
    • tests.utils.maintenance.test_node_operations::test_jaccard_similarity_edges
    • tests.utils.maintenance.test_node_operations::test_materialize_extracted_entities_respects_exclusions
    • tests.utils.maintenance.test_node_operations::test_materialize_extracted_entities_sets_attribute_model
    • tests.utils.maintenance.test_node_operations::test_name_entropy_variants
    • tests.utils.maintenance.test_node_operations::test_normalize_helpers
    • tests.utils.maintenance.test_node_operations::test_resolve_nodes_exact_match_skips_llm
    • tests.utils.maintenance.test_node_operations::test_resolve_nodes_fuzzy_match
    • tests.utils.maintenance.test_node_operations::test_resolve_nodes_low_entropy_uses_llM
    • tests.utils.maintenance.test_node_operations::test_resolve_with_llm_ignores_duplicate_relative_ids
    • tests.utils.maintenance.test_node_operations::test_resolve_with_llm_ignores_out_of_range_relative_ids
    • tests.utils.maintenance.test_node_operations::test_resolve_with_llm_invalid_duplicate_idx_defaults_to_extracted
    • tests.utils.maintenance.test_node_operations::test_resolve_with_llm_updates_unresolved
    • tests.utils.maintenance.test_node_operations::test_resolve_with_similarity_exact_match_updates_state
    • tests.utils.maintenance.test_node_operations::test_resolve_with_similarity_low_entropy_defers_resolution
    • tests.utils.maintenance.test_node_operations::test_resolve_with_similarity_multiple_exact_matches_defers_to_llm
    • tests.utils.maintenance.test_node_operations::test_shingles_and_cache
    • tests.utils.maintenance.test_node_operations::test_signature_dtype_guard
  • tests/unit/core/maintenance/test_temporal_operations.py
    • tests.utils.maintenance.test_temporal_operations_int::test_get_edge_contradictions
    • tests.utils.maintenance.test_temporal_operations_int::test_get_edge_contradictions_multiple_existing
    • tests.utils.maintenance.test_temporal_operations_int::test_get_edge_contradictions_no_contradictions
    • tests.utils.maintenance.test_temporal_operations_int::test_get_edge_contradictions_no_effect
    • tests.utils.maintenance.test_temporal_operations_int::test_get_edge_contradictions_temporal_update
    • tests.utils.maintenance.test_temporal_operations_int::test_invalidate_edges_complex
    • tests.utils.maintenance.test_temporal_operations_int::test_invalidate_edges_partial_update
  • tests/unit/core/orchestration/test_bulk_persistence.py
    • tests.orchestration.test_bulk::test_persist_bulk_payloads_wrapps_sequences_for_graph_operations
    • tests.orchestration.test_bulk::test_serialize_entity_edges
    • tests.orchestration.test_bulk::test_serialize_entity_nodes
    • tests.orchestration.test_bulk::test_serialize_episodes_converts_source_enum
  • tests/unit/core/orchestration/test_bulk_serialization.py
    • tests.orchestration.test_bulk_serialization::test_serialize_episodic_edge_payload
    • tests.orchestration.test_bulk_serialization::test_serialize_episodic_edge_payload_handles_missing_embedding
    • tests.orchestration.test_bulk_serialization::test_serialize_entity_edge_payload
    • tests.orchestration.test_bulk_serialization::test_serialize_entity_node_payload
    • tests.orchestration.test_bulk_serialization::test_serialize_episode_payload
  • tests/unit/core/orchestration/test_episode_orchestrator.py
    • tests.orchestration.test_episode_orchestrator::test_merge_edge_type_map_accepts_sequence_signature
    • tests.orchestration.test_episode_orchestrator::test_merge_edge_type_map_rejects_invalid_signatures
  • tests/unit/core/orchestration/test_initializer_factory.py
    • tests.test_graphium_factory_usage::test_graphium_invokes_reranker_factory
  • tests/unit/core/orchestration/test_node_operations_sequence.py
    • tests.orchestration.test_node_operations_sequence::test_collect_candidate_nodes_invocations
    • tests.orchestration.test_node_operations_sequence::test_resolve_extracted_nodes_accepts_any_sequence
  • tests/unit/core/providers/test_factory.py
    • tests.providers.test_factory::test_create_embedder_from_settings
    • tests.providers.test_factory::test_create_llm_client_from_settings
    • tests.providers.test_factory::test_create_reranker_from_llm_settings
  • tests/unit/core/search/test_edge_search_orchestration.py
    • tests.search.test_edge_search_orchestration::test_edge_search_bfs_seeded_from_results
    • tests.search.test_edge_search_orchestration::test_edge_search_cross_encoder
    • tests.search.test_edge_search_orchestration::test_edge_search_rrF_only
  • tests/unit/core/search/test_hybrid_search.py
    • tests.utils.search.search_utils_test::test_hybrid_node_search_delegates_to_similarity_and_fulltext
    • tests.utils.search.search_utils_test::test_hybrid_node_search_handles_missing_results
    • tests.utils.search.search_utils_test::test_hybrid_node_search_merges_scores
    • tests.utils.search.search_utils_test::test_hybrid_node_search_returns_nodes
  • tests/unit/core/search/test_lucene_utils.py
    • tests.helpers_test::test_lucene_sanitize
  • tests/unit/core/search/test_search_filters.py
    • tests.search.test_search_filters::test_build_date_filter_clause
    • tests.search.test_search_filters::test_edge_search_filter_query_constructor_builds_filters
    • tests.search.test_search_filters::test_edge_search_filter_query_constructor_handles_dates
    • tests.search.test_search_filters::test_edge_search_filter_query_constructor_handles_labels
    • tests.search.test_search_filters::test_edge_search_filter_query_constructor_handles_uuid filters
    • tests.search.test_search_filters::test_edge_search_filter_query_constructor_returns_empty_lists
    • tests.search.test_search_filters::test_node_search_filter_query_constructor_builds_filters
  • tests/unit/core/search/test_search_helpers.py
    • tests.search.test_search_helpers::test_build_search_config_handles_cross_encoder_weight
    • tests.search.test_search_helpers::test_build_search_config_sets_defaults
    • tests.search.test_search_helpers::test_build_search_config_validates_weights
    • tests.search.test_search_helpers::test_rescore_with_cross_encoder_handles_empty
    • tests.search.test_search_helpers::test_rescore_with_cross_encoder_sorts_results
  • tests/unit/core/search/test_search_utils_edges.py
    • tests.search.test_search_utils_edges::test_get_edge_invalidation_candidates_default_provider
    • tests.search.test_search_utils_edges::test_get_relevant_edges_default_provider
    • tests.search.test_search_utils_edges::test_node_distance_reranker
  • tests/unit/core/search/test_search_utils_filters.py
    • tests.search.test_search_utils_filters::test_build_edge_filter_clause_with_group_and_endpoints
    • tests.search.test_search_utils_filters::test_build_edge_filter_clause_without_filters
    • tests.search.test_search_utils_filters::test_collect_edge_matches_ignores_missing_uuid
    • tests.search.test_search_utils_filters::test_fulltext_query_default_provider_includes_group_filter
    • tests.search.test_search_utils_filters::test_fulltext_query_falkordb_delegates
    • tests.search.test_search_utils_filters::test_fulltext_query_kuzu_respects_max_length
  • tests/unit/embedder/test_embeddinggemma.py
    • tests.embedder.test_embeddinggemma::test_embeddinggemma_create
  • tests/unit/embedder/test_gemini.py
    • tests.embedder.test_gemini::test_gemini_embedding_client_handles_rate_limits
    • tests.embedder.test_gemini::test_gemini_embedding_client_initialization_defaults
    • tests.embedder.test_gemini::test_gemini_embedding_client_parses_response
  • tests/unit/embedder/test_openai.py
    • tests.embedder.test_openai::test_openai_embedder_creates_embeddings
    • tests.embedder.test_openai::test_openai_embedder_handles_rate_limit
  • tests/unit/embedder/test_voyage.py
    • tests.embedder.test_voyage::test_voyage_embedder_batches_inputs
    • tests.embedder.test_voyage::test_voyage_embedder_handles_http_error
  • tests/unit/llm_client/test_anthropic_client.py
    • tests.llm_client.test_anthropic_client::TestAnthropicClientGenerateResponse.test_create_tool
    • tests.llm_client.test_anthropic_client::TestAnthropicClientGenerateResponse.test_extract_json_from_text
    • tests.llm_client.test_anthropic_client::TestAnthropicClientGenerateResponse.test_generate_response_with_text_response
    • tests.llm_client.test_anthropic_client::TestAnthropicClientGenerateResponse.test_generate_response_with_tool_use
    • tests.llm_client.test_anthropic_client::TestAnthropicClientGenerateResponse.test_rate_limit_error
    • tests.llm_client.test_anthropic_client::TestAnthropicClientGenerateResponse.test_refusal_error
    • tests.llm_client.test_anthropic_client::TestAnthropicClientGenerateResponse.test_validation_error_retry
    • tests.llm_client.test_anthropic_client::TestAnthropicClientInitialization.test_init_with_config
    • tests.llm_client.test_anthropic_client::TestAnthropicClientInitialization.test_init_with_custom_client
    • tests.llm_client.test_anthropic_client::TestAnthropicClientInitialization.test_init_with default_model
    • tests.llm_client.test_anthropic_client::TestAnthropicClientInitialization.test_init_without_config
  • tests/unit/llm_client/test_client.py
    • tests.llm_client.test_client::test_client_calls_generate_response
    • tests.llm_client.test_client::test_client_handles_structured_output
    • tests.llm_client.test_client::test_client_raises_empty_response_error
  • tests/unit/llm_client/test_errors.py
    • tests.llm_client.test_errors::TestEmptyResponseError.test_message_assignment
    • tests.llm_client.test_errors::TestEmptyResponseError.test_message_required
    • tests.llm_client.test_errors::TestRateLimitError.test_custom_message
    • tests.llm_client.test_errors::TestRateLimitError.test_default_message
    • tests.llm_client.test_errors::TestRefusalError.test_message_assignment
    • tests.llm_client.test_errors::TestRefusalError.test_message_required
  • tests/unit/llm_client/test_gemini_client.py
    • tests.llm_client.test_gemini_client::TestGeminiClientGenerateResponse.test_custom_max_tokens
    • tests.llm_client.test_gemini_client::TestGeminiClientGenerateResponse.test_empty_response_handling
    • tests.llm_client.test_gemini_client::TestGeminiClientGenerateResponse.test_gemini_model_max_tokens_mapping
    • tests.llm_client.test_gemini_client::TestGeminiClientGenerateResponse.test_generate_response_simple text
    • tests.llm_client.test_gemini_client::TestGeminiClientGenerateResponse.test_generate_response_with_structured_output
    • tests.llm_client.test_gemini_client::TestGeminiClientGenerateResponse.test_generate_response_with_system_message
    • tests.llm_client.test_gemini_client::TestGeminiClientGenerateResponse.test_get_model_for_size
    • tests.llm_client.test_gemini_client::TestGeminiClientGenerateResponse.test_max_retries_exceeded
    • tests.llm_client.test_gemini_client::TestGeminiClientGenerateResponse.test_max_tokens_precedence_fallback
    • tests.llm_client.test_gemini_client::TestGeminiClientGenerateResponse.test_model_size_selection
    • tests.llm_client.test_gemini_client::TestGeminiClientGenerateResponse.test_prompt_block_handling
    • tests.llm_client.test_gemini_client::TestGeminiClientGenerateResponse.test_quota_error_handling
    • tests.llm_client.test_gemini_client::TestGeminiClientGenerateResponse.test_rate_limit_error_handling
    • tests.llm_client.test_gemini_client::TestGeminiClientGenerateResponse.test_resource_exhausted_error_handling
    • tests.llm_client.test_gemini_client::TestGeminiClientGenerateResponse.test_retry_logic_with_safety_block
    • tests.llm_client.test_gemini_client::TestGeminiClientGenerateResponse.test_retry_logic_with_validation error
    • tests.llm_client.test_gemini_client::TestGeminiClientGenerateResponse.test_safety_block_handling
    • tests.llm_client.test_gemini_client::TestGeminiClientGenerateResponse.test_structured_output_parsing_error
    • tests.llm_client.test_gemini_client::TestGeminiClientInitialization.test_init_with_config
    • tests.llm_client.test_gemini_client::TestGeminiClientInitialization.test_init_with_default_model
    • tests.llm_client.test_gemini_client::TestGeminiClientInitialization.test_init_with_thinking_config
    • tests.llm_client.test_gemini_client::TestGeminiClientInitialization.test_init_without_config
  • tests/unit/llm_client/test_groq_client.py
    • tests.llm_client.test_groq_client::test_generate_response_returns_json
    • tests.llm_client.test_groq_client::test_generate_response_salvages_json
    • tests.llm_client.test_groq_client::test_generate_response_validates_model
    • tests.llm_client.test_groq_client::test_rate_limit_error
  • tests/unit/llm_client/test_litellm_client.py
    • tests.llm_client.test_litellm_client::test_litellm_client_prefers_pydantic_ai
    • tests.llm_client.test_litellm_client::test_litellm_client_raises_rate_limit_error
    • tests.llm_client.test_litellm_client::test_litellm_client_reports_json_repair
    • tests.llm_client.test_litellm_client::test_litellm_client_retries_on_json_error
    • tests.llm_client.test_litellm_client::test_litellm_client_returns_json
    • tests.llm_client.test_litellm_client::test_litellm_client_validates_response_model
  • tests/unit/llm_client/test_pydantic_ai_adapter.py
    • tests.llm_client.test_pydantic_ai_adapter::test_pydantic_ai_adapter_multi_turn
    • tests.llm_client.test_pydantic_ai_adapter::test_pydantic_ai_adapter_requires_user_prompt
  • tests/unit/llm_client/test_structured_output.py
    • tests.llm_client.test_structured_output::test_format_structured_retry_message_json_error
    • tests.llm_client.test_structured_output::test_format_structured_retry_message_validation_error
    • tests.llm_client.test_structured_output::test_salvage_json_response_returns_none
    • tests.llm_client.test_structured_output::test_salvage_json_response_truncated_object
  • tests/unit/mcp/test_episode_queue.py
    • tests.mcp.test_episode_queue::test_enqueue_episode_retries_and_records_failures
  • tests/unit/search/test_edge_search_orchestration.py
    • tests.search.test_edge_search_orchestration::test_edge_search_bfs_seeded_from_results
    • tests.search.test_edge_search_orchestration::test_edge_search_cross_encoder
    • tests.search.test_edge_search_orchestration::test_edge_search_rrF_only
  • tests/unit/search/test_hybrid_search.py
    • tests.utils.search.search_utils_test::test_hybrid_node_search_delegates_to_similarity_and_fulltext
    • tests.utils.search.search_utils_test::test_hybrid_node_search_handles_missing results
    • tests.utils.search.search_utils_test::test_hybrid_node_search_merges_scores
    • tests.utils.search.search_utils_test::test_hybrid_node_search_returns_nodes
  • tests/unit/search/test_lucene_utils.py
    • tests.helpers_test::test_lucene_sanitize
  • tests/unit/search/test_search_filters.py
    • tests.search.test_search_filters::test_build_date_filter_clause
    • tests.search.test_search_filters::test_edge_search_filter_query_constructor_builds filters
    • tests.search.test_search_filters::test_edge_search_filter query_constructor_handles_dates
    • tests.search.test_search_filters::test_edge_search_filter query_constructor_handles labels
    • tests.search.test_search_filters::test_edge_search_filter query_constructor_handles_uuid filters
    • tests.search.test_search_filters::test_edge_search_filter query_constructor_returns empty lists
    • tests.search.test_search_filters::test_node_search_filter query_constructor_builds filters
  • tests/unit/search/test_search_helpers.py
    • tests.search.test_search_helpers::test_build_search_config_handles_cross_encoder_weight
    • tests.search.test_search_helpers::test_build_search_config_sets defaults
    • tests.search.test_search_helpers::test_build_search_config_validates weights
    • tests.search.test_search_helpers::test_rescore with cross_encoder_handles empty
    • tests.search.test_search_helpers::test_rescore with cross_encoder_sorts results
  • tests/unit/search/test_search_utils_edges.py
    • tests.search.test_search_utils_edges::test_get_edge_invalidation_candidates default provider
    • tests.search.test_search_utils_edges::test_get_relevant_edges default provider
    • tests.search.test_search_utils_edges::test_node_distance_reranker
  • tests/unit/search/test_search_utils_filters.py
    • tests.search.test_search_utils_filters::test_build_edge_filter_clause_with group and endpoints
    • tests.search.test_search_utils_filters::test_build_edge filter_clause without filters
    • tests.search.test_search_utils_filters::test_collect_edge matches_ignores missing uuid
    • tests.search.test_search_utils_filters::test_fulltext_query default provider includes group filter
    • tests.search.test_search_utils_filters::test_fulltext_query falkordb delegates
    • tests.search.test_search_utils_filters::test_fulltext_query kuzu respects max length
  • tests/unit/utils/test_text_utils.py
    • tests.test_text_utils::test_max_summary_chars_constant
    • tests.test_text_utils::test_truncate_at_sentence empty
    • tests.test_text_utils::test_truncate_at_sentence exact length
    • tests.test_text_utils::test_truncate_at_sentence multiple periods
    • tests.test_text_utils::test_truncate at sentence no boundary
    • tests.test_text_utils::test_truncate at sentence realistic summary
    • tests.test_text_utils::test_truncate at sentence short text
    • tests.test_text_utils::test_truncate at sentence strips trailing whitespace
    • tests.test_text_utils::test_truncate at sentence with exclamation
    • tests.test_text_utils::test_truncate at sentence with period
    • tests.test_text_utils::test_truncate at sentence with question

End-to-End Tests

  • The current tests/evals/ modules contain helper functions and CLI entry points but no pytest-collected tests. During restructure, convert these scripts into explicit tests/e2e/graph/test_eval_cli.py and tests/e2e/graph/test_eval_graph_building.py modules with @pytest.mark.e2e wrappers around existing logic, or leave them as manual harnesses documented outside pytest.

Migration Roadmap

  1. Scaffold directories (tests/unit, tests/integration, tests/e2e, tests/helpers).
  2. Extract helper utilities from tests/helpers_test.py into modular fixtures:
    • helpers/factories.py for fake embeddings/nodes.
    • helpers/services.py for service setup/teardown, parameterised by provider.
  3. Split test_graphium_mock.py:
    • Unit cases into tests/unit/orchestration/test_graphium_episode.py.
    • Service-backed cases (Neo4j/FalkorDB/Kùzu) into corresponding integration files.
  4. Move eval scripts (tests/evals/) under tests/e2e/ with explicit markers.
  5. Update CI workflow to execute the unit suite only; provide reusable GitHub Action (reusable workflow or manual dispatch) for integration tiers with secrets for Neo4j credentials.
  6. Document local integration commands in docs/development.md (e.g., uv run pytest -m "integration and neo4j" --maxfail=1).
  7. Introduce pre-commit hook or nox session that mirrors CI unit tier for contributors.
  8. Enable per-provider integration pipelines once fixtures stabilise (start with Neo4j, then FalkorDB/Kùzu).

Service Provisioning Notes

  • Neo4j: Provide docker-compose override in docker-compose.test.yml and add make integration-neo4j target.
  • FalkorDB/Kùzu: Document optional usage; default to skipped unless environment variables (FALKORDB_URI, KUZU_DB_PATH) are supplied.
  • Redis (for FalkorDB cluster detection): Stub FalkorDB client in unit tests; integration suite can spin up ephemeral Redis via docker compose.

Next Steps

  1. Review and sign off on the file-by-file mapping above (especially the large test_graphium_mock.py split).
  2. Implement Phase 1 (directory scaffold, helper extraction, unit CI tweaks).
  3. Convert ingestion/search suites per provider (Phase 2) and add markers.
  4. Move eval harness into tests/e2e and document opt-in execution (Phase 3).
  5. Update contributor documentation once restructuring lands.