Skip to content

[WIP] Implement deep research logic from Python to C##34

Closed
Copilot wants to merge 1 commit into
masterfrom
copilot/master
Closed

[WIP] Implement deep research logic from Python to C##34
Copilot wants to merge 1 commit into
masterfrom
copilot/master

Conversation

Copy link
Copy Markdown
Contributor

Copilot AI commented Jun 2, 2026

  • Inspect existing research/search architecture and locate integration points
  • Build and test baseline to confirm current repository state
  • Map Python deep_research stages to Lethe-native C# flow and data models
  • Implement iterative Think→Search→Extract→Synthesize engine with progress phases, dedupe, stop logic, and stats
  • Integrate with existing abstractions (LLMEngine, IPromptBuilder, web search) and cancellation/timeout behavior
  • Add or update focused tests where existing test infrastructure allows
  • Run targeted validation build/tests and fix any issues
  • Run CodeQL check and address any relevant alerts
  • Finalize and report completed work
Original prompt

Implement the provided Python deep_research.py logic as faithfully as possible in the SerialKicked/Lethe-AI-Sharp codebase on branch master, preserving the original algorithmic structure and behavior rather than replacing it with a generic research scaffold.

Requirements:

  • Use the user's provided Python file as the source of truth for behavior.
  • Port the iterative Think→Search→Extract→Synthesize loop closely, including:
    • research planning step
    • optional category classification using the same categories/concept (product, comparison, howto, factcheck)
    • per-round query generation with first-round broad queries and later-round targeted follow-ups
    • search execution with provider/fallback awareness as appropriate for this repo
    • URL deduplication and query deduplication
    • extraction of useful evidence from search results/pages
    • synthesis of an evolving report using a bounded synthesis window
    • stop-decision logic after minimum rounds
    • final report generation with expansion retry if too short
    • tracking stats such as rounds, queries, URLs, category, and providers used where feasible
    • progress callback/event emission equivalent to the Python _emit(...) phases (planning, searching, reading, analyzing, writing, warning, error)
    • cancellation and timeout behavior as close as practical in C#
  • Integrate with existing Lethe-AI-Sharp abstractions instead of inventing unrelated patterns:
    • LLMEngine
    • IPromptBuilder (including structured output support where helpful)
    • existing web search support / EnrichedSearchResult
    • existing logging and agent/task architecture where appropriate
  • Keep the implementation repo-native and maintainable, but do not change the core logic/flow unless required by C# or the existing architecture.
  • If webpage content extraction capabilities are dependent on the configured search provider/settings, preserve the Python intent by using enriched content when available and clearly structuring the extraction stage around the available result data.
  • Add comments where useful to show correspondence to the original Python stages.

Deliverable:

  • Make the necessary code changes in the repository and open a pull request with the faithful implementation.

The following is the prior conversation context from the user's chat exploration (may be truncated):

User: I was wondering how easy it would be to turn this (joined) python code into C# and integrate the whole concept into Lethe AI. Any idea?

Assistant: Yes — very doable, and this file is actually a good candidate for a C# port.

Short version: this is not “hard to translate” algorithmically. It’s mostly orchestration code: async loops, prompt construction, JSON parsing, cancellation, concurrency limiting, and calling existing services. That kind of logic maps cleanly from Python to modern C#/.NET. The real work is less about syntax conversion and more about wiring it into Lethe AI’s existing abstractions for LLM calls, search providers, webpage fetching, progress events, and settings.

Difficulty estimate

I’d rate it roughly:

  • Porting the core class only: moderate
  • Integrating it cleanly into Lethe AI’s architecture/UI/settings: moderate to somewhat involved
  • Reaching feature parity with the Python behavior: moderate
  • Making it feel native in the app: the main effort

If Lethe AI already has:

  • a unified LLM client,
  • search provider abstraction,
  • webpage fetch/content extraction utilities,
  • cancellation token flow,
  • and UI progress reporting,

then this becomes much easier.

Why this code ports well

This file is mostly composed of patterns C# handles very well:

  • async / await
  • bounded parallelism via semaphore
  • DTO-style JSON parsing
  • stateful service object
  • string prompt templates
  • cancellation
  • timeout handling
  • event/progress callbacks

Nothing here depends on very Python-specific features like:

  • dynamic metaprogramming,
  • decorators,
  • monkeypatching,
  • generators used in tricky ways,
  • pandas/numpy ecosystems,
  • or Python-only libraries central to the design.

So the concept transfers cleanly.

What the class actually is in C# terms

In C#, I’d think of this as something like:

  • a DeepResearchService or IterativeResearchEngine
  • depending on:
    • ILLMClient
    • ISearchProvider or search-provider factory
    • IWebContentFetcher
    • IResearchProgressSink or IProgress<ResearchProgressEvent>
    • config/options object

The Python class already has a nice service-oriented shape:

  • constructor-injected config
  • mostly private helper methods
  • one public ResearchAsync(...)
  • accumulated state for stats/report/findings

That maps very naturally to a C# class.

Main translation points

1. Async methods

Python:

  • async def research(...)
  • asyncio.gather
  • asyncio.Semaphore

C# equivalent:

  • Task<string> ResearchAsync(...)
  • Task.WhenAll(...)
  • `SemaphoreSlim...

This pull request was created from Copilot chat.

@SerialKicked SerialKicked deleted the copilot/master branch June 2, 2026 19:29
Copilot stopped work on behalf of SerialKicked due to an error June 2, 2026 19:29
Copilot AI requested a review from SerialKicked June 2, 2026 19:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants