Skip to content

Establish shared Feishu group transcript as the first step toward multi-agent group chat#399

Open
Y1fe1Zh0u wants to merge 2 commits intodataelement:mainfrom
Y1fe1Zh0u:feishu-shared-transcript-upstream
Open

Establish shared Feishu group transcript as the first step toward multi-agent group chat#399
Y1fe1Zh0u wants to merge 2 commits intodataelement:mainfrom
Y1fe1Zh0u:feishu-shared-transcript-upstream

Conversation

@Y1fe1Zh0u
Copy link
Copy Markdown
Collaborator

Summary

  • aggregate Feishu group prompt history across all agent sessions sharing the same external conversation id
  • add external_message_id on chat_messages and use it for deterministic transcript dedupe
  • persist Feishu outbound bot message_id values where available so transcript replay is more stable
  • add focused shared-transcript regression tests

Why This Change

The long-term goal is Feishu-as-channel multi-agent collaboration, where several agents can participate in the same group conversation and react to each other's public replies instead of behaving like isolated one-on-one bots.

This PR is intentionally only the first slice of that design: shared public transcript reading for Feishu group chats. It does not yet add mention routing, loop filtering, or full channel/member/message modeling.

Testing

  • PYTHONPATH=backend python3 -m unittest backend.tests.test_channel_session_shared_history
  • PYTHONPYCACHEPREFIX=/tmp/pycache python3.13 -m py_compile backend/app/api/feishu.py backend/app/services/channel_session.py backend/app/models/audit.py backend/alembic/versions/add_chat_message_external_message_id.py backend/tests/test_channel_session_shared_history.py
  • simulated shared-group transcript read across two agent sessions in local Docker-backed test DB

@wisdomqin
Copy link
Copy Markdown
Contributor

wisdomqin commented Apr 14, 2026

Thanks for this PR. The direction makes sense to me: shared public transcript reading is a good first slice toward Feishu multi-agent group chat, and keeping P2P behavior unchanged is the right scope for this change.

Before we can consider merging it, there are two issues that should be fixed:

  1. The Alembic migration is attached to an older revision: down_revision = "add_microsoft_teams_support". The current migration chain has moved past that revision, so this is likely to create multiple heads. Please rebase the migration onto the current head / correct migration chain.

  2. load_shared_channel_history() currently loads sessions by source_channel + external_conv_id only. It should also be explicitly scoped to the current agent tenant, otherwise histories from another tenant with the same external conversation id could be mixed into the prompt. Even if Feishu chat IDs are usually globally stable, tenant isolation should be enforced in the query.

A smaller follow-up consideration: it may also be worth restricting the shared-history query to group sessions (is_group = true) so future channel/session types cannot accidentally share transcript context.

Once those isolation and migration-chain issues are addressed, this will be much safer to review further. Thanks again for the thoughtful implementation and the regression tests.

Copy link
Copy Markdown
Collaborator Author

Thanks, this feedback is aligned with the intended architecture.

The goal of this change is to treat a Feishu group as a shared public collaboration channel, not as a set of isolated bot-to-user conversations. Concretely, agents in the same Feishu group should read from one shared public transcript, while still keeping their private state isolated (workspace, memory, task state, internal reasoning, intermediate artifacts). This PR intentionally implements only the first slice of that design: shared transcript construction plus deterministic message-level deduplication. It does not yet attempt to solve higher-level coordination concerns such as scheduling, mention-based routing, or bot-loop suppression.

From that perspective, I agree with both blocker points:

  • the migration should be rebased onto the current Alembic head so we do not introduce multiple heads in the migration graph;
  • load_shared_channel_history() should be explicitly tenant-scoped, because transcript aggregation is tenant-bound in Clawith even if external conversation ids are usually globally stable.

I also agree with the follow-up suggestion to restrict the shared-history query to group sessions (is_group = true). That is consistent with the intended model: only public group channels should participate in transcript sharing, while P2P sessions should remain isolated.

At a system-design level, the staged plan is:

  1. shared public transcript (this PR),
  2. group scheduling / turn control,
  3. routing and loop control.

So the issues you pointed out are not just implementation details to tighten up — they are directly part of making this first transcript layer safe enough to serve as the base for later multi-agent group behavior.

I’ll address the migration chain, tenant isolation, and group-session scoping in the next revision.

@Y1fe1Zh0u
Copy link
Copy Markdown
Collaborator Author

@codex review this pr

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: a43f74d220

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +464 to 470
history = await load_shared_channel_history(
db,
current_agent_id=agent_id,
external_conv_id=conv_id,
source_channel="feishu",
limit=ctx_size,
)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Exclude current event from shared group history

This shared-history fetch runs before the current inbound message is persisted for this agent, so in multi-agent groups a slower agent can read the same Feishu event that another agent already wrote and then send that same text again as llm_user_text. In that timing window the model receives the current turn twice (once from history, once as the live user input), which can cause duplicated or over-weighted responses. Filter shared history by the current message_id (or append history after removing the in-flight event) before invoking the LLM.

Useful? React with 👍 / 👎.

Comment on lines +119 to +123
dedupe_key = (dedupe_speaker, entry["content"], bucket)
if dedupe_key in seen:
continue
seen.add(dedupe_key)
normalized.append({**entry, "_priority": entry_priority})
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Restrict bucket dedupe to human transcript entries

The fallback bucket dedupe is applied unconditionally to messages without external_message_id, including assistant rows, even though the function comment says this bucketing is for duplicate human messages. Since many assistant outputs are saved with no external ID, two legitimate assistant turns from the same agent with identical text in the same 10-second bucket are collapsed into one, dropping real conversation turns from shared context.

Useful? React with 👍 / 👎.

…ti-agent group chat

The long-term goal is Feishu-as-channel multi-agent collaboration, where several
agents can participate in the same group conversation and react to each other's
public replies instead of behaving like isolated one-on-one bots.

This commit intentionally implements only the first slice of that design:
shared public transcript reading for Feishu group chats. It keeps persistence
per-agent, but changes prompt construction so agents in the same Feishu group
read a unified public history keyed by external conversation id. It also adds
channel-native message ids for deterministic dedupe and persists outbound Feishu
message ids where available so transcript replay is more stable.

Constraint: Current chat/session model is still pairwise by agent_id and external_conv_id
Constraint: Need a reversible first slice before adding routing, throttling, or loop control
Rejected: Full channel/member/message schema in the first change | too broad for initial rollout
Rejected: Prompt-only coordination without shared transcript | agents would still miss each other's public replies
Rejected: Text-and-time-bucket-only dedupe | insufficiently deterministic for repeated webhook fanout
Confidence: medium
Scope-risk: moderate
Reversibility: clean
Directive: Treat this as step 1 only; follow-up work still needs mention routing, bot loop filtering, and stronger group scheduling
Tested: PYTHONPATH=backend python3 -m unittest backend.tests.test_channel_session_shared_history
Tested: PYTHONPYCACHEPREFIX=/tmp/pycache python3.13 -m py_compile backend/app/api/feishu.py backend/app/services/channel_session.py backend/app/models/audit.py backend/alembic/versions/add_chat_message_external_message_id.py backend/tests/test_channel_session_shared_history.py
Tested: Simulated shared-group transcript read across two agent sessions in local Docker-backed test DB
Not-tested: Real multi-bot Feishu group end-to-end conversation in a live group
Not-tested: Mention routing, spam control, and bot loop behavior
Scope shared Feishu shared-transcript reads to the current tenant and to group
sessions only, so public transcript aggregation cannot mix unrelated tenant or
session contexts. Rebase the external_message_id migration onto the current
migration chain head to avoid creating a parallel Alembic head.

Constraint: Shared transcript is tenant-bound even when external conversation ids look globally stable
Rejected: Rely on source_channel + external_conv_id alone | risks cross-tenant prompt contamination
Rejected: Keep migration on add_microsoft_teams_support | would branch the migration graph on current main
Confidence: high
Scope-risk: narrow
Reversibility: clean
Directive: Future shared-channel readers should always carry tenant scope explicitly and stay limited to true group sessions
Tested: PYTHONPATH=backend python3.13 -m unittest backend.tests.test_channel_session_shared_history
Tested: PYTHONPYCACHEPREFIX=/tmp/pycache python3.13 -m py_compile backend/app/api/feishu.py backend/app/services/channel_session.py backend/alembic/versions/add_chat_message_external_message_id.py backend/tests/test_channel_session_shared_history.py
Not-tested: Real multi-tenant Feishu deployment with overlapping external conversation ids
@Y1fe1Zh0u Y1fe1Zh0u force-pushed the feishu-shared-transcript-upstream branch from a43f74d to 5e19721 Compare April 14, 2026 11:58
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 5e197215c0

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".



revision = "add_chat_msg_ext_id"
down_revision = "20260313_column_modify"
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Rebase migration onto the current Alembic head

Setting down_revision to 20260313_column_modify creates a second migration head (add_chat_msg_ext_id) instead of extending the existing head chain (currently ending at f1a2b3c4d5e6), so alembic upgrade head becomes ambiguous. This repository’s startup path calls alembic upgrade head in backend/entrypoint.sh, so upgrades can fail before this column is applied while new Feishu writes already reference external_message_id.

Useful? React with 👍 / 👎.

Comment on lines +46 to +53
if "sqlalchemy" not in sys.modules:
fake_sqlalchemy = types.ModuleType("sqlalchemy")

def _fake_select(*args, **kwargs):
return _FakeQuery(*args)

fake_sqlalchemy.select = _fake_select
sys.modules["sqlalchemy"] = fake_sqlalchemy
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Stop overriding sqlalchemy in global test module state

This test injects a fake sqlalchemy module into sys.modules whenever SQLAlchemy was not already imported, which mutates interpreter-wide import state and can break later test modules that expect the real package (for example imports of sqlalchemy.exc). That makes suite behavior order-dependent when this file is imported before DB-backed tests.

Useful? React with 👍 / 👎.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants