Establish shared Feishu group transcript as the first step toward multi-agent group chat by Y1fe1Zh0u · Pull Request #399 · dataelement/Clawith

Y1fe1Zh0u · 2026-04-13T17:15:35Z

Summary

aggregate Feishu group prompt history across all agent sessions sharing the same external conversation id
add external_message_id on chat_messages and use it for deterministic transcript dedupe
persist Feishu outbound bot message_id values where available so transcript replay is more stable
add focused shared-transcript regression tests

Why This Change

The long-term goal is Feishu-as-channel multi-agent collaboration, where several agents can participate in the same group conversation and react to each other's public replies instead of behaving like isolated one-on-one bots.

This PR is intentionally only the first slice of that design: shared public transcript reading for Feishu group chats. It does not yet add mention routing, loop filtering, or full channel/member/message modeling.

Testing

PYTHONPATH=backend python3 -m unittest backend.tests.test_channel_session_shared_history
PYTHONPYCACHEPREFIX=/tmp/pycache python3.13 -m py_compile backend/app/api/feishu.py backend/app/services/channel_session.py backend/app/models/audit.py backend/alembic/versions/add_chat_message_external_message_id.py backend/tests/test_channel_session_shared_history.py
simulated shared-group transcript read across two agent sessions in local Docker-backed test DB

wisdomqin · 2026-04-14T06:00:22Z

Thanks for this PR. The direction makes sense to me: shared public transcript reading is a good first slice toward Feishu multi-agent group chat, and keeping P2P behavior unchanged is the right scope for this change.

Before we can consider merging it, there are two issues that should be fixed:

The Alembic migration is attached to an older revision: down_revision = "add_microsoft_teams_support". The current migration chain has moved past that revision, so this is likely to create multiple heads. Please rebase the migration onto the current head / correct migration chain.
load_shared_channel_history() currently loads sessions by source_channel + external_conv_id only. It should also be explicitly scoped to the current agent tenant, otherwise histories from another tenant with the same external conversation id could be mixed into the prompt. Even if Feishu chat IDs are usually globally stable, tenant isolation should be enforced in the query.

A smaller follow-up consideration: it may also be worth restricting the shared-history query to group sessions (is_group = true) so future channel/session types cannot accidentally share transcript context.

Once those isolation and migration-chain issues are addressed, this will be much safer to review further. Thanks again for the thoughtful implementation and the regression tests.

Y1fe1Zh0u · 2026-04-14T06:11:02Z

Thanks, this feedback is aligned with the intended architecture.

The goal of this change is to treat a Feishu group as a shared public collaboration channel, not as a set of isolated bot-to-user conversations. Concretely, agents in the same Feishu group should read from one shared public transcript, while still keeping their private state isolated (workspace, memory, task state, internal reasoning, intermediate artifacts). This PR intentionally implements only the first slice of that design: shared transcript construction plus deterministic message-level deduplication. It does not yet attempt to solve higher-level coordination concerns such as scheduling, mention-based routing, or bot-loop suppression.

From that perspective, I agree with both blocker points:

the migration should be rebased onto the current Alembic head so we do not introduce multiple heads in the migration graph;
load_shared_channel_history() should be explicitly tenant-scoped, because transcript aggregation is tenant-bound in Clawith even if external conversation ids are usually globally stable.

I also agree with the follow-up suggestion to restrict the shared-history query to group sessions (is_group = true). That is consistent with the intended model: only public group channels should participate in transcript sharing, while P2P sessions should remain isolated.

At a system-design level, the staged plan is:

shared public transcript (this PR),
group scheduling / turn control,
routing and loop control.

So the issues you pointed out are not just implementation details to tighten up — they are directly part of making this first transcript layer safe enough to serve as the base for later multi-agent group behavior.

I’ll address the migration chain, tenant isolation, and group-session scoping in the next revision.

Y1fe1Zh0u · 2026-04-14T06:25:19Z

@codex review this pr

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: a43f74d220

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-04-14T06:34:03Z

backend/app/api/feishu.py

+                history = await load_shared_channel_history(
+                    db,
+                    current_agent_id=agent_id,
+                    external_conv_id=conv_id,
+                    source_channel="feishu",
+                    limit=ctx_size,
                )


Exclude current event from shared group history

This shared-history fetch runs before the current inbound message is persisted for this agent, so in multi-agent groups a slower agent can read the same Feishu event that another agent already wrote and then send that same text again as llm_user_text. In that timing window the model receives the current turn twice (once from history, once as the live user input), which can cause duplicated or over-weighted responses. Filter shared history by the current message_id (or append history after removing the in-flight event) before invoking the LLM.

Useful? React with 👍 / 👎.

chatgpt-codex-connector · 2026-04-14T06:34:03Z

backend/app/services/channel_session.py

+            dedupe_key = (dedupe_speaker, entry["content"], bucket)
+        if dedupe_key in seen:
+            continue
+        seen.add(dedupe_key)
+        normalized.append({**entry, "_priority": entry_priority})


Restrict bucket dedupe to human transcript entries

The fallback bucket dedupe is applied unconditionally to messages without external_message_id, including assistant rows, even though the function comment says this bucketing is for duplicate human messages. Since many assistant outputs are saved with no external ID, two legitimate assistant turns from the same agent with identical text in the same 10-second bucket are collapsed into one, dropping real conversation turns from shared context.

Useful? React with 👍 / 👎.

…ti-agent group chat The long-term goal is Feishu-as-channel multi-agent collaboration, where several agents can participate in the same group conversation and react to each other's public replies instead of behaving like isolated one-on-one bots. This commit intentionally implements only the first slice of that design: shared public transcript reading for Feishu group chats. It keeps persistence per-agent, but changes prompt construction so agents in the same Feishu group read a unified public history keyed by external conversation id. It also adds channel-native message ids for deterministic dedupe and persists outbound Feishu message ids where available so transcript replay is more stable. Constraint: Current chat/session model is still pairwise by agent_id and external_conv_id Constraint: Need a reversible first slice before adding routing, throttling, or loop control Rejected: Full channel/member/message schema in the first change | too broad for initial rollout Rejected: Prompt-only coordination without shared transcript | agents would still miss each other's public replies Rejected: Text-and-time-bucket-only dedupe | insufficiently deterministic for repeated webhook fanout Confidence: medium Scope-risk: moderate Reversibility: clean Directive: Treat this as step 1 only; follow-up work still needs mention routing, bot loop filtering, and stronger group scheduling Tested: PYTHONPATH=backend python3 -m unittest backend.tests.test_channel_session_shared_history Tested: PYTHONPYCACHEPREFIX=/tmp/pycache python3.13 -m py_compile backend/app/api/feishu.py backend/app/services/channel_session.py backend/app/models/audit.py backend/alembic/versions/add_chat_message_external_message_id.py backend/tests/test_channel_session_shared_history.py Tested: Simulated shared-group transcript read across two agent sessions in local Docker-backed test DB Not-tested: Real multi-bot Feishu group end-to-end conversation in a live group Not-tested: Mention routing, spam control, and bot loop behavior

Scope shared Feishu shared-transcript reads to the current tenant and to group sessions only, so public transcript aggregation cannot mix unrelated tenant or session contexts. Rebase the external_message_id migration onto the current migration chain head to avoid creating a parallel Alembic head. Constraint: Shared transcript is tenant-bound even when external conversation ids look globally stable Rejected: Rely on source_channel + external_conv_id alone | risks cross-tenant prompt contamination Rejected: Keep migration on add_microsoft_teams_support | would branch the migration graph on current main Confidence: high Scope-risk: narrow Reversibility: clean Directive: Future shared-channel readers should always carry tenant scope explicitly and stay limited to true group sessions Tested: PYTHONPATH=backend python3.13 -m unittest backend.tests.test_channel_session_shared_history Tested: PYTHONPYCACHEPREFIX=/tmp/pycache python3.13 -m py_compile backend/app/api/feishu.py backend/app/services/channel_session.py backend/alembic/versions/add_chat_message_external_message_id.py backend/tests/test_channel_session_shared_history.py Not-tested: Real multi-tenant Feishu deployment with overlapping external conversation ids

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 5e197215c0

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-04-14T12:06:03Z

backend/alembic/versions/add_chat_message_external_message_id.py

+
+
+revision = "add_chat_msg_ext_id"
+down_revision = "20260313_column_modify"


Rebase migration onto the current Alembic head

Setting down_revision to 20260313_column_modify creates a second migration head (add_chat_msg_ext_id) instead of extending the existing head chain (currently ending at f1a2b3c4d5e6), so alembic upgrade head becomes ambiguous. This repository’s startup path calls alembic upgrade head in backend/entrypoint.sh, so upgrades can fail before this column is applied while new Feishu writes already reference external_message_id.

Useful? React with 👍 / 👎.

chatgpt-codex-connector · 2026-04-14T12:06:03Z

backend/tests/test_channel_session_shared_history.py

+if "sqlalchemy" not in sys.modules:
+    fake_sqlalchemy = types.ModuleType("sqlalchemy")
+
+    def _fake_select(*args, **kwargs):
+        return _FakeQuery(*args)
+
+    fake_sqlalchemy.select = _fake_select
+    sys.modules["sqlalchemy"] = fake_sqlalchemy


Stop overriding sqlalchemy in global test module state

This test injects a fake sqlalchemy module into sys.modules whenever SQLAlchemy was not already imported, which mutates interpreter-wide import state and can break later test modules that expect the real package (for example imports of sqlalchemy.exc). That makes suite behavior order-dependent when this file is imported before DB-backed tests.

Useful? React with 👍 / 👎.

chatgpt-codex-connector bot reviewed Apr 14, 2026

View reviewed changes

Y1fe1Zh0u added 2 commits April 14, 2026 19:55

Y1fe1Zh0u force-pushed the feishu-shared-transcript-upstream branch from a43f74d to 5e19721 Compare April 14, 2026 11:58

chatgpt-codex-connector bot reviewed Apr 14, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Establish shared Feishu group transcript as the first step toward multi-agent group chat#399

Establish shared Feishu group transcript as the first step toward multi-agent group chat#399
Y1fe1Zh0u wants to merge 2 commits intodataelement:mainfrom
Y1fe1Zh0u:feishu-shared-transcript-upstream

Y1fe1Zh0u commented Apr 13, 2026

Uh oh!

wisdomqin commented Apr 14, 2026 •

edited

Loading

Uh oh!

Y1fe1Zh0u commented Apr 14, 2026

Uh oh!

Y1fe1Zh0u commented Apr 14, 2026

Uh oh!

chatgpt-codex-connector bot left a comment

Uh oh!

chatgpt-codex-connector bot Apr 14, 2026

Uh oh!

chatgpt-codex-connector bot Apr 14, 2026

Uh oh!

chatgpt-codex-connector bot left a comment

Uh oh!

chatgpt-codex-connector bot Apr 14, 2026

Uh oh!

chatgpt-codex-connector bot Apr 14, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants



		revision = "add_chat_msg_ext_id"
		down_revision = "20260313_column_modify"

Conversation

Y1fe1Zh0u commented Apr 13, 2026

Summary

Why This Change

Testing

Uh oh!

wisdomqin commented Apr 14, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Y1fe1Zh0u commented Apr 14, 2026

Uh oh!

Y1fe1Zh0u commented Apr 14, 2026

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector bot Apr 14, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector bot Apr 14, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector bot Apr 14, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector bot Apr 14, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

wisdomqin commented Apr 14, 2026 •

edited

Loading