Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
54 commits
Select commit Hold shift + click to select a range
6ef5fae
fix: correct stale send_agent_message references and unify msg_type e…
39499740 Apr 8, 2026
9473cce
feat(ui): polish chat sidebar — segment control, session items, selec…
KinyooZ Apr 9, 2026
7c40208
fix: add rate limiting to DingTalk org sync API calls
39499740 Apr 10, 2026
d2585f8
feat(ui): implement drag-and-drop file upload across application
cinderzhan Apr 10, 2026
5193737
Merge pull request #378 from dataelement/yutong03
cinderzhan Apr 10, 2026
e873678
fix Docker access port to 3008
lu-ang Apr 11, 2026
41666c2
feat: add Exa AI-powered search tool
tgonzalezc5 Apr 12, 2026
f673c61
feat: add Exa AI-powered search tool (#390)
wisdomqin Apr 12, 2026
284ddb8
feat(search): add standalone search engine tools for each provider
wisdomqin Apr 12, 2026
41e343e
fix: Docker access port to 3008 (#388)
wisdomqin Apr 12, 2026
1928805
fix Docker access port to 3008
lu-ang Apr 11, 2026
63c2452
Revert "fix: Docker access port to 3008 (#388)"
wisdomqin Apr 12, 2026
c933484
Revert "Revert "fix: Docker access port to 3008 (#388)""
wisdomqin Apr 12, 2026
8ee565f
fix(seeder): include parameters_schema in new tool INSERT
wisdomqin Apr 12, 2026
878517f
fix(seeder): include parameters_schema in new tool INSERT
wisdomqin Apr 12, 2026
c5870d3
Merge branch 'bugfix'
wisdomqin Apr 12, 2026
558dd8c
fix: add rate limiting to DingTalk org sync API calls
39499740 Apr 10, 2026
2dd3d02
fix: add rate limiting to DingTalk org sync API calls (#374)
wisdomqin Apr 12, 2026
99a3338
fix(image-context): persist base64 marker to DB at write time
wisdomqin Apr 12, 2026
66880a1
fix(chat): strip [image_data:] markers from history display
wisdomqin Apr 12, 2026
e0d9691
fix(chat): restore scroll by fixing main-content height for chat page
wisdomqin Apr 12, 2026
caa73ff
fix(chat): add minHeight:0 to chat flex wrapper to enable scroll
wisdomqin Apr 12, 2026
5639810
fix(chat): parse [image_data:] markers in ChatMessageItem + remove de…
wisdomqin Apr 12, 2026
2559313
fix(chat): deduplicate image display - strip markers always but only …
wisdomqin Apr 12, 2026
74325a8
feat: differentiate send_message_to_agent by msg_type (notify/consult…
39499740 Apr 8, 2026
ac26e6d
test: add unit tests for async A2A msg_type differentiation
39499740 Apr 8, 2026
eedcfef
chore: remove uv.lock generated by test dependency install
39499740 Apr 8, 2026
542b838
fix: prevent a2a_wake storm and notify silent failure
39499740 Apr 8, 2026
b540558
fix: suppress a2a_wake results from user chat
39499740 Apr 8, 2026
0d00744
fix: set max_fires=1 on A2A on_message triggers to prevent loops
39499740 Apr 8, 2026
1d5f8d2
fix: improve a2a_wake prompt to prevent unwanted consult-back
39499740 Apr 9, 2026
6c9e2f8
fix: skip DEDUP for send_message_to_agent wake calls
39499740 Apr 9, 2026
b7cc09c
feat: make msg_type required, improve tool description for auto-selec…
39499740 Apr 9, 2026
a76fd9d
feat: improve msg_type decision guide for ambiguous user intent
39499740 Apr 9, 2026
13cca98
fix: use trigger reason instead of internal name in user notification
39499740 Apr 9, 2026
7416dc6
fix: user-friendly notification headline for task_delegate triggers
39499740 Apr 9, 2026
8d2c536
fix: tell agent its a2a_wait reply is user-visible
39499740 Apr 9, 2026
aececc5
fix: dual protection against internal terms in user notifications
39499740 Apr 9, 2026
761fe2d
fix: strengthen regex filter for internal terms in user notifications
39499740 Apr 9, 2026
fb1f14f
perf: limit a2a_wake Reflection Sessions to 2 tool rounds
39499740 Apr 9, 2026
d3b4246
feat: add a2a_async_enabled feature flag (per-agent toggle)
39499740 Apr 9, 2026
2ce921e
refactor: move a2a_async_enabled from Agent to Tenant (company-level)
39499740 Apr 9, 2026
b929265
feat: add A2A async toggle to company settings page
39499740 Apr 9, 2026
1c9b78e
feat: add i18n translations for A2A async toggle (en + zh)
39499740 Apr 9, 2026
3af43eb
fix: add proper revision/down_revision to alembic migration
39499740 Apr 9, 2026
37f3efb
fix: security hardening and conflict prevention
39499740 Apr 9, 2026
5a7c6fe
ui: move A2A async toggle to bottom of Company Info tab, before Dange…
wisdomqin Apr 12, 2026
b2fac91
fix: hide msg_type param from LLM when a2a_async is disabled
wisdomqin Apr 12, 2026
1459a3c
fix: show agent-to-agent sessions in Other Users tab
wisdomqin Apr 12, 2026
3a53c6e
Merge bugfix into main: A2A async communication + bug fixes
wisdomqin Apr 12, 2026
278f503
release: v1.8.3-beta
wisdomqin Apr 12, 2026
1214576
chore: update Helm appVersion to 1.8.3-beta
wisdomqin Apr 12, 2026
36c4464
docs: update release notes title with feature keywords
wisdomqin Apr 12, 2026
1c12ee3
feat: add DingTalk media message support (image, file, voice, video)
Apr 13, 2026
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions .env.example
Original file line number Diff line number Diff line change
Expand Up @@ -25,6 +25,9 @@ FEISHU_REDIRECT_URI=http://localhost:3000/auth/feishu/callback
# Without a key, the tools still work but with lower rate limits
JINA_API_KEY=

# Exa API key (for exa_search tool and web_search Exa engine — get one at https://exa.ai)
EXA_API_KEY=

# Public app URL used in user-facing links, such as password reset emails.
# Leave empty for auto-discovery from the browser request.
# Set explicitly for production (e.g. https://your-domain.com) — required for
Expand Down
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -113,7 +113,7 @@ bash restart.sh
git clone https://github.com/dataelement/Clawith.git
cd Clawith && cp .env.example .env
docker compose up -d
# → http://localhost:3000
# → http://localhost:3008
```

**To update an existing deployment:**
Expand Down
2 changes: 1 addition & 1 deletion README_zh-CN.md
Original file line number Diff line number Diff line change
Expand Up @@ -108,7 +108,7 @@ bash restart.sh
git clone https://github.com/dataelement/Clawith.git
cd Clawith && cp .env.example .env
docker compose up -d
# → http://localhost:3000
# → http://localhost:3008
```

**更新已有部署:**
Expand Down
91 changes: 43 additions & 48 deletions RELEASE_NOTES.md
Original file line number Diff line number Diff line change
@@ -1,48 +1,53 @@
# v1.8.2 Release Notes
# v1.8.3-beta — A2A Async Communication, Image Context & Search Tools

## What's New

### Security
- **Fix account takeover via username collision** (#300): Prevents an attacker from creating an account with a username matching an existing SSO user's email, which could lead to unauthorized account access.
- **Fix duplicate user creation on repeated SSO logins**: Feishu and DingTalk SSO now correctly reuse existing accounts instead of creating duplicate users.

### AgentBay — Cloud Computer & Browser Automation
- **New: `agentbay_file_transfer` tool**: Transfer files between any two environments — agent workspace, browser sandbox, cloud desktop (computer), or code sandbox — in any direction.
- **Fix: Computer Take Control (TC) white screen**: TC now connects to the correct environment session (computer vs. browser) based on `env_type`. Previously, an existing browser session could hijack the computer TC connection.
- **Fix: OS-aware desktop paths**: The `agentbay_file_transfer` tool description now automatically reflects the correct paths for the agent's configured OS type:
- Windows: `C:\Users\Administrator\Desktop\`
- Linux: `/home/wuying/Desktop/`
- **Fix: Desktop file refresh**: After uploading to the Linux desktop directory, GNOME is notified to refresh icon display.
- Multiple Take Control stability fixes: CDP polling replaced with sleep, multi-tab cleanup, 40s navigate timeout, unhashable type errors.

### Feishu (Lark) — CardKit Streaming Cards
- Feishu bot responses now stream as animated typing-effect cards using the CardKit API (#287).
- Fixed SSE stream hang issues and websocket proxy bypass for system proxy conflicts.

### DingTalk & Organization Sync
- Fixed DingTalk org sync permissions guide (`Contact.User.Read` scope).
- Fixed `open_id` vs `employee_id` user type handling in Feishu org sync.

### Other Bug Fixes
- **Fix: SSE stream protection** — `finish_reason` break guard added for OpenAI and Gemini streams to prevent runaway streams.
- **Fix: Duplicate tool `send_feishu_message`** — Removed duplicate DB entry; added dedup guard in tool loading to prevent `Tool names must be unique` LLM errors.
- **Fix: JWT token not consumed** on reset-password and verify-email routes.
- **Fix: NULL username/email** for SSO-created users in `list_users`.
- **Fix: Company name slug generation** — Added `anyascii` + `pypinyin` for universal CJK/Latin transliteration.
- **Fix: `publish_page` URL** — Correctly generates `try.clawith.ai` links on source deployments.
- **Fix: Agent template directory** — Dynamic default for source deployments.
- Various i18n fixes (TakeControlPanel, DingTalk guide).
### Agent-to-Agent (A2A) Async Communication — Beta
- **Three communication modes** for `send_message_to_agent`:
- `notify` — fire-and-forget, one-way announcement
- `task_delegate` — delegate work and get results back asynchronously via `on_message` trigger
- `consult` — synchronous question-reply (original behaviour)
- **Feature flag**: controlled at the tenant level via Company Settings → Company Info → A2A Async toggle (default: **OFF**)
- When disabled, the `msg_type` parameter is **hidden from the LLM** so agents only see synchronous consult mode
- Security: chain depth protection (max 3 hops), regex filtering of internal terms, SQL injection prevention
- Performance: async wake sessions limited to 2 tool rounds

### Multimodal Image Context
- Base64 image markers are now persisted to the database at write time
- Chat UI correctly strips `[image_data:]` markers and renders thumbnails
- Fixed chat page vertical scrolling (flexbox `min-height: 0` constraint)
- Removed deprecated `/agents/:id/chat` route

### Search Engine Tools
- New `Exa Search` tool — AI-powered semantic search with category filtering
- New standalone search engine tools: DuckDuckGo, Tavily, Google, Bing (each as own tool)

### UI Improvements
- Drag-and-drop file upload across the application
- Chat sidebar polish: segment control, session items styling
- Agent-to-agent sessions now visible in the admin "Other Users" tab

### Bug Fixes
- DingTalk org sync rate limiting to prevent API throttling
- Tool seeder: `parameters_schema` now correctly included in new tool INSERT
- Unified `msg_type` enum references across codebase
- Docker access port corrected to 3008

---

## Upgrade Guide

> **No database migrations required.** No new environment variables.
> **Database migration required.** Run `alembic upgrade heads` to add the `a2a_async_enabled` column.

### Docker Deployment (Recommended)

```bash
git pull origin main

# Run database migration
docker exec clawith-backend-1 alembic upgrade heads

# Rebuild and restart
docker compose down && docker compose up -d --build
```

Expand All @@ -51,8 +56,8 @@ docker compose down && docker compose up -d --build
```bash
git pull origin main

# Install new Python dependency
pip install anyascii>=0.3.2
# Run database migration
alembic upgrade heads

# Rebuild frontend
cd frontend && npm install && npm run build
Expand All @@ -61,23 +66,13 @@ cd ..
# Restart services
```

### nginx Update Required

A new routing rule has been added to `nginx.conf`. If you manage nginx separately (not via Docker), add this block inside your `server {}` before the WebSocket proxy section:

```nginx
location ~ ^/WW_verify_[A-Za-z0-9]+\.txt$ {
proxy_pass http://backend:8000/api/wecom-verify$request_uri;
proxy_set_header Host $http_host;
proxy_set_header X-Real-IP $remote_addr;
}
```

### Kubernetes (Helm)

```bash
helm upgrade clawith helm/clawith/ -f values.yaml
# Run migration job for a2a_async_enabled column
```

No migration job needed.

### Notes
- The A2A Async feature is **disabled by default**. No behaviour changes until explicitly enabled.
- The `a2a_async_enabled` column defaults to `FALSE`, so existing tenants are unaffected.
2 changes: 1 addition & 1 deletion backend/VERSION
Original file line number Diff line number Diff line change
@@ -1 +1 @@
1.8.2
1.8.3-beta
24 changes: 24 additions & 0 deletions backend/alembic/versions/add_a2a_async_enabled.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,24 @@
"""Add a2a_async_enabled column to tenants table.

Revision ID: f1a2b3c4d5e6
Revises: d9cbd43b62e5
Create Date: 2026-04-10 02:50:00.000000
"""
from alembic import op


revision = "f1a2b3c4d5e6"
down_revision = "d9cbd43b62e5"


def upgrade() -> None:
op.execute(
"ALTER TABLE agents DROP COLUMN IF EXISTS a2a_async_enabled"
)
op.execute(
"ALTER TABLE tenants ADD COLUMN IF NOT EXISTS a2a_async_enabled BOOLEAN DEFAULT FALSE"
)


def downgrade() -> None:
op.execute("ALTER TABLE tenants DROP COLUMN IF EXISTS a2a_async_enabled")
152 changes: 144 additions & 8 deletions backend/app/api/dingtalk.py
Original file line number Diff line number Diff line change
Expand Up @@ -57,7 +57,7 @@ async def configure_dingtalk_channel(
existing.is_configured = True
existing.extra_config = {**existing.extra_config, "connection_mode": conn_mode, "agent_id": dingtalk_agent_id}
await db.flush()

# Restart Stream client if in websocket mode
if conn_mode == "websocket":
from app.services.dingtalk_stream import dingtalk_stream_manager
Expand All @@ -68,7 +68,7 @@ async def configure_dingtalk_channel(
from app.services.dingtalk_stream import dingtalk_stream_manager
import asyncio
asyncio.create_task(dingtalk_stream_manager.stop_client(agent_id))

return ChannelConfigOut.model_validate(existing)

config = ChannelConfig(
Expand Down Expand Up @@ -145,8 +145,19 @@ async def process_dingtalk_message(
conversation_id: str,
conversation_type: str,
session_webhook: str,
image_base64_list: list[str] | None = None,
saved_file_paths: list[str] | None = None,
sender_nick: str = "",
message_id: str = "",
):
"""Process an incoming DingTalk bot message and reply via session webhook."""
"""Process an incoming DingTalk bot message and reply via session webhook.

Args:
image_base64_list: List of base64-encoded image data URIs for vision LLM.
saved_file_paths: List of local file paths where media files were saved.
sender_nick: Display name of the sender from DingTalk.
message_id: DingTalk message ID (used for reactions).
"""
import json
import httpx
from datetime import datetime, timezone
Expand Down Expand Up @@ -207,21 +218,146 @@ async def process_dingtalk_message(
)
history = [{"role": m.role, "content": m.content} for m in reversed(history_r.scalars().all())]

# Build saved_content for DB (no base64 blobs, keep it display-friendly)
import re as _re_dt
_clean_text = _re_dt.sub(
r'\[image_data:data:image/[^;]+;base64,[A-Za-z0-9+/=]+\]',
"", user_text,
).strip()
if saved_file_paths:
from pathlib import Path as _PathDT
_file_prefixes = "\n".join(
f"[file:{_PathDT(p).name}]" for p in saved_file_paths
)
saved_content = f"{_file_prefixes}\n{_clean_text}".strip() if _clean_text else _file_prefixes
else:
saved_content = _clean_text or user_text

# Save user message
db.add(ChatMessage(
agent_id=agent_id, user_id=platform_user_id,
role="user", content=user_text,
role="user", content=saved_content,
conversation_id=session_conv_id,
))
sess.last_message_at = datetime.now(timezone.utc)
await db.commit()

# Build LLM input text: for images, inject base64 markers so vision models can see them
llm_user_text = user_text
if image_base64_list:
image_markers = "\n".join(
f"[image_data:{uri}]" for uri in image_base64_list
)
llm_user_text = f"{user_text}\n{image_markers}" if user_text else image_markers

# ── Set up channel_file_sender so the agent can send files via DingTalk ──
from app.services.agent_tools import channel_file_sender as _cfs
from app.services.dingtalk_stream import (
_upload_dingtalk_media,
_send_dingtalk_media_message,
)

# Load DingTalk credentials from ChannelConfig
_dt_cfg_r = await db.execute(
_select(ChannelConfig).where(
ChannelConfig.agent_id == agent_id,
ChannelConfig.channel_type == "dingtalk",
)
)
_dt_cfg = _dt_cfg_r.scalar_one_or_none()
_dt_app_key = _dt_cfg.app_id if _dt_cfg else None
_dt_app_secret = _dt_cfg.app_secret if _dt_cfg else None

_cfs_token = None
if _dt_app_key and _dt_app_secret:
# Determine send target: group -> conversation_id, P2P -> sender_staff_id
_dt_target_id = conversation_id if conversation_type == "2" else sender_staff_id
_dt_conv_type = conversation_type

async def _dingtalk_file_sender(file_path: str, msg: str = ""):
"""Send a file/image/video via DingTalk proactive message API."""
from pathlib import Path as _P

_fp = _P(file_path)
_ext = _fp.suffix.lower()

# Determine media type from extension
if _ext in (".jpg", ".jpeg", ".png", ".gif", ".bmp", ".webp"):
_media_type = "image"
elif _ext in (".mp4", ".mov", ".avi", ".mkv"):
_media_type = "video"
elif _ext in (".mp3", ".wav", ".ogg", ".amr", ".m4a"):
_media_type = "voice"
else:
_media_type = "file"

# Upload media to DingTalk
_mid = await _upload_dingtalk_media(
_dt_app_key, _dt_app_secret, file_path, _media_type
)

if _mid:
# Send via proactive message API
_ok = await _send_dingtalk_media_message(
_dt_app_key, _dt_app_secret,
_dt_target_id, _mid, _media_type,
_dt_conv_type, filename=_fp.name,
)
if _ok:
# Also send accompany text if provided
if msg:
try:
async with httpx.AsyncClient(timeout=10) as _cl:
await _cl.post(session_webhook, json={
"msgtype": "text",
"text": {"content": msg},
})
except Exception:
pass
return

# Fallback: send a text message with file info
_fallback_parts = []
if msg:
_fallback_parts.append(msg)
_fallback_parts.append(f"[File: {_fp.name}]")
try:
async with httpx.AsyncClient(timeout=10) as _cl:
await _cl.post(session_webhook, json={
"msgtype": "text",
"text": {"content": "\n\n".join(_fallback_parts)},
})
except Exception as _fb_err:
logger.error(f"[DingTalk] Fallback file text also failed: {_fb_err}")

_cfs_token = _cfs.set(_dingtalk_file_sender)

# Call LLM
reply_text = await _call_agent_llm(
db, agent_id, user_text,
history=history, user_id=platform_user_id,
try:
reply_text = await _call_agent_llm(
db, agent_id, llm_user_text,
history=history, user_id=platform_user_id,
)
finally:
# Reset ContextVar
if _cfs_token is not None:
_cfs.reset(_cfs_token)
# Recall thinking reaction (before sending reply)
if message_id and _dt_app_key:
try:
from app.services.dingtalk_reaction import recall_thinking_reaction
await recall_thinking_reaction(
_dt_app_key, _dt_app_secret,
message_id, conversation_id,
)
except Exception as _recall_err:
logger.warning(f"[DingTalk] Failed to recall thinking reaction: {_recall_err}")

has_media = bool(image_base64_list or saved_file_paths)
logger.info(
f"[DingTalk] LLM reply ({'media' if has_media else 'text'} input): "
f"{reply_text[:100]}"
)
logger.info(f"[DingTalk] LLM reply: {reply_text[:100]}")

# Reply via session webhook (markdown)
try:
Expand Down
2 changes: 1 addition & 1 deletion backend/app/api/gateway.py
Original file line number Diff line number Diff line change
Expand Up @@ -413,7 +413,7 @@ async def _send_to_agent_background(
"--- Agent-to-Agent Communication Alert ---\n"
f"You are receiving a direct message from another digital employee ({source_agent_name}). "
"CRITICAL INSTRUCTION: Your direct text reply will automatically be delivered back to them. "
"DO NOT use the `send_agent_message` tool to reply to this conversation. Just reply naturally in text.\n"
"DO NOT use the `send_message_to_agent` tool to reply to this conversation. Just reply naturally in text.\n"
"If they are asking you to create or analyze a file, deliver the file using `send_file_to_agent` after writing it."
)

Expand Down
2 changes: 1 addition & 1 deletion backend/app/api/relationships.py
Original file line number Diff line number Diff line change
Expand Up @@ -303,7 +303,7 @@ async def _regenerate_relationships_file(db: AsyncSession, agent_id: uuid.UUID):
label = AGENT_RELATION_LABELS.get(r.relation, r.relation)
lines.append(f"### {a.name}{a.role_description or '数字员工'}")
lines.append(f"- 关系:{label}")
lines.append(f"- 可以用 send_agent_message 工具给 {a.name} 发消息协作")
lines.append(f"- 可以用 send_message_to_agent 工具给 {a.name} 发消息协作")
if r.description:
lines.append(f"- {r.description}")
lines.append("")
Expand Down
Loading