dataelement · wisdomqin · Apr 13, 2026 · Apr 8, 2026 · Apr 9, 2026 · Apr 10, 2026
diff --git a/.env.example b/.env.example
@@ -25,6 +25,9 @@ FEISHU_REDIRECT_URI=http://localhost:3000/auth/feishu/callback
 # Without a key, the tools still work but with lower rate limits
 JINA_API_KEY=
 
+# Exa API key (for exa_search tool and web_search Exa engine — get one at https://exa.ai)
+EXA_API_KEY=
+
 # Public app URL used in user-facing links, such as password reset emails.
 # Leave empty for auto-discovery from the browser request.
 # Set explicitly for production (e.g. https://your-domain.com) — required for

diff --git a/README.md b/README.md
@@ -113,7 +113,7 @@ bash restart.sh
 git clone https://github.com/dataelement/Clawith.git
 cd Clawith && cp .env.example .env
 docker compose up -d
-# → http://localhost:3000
+# → http://localhost:3008
 ```
 
 **To update an existing deployment:**

diff --git a/README_zh-CN.md b/README_zh-CN.md
@@ -108,7 +108,7 @@ bash restart.sh
 git clone https://github.com/dataelement/Clawith.git
 cd Clawith && cp .env.example .env
 docker compose up -d
-# → http://localhost:3000
+# → http://localhost:3008
 ```
 
 **更新已有部署：**

diff --git a/RELEASE_NOTES.md b/RELEASE_NOTES.md
@@ -1,48 +1,53 @@
-# v1.8.2 Release Notes
+# v1.8.3-beta — A2A Async Communication, Image Context & Search Tools
 
 ## What's New
 
-### Security
-- **Fix account takeover via username collision** (#300): Prevents an attacker from creating an account with a username matching an existing SSO user's email, which could lead to unauthorized account access.
-- **Fix duplicate user creation on repeated SSO logins**: Feishu and DingTalk SSO now correctly reuse existing accounts instead of creating duplicate users.
-
-### AgentBay — Cloud Computer & Browser Automation
-- **New: `agentbay_file_transfer` tool**: Transfer files between any two environments — agent workspace, browser sandbox, cloud desktop (computer), or code sandbox — in any direction.
-- **Fix: Computer Take Control (TC) white screen**: TC now connects to the correct environment session (computer vs. browser) based on `env_type`. Previously, an existing browser session could hijack the computer TC connection.
-- **Fix: OS-aware desktop paths**: The `agentbay_file_transfer` tool description now automatically reflects the correct paths for the agent's configured OS type:
-  - Windows: `C:\Users\Administrator\Desktop\`
-  - Linux: `/home/wuying/Desktop/`
-- **Fix: Desktop file refresh**: After uploading to the Linux desktop directory, GNOME is notified to refresh icon display.
-- Multiple Take Control stability fixes: CDP polling replaced with sleep, multi-tab cleanup, 40s navigate timeout, unhashable type errors.
-
-### Feishu (Lark) — CardKit Streaming Cards
-- Feishu bot responses now stream as animated typing-effect cards using the CardKit API (#287).
-- Fixed SSE stream hang issues and websocket proxy bypass for system proxy conflicts.
-
-### DingTalk & Organization Sync
-- Fixed DingTalk org sync permissions guide (`Contact.User.Read` scope).
-- Fixed `open_id` vs `employee_id` user type handling in Feishu org sync.
-
-### Other Bug Fixes
-- **Fix: SSE stream protection** — `finish_reason` break guard added for OpenAI and Gemini streams to prevent runaway streams.
-- **Fix: Duplicate tool `send_feishu_message`** — Removed duplicate DB entry; added dedup guard in tool loading to prevent `Tool names must be unique` LLM errors.
-- **Fix: JWT token not consumed** on reset-password and verify-email routes.
-- **Fix: NULL username/email** for SSO-created users in `list_users`.
-- **Fix: Company name slug generation** — Added `anyascii` + `pypinyin` for universal CJK/Latin transliteration.
-- **Fix: `publish_page` URL** — Correctly generates `try.clawith.ai` links on source deployments.
-- **Fix: Agent template directory** — Dynamic default for source deployments.
-- Various i18n fixes (TakeControlPanel, DingTalk guide).
+### Agent-to-Agent (A2A) Async Communication — Beta
+- **Three communication modes** for `send_message_to_agent`:
+  - `notify` — fire-and-forget, one-way announcement
+  - `task_delegate` — delegate work and get results back asynchronously via `on_message` trigger
+  - `consult` — synchronous question-reply (original behaviour)
+- **Feature flag**: controlled at the tenant level via Company Settings → Company Info → A2A Async toggle (default: **OFF**)
+- When disabled, the `msg_type` parameter is **hidden from the LLM** so agents only see synchronous consult mode
+- Security: chain depth protection (max 3 hops), regex filtering of internal terms, SQL injection prevention
+- Performance: async wake sessions limited to 2 tool rounds
+
+### Multimodal Image Context
+- Base64 image markers are now persisted to the database at write time
+- Chat UI correctly strips `[image_data:]` markers and renders thumbnails
+- Fixed chat page vertical scrolling (flexbox `min-height: 0` constraint)
+- Removed deprecated `/agents/:id/chat` route
+
+### Search Engine Tools
+- New `Exa Search` tool — AI-powered semantic search with category filtering
+- New standalone search engine tools: DuckDuckGo, Tavily, Google, Bing (each as own tool)
+
+### UI Improvements
+- Drag-and-drop file upload across the application
+- Chat sidebar polish: segment control, session items styling
+- Agent-to-agent sessions now visible in the admin "Other Users" tab
+
+### Bug Fixes
+- DingTalk org sync rate limiting to prevent API throttling
+- Tool seeder: `parameters_schema` now correctly included in new tool INSERT
+- Unified `msg_type` enum references across codebase
+- Docker access port corrected to 3008
 
 ---
 
 ## Upgrade Guide
 
-> **No database migrations required.** No new environment variables.
+> **Database migration required.** Run `alembic upgrade heads` to add the `a2a_async_enabled` column.
 
 ### Docker Deployment (Recommended)
 
 ```bash
 git pull origin main
+
+# Run database migration
+docker exec clawith-backend-1 alembic upgrade heads
+
+# Rebuild and restart
 docker compose down && docker compose up -d --build
 ```
 
@@ -51,8 +56,8 @@ docker compose down && docker compose up -d --build
 ```bash
 git pull origin main
 
-# Install new Python dependency
-pip install anyascii>=0.3.2
+# Run database migration
+alembic upgrade heads
 
 # Rebuild frontend
 cd frontend && npm install && npm run build
@@ -61,23 +66,13 @@ cd ..
 # Restart services
 ```
 
-### nginx Update Required
-
-A new routing rule has been added to `nginx.conf`. If you manage nginx separately (not via Docker), add this block inside your `server {}` before the WebSocket proxy section:
-
-```nginx
-location ~ ^/WW_verify_[A-Za-z0-9]+\.txt$ {
-    proxy_pass http://backend:8000/api/wecom-verify$request_uri;
-    proxy_set_header Host $http_host;
-    proxy_set_header X-Real-IP $remote_addr;
-}
-```
-
 ### Kubernetes (Helm)
 
 ```bash
 helm upgrade clawith helm/clawith/ -f values.yaml
+# Run migration job for a2a_async_enabled column
 ```
 
-No migration job needed.
-
+### Notes
+- The A2A Async feature is **disabled by default**. No behaviour changes until explicitly enabled.
+- The `a2a_async_enabled` column defaults to `FALSE`, so existing tenants are unaffected.
diff --git a/backend/VERSION b/backend/VERSION
@@ -1 +1 @@
-1.8.2
+1.8.3-beta
diff --git a/backend/alembic/versions/add_a2a_async_enabled.py b/backend/alembic/versions/add_a2a_async_enabled.py
@@ -0,0 +1,24 @@
+"""Add a2a_async_enabled column to tenants table.
+
+Revision ID: f1a2b3c4d5e6
+Revises: d9cbd43b62e5
+Create Date: 2026-04-10 02:50:00.000000
+"""
+from alembic import op
+
+
+revision = "f1a2b3c4d5e6"
+down_revision = "d9cbd43b62e5"
+
+
+def upgrade() -> None:
+    op.execute(
+        "ALTER TABLE agents DROP COLUMN IF EXISTS a2a_async_enabled"
+    )
+    op.execute(
+        "ALTER TABLE tenants ADD COLUMN IF NOT EXISTS a2a_async_enabled BOOLEAN DEFAULT FALSE"
+    )
+
+
+def downgrade() -> None:
+    op.execute("ALTER TABLE tenants DROP COLUMN IF EXISTS a2a_async_enabled")
diff --git a/backend/app/api/dingtalk.py b/backend/app/api/dingtalk.py
@@ -57,7 +57,7 @@ async def configure_dingtalk_channel(
         existing.is_configured = True
         existing.extra_config = {**existing.extra_config, "connection_mode": conn_mode, "agent_id": dingtalk_agent_id}
         await db.flush()
-        
+
         # Restart Stream client if in websocket mode
         if conn_mode == "websocket":
             from app.services.dingtalk_stream import dingtalk_stream_manager
@@ -68,7 +68,7 @@ async def configure_dingtalk_channel(
             from app.services.dingtalk_stream import dingtalk_stream_manager
             import asyncio
             asyncio.create_task(dingtalk_stream_manager.stop_client(agent_id))
-            
+
         return ChannelConfigOut.model_validate(existing)
 
     config = ChannelConfig(
@@ -145,8 +145,19 @@ async def process_dingtalk_message(
     conversation_id: str,
     conversation_type: str,
     session_webhook: str,
+    image_base64_list: list[str] | None = None,
+    saved_file_paths: list[str] | None = None,
+    sender_nick: str = "",
+    message_id: str = "",
 ):
-    """Process an incoming DingTalk bot message and reply via session webhook."""
+    """Process an incoming DingTalk bot message and reply via session webhook.
+
+    Args:
+        image_base64_list: List of base64-encoded image data URIs for vision LLM.
+        saved_file_paths: List of local file paths where media files were saved.
+        sender_nick: Display name of the sender from DingTalk.
+        message_id: DingTalk message ID (used for reactions).
+    """
     import json
     import httpx
     from datetime import datetime, timezone
@@ -207,21 +218,146 @@ async def process_dingtalk_message(
         )
         history = [{"role": m.role, "content": m.content} for m in reversed(history_r.scalars().all())]
 
+        # Build saved_content for DB (no base64 blobs, keep it display-friendly)
+        import re as _re_dt
+        _clean_text = _re_dt.sub(
+            r'\[image_data:data:image/[^;]+;base64,[A-Za-z0-9+/=]+\]',
+            "", user_text,
+        ).strip()
+        if saved_file_paths:
+            from pathlib import Path as _PathDT
+            _file_prefixes = "\n".join(
+                f"[file:{_PathDT(p).name}]" for p in saved_file_paths
+            )
+            saved_content = f"{_file_prefixes}\n{_clean_text}".strip() if _clean_text else _file_prefixes
+        else:
+            saved_content = _clean_text or user_text
+
         # Save user message
         db.add(ChatMessage(
             agent_id=agent_id, user_id=platform_user_id,
-            role="user", content=user_text,
+            role="user", content=saved_content,
             conversation_id=session_conv_id,
         ))
         sess.last_message_at = datetime.now(timezone.utc)
         await db.commit()
 
+        # Build LLM input text: for images, inject base64 markers so vision models can see them
+        llm_user_text = user_text
+        if image_base64_list:
+            image_markers = "\n".join(
+                f"[image_data:{uri}]" for uri in image_base64_list
+            )
+            llm_user_text = f"{user_text}\n{image_markers}" if user_text else image_markers
+
+        # ── Set up channel_file_sender so the agent can send files via DingTalk ──
+        from app.services.agent_tools import channel_file_sender as _cfs
+        from app.services.dingtalk_stream import (
+            _upload_dingtalk_media,
+            _send_dingtalk_media_message,
+        )
+
+        # Load DingTalk credentials from ChannelConfig
+        _dt_cfg_r = await db.execute(
+            _select(ChannelConfig).where(
+                ChannelConfig.agent_id == agent_id,
+                ChannelConfig.channel_type == "dingtalk",
+            )
+        )
+        _dt_cfg = _dt_cfg_r.scalar_one_or_none()
+        _dt_app_key = _dt_cfg.app_id if _dt_cfg else None
+        _dt_app_secret = _dt_cfg.app_secret if _dt_cfg else None
+
+        _cfs_token = None
+        if _dt_app_key and _dt_app_secret:
+            # Determine send target: group -> conversation_id, P2P -> sender_staff_id
+            _dt_target_id = conversation_id if conversation_type == "2" else sender_staff_id
+            _dt_conv_type = conversation_type
+
+            async def _dingtalk_file_sender(file_path: str, msg: str = ""):
+                """Send a file/image/video via DingTalk proactive message API."""
+                from pathlib import Path as _P
+
+                _fp = _P(file_path)
+                _ext = _fp.suffix.lower()
+
+                # Determine media type from extension
+                if _ext in (".jpg", ".jpeg", ".png", ".gif", ".bmp", ".webp"):
+                    _media_type = "image"
+                elif _ext in (".mp4", ".mov", ".avi", ".mkv"):
+                    _media_type = "video"
+                elif _ext in (".mp3", ".wav", ".ogg", ".amr", ".m4a"):
+                    _media_type = "voice"
+                else:
+                    _media_type = "file"
+
+                # Upload media to DingTalk
+                _mid = await _upload_dingtalk_media(
+                    _dt_app_key, _dt_app_secret, file_path, _media_type
+                )
+
+                if _mid:
+                    # Send via proactive message API
+                    _ok = await _send_dingtalk_media_message(
+                        _dt_app_key, _dt_app_secret,
+                        _dt_target_id, _mid, _media_type,
+                        _dt_conv_type, filename=_fp.name,
+                    )
+                    if _ok:
+                        # Also send accompany text if provided
+                        if msg:
+                            try:
+                                async with httpx.AsyncClient(timeout=10) as _cl:
+                                    await _cl.post(session_webhook, json={
+                                        "msgtype": "text",
+                                        "text": {"content": msg},
+                                    })
+                            except Exception:
+                                pass
+                        return
+
+                # Fallback: send a text message with file info
+                _fallback_parts = []
+                if msg:
+                    _fallback_parts.append(msg)
+                _fallback_parts.append(f"[File: {_fp.name}]")
+                try:
+                    async with httpx.AsyncClient(timeout=10) as _cl:
+                        await _cl.post(session_webhook, json={
+                            "msgtype": "text",
+                            "text": {"content": "\n\n".join(_fallback_parts)},
+                        })
+                except Exception as _fb_err:
+                    logger.error(f"[DingTalk] Fallback file text also failed: {_fb_err}")
+
+            _cfs_token = _cfs.set(_dingtalk_file_sender)
+
         # Call LLM
-        reply_text = await _call_agent_llm(
-            db, agent_id, user_text,
-            history=history, user_id=platform_user_id,
+        try:
+            reply_text = await _call_agent_llm(
+                db, agent_id, llm_user_text,
+                history=history, user_id=platform_user_id,
+            )
+        finally:
+            # Reset ContextVar
+            if _cfs_token is not None:
+                _cfs.reset(_cfs_token)
+            # Recall thinking reaction (before sending reply)
+            if message_id and _dt_app_key:
+                try:
+                    from app.services.dingtalk_reaction import recall_thinking_reaction
+                    await recall_thinking_reaction(
+                        _dt_app_key, _dt_app_secret,
+                        message_id, conversation_id,
+                    )
+                except Exception as _recall_err:
+                    logger.warning(f"[DingTalk] Failed to recall thinking reaction: {_recall_err}")
+
+        has_media = bool(image_base64_list or saved_file_paths)
+        logger.info(
+            f"[DingTalk] LLM reply ({'media' if has_media else 'text'} input): "
+            f"{reply_text[:100]}"
         )
-        logger.info(f"[DingTalk] LLM reply: {reply_text[:100]}")
 
         # Reply via session webhook (markdown)
         try:

diff --git a/backend/app/api/gateway.py b/backend/app/api/gateway.py
@@ -413,7 +413,7 @@ async def _send_to_agent_background(
                 "--- Agent-to-Agent Communication Alert ---\n"
                 f"You are receiving a direct message from another digital employee ({source_agent_name}). "
                 "CRITICAL INSTRUCTION: Your direct text reply will automatically be delivered back to them. "
-                "DO NOT use the `send_agent_message` tool to reply to this conversation. Just reply naturally in text.\n"
+                "DO NOT use the `send_message_to_agent` tool to reply to this conversation. Just reply naturally in text.\n"
                 "If they are asking you to create or analyze a file, deliver the file using `send_file_to_agent` after writing it."
             )
 

diff --git a/backend/app/api/relationships.py b/backend/app/api/relationships.py
@@ -303,7 +303,7 @@ async def _regenerate_relationships_file(db: AsyncSession, agent_id: uuid.UUID):
             label = AGENT_RELATION_LABELS.get(r.relation, r.relation)
             lines.append(f"### {a.name} — {a.role_description or '数字员工'}")
             lines.append(f"- 关系：{label}")
-            lines.append(f"- 可以用 send_agent_message 工具给 {a.name} 发消息协作")
+            lines.append(f"- 可以用 send_message_to_agent 工具给 {a.name} 发消息协作")
             if r.description:
                 lines.append(f"- {r.description}")
             lines.append("")