Skip to content

docs: sync README files with latest repo state#139

Open
yyiilluu wants to merge 1 commit into
mainfrom
docs/sync-readmes-20260609
Open

docs: sync README files with latest repo state#139
yyiilluu wants to merge 1 commit into
mainfrom
docs/sync-readmes-20260609

Conversation

@yyiilluu

@yyiilluu yyiilluu commented Jun 9, 2026

Copy link
Copy Markdown
Contributor

Summary

  • Updates code-map README navigation for the latest server embedding routing architecture.
  • Documents long-interaction prompt slicing via REFLEXIO_MAX_INTERACTION_CONTENT_TOKENS in the service-layer maps.

Repositories/submodules reviewed

  • Parent: yyiilluu/reflexio-enterprise
  • Submodule: ReflexioAI/reflexio

README files updated

  • reflexio/server/README.md
  • reflexio/server/services/README.md

Validation

  • git diff --check
  • Local README relative link check for changed files

Notes/Risks

  • Updates follow /root/reflexio-enterprise/how_to_write_readme.md and are limited to model-facing code-map README files.
  • No parent submodule pointer commit is included; this PR is scoped to the submodule repository.

Summary by CodeRabbit

Documentation

  • Added documentation on embedding provider configuration modes: cloud, local_service, internal_service, inprocess, and off.
  • Clarified local embedding model routing with automatic fallback support when services are unavailable.
  • Updated documentation on interaction content token limits for embedding processing.

@coderabbitai

coderabbitai Bot commented Jun 9, 2026

Copy link
Copy Markdown

Review Change Stack

📝 Walkthrough

Walkthrough

Documentation updates across server README files clarify embedding provider routing through shared daemon services with fallback, per-interaction content slicing with token limits, and tool-usage prefixing behavior in interaction history formatting.

Changes

Server and service documentation updates

Layer / File(s) Summary
Embedding routing and content construction documentation
reflexio/server/README.md, reflexio/server/services/README.md
Documents embedding provider routing for local/* models through shared OpenAI-compatible daemon with fallback to in-process embedders and REFLEXIO_EMBEDDING_PROVIDER modes. Clarifies per-interaction embedding content slicing bounded by REFLEXIO_MAX_INTERACTION_CONTENT_TOKENS and the embedding call path through llm/litellm_client.py and embedding_service_provider.py.
Tool-usage behavior documentation
reflexio/server/README.md
Updates service_utils.py documentation to clarify that format_interactions_to_history_string() always prepends tool-usage info when present, without conditional logic.

🎯 1 (Trivial) | ⏱️ ~3 minutes

A rabbit hops through docs so bright,
Embedding daemons route just right,
With fallbacks and tokens all in place,
Tool wisdom flows without a trace. 🐰✨

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'docs: sync README files with latest repo state' directly and accurately describes the changeset, which updates two README.md files to reflect current embedding routing and interaction slicing behavior.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch docs/sync-readmes-20260609

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@reflexio/server/README.md`:
- Line 226: The README description is inaccurate: update the bullet for
service_utils.py to state that format_interactions_to_history_string() slices
interaction.content using REFLEXIO_MAX_INTERACTION_CONTENT_TOKENS via
_resolve_max_interaction_content_tokens() and inserts a truncation marker when
content is sliced, and clarify that the “[used tool: ...]” prefix is conditional
(added only when interaction.content is truthy and interaction.tools_used
exists) rather than always prepended; mention that tool-only turns in tests
include non-empty placeholder content so the marker line still appears.

In `@reflexio/server/services/README.md`:
- Around line 45-46: Update the README paragraph to reflect the actual routing
logic: state that LiteLLMClient.get_embedding() chooses between calling
embedding_service_provider.get_service_embeddings() (HTTP/shared daemon) and the
in-process LocalEmbedder based on
embedding_provider_mode()/should_use_embedding_service(); clarify that daemon
selection requires the daemon /health probe to report a matching active_model
and healthy status, and that setting CLAUDE_SMART_USE_LOCAL_EMBEDDING=1 or
REFLEXIO_EMBEDDING_PROVIDER=local_service|internal_service bypasses the
probe—when bypassed the code may raise EmbeddingUnavailableError instead of
falling back to the in-process embedder.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro Plus

Run ID: 2e6207c4-fdb2-404b-aff4-f25c342bb48c

📥 Commits

Reviewing files that changed from the base of the PR and between 671e822 and 61074b9.

📒 Files selected for processing (2)
  • reflexio/server/README.md
  • reflexio/server/services/README.md

Comment thread reflexio/server/README.md
- `operation_state_utils.py`: Centralized `OperationStateManager` for all `_operation_state` table interactions (progress tracking, concurrency locks, extractor/aggregator bookmarks, simple locks)
- `deduplication_utils.py`: Shared utilities for LLM-based deduplication (used by ProfileDeduplicator and PlaybookConsolidator)
- `service_utils.py`: Utilities (`construct_messages_from_interactions()`, `format_interactions_to_history_string()` (prepends tool usage info when `tools_used` is present), `extract_json_from_string()`, `log_model_response()` for colored LLM response logging)
- `service_utils.py`: Utilities (`construct_messages_from_interactions()`, `format_interactions_to_history_string()` (prepends tool usage info and slices long interaction content via `REFLEXIO_MAX_INTERACTION_CONTENT_TOKENS`), `extract_json_from_string()`, `log_model_response()` for colored LLM response logging)

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Verify format_interactions_to_history_string implementation

# Find the function definition
ast-grep --pattern $'def format_interactions_to_history_string($$$) {
  $$$
}'

# Search for tool usage prepending logic
rg -nP 'format_interactions_to_history_string|tool.?usage.*prepend|prepend.*tool' --type=py -C5

# Verify REFLEXIO_MAX_INTERACTION_CONTENT_TOKENS usage
rg -nP 'REFLEXIO_MAX_INTERACTION_CONTENT_TOKENS' --type=py -C3

# Check service_utils.py specifically
fd -t f 'service_utils.py' reflexio/server/services --exec cat {}

Repository: ReflexioAI/reflexio

Length of output: 50377


Fix the README behavioral wording for tool prefixes and confirm slicing.

format_interactions_to_history_string() slices interaction.content using REFLEXIO_MAX_INTERACTION_CONTENT_TOKENS via _resolve_max_interaction_content_tokens() (and inserts the truncation marker when over budget); this matches the slicing claim.

Tool-usage “prepending” is conditional: the [used tool: ...] prefix is added only when interaction.content is truthy and interaction.tools_used is present. The tests show tool-only turns are expected to carry non-empty placeholder content (e.g., "(tool call)") so the marker line is still emitted—so the “always prepends” phrasing in the description should be qualified.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@reflexio/server/README.md` at line 226, The README description is inaccurate:
update the bullet for service_utils.py to state that
format_interactions_to_history_string() slices interaction.content using
REFLEXIO_MAX_INTERACTION_CONTENT_TOKENS via
_resolve_max_interaction_content_tokens() and inserts a truncation marker when
content is sliced, and clarify that the “[used tool: ...]” prefix is conditional
(added only when interaction.content is truthy and interaction.tools_used
exists) rather than always prepended; mention that tool-only turns in tests
include non-empty placeholder content so the marker line still appears.

Comment on lines +45 to +46
Embedding requests leave the service layer through `llm/litellm_client.py`; local models (`local/minilm-l6-v2`, `local/nomic-embed-v1.5`) are routed by `llm/providers/embedding_service_provider.py` to the shared embedding daemon when reachable, otherwise to the in-process embedder.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Verify embedding routing from service layer through LiteLLM client

# Check litellm_client.py for embedding delegation
fd -t f 'litellm_client.py' reflexio/server/llm --exec rg -nP 'get_embedding|embed' {} -C5

# Verify embedding_service_provider.py routing logic
fd -t f 'embedding_service_provider.py' --exec cat {}

# Confirm model name examples are documented or used
rg -nP 'minilm-l6-v2|nomic-embed-v1\.5|local/minilm|local/nomic' --type=py -C2

# Check daemon fallback logic
rg -nP 'daemon.*reachable|embedding.*daemon.*fall|in-process.*embedder' --type=py -C3

Repository: ReflexioAI/reflexio

Length of output: 50375


Align README embedding routing paragraph with actual provider mode logic

  • The routing described is consistent with the implementation: LiteLLMClient.get_embedding() decides between the HTTP embedding service (embedding_service_provider.get_service_embeddings()) and the in-process local embedder (LocalEmbedder) based on embedding_provider_mode() / should_use_embedding_service().
  • The model examples local/minilm-l6-v2 and local/nomic-embed-v1.5 are valid (they’re referenced by the provider-mode logic and covered by tests).
  • The daemon/fallback behavior in the README is accurate for the default “model-driven” routing, but the docs should also mention that daemon selection depends on the daemon health probe’s /health response (including active_model match), and that setting CLAUDE_SMART_USE_LOCAL_EMBEDDING=1 or REFLEXIO_EMBEDDING_PROVIDER=local_service/internal_service bypasses the probe and can raise EmbeddingUnavailableError instead of falling back to in-process.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@reflexio/server/services/README.md` around lines 45 - 46, Update the README
paragraph to reflect the actual routing logic: state that
LiteLLMClient.get_embedding() chooses between calling
embedding_service_provider.get_service_embeddings() (HTTP/shared daemon) and the
in-process LocalEmbedder based on
embedding_provider_mode()/should_use_embedding_service(); clarify that daemon
selection requires the daemon /health probe to report a matching active_model
and healthy status, and that setting CLAUDE_SMART_USE_LOCAL_EMBEDDING=1 or
REFLEXIO_EMBEDDING_PROVIDER=local_service|internal_service bypasses the
probe—when bypassed the code may raise EmbeddingUnavailableError instead of
falling back to the in-process embedder.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant