docs: sync README files with latest repo state by yyiilluu · Pull Request #139 · ReflexioAI/reflexio

yyiilluu · 2026-06-09T18:54:25Z

Summary

Updates code-map README navigation for the latest server embedding routing architecture.
Documents long-interaction prompt slicing via REFLEXIO_MAX_INTERACTION_CONTENT_TOKENS in the service-layer maps.

Repositories/submodules reviewed

Parent: yyiilluu/reflexio-enterprise
Submodule: ReflexioAI/reflexio

README files updated

reflexio/server/README.md
reflexio/server/services/README.md

Validation

git diff --check
Local README relative link check for changed files

Notes/Risks

Updates follow /root/reflexio-enterprise/how_to_write_readme.md and are limited to model-facing code-map README files.
No parent submodule pointer commit is included; this PR is scoped to the submodule repository.

Summary by CodeRabbit

Documentation

Added documentation on embedding provider configuration modes: cloud, local_service, internal_service, inprocess, and off.
Clarified local embedding model routing with automatic fallback support when services are unavailable.
Updated documentation on interaction content token limits for embedding processing.

coderabbitai · 2026-06-09T18:54:38Z

📝 Walkthrough

Walkthrough

Documentation updates across server README files clarify embedding provider routing through shared daemon services with fallback, per-interaction content slicing with token limits, and tool-usage prefixing behavior in interaction history formatting.

Changes

Server and service documentation updates

Layer / File(s)	Summary
Embedding routing and content construction documentation `reflexio/server/README.md`, `reflexio/server/services/README.md`	Documents embedding provider routing for `local/*` models through shared OpenAI-compatible daemon with fallback to in-process embedders and `REFLEXIO_EMBEDDING_PROVIDER` modes. Clarifies per-interaction embedding content slicing bounded by `REFLEXIO_MAX_INTERACTION_CONTENT_TOKENS` and the embedding call path through `llm/litellm_client.py` and `embedding_service_provider.py`.
Tool-usage behavior documentation `reflexio/server/README.md`	Updates `service_utils.py` documentation to clarify that `format_interactions_to_history_string()` always prepends tool-usage info when present, without conditional logic.

🎯 1 (Trivial) | ⏱️ ~3 minutes

A rabbit hops through docs so bright,
Embedding daemons route just right,
With fallbacks and tokens all in place,
Tool wisdom flows without a trace. 🐰✨

🚥 Pre-merge checks | ✅ 5

✅ Passed checks (5 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title 'docs: sync README files with latest repo state' directly and accurately describes the changeset, which updates two README.md files to reflect current embedding routing and interaction slicing behavior.
Docstring Coverage	✅ Passed	No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch docs/sync-readmes-20260609

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@reflexio/server/README.md`:
- Line 226: The README description is inaccurate: update the bullet for
service_utils.py to state that format_interactions_to_history_string() slices
interaction.content using REFLEXIO_MAX_INTERACTION_CONTENT_TOKENS via
_resolve_max_interaction_content_tokens() and inserts a truncation marker when
content is sliced, and clarify that the “[used tool: ...]” prefix is conditional
(added only when interaction.content is truthy and interaction.tools_used
exists) rather than always prepended; mention that tool-only turns in tests
include non-empty placeholder content so the marker line still appears.

In `@reflexio/server/services/README.md`:
- Around line 45-46: Update the README paragraph to reflect the actual routing
logic: state that LiteLLMClient.get_embedding() chooses between calling
embedding_service_provider.get_service_embeddings() (HTTP/shared daemon) and the
in-process LocalEmbedder based on
embedding_provider_mode()/should_use_embedding_service(); clarify that daemon
selection requires the daemon /health probe to report a matching active_model
and healthy status, and that setting CLAUDE_SMART_USE_LOCAL_EMBEDDING=1 or
REFLEXIO_EMBEDDING_PROVIDER=local_service|internal_service bypasses the
probe—when bypassed the code may raise EmbeddingUnavailableError instead of
falling back to the in-process embedder.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro Plus

Run ID: 2e6207c4-fdb2-404b-aff4-f25c342bb48c

📥 Commits

Reviewing files that changed from the base of the PR and between 671e822 and 61074b9.

📒 Files selected for processing (2)

reflexio/server/README.md
reflexio/server/services/README.md

coderabbitai · 2026-06-09T18:59:24Z

 - `operation_state_utils.py`: Centralized `OperationStateManager` for all `_operation_state` table interactions (progress tracking, concurrency locks, extractor/aggregator bookmarks, simple locks)
 - `deduplication_utils.py`: Shared utilities for LLM-based deduplication (used by ProfileDeduplicator and PlaybookConsolidator)
- `service_utils.py`: Utilities (`construct_messages_from_interactions()`, `format_interactions_to_history_string()` (prepends tool usage info when `tools_used` is present), `extract_json_from_string()`, `log_model_response()` for colored LLM response logging)
+- `service_utils.py`: Utilities (`construct_messages_from_interactions()`, `format_interactions_to_history_string()` (prepends tool usage info and slices long interaction content via `REFLEXIO_MAX_INTERACTION_CONTENT_TOKENS`), `extract_json_from_string()`, `log_model_response()` for colored LLM response logging)


⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash # Verify format_interactions_to_history_string implementation # Find the function definition ast-grep --pattern $'def format_interactions_to_history_string($$$) { $$$ }' # Search for tool usage prepending logic rg -nP 'format_interactions_to_history_string|tool.?usage.*prepend|prepend.*tool' --type=py -C5 # Verify REFLEXIO_MAX_INTERACTION_CONTENT_TOKENS usage rg -nP 'REFLEXIO_MAX_INTERACTION_CONTENT_TOKENS' --type=py -C3 # Check service_utils.py specifically fd -t f 'service_utils.py' reflexio/server/services --exec cat {}

Repository: ReflexioAI/reflexio

Length of output: 50377

Fix the README behavioral wording for tool prefixes and confirm slicing.

format_interactions_to_history_string() slices interaction.content using REFLEXIO_MAX_INTERACTION_CONTENT_TOKENS via _resolve_max_interaction_content_tokens() (and inserts the truncation marker when over budget); this matches the slicing claim.

Tool-usage “prepending” is conditional: the [used tool: ...] prefix is added only when interaction.content is truthy and interaction.tools_used is present. The tests show tool-only turns are expected to carry non-empty placeholder content (e.g., "(tool call)") so the marker line is still emitted—so the “always prepends” phrasing in the description should be qualified.

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@reflexio/server/README.md` at line 226, The README description is inaccurate: update the bullet for service_utils.py to state that format_interactions_to_history_string() slices interaction.content using REFLEXIO_MAX_INTERACTION_CONTENT_TOKENS via _resolve_max_interaction_content_tokens() and inserts a truncation marker when content is sliced, and clarify that the “[used tool: ...]” prefix is conditional (added only when interaction.content is truthy and interaction.tools_used exists) rather than always prepended; mention that tool-only turns in tests include non-empty placeholder content so the marker line still appears.

coderabbitai · 2026-06-09T18:59:24Z

+Embedding requests leave the service layer through `llm/litellm_client.py`; local models (`local/minilm-l6-v2`, `local/nomic-embed-v1.5`) are routed by `llm/providers/embedding_service_provider.py` to the shared embedding daemon when reachable, otherwise to the in-process embedder.
+


⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash # Verify embedding routing from service layer through LiteLLM client # Check litellm_client.py for embedding delegation fd -t f 'litellm_client.py' reflexio/server/llm --exec rg -nP 'get_embedding|embed' {} -C5 # Verify embedding_service_provider.py routing logic fd -t f 'embedding_service_provider.py' --exec cat {} # Confirm model name examples are documented or used rg -nP 'minilm-l6-v2|nomic-embed-v1\.5|local/minilm|local/nomic' --type=py -C2 # Check daemon fallback logic rg -nP 'daemon.*reachable|embedding.*daemon.*fall|in-process.*embedder' --type=py -C3

Repository: ReflexioAI/reflexio

Length of output: 50375

Align README embedding routing paragraph with actual provider mode logic

The routing described is consistent with the implementation: LiteLLMClient.get_embedding() decides between the HTTP embedding service (embedding_service_provider.get_service_embeddings()) and the in-process local embedder (LocalEmbedder) based on embedding_provider_mode() / should_use_embedding_service().

The model examples local/minilm-l6-v2 and local/nomic-embed-v1.5 are valid (they’re referenced by the provider-mode logic and covered by tests).

The daemon/fallback behavior in the README is accurate for the default “model-driven” routing, but the docs should also mention that daemon selection depends on the daemon health probe’s /health response (including active_model match), and that setting CLAUDE_SMART_USE_LOCAL_EMBEDDING=1 or REFLEXIO_EMBEDDING_PROVIDER=local_service/internal_service bypasses the probe and can raise EmbeddingUnavailableError instead of falling back to in-process.

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@reflexio/server/services/README.md` around lines 45 - 46, Update the README paragraph to reflect the actual routing logic: state that LiteLLMClient.get_embedding() chooses between calling embedding_service_provider.get_service_embeddings() (HTTP/shared daemon) and the in-process LocalEmbedder based on embedding_provider_mode()/should_use_embedding_service(); clarify that daemon selection requires the daemon /health probe to report a matching active_model and healthy status, and that setting CLAUDE_SMART_USE_LOCAL_EMBEDDING=1 or REFLEXIO_EMBEDDING_PROVIDER=local_service|internal_service bypasses the probe—when bypassed the code may raise EmbeddingUnavailableError instead of falling back to the in-process embedder.

docs: sync README files with latest repo state

61074b9

coderabbitai Bot reviewed Jun 9, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

docs: sync README files with latest repo state#139

docs: sync README files with latest repo state#139
yyiilluu wants to merge 1 commit into
mainfrom
docs/sync-readmes-20260609

yyiilluu commented Jun 9, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented Jun 9, 2026 •

edited

Loading

Walkthrough

Changes

Uh oh!

coderabbitai Bot left a comment

Uh oh!

coderabbitai Bot Jun 9, 2026

Uh oh!

coderabbitai Bot Jun 9, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

		Embedding requests leave the service layer through `llm/litellm_client.py`; local models (`local/minilm-l6-v2`, `local/nomic-embed-v1.5`) are routed by `llm/providers/embedding_service_provider.py` to the shared embedding daemon when reachable, otherwise to the in-process embedder.

Conversation

yyiilluu commented Jun 9, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Repositories/submodules reviewed

README files updated

Validation

Notes/Risks

Summary by CodeRabbit

Documentation

Uh oh!

coderabbitai Bot commented Jun 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 9, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 9, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

yyiilluu commented Jun 9, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Jun 9, 2026 •

edited

Loading