Deterministic tools for exact tasks that small models still miss.
FluxEM is now tools-first.
- Supported product surface:
fluxem-tools-pkg/->fluxem-tools - Supported release scope: a validated core of 35 tools across 9 domains
- Experimental or archived:
fluxem/, old RL/research harnesses, research docs, and older benchmark artifacts
Archive details live in docs/ARCHIVE_STATUS.md.
fluxem-tools is a deterministic tool catalog for LLM tool calling. The package registry currently contains 210 tools across 42 domains, but the supported release surface is intentionally narrower:
arithmeticnumber_theoryphysicschemistrybiologystatisticsfinancetemporalinformation_theory
That validated core is the current product story. The broader catalog stays in-repo as experimental surface until it is benchmarked and documented to the same standard.
fluxem-tools-pkg/- supported package, release gate, catalog manifest, validation scriptsexamples/- small-model and OpenAI-compatible integration examplesbenchmarks/- current Qwen3.5-2B benchmark harnessfluxem/- archived research trackexperiments/- archived or experimental evaluation/training workdocs/- package docs plus archived research pages
pip install fluxem-toolsfrom fluxem_tools import call_tool
print(call_tool("arithmetic", expr="789123 * 456789"))
print(call_tool("physics_convert", value=88, from_unit="ft/s", to_unit="m/s"))
print(call_tool("finance_npv", rate=0.1, cashflows=[-100, 30, 40, 50, 60]))Benchmarks and local model downloads should use the SSD-backed Hugging Face root at /Volumes/VIXinSSD/huggingface.
export FLUXEM_HF_ROOT=/Volumes/VIXinSSD/huggingface
export HF_HOME=/Volumes/VIXinSSD/huggingface
export HF_HUB_CACHE=/Volumes/VIXinSSD/huggingface/hub
export HF_DATASETS_CACHE=/Volumes/VIXinSSD/huggingface/datasets
export TRANSFORMERS_CACHE=/Volumes/VIXinSSD/huggingface/hubServe the validated model locally and run with retrieved tools:
# Start the MLX server
pip install mlx-lm
mlx_lm.server --model mlx-community/Qwen3.5-2B-OptiQ-4bit --port 8000
# Run a query
uv run python examples/qwen35_2b_openai_tools.py \
"What is 789123 * 456789?" \
--base-url http://127.0.0.1:8000/v1Qwen3.5-2B (4-bit, MLX) on 18 cases across 9 validated domains:
| Condition | Accuracy | Avg Latency |
|---|---|---|
| Model only | 38.9% | 908 ms |
| Retrieved tools (max 8) | 88.9% | 911 ms |
| Full registry (210 tools) | 72.2% | 1302 ms |
The retrieved-tools path is the recommended integration for models under 7B. Smaller tool surfaces produce better results on small models.
Entry points:
examples/qwen35_2b_openai_tools.py- end-to-end examplebenchmarks/qwen35_2b_tools_benchmark.py- benchmark harness
The package now owns the release-facing catalog artifacts:
To regenerate them:
cd fluxem-tools-pkg
python scripts/build_catalog_artifacts.py --strict-corepip install "fluxem-tools[mcp]"
fluxem-tools-mcpThe server exposes:
fluxem_searchfluxem_call
The algebraic embedding work remains in the repository as archive material only. It is not the supported product story and it is not the release path for current FluxEM work.