The open-source SDK for AI evaluation, observability, and optimization
- What is Future AGI?
- Installation
- Authentication
- 30-Second Examples
- Quick Start
- How It Works
- Core Use Cases
- Real-World Use Cases
- Why Choose Future AGI?
- Supported Integrations
- Documentation
- Language Support
- Support & Community
- Contributing
- Testimonials
- Roadmap
- Troubleshooting & FAQ
Your agent passed every eval. Then it hallucinated a refund policy that doesn't exist. Future AGI gives you the tools to catch that — datasets, prompt versioning, knowledge bases, evaluations, and guardrails. One SDK, one feedback loop.
# Get started in 30 seconds
pip install futureagi
export FI_API_KEY="your_key"
export FI_SECRET_KEY="your_secret"👉 Get Free API Keys • View Live Demo • Read Quick Start Guide
- 🎯 Evaluations — 50+ metrics, LLM-as-judge, and custom rubrics powered by the Critique AI agent
- ⚡ Guardrails — Real-time safety checks with sub-100ms latency
- 📊 Datasets — Programmatically create, version, and manage training and test datasets
- 🎨 Prompt Workbench — Version control, A/B testing, and deployment labels for prompts
- 📚 Knowledge Base — Document management and retrieval for RAG applications
- 📈 Analytics — Model performance, token costs, and behavior insights
- 🤖 Simulate — Test your AI system against realistic scenarios before users hit it
- 🔍 Observability — OpenTelemetry-native tracing across 50+ frameworks
pip install futureaginpm install @future-agi/sdk
# or
pnpm add @future-agi/sdkRequirements: Python >= 3.10 | Node.js >= 14
Get your API credentials from the Future AGI Dashboard:
export FI_API_KEY="your_api_key"
export FI_SECRET_KEY="your_secret_key"Or set them programmatically:
import os
os.environ["FI_API_KEY"] = "your_api_key"
os.environ["FI_SECRET_KEY"] = "your_secret_key"Create and manage datasets with built-in evaluations:
from fi.datasets import Dataset
from fi.datasets.types import (
Cell, Column, DatasetConfig, DataTypeChoices,
ModelTypes, Row, SourceChoices
)
# Create a new dataset
config = DatasetConfig(name="qa_dataset", model_type=ModelTypes.GENERATIVE_LLM)
dataset = Dataset(dataset_config=config)
dataset = dataset.create()
# Define columns
columns = [
Column(name="user_query", data_type=DataTypeChoices.TEXT, source=SourceChoices.OTHERS),
Column(name="ai_response", data_type=DataTypeChoices.TEXT, source=SourceChoices.OTHERS),
Column(name="quality_score", data_type=DataTypeChoices.INTEGER, source=SourceChoices.OTHERS),
]
# Add data
rows = [
Row(order=1, cells=[
Cell(column_name="user_query", value="What is machine learning?"),
Cell(column_name="ai_response", value="Machine learning is a subset of AI..."),
Cell(column_name="quality_score", value=9),
]),
Row(order=2, cells=[
Cell(column_name="user_query", value="Explain quantum computing"),
Cell(column_name="ai_response", value="Quantum computing uses quantum bits..."),
Cell(column_name="quality_score", value=8),
]),
]
# Push data and run evaluations
dataset = dataset.add_columns(columns=columns)
dataset = dataset.add_rows(rows=rows)
# Add automated evaluation
dataset.add_evaluation(
name="factual_accuracy",
eval_template="is_factually_consistent",
model="gpt-4o-mini",
required_keys_to_column_names={
"input": "user_query",
"output": "ai_response",
"context": "user_query",
},
run=True
)
print("✓ Dataset created with automated evaluations")Version control and A/B test your prompts:
from fi.prompt import Prompt, PromptTemplate, ModelConfig
# Create a versioned prompt template
template = PromptTemplate(
name="customer_support",
messages=[
{"role": "system", "content": "You are a helpful customer support agent."},
{"role": "user", "content": "Help {{customer_name}} with {{issue_type}}."},
],
variable_names={"customer_name": ["Alice"], "issue_type": ["billing"]},
model_configuration=ModelConfig(model_name="gpt-4o-mini", temperature=0.7)
)
# Create and version the template
client = Prompt(template)
client.create() # Create v1
client.commit_current_version("Initial version", set_default=True)
# Assign deployment labels
client.assign_label("Production", version="v1")
# Compile with variables
compiled = client.compile(customer_name="Bob", issue_type="refund")
print(compiled)
# Output: [
# {"role": "system", "content": "You are a helpful customer support agent."},
# {"role": "user", "content": "Help Bob with refund."}
# ]A/B Testing Example:
import random
from openai import OpenAI
from fi.prompt import Prompt
# Fetch different variants (returns Prompt instances)
variant_a = Prompt.get_template_by_name("customer_support", label="variant-a")
variant_b = Prompt.get_template_by_name("customer_support", label="variant-b")
# Randomly select and use
selected = random.choice([variant_a, variant_b])
compiled = selected.compile(customer_name="Alice", issue_type="refund")
# Send to your LLM provider
openai = OpenAI(api_key="your_openai_key")
response = openai.chat.completions.create(model="gpt-4o", messages=compiled)
print(f"Using variant: {selected.template.name}")
print(f"Response: {response.choices[0].message.content}")Manage documents for retrieval-augmented generation:
from fi.kb import KnowledgeBase
# Initialize client
kb_client = KnowledgeBase(
fi_api_key="your_api_key",
fi_secret_key="your_secret_key"
)
# Create a knowledge base with documents
kb = kb_client.create_kb(
name="product_docs",
file_paths=["manual.pdf", "faq.txt", "guide.docx"]
)
print(f"✓ Knowledge base created: {kb.kb.name}")
print(f" Files uploaded: {len(kb.kb.files)}")
# Update with more files
updated_kb = kb_client.update_kb(
kb_name=kb.kb.name,
file_paths=["updates.pdf"]
)
# Delete specific files
kb_client.delete_files_from_kb(file_names=["updates.pdf"])
# Clean up
kb_client.delete_kb(kb_ids=[kb.kb.id])| Feature | Use Case | Benefit |
|---|---|---|
| Datasets | Store and version training/test data | Reproducible experiments, automated evaluations |
| Prompt Workbench | Version control for prompts | A/B testing, deployment management, rollback |
| Knowledge Base | Evaluations and synthetic data | Intelligent retrieval, document versioning |
| Evaluations | Automated quality checks | No human-in-the-loop, 100% configurable |
| Protect | Real-time safety filters | Sub-100ms latency, production-ready |
| Feature | Future AGI | Traditional Tools | Other Platforms |
|---|---|---|---|
| Evaluation Speed | ⚡ Sub-100ms | 🐌 Seconds-Minutes | 🐢 Minutes-Hours |
| Human in Loop | ❌ Fully Automated | ✅ Required | ✅ Often Required |
| Multimodal Support | ✅ Text, Image, Audio, Video | ||
| Setup Time | ⏱️ 2 minutes | ⏳ Days-Weeks | ⏳ Hours-Days |
| Configurability | 🎯 100% Customizable | 🔒 Fixed Metrics | ⚙️ Some Flexibility |
| Privacy Options | 🔐 Cloud + Self-hosted | ☁️ Cloud Only | ☁️ Cloud Only |
| A/B Testing | ✅ Built-in | ❌ Manual | |
| Prompt Versioning | ✅ Git-like Control | ❌ Not Available | |
| Real-time Guardrails | ✅ Production-ready | ❌ Not Available |
Future AGI works seamlessly with your existing AI stack:
LLM Providers
OpenAI • Anthropic • Google Gemini • Azure OpenAI • AWS Bedrock • Cohere • Mistral • Ollama • vLLM
Frameworks
LangChain • LlamaIndex • CrewAI • AutoGen • Haystack • Semantic Kernel
Vector Databases
Pinecone • Weaviate • Qdrant • Milvus • Chroma • FAISS
Observability
OpenTelemetry • Custom Logging • Trace Context Propagation
| Language | Package | Status |
|---|---|---|
| Python | futureagi |
✅ Full Support |
| TypeScript/JavaScript | @future-agi/sdk |
✅ Full Support |
| REST API | cURL/HTTP | ✅ Available |
- 📧 Email: support@futureagi.com
- 💼 LinkedIn: Future AGI Company
- 🐦 X (Twitter): @FutureAGI_
- 📰 Substack: Future AGI Blog
We welcome contributions! Here's how to get involved:
- 🐛 Report bugs: Open an issue
- 💡 Request features: Start a discussion
- 🔧 Submit PRs: Fork, create a feature branch, and submit a pull request
- 📖 Improve docs: Help us make our documentation better
See CONTRIBUTING.md for detailed guidelines.
"Future AGI cut our evaluation time from days to minutes. The automated critiques are spot-on!"
— AI Engineering Team, Fortune 500 Company
"The prompt versioning alone saved us countless headaches. A/B testing is now trivial."
— ML Lead, Healthcare Startup
"Sub-100ms guardrails in production. Game changer for our customer-facing AI."
— CTO, E-commerce Platform
- Datasets with automated evaluations
- Prompt workbench with versioning
- Knowledge base for RAG
- Real-time guardrails (sub-100ms)
- Multi-language SDK (Python + TypeScript)
- Bulk Annotations for Human in the Loop
- On-premise deployment toolkit
Import Error: `ModuleNotFoundError: No module named 'fi'`
Make sure Future AGI is installed:
pip install futureagi --upgradeAuthentication Error: Invalid API credentials
- Check your API keys at Dashboard
- Ensure environment variables are set correctly:
echo $FI_API_KEY
echo $FI_SECRET_KEY- Try setting them programmatically in your code
How do I switch between environments (dev/staging/prod)?
Use prompt labels to manage different deployment environments:
client.assign_label("Development", version="v1")
client.assign_label("Staging", version="v2")
client.assign_label("Production", version="v3")Can I use Future AGI without sending data to the cloud?
Yes! Future AGI supports self-hosted deployments. Contact us at support@futureagi.com for enterprise on-premise options.
What LLM providers are supported?
All major providers: OpenAI, Anthropic, Google, Azure, AWS Bedrock, Cohere, Mistral, and open-source models via vLLM/Ollama.
Need more help? Check our complete FAQ or join our community.
This project is licensed under the Apache License 2.0 - see the LICENSE file for details.
Built with ❤️ by the Future AGI team and contributors.
If Future AGI helps you ship better AI, a ⭐ helps more teams find us.
🌐 futureagi.com · 📖 docs.futureagi.com · ☁️ app.futureagi.com
