add docs for RAGAS by davidwtf · Pull Request #160 · alauda/aml-docs

davidwtf · 2026-03-23T16:37:16Z

Summary by CodeRabbit

Documentation
- Added a comprehensive guide for evaluating RAG systems with the Ragas SDK: dataset field requirements, categories of metrics with what they measure, recommended modern workflow for scoring, prerequisites, and guidance for interpreting results.
- Added an interactive Jupyter notebook demonstrating a complete end-to-end RAG evaluation flow with runnable examples, scoring patterns, aggregated metrics, per-row outputs, and troubleshooting tips.

coderabbitai · 2026-03-23T16:37:55Z

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: feaccca0-703d-4b85-9ace-36ad87340d8d

📥 Commits

Reviewing files that changed from the base of the PR and between 3cfcebd and add5485.

📒 Files selected for processing (1)

docs/en/ragas.mdx

✅ Files skipped from review due to trivial changes (1)

docs/en/ragas.mdx

Walkthrough

Added two new RAG evaluation docs: a detailed docs/en/ragas.mdx guide describing dataset fields, metric categories, and the modern Ragas SDK scoring flow; and docs/public/ragas-rag-eval.ipynb, a runnable notebook with env config, client setup, async scoring, and aggregation examples.

Changes

Cohort / File(s)	Summary
RAG Evaluation Documentation `docs/en/ragas.mdx`, `docs/public/ragas-rag-eval.ipynb`	New documentation page and companion notebook describing dataset schema (`user_input`, `retrieved_contexts`, `response`, `reference`), Ragas metric categories and required arguments, modern SDK flow using `llm`/`embeddings` clients, usage of `ascore()` (and `score()` for sync), prerequisites, troubleshooting, and result interpretation. Notebook includes environment-variable configuration, async OpenAI-compatible client construction, dataset creation, per-row async scoring, and aggregate metric computation.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~12 minutes

Poem

🐰 I hopped through docs and a notebook bright,

Found contexts, references, metrics in sight.
Async calls hum, scores gather and beam,
A rabbit’s small cheer for a tidy RAG dream. 🥕

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 inconclusive)

Check name	Status	Explanation	Resolution
Title check	❓ Inconclusive	The title 'add docs for RAGAS' is generic and vague. It describes the action of adding documentation but lacks specificity about what aspect of RAGAS is documented, failing to convey meaningful information about the changeset.	Consider a more descriptive title such as 'Add documentation for RAG evaluation using Ragas SDK' to clearly communicate the primary purpose of the documentation.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Docstring Coverage	✅ Passed	No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch feat/ragas-docs

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 1

🧹 Nitpick comments (2)

docs/public/ragas-rag-eval.ipynb (2)

41-45: Add commented version-pinned installation example.

The documentation (ragas.mdx line 107) emphasizes version pinning for reproducibility, but the notebook installs packages without version constraints. Consider adding a commented alternative showing pinned versions.

📌 Suggested addition for version pinning

 # Use current kernel's Python so PATH does not point to another env
 # If download is slow, add: -i https://pypi.tuna.tsinghua.edu.cn/simple
+# For reproducible benchmarks, pin versions (example):
+# !{sys.executable} -m pip install "ragas==0.1.9" "datasets==2.18.0" "openai==1.12.0"
 import sys
 !{sys.executable} -m pip install "ragas" "datasets" "openai"

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@docs/public/ragas-rag-eval.ipynb` around lines 41 - 45, The notebook
currently installs packages with a generic pip command using sys.executable
("!{sys.executable} -m pip install \"ragas\" \"datasets\" \"openai\"") which
conflicts with the repo guidance to pin versions; add a commented alternative
right after that line showing a version-pinned install (e.g., a commented pip
command that pins specific versions for ragas, datasets, and openai) and include
a short comment explaining this is for reproducibility and when to use it; keep
the original unpinned command intact but make the pinned example clearly visible
and updatable.

199-247: LGTM: Correct baseline metrics evaluation.

The evaluation flow properly instantiates metrics with required dependencies (llm, embeddings), uses async ascore() calls with correct arguments, and computes both aggregate and per-row results.

Optional: Consider adding error handling for API failures.

For production use, wrapping the scoring calls in try-except blocks could provide better resilience against transient API failures.

🛡️ Optional error handling pattern

async def score_baseline_rows(ds):
    rows = ds.to_list()
    scored = []
    for idx, row in enumerate(rows):
        try:
            faithfulness_result = await faithfulness_metric.ascore(
                user_input=row["user_input"],
                response=row["response"],
                retrieved_contexts=row["retrieved_contexts"],
            )
            answer_relevancy_result = await answer_relevancy_metric.ascore(
                user_input=row["user_input"],
                response=row["response"],
            )
            scored.append({
                "user_input": row["user_input"],
                "faithfulness": faithfulness_result.value,
                "answer_relevancy": answer_relevancy_result.value,
            })
        except Exception as e:
            print(f"Error scoring row {idx}: {e}")
            # Optionally append None values or skip
    return scored

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@docs/public/ragas-rag-eval.ipynb` around lines 199 - 247, Wrap the
asynchronous scoring loop in score_baseline_rows with error handling so
transient API/LLM failures don't crash the whole evaluation: inside
score_baseline_rows (where faithfulness_metric.ascore and
answer_relevancy_metric.ascore are called) add a try/except around the two await
calls, log or print the exception with the row index and user_input, and decide
on a stable fallback (e.g., append a result with None or sentinel values for
"faithfulness" and "answer_relevancy", or skip the row) before continuing;
ensure exceptions from either faithfulness_metric.ascore or
answer_relevancy_metric.ascore are caught so the loop continues for remaining
rows.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@docs/en/ragas.mdx`:
- Around line 71-88: The docs show embedding_factory(..., interface="modern")
but the notebook uses OpenAIEmbeddings(), causing inconsistent examples; pick
one pattern and make both examples match. Either update the notebook to
instantiate embeddings via embedding_factory(provider="openai",
model="your-embedding-model", client=client, interface="modern") to mirror the
MDX example, or update the MDX snippet to use OpenAIEmbeddings(...) (the same
constructor and client as the notebook); ensure references to embedding_factory
and OpenAIEmbeddings in the examples are consistent and keep llm_factory usage
unchanged.

---

Nitpick comments:
In `@docs/public/ragas-rag-eval.ipynb`:
- Around line 41-45: The notebook currently installs packages with a generic pip
command using sys.executable ("!{sys.executable} -m pip install \"ragas\"
\"datasets\" \"openai\"") which conflicts with the repo guidance to pin
versions; add a commented alternative right after that line showing a
version-pinned install (e.g., a commented pip command that pins specific
versions for ragas, datasets, and openai) and include a short comment explaining
this is for reproducibility and when to use it; keep the original unpinned
command intact but make the pinned example clearly visible and updatable.
- Around line 199-247: Wrap the asynchronous scoring loop in score_baseline_rows
with error handling so transient API/LLM failures don't crash the whole
evaluation: inside score_baseline_rows (where faithfulness_metric.ascore and
answer_relevancy_metric.ascore are called) add a try/except around the two await
calls, log or print the exception with the row index and user_input, and decide
on a stable fallback (e.g., append a result with None or sentinel values for
"faithfulness" and "answer_relevancy", or skip the row) before continuing;
ensure exceptions from either faithfulness_metric.ascore or
answer_relevancy_metric.ascore are caught so the loop continues for remaining
rows.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 16c039f1-c2e4-4515-b3b6-b3a98288383c

📥 Commits

Reviewing files that changed from the base of the PR and between 904d33b and 3cfcebd.

📒 Files selected for processing (2)

docs/en/ragas.mdx
docs/public/ragas-rag-eval.ipynb

docs/en/ragas.mdx

cloudflare-workers-and-pages · 2026-03-23T16:43:34Z

Deploying alauda-ai with Cloudflare Pages

Latest commit:	`add5485`
Status:	✅ Deploy successful!
Preview URL:	https://f91cec6e.alauda-ai.pages.dev
Branch Preview URL:	https://feat-ragas-docs.alauda-ai.pages.dev

View logs

add docs for RAGAS

3cfcebd

coderabbitai bot reviewed Mar 23, 2026

View reviewed changes

docs/en/ragas.mdx Show resolved Hide resolved

update

add5485

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

add docs for RAGAS#160

add docs for RAGAS#160
davidwtf wants to merge 2 commits intomasterfrom
feat/ragas-docs

davidwtf commented Mar 23, 2026 •

edited by coderabbitai bot

Loading

Uh oh!

coderabbitai bot commented Mar 23, 2026 •

edited

Loading

❌ Failed checks (1 inconclusive)

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

cloudflare-workers-and-pages bot commented Mar 23, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

davidwtf commented Mar 23, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Mar 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Poem

❌ Failed checks (1 inconclusive)

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

cloudflare-workers-and-pages bot commented Mar 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Deploying alauda-ai with Cloudflare Pages

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

davidwtf commented Mar 23, 2026 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Mar 23, 2026 •

edited

Loading

cloudflare-workers-and-pages bot commented Mar 23, 2026 •

edited

Loading