docs: add llms.txt ecosystem hub at site root#22003
Open
timsaucer wants to merge 1 commit intoapache:mainfrom
Open
docs: add llms.txt ecosystem hub at site root#22003timsaucer wants to merge 1 commit intoapache:mainfrom
timsaucer wants to merge 1 commit intoapache:mainfrom
Conversation
Adds docs/source/llms.txt following the llmstxt.org schema as a directory hub for the DataFusion ecosystem: links to the core Rust user/library/ contributor guides, Rust API docs, and the Python/Ballista/Comet subproject docs roots. Configures Sphinx html_extra_path so the file is served verbatim at https://datafusion.apache.org/llms.txt, and excludes it from the RAT license-header check (markdown body cannot carry the standard "..." comment header). Per the convention noted in the file, agents can probe each subproject docs root for its own llms.txt — keeps the hub future-proof without hardcoding pending URLs. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Contributor
There was a problem hiding this comment.
Pull request overview
Adds an agent-facing llms.txt “ecosystem hub” at the documentation site root (https://datafusion.apache.org/llms.txt) to improve automated discovery of DataFusion and subproject documentation.
Changes:
- Adds
docs/source/llms.txtdescribing core DataFusion docs, subproject docs roots, and optional links. - Configures Sphinx to copy
llms.txtverbatim to the built site root viahtml_extra_path. - Excludes
docs/source/llms.txtfrom the Apache RAT license-header check.
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 1 comment.
| File | Description |
|---|---|
| docs/source/llms.txt | New hub file linking to core DataFusion docs and subproject docs roots. |
| docs/source/conf.py | Copies llms.txt into the built HTML output root. |
| dev/release/rat_exclude_files.txt | Excludes the new llms.txt from RAT header enforcement. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Which issue does this PR close?
apache/datafusionllms.txthub.Rationale for this change
llms.txt is an emerging convention for exposing a machine-readable, agent-facing entry point at a site's docs root. Subprojects in the DataFusion ecosystem are starting to publish their own (
apache/datafusion-pythonPR apache/datafusion-python#1505 added one). The maindatafusion.apache.orgsite is the natural top-level discovery point for the whole ecosystem, so it should expose a hubllms.txtthat points agents at:<docs root>/llms.txtfor project-specific guidance.Net effect: an agent fetching
https://datafusion.apache.org/llms.txtlands in a categorized directory of the entire ecosystem's agent guidance.What changes are included in this PR?
docs/source/llms.txt— new file, llmstxt.org schema. Sections: Core DataFusion (Rust), Subprojects, Optional. The Subprojects section links to docs roots (not pendingllms.txtURLs) and includes a one-line note describing the probe convention so the hub stays correct as subprojects ship their own files.docs/source/conf.py—html_extra_path = ["llms.txt"]so Sphinx copies the file verbatim to the build output root, served athttps://datafusion.apache.org/llms.txt.dev/release/rat_exclude_files.txt— excludedocs/source/llms.txtfrom the RAT license-header check (the file body is rendered markdown and cannot carry the standard..comment header without breaking the format).Are these changes tested?
No automated tests. The change is a single static file plus a Sphinx config line that mirrors a pattern already used in
apache/datafusion-python(html_extra_path = ["llms.txt"], PR apache/datafusion-python#1505). Verification will be done at deploy time: confirmhttps://datafusion.apache.org/llms.txtresolves and renders.Are there any user-facing changes?
Yes — adds a new public URL
https://datafusion.apache.org/llms.txt. No existing pages are modified. No API changes.