[improve][misc] Add AGENTS.md and split contributor/coding/architecture/security docs, with .agents skills by lhotari · Pull Request #25871 · apache/pulsar

lhotari · 2026-05-26T08:44:59Z

Motivation

The repository had GitHub Copilot review instructions (.github/copilot-instructions.md) but no
top-level guide for general AI coding agents (Claude Code, Cursor, Gemini, Codex, Aider, …), and no
single, concise contributor reference for the build / test / contribution workflow after the
Maven→Gradle migration (PIP-463).

This PR adds that guidance and — following review feedback on the first iteration — structures it the
way apache/groovy and apache/grails organize theirs:
human-facing split docs at the repository root, a concise AGENTS.md index, and per-task skills under
.agents/skills/ that are loaded on demand so agents don't pull every instruction into context (a
token-economy concern for contributors on metered plans). The coding/contribution conventions were
distilled from recurring guidance in past apache/pulsar PR reviews.

Important

New logging convention — please confirm the direction. The coding guidelines document a
preference for the slog structured-logging library via Lombok
@CustomLog, with SLF4J treated as deprecated for new code, structured attributes + lazy
evaluation instead of isDebugEnabled() guards, and defaulting new logs to TRACE/DEBUG. slog is
already wired into the build; this PR is the first time the preference is written down.

Modifications

Root docs (human-facing, the source of truth) + an AGENTS.md index + .agents/skills/:

AGENTS.md — concise router/index: a "Licensing and provenance (read first)" section (ASF
Generative Tooling guidance: human-in-the-loop
accountability, provenance/licensing, attribution), a canonical-docs table, a skills table, critical
rules, and where to ask.
ARCHITECTURE.md (new) — module map, the Gradle build infrastructure, the
module-name-vs-directory gotcha, the pip/ proposals, the (undocumented) concurrency model and
backpressure.
CODING.md (new) — style, data types, async/CompletableFuture, concurrency + Java Memory
Model, logging (slog), resource/memory management, performance, dependencies, backward compatibility
(incl. plugin/SPI extension points), testing conventions, and the review checklist.
CONTRIBUTING.md — expanded from a website-pointer stub into the local dev workflow: build,
lint, --tests-scoped runs, test groups, integration tests, Personal CI, PR conventions, scope &
branches/backports, security reporting.
SECURITY.md — reporting, disclosure hygiene, the (informal) security model & threat scope, and
checking exposure to an already-public CVE.
.agents/skills/ (new) — lean, on-demand guardrail skills (pulsar-build, pulsar-tests,
pulsar-pr-workflow, pulsar-security) that cite the canonical docs rather than restating them,
plus a README.md index.
CLAUDE.md and .github/copilot-instructions.md are now symlinks to AGENTS.md.
.github/PULL_REQUEST_TEMPLATE.md — Closes #xyz accepted alongside Fixes #xyz; notes the
CI-enforced title prefixes and that Motivation/Modifications are required.
README.md — a short note on checking exposure to an already-public CVE.

Note

.github/copilot-instructions.md becoming a symlink to AGENTS.md means Copilot's detailed review
guidance now lives in CODING.md (which AGENTS.md links to). If Copilot doesn't follow the symlink
or traverse to CODING.md, its in-review guidance is thinner than before — worth confirming. (Done
per the review suggestion to symlink the per-tool files to AGENTS.md.)

Verifying this change

Make sure that the change passes the CI checks.

This change is a trivial rework / code cleanup without any test coverage. It is documentation-only
(Markdown, plus two symlinks); all changed paths are excluded from RAT / Checkstyle / Spotless.

Does this pull request potentially affect one of the following parts:

If the box was checked, please highlight the changes

…resource cleanup guidance

asafm · 2026-05-26T09:38:32Z

@lhotari Couple of high level notes first:

From what I know, it's best to have a dedicated folder for agent documentation (e.g. .ai) and place any LLM agent instructions there.
Once you have (1), you can split the context per what is needed - everything related to contributing (setting up local dev env, rules for PR creation ,etc) goes to CONTRIBUTING.md; Architecture and how Pulsar works can go to ARCHITECTURE.md, and coding guidelines and best practices can go to CODING.md.
Once you have (2) AGENTS.md is effectively an index - describing the different LLM doc files you created in (2), and where they are located.
Once you have (3), all is left is to have the files for other LLMs be a symlink to AGENTS.md so , github-instructions.md and CLAUDE.md is just symlink.

…cations

lhotari · 2026-05-26T10:03:33Z

@lhotari Couple of high level notes first:

From what I know, it's best to have a dedicated folder for agent documentation (e.g. .ai) and place any LLM agent instructions there.

Once you have (1), you can split the context per what is needed - everything related to contributing (setting up local dev env, rules for PR creation ,etc) goes to CONTRIBUTING.md; Architecture and how Pulsar works can go to ARCHITECTURE.md, and coding guidelines and best practices can go to CODING.md.

Once you have (2) AGENTS.md is effectively an index - describing the different LLM doc files you created in (2), and where they are located.

Once you have (3), all is left is to have the files for other LLMs be a symlink to AGENTS.md so , github-instructions.md and CLAUDE.md is just symlink.

@asafm thanks for the suggestions. I guess we can iterate on this in further PRs. I won't have time to make such improvements at this time. Would you like to take over the restructuring after this PR has been merged?

btw. Some repositories provide skills for AI agents for performing specific tasks in the repository. example: https://github.com/apache/grails-core/blob/7.0.x/AGENTS.md#available-skills . I guess such a solution could save tokens so that the agent doesn't always pull in all information in AGENTS.md and referenced files into the context.

lhotari · 2026-05-26T10:07:34Z

This seems to be a good example where CONTRIBUTING.md and ARCHITECTURE.md are referenced:
https://github.com/apache/groovy/blob/master/AGENTS.md
There's also a good amount of skills to be used by agents (or humans driving an agent):
https://github.com/apache/groovy/blob/master/AGENTS.md#skills

lhotari · 2026-05-26T10:10:49Z

I guess I could use an agent (Claude Code) to restructure according to a similar approach that is used in apache/groovy or apache/grails.

…RE/CODING/SECURITY, add .agents/skills Follow apache/groovy's layout per PR review feedback: AGENTS.md becomes a concise router/index; the detail moves to human-facing CONTRIBUTING.md (dev/build/test/PR/CI), ARCHITECTURE.md (modules + build), CODING.md (conventions + review checklist), and SECURITY.md (reporting, disclosure hygiene, public-CVE checks). Task-specific guardrails live under .agents/skills/ (pulsar-build, pulsar-tests, pulsar-pr-workflow, pulsar-security) and are loaded on demand to keep agent context small. CLAUDE.md and .github/copilot-instructions.md are now symlinks to AGENTS.md.

…e with the agent docs

…ns, perimeter security, no malicious-DoS protection)

…ty fixes

… shares patch privately

…public pre-release or private repo)

…formance/GC guidance Expand CODING.md's Concurrency section with the Java Memory Model rules that have historically tripped up Pulsar code: synchronization needs the same lock for reads and writes, fields shared across threads need volatile, immutable vs. effectively-immutable objects and safe publication/initialization, preferring DefaultThreadFactory/FastThreadLocalThread, and how to reproduce timing/platform-dependent bugs. Add a ZGC + Netty Recycler note (PIP-443) and a JMH-benchmark guideline (microbench/). Add a Concurrency-model gap and Backpressure (PIP-442) section to ARCHITECTURE.md, and point the pulsar-tests skill at the reproduction guidance.

lhotari · 2026-05-26T11:43:44Z

@asafm I have addressed your feedback about splitting into multiple files. PTAL

…tenated Map keys

…R approval gate

…s to finish first

…from PR review feedback Add recurring, generalizable guidance distilled from past apache/pulsar PR reviews: - CODING.md: data-type conventions (records, narrowest interface type, factory methods, minimize method/constructor params, builders incl. records, naming); async per-call-site evaluation + checkArgumentAsync; concurrency lock-scope; backward-compat for plugin/SPI interfaces (default methods, no third-party types in public APIs, opt-in behavior changes); Performance section (hot-path costs, no overhead under load, bounded caches/StringInterner); testing terminology (unit vs container integration tests), SharedPulsarBaseTest usage, integration-style vs unit-test design, JMH benchmarks, Awaitility. - CONTRIBUTING.md: focused PRs / no drive-by refactor or reformatting, large-refactor discussion on dev@, branches & backports, /pulsarbot rerun-failure-checks and flaky-test handling. - ARCHITECTURE.md: PIP-number reservation via dev@ thread. - AGENTS.md: "stay in scope" critical rule. - Skills (pulsar-build/tests/pr-workflow): matching guardrails.

Rewrite the four SKILL.md files as a lean, on-demand guardrail layer that cites the canonical docs (CODING/CONTRIBUTING/ARCHITECTURE/SECURITY) instead of duplicating their prose, to keep agent context small; trim frontmatter to name + description. Expand CONTRIBUTING.md backport guidance: maintainers handle backports, cherry-pick in merge order, dependent changes first, and drop branch-4.1 from the example.

…ce to AGENTS.md

…CURITY.md

…ks to the canonical policy Slim README's Build section to a short quick-start that refers to CONTRIBUTING/ARCHITECTURE/CODING/AGENTS for detail instead of repeating it, and make the security section consistent with SECURITY.md. Point README's security links to https://github.com/apache/pulsar/security/policy, and note in SECURITY.md, AGENTS.md, CONTRIBUTING.md, and the pulsar-security skill that the latest SECURITY.md is maintained there (so forks reference the canonical copy).

…tation in AGENTS.md

…ated pulsarbot command - Reorder "Disclosure hygiene" so the project-team-commits-the-fix paragraph comes first; clarify the commit-message/PR neutrality rules are for whoever commits the fix (the project team), and gate them on the vulnerability being announced. - Note that already-public dependency CVEs are an exception: name the CVE id directly in the PR title/description. - Document only `/pulsarbot rerun` (rerun-failure-checks is deprecated): it re-runs the failed jobs of a completed workflow run.

…architecture overview and DeepWiki

…and fix cross-references

…cl. public forks) vs public CVE dependency bumps

[improve][misc] Add AGENTS.md and update copilot-instructions.md

5dfde69

lhotari requested review from Technoboy-, asafm, dao-jun, merlimat and nodece May 26, 2026 08:45

[improve][misc] Clarify thread leak detector false positives in test …

d6e0726

…resource cleanup guidance

[improve][misc] Require PR description to cover motivation and modifi…

84d12bd

…cations

lhotari added 4 commits May 26, 2026 13:29

[improve][misc] Add Closes alternative for Fixes and align PR templat…

149014c

…e with the agent docs

[improve][misc] Document Pulsar security model scope (trusted functio…

d8294ad

…ns, perimeter security, no malicious-DoS protection)

[improve][misc] Reword DoS phrasing in security model scope

c78df01

lhotari changed the title ~~[improve][misc] Add AGENTS.md and refresh contributor/agent convention docs~~ [improve][misc] Add AGENTS.md and restructure contributor/agent docs with .agents/skills May 26, 2026

lhotari added 6 commits May 26, 2026 13:47

[improve][misc] Link slog to its repository on mentions in agent docs

d410011

[improve][misc] Require PMC go-ahead before pushing non-public securi…

728c326

…ty fixes

[improve][misc] Clarify project team commits security fixes; reporter…

0d514a0

… shares patch privately

[improve][misc] Clarify how the project team commits security fixes (…

1426a19

…public pre-release or private repo)

[improve][misc] Add GitHub Discussions to Where to ask

d6c1316

lhotari added 4 commits May 26, 2026 14:49

[improve][misc] Clarify no explicit protection against malicious DoS

a33c57c

[improve][misc] Prefer named Java records over Pair returns and conca…

af13258

…tenated Map keys

[improve][misc] Document /pulsarbot rerun for flaky CI and the fork-P…

8182e03

…R approval gate

[improve][misc] Note /pulsarbot rerun requires the previous run's job…

9934370

…s to finish first

dao-jun approved these changes May 26, 2026

View reviewed changes

lhotari added 2 commits May 26, 2026 17:10

[improve][misc] Add licensing & provenance (human-in-the-loop) guidan…

923448d

…ce to AGENTS.md

lhotari changed the title ~~[improve][misc] Add AGENTS.md and restructure contributor/agent docs with .agents/skills~~ [improve][misc] Add AGENTS.md and split contributor/coding/architecture/security docs, with .agents skills May 26, 2026

lhotari added 2 commits May 26, 2026 17:23

[improve][misc] Require human verification for security reports in SE…

8ab032a

…CURITY.md

asafm reviewed May 26, 2026

View reviewed changes

Comment thread AGENTS.md Outdated

Comment thread CLAUDE.md

Comment thread .github/PULL_REQUEST_TEMPLATE.md

Comment thread .github/PULL_REQUEST_TEMPLATE.md

Comment thread .github/PULL_REQUEST_TEMPLATE.md

lhotari added 3 commits May 26, 2026 17:44

[improve][misc] Note pulsar-site as the source of the project documen…

012afc3

…tation in AGENTS.md

[improve][misc] ARCHITECTURE.md: Oxia/ZooKeeper metadata store, link …

9f58c04

…architecture overview and DeepWiki

lhotari requested a review from asafm May 26, 2026 15:22

lhotari added 2 commits May 26, 2026 18:28

[improve][misc] Move SECURITY.md security model section near the end …

663e12f

…and fix cross-references

[improve][misc] Clarify security PR rules: undisclosed-vuln fixes (in…

5db4d01

…cl. public forks) vs public CVE dependency bumps

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[improve][misc] Add AGENTS.md and split contributor/coding/architecture/security docs, with .agents skills#25871

[improve][misc] Add AGENTS.md and split contributor/coding/architecture/security docs, with .agents skills#25871
lhotari wants to merge 27 commits into
apache:masterfrom
lhotari:lh-AGENTS.md

lhotari commented May 26, 2026 •

edited

Loading

Uh oh!

asafm commented May 26, 2026

Uh oh!

lhotari commented May 26, 2026

Uh oh!

lhotari commented May 26, 2026

Uh oh!

lhotari commented May 26, 2026 •

edited

Loading

Uh oh!

lhotari commented May 26, 2026

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

lhotari commented May 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Motivation

Modifications

Verifying this change

Does this pull request potentially affect one of the following parts:

Uh oh!

asafm commented May 26, 2026

Uh oh!

lhotari commented May 26, 2026

Uh oh!

lhotari commented May 26, 2026

Uh oh!

lhotari commented May 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

lhotari commented May 26, 2026

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

lhotari commented May 26, 2026 •

edited

Loading

lhotari commented May 26, 2026 •

edited

Loading