CLAUDE.md

This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.

Project Overview

This is a data repository containing nanopublications created by three specialized bots. Nanopublications are minimalist semantic publications in RDF Trig format, cryptographically signed with RSA keys.

The project creator is Tobias Kuhn (ORCID: 0000-0002-1267-0234).

Repository Structure

Each bot has its own directory with a consistent layout:

examples/ — reference nanopublication showing expected structure
output/ — generated unsigned nanopublications
signed/ — cryptographically signed versions (ready for publishing)
README.md — bot IRI and key file location

Bots

Bot	Domain	Key File
`doibot`	Academic papers (via DOI/Crossref)	`~/.nanopub/doibot_id_rsa`
`biodivbot`	Organism-environment associations (BioLink, ENVO, UBERON ontologies)	`~/.nanopub/biodivbot_id_rsa`
`ai-in-edu-bot`	AI approaches in education research	`~/.nanopub/ai-in-edu-bot_id_rsa`

Nanopublication Structure

Every nanopub (.trig file) contains four named graphs:

Head — links to the other three graphs
Assertion — the semantic claims (domain-specific RDF triples)
Provenance — attribution and source references
PublicationInfo — metadata, bot identity, and RSA signature (in signed versions)

The plain.introtemplate.trig at the repo root is a template for introducing new bots to the nanopub network.

nanopub-java CLI

The ./np wrapper script runs the nanopub-java CLI from the sibling ../nanopub-java repo. If the JAR isn't built yet:

mvn -f ../nanopub-java clean package -DskipTests

Raw CLI reference (use the scripts below for common operations):

./np sign -k <key> <file.trig> -o <out.trig>  # sign with specific key
./np publish <signed.trig>                     # publish to nanopub network
./np check <file.trig>                         # validate a nanopub
# Retract a nanopub (must specify -s <signer-IRI> or you'll get a NullPointerException):
./np retract -i <nanopub-uri-or-file> -k <key> -s <signer-IRI> -p
# For doibot: -s https://w3id.org/np/RAkkUz7qBJ-BIOCHV_4WCTgHCdTyI25_bnRuw166SXjwM/DOI-bot
# (The docs imply -s must be an ORCID, but any IRI including the bot IRI works.)

Superseding nanopublications

To publish an updated version of a nanopub, add an npx:supersedes triple in the pubinfo graph pointing to the old nanopub's URI:

this:pubinfo {
  this: ...
    npx:supersedes <https://w3id.org/np/RAold...> ;
    ...
}

Then sign and publish the new nanopub. The old one remains immutable on the network but is marked as superseded.

For index nanopubs, mkindex -x <old-index-uri> adds the supersedes link automatically:

./np mkindex -x <old-index-uri> -o new-index.trig -t "Title" file1.trig file2.trig

Scripts

The scripts/ directory contains helpers for common tasks. All scripts take <name> as the file basename without the .trig extension.

Signing and publishing

scripts/sign.sh <bot> <name>          # sign output/<name>.trig → signed/signed.<name>.trig
scripts/publish.sh <bot> <name>       # publish signed/signed.<name>.trig
scripts/sign-publish.sh <bot> <name>  # sign + publish in one step
scripts/check.sh <bot> <name>         # validate output/<name>.trig

Timestamps

scripts/timestamp.sh   # prints e.g. 2026-02-23T14:05:31.000+01:00

Always use local time (not UTC). Do not use date -u.

Doibot: generate a draft nanopub from a DOI

scripts/doi-to-trig.sh 10.1145/3460210.3493567

Fetches CrossRef metadata, auto-searches ORCID for each author, and writes a ready-to-edit draft to doibot/output/<name>.trig. Authors whose ORCID was auto-found are marked # VERIFY; unresolved ones are left as blank nodes (:<firstname>-<lastname>) with # TODO comments listing any candidates. After reviewing and fixing the draft, sign and publish with scripts/sign-publish.sh doibot <name>.

Always verify author order against the publisher page — CrossRef order is unreliable.

Do not run doi-to-trig.sh calls in parallel — the script uses a per-process temp file for CrossRef data (now fixed), but the ORCID search calls inside also write to shared temp paths. Run sequentially.

Doibot: inspect CrossRef metadata

scripts/crossref-meta.sh 10.1145/3460210.3493567   # title, type, author order, ISSN, abstract
scripts/orcid-search-all.sh 10.1145/3460210.3493567 # ORCID lookup for all authors at once

Use these to manually inspect what doi-to-trig.sh will use, or to troubleshoot.

Doibot: find papers without nanopubs

scripts/find-missing-nanopubs.sh 0000-0002-1267-0234

Compares an author's ORCID works list against doibot/output/ and lists papers that don't have a nanopub yet.

Items to skip in the output:

Preprints: 10.48550/arXiv.*, 10.1101/* (bioRxiv/medRxiv), 10.31219/osf.io/*, 10.7287/peerj.preprints.*
Author corrections and addenda (title starts with "Author Correction:", "Addendum:", "Authors' Response to Peer Reviews")
Items typed [other] (usually preprints, workshop abstracts, or non-standard documents)
Repository DOIs like 10.5167/* (UZH institutional repository) — not in CrossRef

Malformed DOIs: Some old ORCID entries use hyphens before chapter numbers instead of underscores (e.g. 10.1007/978-3-642-38288-8-33). Try replacing the final - with _ — the correctly formatted version may already have a nanopub.

DOI metadata (raw Turtle)

scripts/doi-meta.sh 10.1007/11799511_7

Returns raw Turtle RDF from DOI content negotiation. Useful for checking what the DOI resolver returns directly. Note: doi-to-trig.sh uses CrossRef instead, which has better-structured data.

ORCID lookup

scripts/orcid-search.sh Kuhn Tobias              # search by name (handles diacritics)
scripts/orcid-verify.sh 0000-0002-1267-0234      # show name + employment history
scripts/orcid-works.sh 0000-0002-1267-0234       # list works (for disambiguation)

Common names may return multiple results — verify by checking works or employment history. Use ORCID URIs (e.g. orcid:0000-0002-1267-0234) in nanopubs.

Do not run orcid-verify.sh / orcid-works.sh in parallel — stdout gets interleaved and results become unreadable. Run sequentially.

Disambiguation heuristic: If ORCID search returns exactly 1 match for someone in a specialized field, it's generally safe to accept. When multiple matches exist, orcid-works.sh is usually faster than orcid-verify.sh for disambiguation.

Frequently appearing co-authors (Tobias Kuhn's papers)

These ORCIDs come up repeatedly and don't need re-lookup:

Name	ORCID
Michel Dumontier	0000-0003-4727-9435
Michael Krauthammer	0000-0002-4808-1845
Egon Willighagen	0000-0001-7542-0286
Albert Meroño-Peñuela	0000-0003-4646-5842
Victor de Boer	0000-0001-9079-039X
Guus Schreiber	0000-0002-2400-1185
Martin Volk	0000-0002-2063-4516
Kurt Winkler	0000-0003-0197-9743

ROR lookup

scripts/ror-search.sh "Vrije Universiteit Amsterdam"
scripts/ror-verify.sh 008xxew50

Checking existing doibot nanopubs on the network

Before creating a nanopub for a paper, check whether one already exists on the network. The find-missing-nanopubs.sh script only compares against local doibot/output/ files — it will miss papers that were published by a previous session but whose local files were deleted.

Query the nanopub network (returns SPARQL JSON):

# All papers by an author on the network:
curl -s "https://query.knowledgepixels.com/api/RA7X8hbsozQjZCv4RfWGIgzEA6qr9Ds6RL5kQnB7GHThc/get-papers-for-author?author=https://orcid.org/0000-0002-1267-0234" | python3 -c "import sys,json; [print(b['label']['value'], '|', b['np']['value']) for b in json.load(sys.stdin)['results']['bindings']]"

Or use the nanodash UI: https://nanodash.knowledgepixels.com/query?runquery=RA7X8hbsozQjZCv4RfWGIgzEA6qr9Ds6RL5kQnB7GHThc/get-papers-for-author&queryparam_author=https://orcid.org/0000-0002-1267-0234

DOI case sensitivity: DOIs are case-insensitive but the network treats them as separate URIs. If a paper already has a nanopub with https://doi.org/10.1162/COLI_a_00168 (uppercase), creating one with the lowercase form will result in a duplicate. Always check the network before creating.

scripts/check-author-nanopubs.sh 0000-0002-1267-0234   # prints query URL for nanodash

Workflow: creating/updating nanopubs

Creating a new doibot nanopub:

scripts/doi-to-trig.sh <doi> — generates the draft, auto-resolves ORCIDs
Review the draft in doibot/output/<name>.trig: fix # TODO items, verify # VERIFY ORCIDs, check author order against publisher page, add affiliations if known
scripts/sign-publish.sh doibot <name>

Other bots (biodivbot, ai-in-edu-bot): edit the output file manually, then sign and publish.

When updating an existing nanopub:

Update the dct:created timestamp: scripts/timestamp.sh
Add/update npx:supersedes in the pubinfo graph pointing to the old signed URI
Then re-sign and publish

Personal information policy

Only include personal information (names, email addresses, affiliations, ORCIDs, etc.) in a nanopublication if it is already permanently and openly published — e.g. expressed in a published scientific paper or made available online by the person themselves under a permanent open license (such as CC0 or CC-BY).

License

All bot nanopublications must be published under CC0 (https://creativecommons.org/publicdomain/zero/1.0/), not CC-BY.

FaBiO types for doibot

fabio:Article — journal articles (with dct:isPartOf linking to ISSN)
fabio:BookChapter — book/proceedings chapters (e.g. LNCS/Springer, with dct:isPartOf linking to ISSN)
fabio:ConferencePaper — standalone conference papers without a journal ISSN (e.g. ACM, IEEE proceedings); dct:isPartOf can be omitted

All doibot nanopubs use npx:hasNanopubType fabio:ScholarlyWork in pubinfo regardless of the specific FaBiO type.

Provenance patterns per bot

doibot: prov:wasAttributedTo (paper authors) + prov:wasDerivedFrom (paper DOI)
biodivbot: prov:wasAttributedTo (researchers who made the observation)
ai-in-edu-bot: prov:wasDerivedFrom (paper DOI)

Temp URI prefix

Output files must use @prefix : <http://purl.org/nanopub/temp/np1/> . as the base prefix. This is the standard nanopub temp URI that gets replaced with a proper trusty URI (https://w3id.org/np/RA...) during signing. Using <https://w3id.org/np/temp> instead causes the signed URI to incorrectly contain /temp/.

External Dependencies

Nanopub network: Published via nanodash.knowledgepixels.com / nanodash.petapico.org
Identifiers: ORCID, ROR, CrossRef DOIs, ChecklistBank, PubMed
Ontologies: ENVO (environments), UBERON (anatomy/life stages), BioLink (associations), Schema.org, Dublin Core, PROV

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

CLAUDE.md

Project Overview

Repository Structure

Bots

Nanopublication Structure

nanopub-java CLI

Superseding nanopublications

Scripts

Signing and publishing

Timestamps

Doibot: generate a draft nanopub from a DOI

Doibot: inspect CrossRef metadata

Doibot: find papers without nanopubs

DOI metadata (raw Turtle)

ORCID lookup

Frequently appearing co-authors (Tobias Kuhn's papers)

ROR lookup

Checking existing doibot nanopubs on the network

Workflow: creating/updating nanopubs

Personal information policy

License

FaBiO types for doibot

Provenance patterns per bot

Temp URI prefix

External Dependencies

FilesExpand file tree

CLAUDE.md

Latest commit

History

CLAUDE.md

File metadata and controls

CLAUDE.md

Project Overview

Repository Structure

Bots

Nanopublication Structure

nanopub-java CLI

Superseding nanopublications

Scripts

Signing and publishing

Timestamps

Doibot: generate a draft nanopub from a DOI

Doibot: inspect CrossRef metadata

Doibot: find papers without nanopubs

DOI metadata (raw Turtle)

ORCID lookup

Frequently appearing co-authors (Tobias Kuhn's papers)

ROR lookup

Checking existing doibot nanopubs on the network

Workflow: creating/updating nanopubs

Personal information policy

License

FaBiO types for doibot

Provenance patterns per bot

Temp URI prefix

External Dependencies