Reusable React components on JSF: file uploader (DVWebloader v2) and lazy file tree view (#6691, #12179)#12382
Draft
Reusable React components on JSF: file uploader (DVWebloader v2) and lazy file tree view (#6691, #12179)#12382
Conversation
When S3 tagging is enabled (DISABLE_S3_TAGGING is false or unset), generateTemporaryS3UploadUrls now includes "tagging": "dv-state=temp" in the JSON response. The client reads this field and sets x-amz-tagging accordingly — making the server authoritative instead of duplicating the JVM setting on the client. Also adds doc/Architecture/reusable_frontend_components.md covering the cross-repo uploader and tree view design decisions.
- FeatureFlags.REACT_UPLOADER: replace @todo with @SInCE 6.11; document the runtime requirement (api-session-auth) and the expected bundle URL. - editFilesFragment.xhtml: short comment explaining why dropBoxUploadFinished is now hoisted out of the legacy upload block (the Dropbox panel renders independently of the React/JSF upload switch and still needs the callback). - reusable_frontend_components.md: document the CSS isolation strategy and the remaining Bootstrap-globals limitation, with PostCSS scoping / Shadow DOM as the planned follow-ups.
The JSF page that mounts the React uploader currently hardcodes the
bundle path as `/dvwebloader/...` (legacy from DVWebloader v1). This
worked only when the dataverse-frontend dev environment served the
build output at that same-origin path.
To support institutions that don't run the SPA — and that may host
the bundle from a sidecar container, an existing nginx alias, or a
CDN — make the base URL configurable.
- JvmSettings: new entry REUSABLE_COMPONENTS_BASE_URL bound to
`dataverse.reusable-components.base-url`.
- SystemConfig.getReusableComponentsBaseUrl(): returns the configured
URL with any trailing slash trimmed, defaulting to `/dvwebloader`
to preserve backward compatibility with the existing dev nginx
alias and any same-origin operator setup.
- editFilesFragment.xhtml: the React-uploader script tag now reads
`#{systemConfig.reusableComponentsBaseUrl}/reusable-components/
dv-uploader.js` instead of the literal `/dvwebloader/...`. JSF
fallback path is unchanged.
Non-breaking: default behaviour matches the previous hardcoded path.
Operator-facing documentation for the reusable React components track:
how to host the bundle, how to point the JSF page at it, and how
versioning flows through npm → Docker image → JVM setting.
- doc/sphinx-guides/source/container/running/reusable-components.rst
is a new guide page modelled on previewers-provider in the demo
guide. It explains the npm + sidecar-image distribution model,
walks through three valid hosting choices (gdcc/dataverse-reusable-
components container, operator-managed nginx, CDN), gives a sample
Docker Compose service block, and cross-references the relevant
feature flags + the frontend-side contract document.
- frontend-dev.rst now links to the new page so readers landing on
the SPA-frontend guide find the JSF integration story.
- container/running/index.rst toctree includes the new page between
frontend-dev and backend-dev.
- installation/config.rst adds:
- dataverse.feature.react-uploader (the existing flag, finally
documented) with prerequisite notes.
- dataverse.reusable-components.base-url next to dataverse.siteUrl,
with examples for sidecar / nginx / CDN setups.
- doc/release-notes/6691-reusable-frontend-components.md describes
the React uploader feature flag, the new JVM setting, the S3
tagging server-authoritative change, the prerequisites for
enabling the feature, and the cross-repo coordination.
The original document mixed cross-repo decision-log content with backend-side integration mechanics. Split that responsibility: - This document (in dataverse) is now strictly the BACKEND HALF of the dual-mode contract: how JSF pages mount React components built in dataverse-frontend, how feature flags gate the swap, how nginx hosts the bundle, and how to add a new JSF page that mounts an SPA component. - The matching FRONTEND HALF — config interfaces, build pipeline, CSS isolation, how to make a component reusable — lives in dataverse-frontend/docs/reusable-components.md (added in that repo). - Cross-repo decisions, branch tracking, and active-track notes move out of this file entirely; they belong in the working plan rather than in committed Dataverse documentation. The new content covers: - Why dual-mode + the integration pattern diagram. - Feature flag conventions and naming. - Authentication prerequisites (session-cookie + hardening). - Hosting options for the bundle (image / nginx / CDN). - A worked example of replacing a JSF widget with an SPA component (the uploader). - Adding a brand-new reusable component to a JSF page (the upcoming tree-view case). - Currently shipped components (uploader, tree-view planned). - Risks and trade-offs (Bootstrap collision, session-cookie, etc.).
New API endpoint that lazy-lists the immediate children (folders +
files) inside a folder of a dataset version, enabling tree-view UIs
to fetch on demand and paginate stably across very large datasets:
GET /api/datasets/{id}/versions/{versionId}/tree
Query parameters: path, limit (default 100, clamped 1-1000), cursor
(opaque keyset token), include (all|folders|files), order
(NameAZ|NameZA), includeDeaccessioned, originals.
Response: {path, items[], nextCursor, limit, order, include,
approximateCount}. Folders come first, then files; both name-sorted
case-insensitively, files break ties on data file id for stability.
Folder items carry counts of distinct subfolders + descendant files.
File items carry id, size, contentType, access (public/restricted/
embargoed), optional checksum, and downloadUrl. Permissions and
embargoes are honoured exactly as on .../files.
Implementation:
- DatasetVersionTreeService (new package edu.harvard.iq.dataverse.
datasetversiontree): walks DatasetVersion.fileMetadatas once,
groups files by their first segment relative to the requested
path, applies include/order, paginates in memory with an opaque
Base64 "offset=N" cursor. Wire format and cursor behaviour are
stable; promotion to native keyset SQL is tracked as a follow-up
and won't change the contract.
- Datasets.getVersionTree handler + jsonTreePage serialiser.
- Bundle.properties keys for invalid-query / not-found errors.
Tests:
- DatasetVersionTreeServiceTest covers root grouping, folder-only
immediate-children listing, path normalisation
(/data//sub/// → data/sub), include filter, cursor-paginated
retrieval, invalid-cursor / invalid-order rejection, originals
toggle on the downloadUrl, descending order, restricted /
embargoed access strings, and folder-counts semantics.
Sphinx native-api.rst gains a "List a Folder of a Dataset Version
(Tree View)" section. Release-notes snippet at
doc/release-notes/6691-dataset-version-tree-listing-api.md.
End-to-end coverage of the new dataset-version tree endpoint, run
against a live container in CI. Complements the unit-level
DatasetVersionTreeServiceTest which only exercises the service bean.
Tests:
- root listing returns immediate children, folders first, with the
expected counts {files, folders} on each folder item.
- folder listing returns only immediate children.
- path normalisation (/data//sub///) → "data/sub".
- cursor pagination is stable and exhausts cleanly.
- invalid cursor → 400.
- invalid order → 400.
- include filter restricts items to folders or files.
- descending order keeps folders-first but reverses the within-type
sort.
- originals=true switches the file downloadUrl to ?format=original.
- unauthenticated access to a draft → 401/403.
- another authenticated user without permission → 404 (Dataverse's
standard "draft not visible" behaviour, not 403).
- empty dataset → empty items list with approximateCount=0.
- a published dataset is readable via :latest.
UtilIT gains a getVersionTree helper that mirrors the existing
getVersionFiles helper.
For published, non-deaccessioned versions, the response now carries: ETag: "<sha256-prefix>" Cache-Control: public, immutable The ETag is derived from a stable hash of (version id, version state, path, limit, cursor, include, order, originals, includeDeaccessioned). Subsequent requests including a matching If-None-Match header receive 304 Not Modified with no body. Drafts and deaccessioned versions do not emit an ETag because their content can change in place. The published-version assumption holds because Dataverse versions are immutable once released; deaccession is the only state change, and we exclude it explicitly. Doc + release-notes updates describe the caching contract. DatasetsTreeIT gains two tests: - draft response must NOT carry an ETag - published response carries ETag + Cache-Control, honours If-None-Match (returning 304), and changes the ETag on different query params.
Sphinx guide and the per-issue release-notes snippet now mention the ETag / Cache-Control / If-None-Match contract added in the previous commit. The behaviour itself is unchanged.
…#12179) Mirrors the existing react-uploader pattern: a JVM feature flag controls whether the JSF page renders the React reusable component or the classic PrimeFaces widget. - New feature flag dataverse.feature.react-tree-view in FeatureFlags.java + SystemConfig.isReactTreeViewEnabled(). - filesFragment.xhtml: when the flag is on AND the user selects the Tree mode of the existing Table/Tree toggle, the page renders <div id="dv-tree-view"> + a window.dvTreeViewConfig snippet + a module script tag pointing at #{systemConfig.reusableComponentsBaseUrl} /reusable-components/dv-tree-view.js. Otherwise the existing p:tree continues to render unchanged. - Sphinx config.rst documents the new flag next to react-uploader and links to the operator guide. - container/running/reusable-components.rst notes both shipped components share the same build/distribution. - 6691-reusable-frontend-components.md release-notes file gains a bullet for the tree-view flag. The React bundle is built by the dataverse-frontend build-uploader script (vite.config.uploader.ts) and ships alongside dv-uploader.js with shared chunks. This satisfies #12179 (direct JS mount in JSF for tree view).
Replaces the 'in development' tree-view note with the shipped surface (JSF mount path, config interface, backend endpoint, ETag, streaming zip) and updates the greenfield-pattern paragraph to reflect that the tree view has landed.
Drops the last two FQN references in the new tree handler's ETag helper. Cosmetic; matches prevailing style in the file.
Title 'List a Folder of a Dataset Version (Tree View)' is 46
characters; underline was 45. Sphinx 7.x treats this as a build error
('Warning, treated as error: Title underline too short.') under the
docs / readthedocs CI. One extra tilde fixes it.
|
📦 Pushed preview images as 🚢 See on GHCR. Use by referencing with full name as printed above, mind the registry name. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What this PR does / why we need it:
Lands the backend half of the reusable React components pattern that lets a single React component built in
dataverse-frontendmount on either the SPA or a JSF page, behind a feature flag, with the legacy widget as the off-state. Two concrete components ride on that pattern in this PR:p:fileUploadwidget on the dataset edit page whendataverse.feature.react-uploaderis enabled.dataverse.feature.react-tree-viewis enabled. The Table view is unchanged.Both flags default to off. JSF behaves exactly as before until an operator opts in.
Net-new in this PR:
GET /api/datasets/{id}/versions/{versionId}/tree— a paginated, lazy listing of the immediate folders + files inside a folder of a dataset version. Opaque keyset cursor, name ordering,include/order/originalsfilters,ETag+If-None-Matchfor published versions. Used by the tree component but generally useful to any client that wants to walk a dataset's directory structure without materialising all files at once.DatasetVersionTreeService+ unit + integration tests (DatasetsTreeIT).FeatureFlagsenum entries (REACT_UPLOADER,REACT_TREE_VIEW).dataverse.reusable-components.base-url(default/dvwebloader) so operators can host the bundle in a sidecar container, on their own nginx, or behind a CDN — see new operator-facing Sphinx page.<ui:fragment>swaps ineditFilesFragment.xhtmlandfilesFragment.xhtml, gated on the flags.S3AccessIO.generateTemporaryS3UploadUrlsnow includes ataggingfield in its JSON response whendataverse.files.<driverId>.disable-taggingis unset. The dataverse-client-javascript SDK reads this and decides whether to sendx-amz-tagging— non-breaking, additive.doc/Architecture/reusable_frontend_components.md(refactored to be a backend integration guide; cross-links the frontend half).doc/sphinx-guides/source/container/running/reusable-components.rst(new operator guide).doc/release-notes/.Which issue(s) this PR closes:
Special notes for your reviewer:
develop, decoupled from the12178_*hardening track, so review and merge are independent.dataverse.feature.react-tree-viewon a JSF-only install with large datasets. TheDatasetVersionTreeService.listChildrenfirst cut walksversion.getFileMetadatas()once per request and partitions in memory. The wire is correct and the cursor behaviour is stable, but for a dataset with ~100k files, opening 10 folders is roughly 10× the backend work the table view does in 1 request. This is acceptable for a few-thousand-file install, an SPA-only opt-in, or a power user driving the URL bookmark; it is not yet right for advertising the JSF mount on a large-dataset operator. Promotion to a native folder query + JPA Criteria for files (with Flyway indices and a side-by-side fixture-comparison IT) is tracked as the next focused PR. Until then the JSF feature flag should stay off on big installs.client-zip); there is no server-side ZIP endpoint touched in this PR.getDatasetVersionOrDieplumbing as/versions/{versionId}/files, so permissions / embargoes / restrictions / deaccession honour the same rules.dataverse.reusable-components.base-url. Default/dvwebloaderpreserves backwards compatibility with the dev-environment nginx alias and with operators who already self-host. Three reasonable hosting patterns are documented (sidecar image, own nginx, CDN).IQSS/dataverse-client-javascript#403; the two are independent because dataverse has nonpm/js-dataversedependency. The matching frontend PR (IQSS/dataverse-frontend#898) merges last, with a final commit that bumps the SDK pin from the GitHub Packages prerelease (2.2.0-pr403.<sha>) to the released semver the SDK PR cuts.dv-uploader.jsanddv-tree-view.jsbundles for production operators (npm package + Docker sidecar image) is a separate piece of work tracked in the cross-repo plan and deferred to a team discussion. Until it lands, operators wanting the JSF mount in production have to builddataverse-frontendand servedist-uploader/themselves. Documented in the operator guide as the current limitation. The dev-compose setup automates this for reviewers, so this PR is independently reviewable as-is.Suggestions on how to test this:
Backend-only checks (no frontend repo needed):
Then exercise the new endpoint directly:
End-to-end with the JSF mount (requires the matching frontend bundle to be reachable; the dev-compose setup in
dataverse-frontend/dev-env/serves it for you):cd dataverse-frontend/dev-env && docker compose up. It pulls agdcc/dataverse:<DATAVERSE_IMAGE_TAG>image and mounts the locally-built bundle at/dvwebloader/. Both flags are on by default in the dev compose env.http://localhost:8000/editdatafiles.xhtml?datasetId=<id>→ the React uploader replaces the PrimeFaces upload widget.http://localhost:8000/dataset.xhtml?persistentId=<pid>→ flip the existing Tree toggle in the Files tab → the React lazy tree mounts.Does this PR introduce a user interface change? If mockups are available, please link/include them here:
Yes, but only when one of the two new feature flags is enabled:
dataverse.feature.react-uploader=on→ the file upload widget on the dataset edit page becomes the React uploader. Same dataset-edit flow, no DB-level differences.dataverse.feature.react-tree-view=on→ the existing Table / Tree toggle on the dataset Files tab keeps working; the Tree view is rendered by the React lazy tree (with selectable rows, keyboard navigation, URL bookmarkability, and a client-side streaming-zip download for the user's selection) instead of the PrimeFaces tree. The Table view is unchanged.The frontend half (with screenshots / Storybook stories / Chromatic baselines) is in
IQSS/dataverse-frontend#898.Is there a release notes update needed for this change?:
Yes — included in this PR under
doc/release-notes/:6691-reusable-frontend-components.md— covers the JSF mount pattern, both feature flags, the new JVM setting, the S3-tagging change, hosting options, and prerequisites.6691-dataset-version-tree-listing-api.md— covers the newGET .../treeendpoint, query parameters, response shape, ETag semantics, and the matching SDK helpers indataverse-client-javascript.Additional documentation:
doc/sphinx-guides/source/container/running/reusable-components.rst(Reusable Frontend Components) — how to host the bundle, sidecar vs CDN vs same-origin nginx, prerequisites, versioning.doc/sphinx-guides/source/api/native-api.rst§ List a Folder of a Dataset Version (Tree View) — endpoint contract, params, response, ETag, error codes.doc/Architecture/reusable_frontend_components.md— the backend half of the reusable-components contract (matchesdataverse-frontend/docs/reusable-components.mdon the frontend side).dataverse.feature.react-uploader,dataverse.feature.react-tree-view, anddataverse.reusable-components.base-url.Cross-repo PRs that pair with this one:
IQSS/dataverse-client-javascript#403— addslistDatasetTreeNodeanditerateDatasetTreeNodeSDK helpers, plus the server-authoritative S3 tagging change.IQSS/dataverse-frontend#898— ships thedv-uploader.jsanddv-tree-view.jsbundles plus the SPA tree view.A canonical, living plan for the cross-repo work is at
tree_view_plan.mdin the workspacedataverse-contextrepo.AI assistance disclosure: This PR was developed with significant assistance from an AI coding assistant (Claude). All Java, JSF, Sphinx, and release-notes content was generated with AI involvement; the human author reviewed and curated each commit before pushing. Reviewers should treat the diff as if any human had written it — flag anything that looks off, especially around the in-memory paginator's behaviour on large datasets.