docs: add JEP-0014 Virtual Scalable Exporters proposal by mangelajo · Pull Request #744 · jumpstarter-dev/jumpstarter

mangelajo · 2026-06-03T17:02:01Z

Summary

Adds JEP-0014: Virtual Scalable Exporters, proposing a controller-managed pool of virtual target instances with configurable autoscaling via per-provider CRDs (QEMUExporterPool, AndroidExporterPool, etc.)
Updates the JEP index in README.md to include JEP-0014

Test plan

Verify the document renders correctly in Sphinx docs
Review JEP content for completeness and accuracy
Confirm all template-required sections are present (Abstract, Motivation, Proposal, Design Decisions, Design Details, Test Plan, Acceptance Criteria, Backward Compatibility, Consequences, Rejected Alternatives)

Made with Cursor

coderabbitai · 2026-06-03T17:02:14Z

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

@coderabbitai resume to resume automatic reviews.
@coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

▶️ Resume reviews
🔍 Trigger review

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: f0806445-50e1-4c09-843d-f956d62a9aa2

📥 Commits

Reviewing files that changed from the base of the PR and between b397149 and 0c8b743.

📒 Files selected for processing (1)

python/docs/source/contributing/jeps/index.md

🚧 Files skipped from review as they are similar to previous changes (1)

python/docs/source/contributing/jeps/index.md

📝 Walkthrough

Walkthrough

Adds JEP-0014 documentation proposing controller-managed, provider-specific autoscaling exporter pools with warm-pool leasing, Exporter.spec.enabled for graceful scale-down, reconciliation pseudocode, tests/acceptance criteria, phased implementation plan, and registers the JEP in the docs index.

Changes

JEP-0014 Virtual Scalable Exporters

Layer / File(s)	Summary
Problem Statement and Core Proposal `python/docs/source/contributing/jeps/JEP-0014-virtual-scalable-exporters.md`	JEP header, abstract, motivation, user stories; introduces managed `*ExporterPool` CRDs, warm-pool leasing semantics, and example provider manifests.
Architecture, Deployment, and API `python/docs/source/contributing/jeps/JEP-0014-virtual-scalable-exporters.md`	Controller and pool-controller architecture, per-provider Deployment model (shared binary + `--provider`), scaling inputs (watching Leases + Exporters), instance lifecycle, hardware/compatibility notes, and `Exporter.spec.enabled` for coordinated scale-down.
Design Decisions and Reconciliation Logic `python/docs/source/contributing/jeps/JEP-0014-virtual-scalable-exporters.md`	Design rationale (pool-based scaling, rejection of per-lease parameters), reconciliation pseudocode, invariants, instance state model, component interactions, and failure modes.
Testing, Acceptance, and Backward Compatibility `python/docs/source/contributing/jeps/JEP-0014-virtual-scalable-exporters.md`	Test plan, acceptance and graduation criteria, and backward compatibility expectations (including `Exporter.enabled` defaulting).
Consequences, Risks, and Future Work `python/docs/source/contributing/jeps/JEP-0014-virtual-scalable-exporters.md`	Consequences, identified risks, rejected alternatives, prior art, unresolved/resolved questions, and future provider extensions outside the JEP scope.
Implementation Plan and References `python/docs/source/contributing/jeps/JEP-0014-virtual-scalable-exporters.md`	Phased implementation roadmap (Phase 1: `Exporter.enabled`; Phase 2: pool controller + `QEMUExporterPool`; Phase 3: more providers), implementation history, references, and license.
JEP Documentation Index Update `python/docs/source/contributing/jeps/index.md`	Registers JEP-0014 in Standards Track table (Draft) and adds the JEP to the JEP toctree.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

Suggested labels

documentation

Suggested reviewers

kirkbrauer
bennyz
bkhizgiy
maboras-rh
raballew

Poem

🐰 I hopped through a JEP at break of day,
Pools of exporters lined the way,
Controllers hum and leases sing,
Warm instances ready for testing,
Docs tucked neat — the rabbit hops away.

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 66.67% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title clearly summarizes the main change: adding JEP-0014 documentation for Virtual Scalable Exporters, which aligns with the +920 lines added to the JEP document and index updates.
Description check	✅ Passed	The description is directly related to the changeset, detailing the addition of JEP-0014 Virtual Scalable Exporters and index updates with a clear test plan.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch jep-0014-virtual-scalable-exporters

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

Propose a Virtual Scalable Exporter subsystem for Jumpstarter that manages pools of virtual targets with configurable autoscaling via per-provider CRDs (QEMUExporterPool, AndroidExporterPool, etc.). Co-authored-by: Cursor <cursoragent@cursor.com>

coderabbitai

🧹 Nitpick comments (1)

python/packages/jumpstarter-driver-opendal/jumpstarter_driver_opendal/client.py (1)

82-84: 💤 Low value

Optional: consolidate the duplicated HTTP-URL detection. The same 3-line original_url block is now copy-pasted in write_from_path, _flash_single, and StorageMuxFlasherClient.flash. A tiny helper keeps the three sites from drifting.

♻️ Proposed helper

+def _http_original_url(path: PathBuf) -> str | None:
+    """Return the path as an HTTP(S) original_url, else None."""
+    if isinstance(path, str) and path.startswith(("http://", "https://")):
+        return path
+    return None

Then at each call site:

-        original_url = None
-        if isinstance(path, str) and path.startswith(("http://", "https://")):
-            original_url = path
+        original_url = _http_original_url(path)
         if operator is None:
             path, operator, _ = operator_for_path(path)

Also applies to: 636-638, 774-776

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In
`@python/packages/jumpstarter-driver-opendal/jumpstarter_driver_opendal/client.py`
around lines 82 - 84, The HTTP-URL detection logic (setting original_url when
path is a string starting with "http://" or "https://") is duplicated in
write_from_path, _flash_single, and StorageMuxFlasherClient.flash; extract this
into a small helper function (e.g., is_http_url or extract_original_url) and
replace the three copy-pasted blocks with calls to that helper. Ensure the
helper accepts the same path argument(s) and returns the original_url (or None)
so callers in write_from_path, _flash_single, and StorageMuxFlasherClient.flash
keep the existing behavior without duplicated code.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In
`@python/packages/jumpstarter-driver-opendal/jumpstarter_driver_opendal/client.py`:
- Around line 82-84: The HTTP-URL detection logic (setting original_url when
path is a string starting with "http://" or "https://") is duplicated in
write_from_path, _flash_single, and StorageMuxFlasherClient.flash; extract this
into a small helper function (e.g., is_http_url or extract_original_url) and
replace the three copy-pasted blocks with calls to that helper. Ensure the
helper accepts the same path argument(s) and returns the original_url (or None)
so callers in write_from_path, _flash_single, and StorageMuxFlasherClient.flash
keep the existing behavior without duplicated code.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 1246f919-4b05-47a5-bbbd-68370cf50adc

📥 Commits

Reviewing files that changed from the base of the PR and between e654084 and 8928e29.

📒 Files selected for processing (4)

python/docs/source/internal/jeps/JEP-0014-virtual-scalable-exporters.md
python/docs/source/internal/jeps/README.md
python/packages/jumpstarter-driver-opendal/jumpstarter_driver_opendal/client.py
python/packages/jumpstarter-driver-opendal/jumpstarter_driver_opendal/driver_test.py

coderabbitai

Actionable comments posted: 3

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@python/docs/source/contributing/jeps/JEP-0014-virtual-scalable-exporters.md`:
- Around line 647-653: Replace the placeholder "<!-- TODO: Detail specific test
cases -->" in the "Test Plan" section with concrete, verifiable test cases:
enumerate unit tests (functionality scenarios, inputs, expected outputs) for
exporter creation and scaling code paths (e.g., single-exporter, multi-exporter,
failure/retry), integration tests covering end-to-end export flows and
compatibility boundaries (formats, destinations, auth), performance/load tests
with target metrics (throughput, latency, resource usage) and pass/fail
thresholds, and regression/upgrade tests to assert behavior across version
changes; reference the "Test Plan" header and ensure each case includes scope,
steps, expected outcome, and acceptance criteria so reviewers can reproduce and
validate.
- Line 1: This JEP file is not included in any Sphinx toctree; open
python/docs/source/contributing/jeps/index.md and add an entry for
JEP-0014-virtual-scalable-exporters.md (exact filename) to the toctree so the
page is discovered by Sphinx; ensure the relative path and filename match the
JEP file and rebuild docs to confirm warnings are cleared.
- Around line 243-279: The fenced code blocks containing ASCII diagrams and
sequences are untyped and trigger MD040; update each triple-backtick fence
around the diagrams/blocks (e.g., the large Kubernetes ASCII diagram block and
the smaller sequence blocks like "for each *ExporterPool CR:" and "Provisioning
→ Ready...") by adding an explicit language identifier such as text (```text) to
the opening fence so the markdown linter stops flagging them; ensure you replace
the untyped ``` with ```text for all occurrences noted in the review (including
the blocks around lines referenced in the comment).

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 35aa8d0c-1ccf-4eb6-9673-1944fc3dbb0f

📥 Commits

Reviewing files that changed from the base of the PR and between 8928e29 and 08d970b.

📒 Files selected for processing (2)

python/docs/source/contributing/jeps/JEP-0014-virtual-scalable-exporters.md
python/docs/source/contributing/jeps/index.md

✅ Files skipped from review due to trivial changes (1)

python/docs/source/contributing/jeps/index.md

raballew · 2026-06-03T19:25:40Z

+  iterate quickly without waiting for scarce hardware.
+
+- **As a** platform engineer, **I want to** declare a virtual target pool with
+  `minInstances: 2, maxInstances: 20`, **so that** there are always warm


shouldnt it be minWarmInstances and maxTotalInstances then?

Done — renamed throughout the document: minInstances → minWarmInstances and maxInstances → maxTotalInstances. CRD spec comments updated to reflect the semantics.

This comment was generated from a Cursor session.

raballew · 2026-06-03T19:27:47Z

+The guiding principle is: **"Get me a target that matches my requirements."** The
+distinction between physical and virtual is an implementation detail, not a
+primary concern for the user. Virtual exporters simply appear in the same pool
+as physical ones, differentiated only by labels.


yeah, thats a reasonable design decision - this would allow you to easily switch between virtual and physical by merely adding a label

raballew · 2026-06-03T19:29:41Z

+Each pool controller watches two key resources to make scaling decisions:
+
+1. **Leases** — The controller watches for pending Leases whose label selectors
+   match the pool's labels. Pending leases with no available exporter signal


what about scheduled leases?

+1 we need to ignore them until it's time to make them effective.

Addressed — moved "Scheduled leases" from Unresolved to Resolved Questions. The controller already supports Spec.BeginTime on Lease CRs; the pool controller simply ignores leases whose BeginTime is in the future when counting demand, and only pre-provisions instances as the scheduled time approaches.

This comment was generated from a Cursor session.

raballew · 2026-06-03T19:30:33Z

+**Per-Provider Deployments (single image by default):** All provider
+controllers are compiled into a single binary. Each Deployment in the cluster
+passes a `--provider=<type>` flag to activate the corresponding reconciler.
+This gives each provider isolated logs and independent restarts while
+maintaining a single image to build and release. The per-provider `image`
+override in the operator CR allows administrators to substitute a custom image
+for a specific provider (e.g., a third-party provider distributed as its own
+image) without affecting other providers.


raballew · 2026-06-03T19:34:04Z

+for scalable testing. However:
+
+- Virtual targets must faithfully emulate the interfaces exposed by physical
+  hardware (serial, network, storage, power) through the existing driver model.


hehe the LLM was a little bit creative in this, I think it needs some work

ok, so I am proposing "0" for no limits... (at your own risk) :D

Done — maxTotalInstances is now optional; omitting it or setting it to 0 means no upper limit.

This comment was generated from a Cursor session.

I am not sure what you are referring to here. the comment seems to be not related to my support for proper interfaces

@raballew I think if we can use virtio here, we might be able to get very high fidelity for guests that run natively on virtio targets and we can avoid privileged Pods in this case :)

raballew · 2026-06-03T19:34:57Z

+**Decision:** Pool-based with configurable min/max.
+
+**Rationale:** Purely on-demand provisioning introduces unacceptable latency for
+CI pipelines (VM boot + exporter registration can take 30-120s). A warm pool


why does it take 2min?

create exporter

create the pod, node is assigned...

the image is downloaded

the exporter boots

connects back and becomes ready.

May be it's more around 10-15 seconds.

But could be driver dependent, i.e. renode takes some time to initialize.

But it really doesn't make much difference, with the current design you can chose to set minInstances to 0 and ... get 0 warmth instances, or more if you want any :) it becomes an admin decission and it doesn't really change a lot the underlying code design.

Updated DD-1 — the cold-start estimate is now 10–60 seconds (not 2 minutes). The breakdown: image pull + VM boot + exporter registration. The previous number was too generous.

This comment was generated from a Cursor session.

raballew · 2026-06-03T19:59:47Z

+
+```
+Provisioning → Ready (warm pool) → Leased → Ready
+                                              └→ Terminating → (deleted if available instances>min)


does that me we are reusing virtual instances (even just the exporter?)

Addressed — clarified instance reuse in the Component Interaction section. Added a recycleStrategy field to the common pool spec with two options:

ExitAndReplace (default): the exporter process exits after lease release; the Deployment/ReplicaSet respawns a fresh Pod.

InPlaceReuse: the exporter performs internal cleanup and re-registers as ready for the next lease without restarting.

This comment was generated from a Cursor session.

virtual instances should be throw away imo

Add DD-4 explaining why per-lease parameters are not included in this JEP. The same use case is served by creating separate pools with different resource profiles, avoiding complexity across the Lease CRD, controller, pool controllers, and driver templates. Co-authored-by: Cursor <cursoragent@cursor.com>

- Clarify warm pool rationale and cold-start latency range (10-60s) - Rename minInstances/maxInstances to minWarmInstances/maxTotalInstances - Make maxTotalInstances optional (0 or omitted means unlimited) - Add Crossplane to Prior Art with rationale - Resolve scheduled leases question via existing BeginTime mechanism - Add DD-5: built-in scaling vs HPA/KEDA - Add DD-4: per-lease parameters rejected in favor of pool flavors - Add composite exporters and Corellium to Future Possibilities - Clarify instance reuse with recycleStrategy field (ExitAndReplace default) - Add language identifiers to untyped fenced code blocks - Add Apache 2.0 license footer Co-authored-by: Cursor <cursoragent@cursor.com>

raballew · 2026-06-04T12:37:42Z

+CI pipelines (Pod scheduling + image pull + VM boot + exporter registration
+typically takes 10-15s, and up to 60s with cold image pulls or heavy
+providers). A warm pool
+provides instant lease fulfillment for the common case. Setting `minWarmInstances: 0`


are you referring to the exporter image or the "disk image" that the exporter will run?

Pull of the container image that runs the exporter in a pod. I guess that in most cases will already be part of the node.

then this is a no-op wiht time only spend on scheduling, boot and registration

kirkbrauer · 2026-06-04T13:58:07Z

@mangelajo I would also propose a sidecar pattern for the exporters by default. This would prevent losing the main pod from bringing down the exporter too and add more flexibility for multi-device virtual benches in the future.

mangelajo · 2026-06-04T14:41:49Z

+
+- [ ] `AndroidExporterPool` CRD and reconciler
+- [ ] Provider authoring guide documenting how to add a new `*ExporterPool`
+


Future ideas:

Priority selectors : I want "this", if not available "this other thing" ... otherwise ....

Should this be admin configured or user configured?

This is probably related to Device Classes

@mangelajo Humm, yeah we'll need to think about this if this is related to a DeviceClass or ExporterClass since the Exporter is the primary scheduling unit in this proposal.

New JEP files not listed in any toctree cause Sphinx build warnings, which fail the check-warnings CI job. Co-authored-by: Cursor <cursoragent@cursor.com>

kirkbrauer

Thanks for putting this together, @mangelajo — the warm-pool + autoscaling direction is great and the design decisions are well argued. I've left a set of inline suggestions exploring whether we can lean harder on native Kubernetes primitives (so the proposal reads like standard k8s to cluster admins) and make the provider/device model more extensible.

The throughline of the inline comments:

Orchestration → ExporterSet (a ReplicaSet+HPA analog): selector + an embedded template, HPA/PDB scaling vocab (minReplicas/maxReplicas/minAvailableReplicas), the scale subresource, and Deployment-style status. Reuses the existing lease-selector→Exporter-label matching unchanged.
Separate the device from the exporter: keep Exporter as the minimum leased unit (Pod analog); move the provider-typed device into a first-class VirtualTarget.
CSI-style class + claim: a VirtualTargetClass (StorageClass analog) with a pluggable provisioner (k8s container / EC2 / Corellium / a vendor cloud-device API), an inline credentialsSecretRef, bindingMode (warm vs provision-on-lease), reclaimPolicy, and node scheduling; the typed *VirtualTarget is the claim.
Endorse per-provisioner, backend-aware autoscaling — the only ask is a consistent scaling API across provisioners.
Packaging: exporter as a native sidecar + an independent runtime container + the OS image as an OCI artifact (image volume); drivers attach over standard interfaces (serial/SPI/CAN/GPIO via virtio), reused across physical/virtual.
Node scheduling on the class (arch/KVM/GPU via tolerations + device resources).
A fidelity/cost ladder of multiple classes (software-emulated → cloud virtual device → real hardware), all selectable through one jmp lease.
A few Future Possibilities to keep the model open (cross-node accelerators, a *ProviderConfig for multi-account creds, a realized-instance CRD, an ExporterDeployment rollout tier, multi-target-per-exporter, a universal Target).

To make this concrete rather than abstract, I pushed a worked rewrite of the JEP on a separate branch so we can diff and discuss specifics:

Diff vs this PR: jep-0014-virtual-scalable-exporters...jep-0014-k8s-remodel
Rendered file: https://github.com/jumpstarter-dev/jumpstarter/blob/jep-0014-k8s-remodel/python/docs/source/contributing/jeps/JEP-0014-virtual-scalable-exporters.md

These are suggestions for discussion, not blockers — happy to scope any of them down to Future Possibilities.

🤖 This review summary and the inline suggestions were drafted with AI assistance (Claude) and reviewed by me (the PR reviewer) before posting.

kirkbrauer · 2026-06-12T13:03:27Z

+
+## Abstract
+
+This JEP proposes a Virtual Scalable Exporter subsystem for Jumpstarter that


We might want to specify that the thing that is really scaling here is not the exporter per say, but actually the targets themselves. For example, one exporter may have multiple targets, but I do understand this from the perspective of the "exporter" being the basic unit of scheduling such as a Pod in k8s.

kirkbrauer · 2026-06-12T16:08:45Z

+target definition declares scaling parameters that let administrators tune the
+trade-off between instant availability and resource consumption.
+
+### Core Concept: Managed Pools with Scaling


Suggestion: consider re-modeling the pool on native Kubernetes workload primitives so it reads like standard k8s to cluster admins. Replace the provider-typed *ExporterPool with a generic ExporterSet (a ReplicaSet+HPA analog): spec.selector + an inline spec.template, HPA/PDB scaling vocab (minReplicas/maxReplicas/minAvailableReplicas), Deployment-style status (replicas/readyReplicas/availableReplicas/leasedReplicas), and the scale subresource. This reuses the existing lease-selector→Exporter-label matching unchanged and lets kubectl scale/HPA/KEDA interoperate.

kind: ExporterSet # ≈ ReplicaSet + HPA spec: minReplicas: 0 maxReplicas: 20 minAvailableReplicas: 2 # PDB-style warm buffer (ready & unleased) selector: matchLabels: board: rpi4 template: # embedded template (Deployment idiom) metadata: labels: board: rpi4 virtual: "true" spec: drivers: - type: jumpstarter_driver_power.driver.QemuPower # ... status: # Deployment-style counters replicas: 5 readyReplicas: 3 availableReplicas: 3 # warm leasedReplicas: 2 # scale subresource: specReplicasPath=.spec.maxReplicas

🤖 Drafted with AI assistance (Claude) and reviewed by the PR reviewer before posting.

kirkbrauer · 2026-06-12T16:08:46Z

+    storage: 16Gi
+
+  # Exporter template (drivers exposed by each instance)
+  exporterTemplate:


Suggestion: rather than the pool being the provider-typed device, separate the device into a first-class VirtualTarget, keeping Exporter as the minimum leased unit (Pod-equivalent). The drivers/exporterTemplate here would move under a typed *VirtualTarget (e.g. QEMUVirtualTarget) the Exporter owns. Keeps the lease flow unchanged + unified for physical/virtual, localizes provider typing, and leaves room for one Exporter to expose multiple VirtualTargets later (multi-device benches).

ExporterSet (generic) ~ ReplicaSet + HPA └ Exporter (leasable; a Pod) ~ Pod ← minimum leased unit └ QEMUVirtualTarget ~ the device (provider-typed)

🤖 Drafted with AI assistance (Claude) and reviewed by the PR reviewer before posting.

kirkbrauer · 2026-06-12T16:08:46Z

+
+  # Corellium-specific configuration
+  apiHost: app.corellium.com
+  apiCredentialsSecret: corellium-api-credentials  # Secret with keys: token


Suggestion: instead of inlining credentials/provisioning per pool, adopt the CSI StorageClass/PVC pattern. A cluster-scoped VirtualTargetClass holds the provisioner, an inline credentialsSecretRef, opaque parameters (apiHost/projectId/region), bindingMode (Immediate=warm vs WaitForFirstConsumer=provision-on-lease), and reclaimPolicy; the typed *VirtualTarget is the claim naming the class. Admins own classes + secrets; claim authors never touch credentials — like a PVC naming a StorageClass.

# cluster-scoped (StorageClass analog) — admins own it kind: VirtualTargetClass metadata: name: corellium-kronos spec: provisioner: corellium.jumpstarter.dev credentialsSecretRef: name: corellium-creds namespace: jumpstarter parameters: apiHost: app.corellium.com projectId: "778f..." bindingMode: WaitForFirstConsumer # Immediate = pre-warmed pool reclaimPolicy: Delete --- # the typed claim just names the class (PVC analog) — no creds here kind: CorelliumVirtualTarget spec: virtualTargetClassName: corellium-kronos deviceFlavor: kronos

🤖 Drafted with AI assistance (Claude) and reviewed by the PR reviewer before posting.

kirkbrauer · 2026-06-12T16:08:47Z

+Deployment manifest pointing to the same image with a different flag — no new
+image build required.
+
+### DD-3: CRD per provider vs. generic CRD


Suggestion: a cleaner framing than "CRD per provider vs generic" (borrowed from CSI/CRI/Cluster-API): keep orchestration generic and make the device backend pluggable via a provisioner named on the class. The typed *VirtualTarget stays strongly-typed (your DD-3 win), while one provisioner string selects the backend — k8s container, EC2 instance, Corellium/REST, a vendor cloud-device API — all behind one interface, so the Exporter/lease experience is identical regardless of where the device runs. New backends add a claim kind + a provisioner, no pool-tier changes.

VirtualTargetClass.provisioner → qemu.jumpstarter.dev → k8s container (+ OS OCI image volume) ec2.jumpstarter.dev → AWS API corellium.jumpstarter.dev → Corellium REST API # one typed *VirtualTarget claim interface; backend is pluggable

🤖 Drafted with AI assistance (Claude) and reviewed by the PR reviewer before posting.

kirkbrauer · 2026-06-12T16:08:50Z

+  maxTotalInstances: 20       # Scale up to 20 under load
+
+  # Node scheduling (shared across all pool CRDs, optional)
+  nodeSelector:


Suggestion: node scheduling (this nodeSelector) is really a property of the backend, so consider folding it into the VirtualTargetClass as a scheduling block — nodeSelector/nodeAffinity plus tolerations (tainted KVM/GPU/baremetal/ARM nodes) and device resource requests (kubernetes.io/arch, devices.kubevirt.io/kvm, nvidia.com/gpu). CSI precedent: StorageClass.allowedTopologies. The rendered Pod inherits it, with optional per-ExporterSet override.

kind: VirtualTargetClass spec: scheduling: # inherited by the rendered exporter Pod nodeSelector: kubernetes.io/arch: arm64 tolerations: - key: jumpstarter.dev/kvm operator: Exists effect: NoSchedule resources: limits: devices.kubevirt.io/kvm: "1" # or nvidia.com/gpu

🤖 Drafted with AI assistance (Claude) and reviewed by the PR reviewer before posting.

kirkbrauer · 2026-06-12T16:08:51Z

+   `status.leaseRef` to remain empty).
+3. Pool controller deletes the Pod and Exporter CR.
+
+### Hardware Considerations


Suggestion: add a concrete fidelity/cost ladder showing one logical target served by multiple classes — a container-backed software emulator/simulator (cheap, CI-scale), an API/cloud-backed virtual device (higher fidelity, metered), and real hardware (full fidelity, scarce) — all selected via labels through one jmp lease. For example, a target that needs a GPU or specialized I/O device: a software-emulated class runs functional checks cheaply in CI, a cloud-backed virtual device adds higher fidelity, and real hardware is the ground truth. This illustrates why keeping the VirtualTarget/class abstraction generic pays off across fidelity tiers.

One logical target, selected via labels through jmp lease:

class (provisioner) fidelity scale/cost role

container sim (qemu) low cheap / CI functional checks

cloud virtual device (api) high metered higher-fidelity behavior

real hardware (exporter) full scarce ground truth

🤖 Drafted with AI assistance (Claude) and reviewed by the PR reviewer before posting.

kirkbrauer · 2026-06-12T16:08:52Z

+  input, so they naturally do not scale up for future-dated leases until the
+  controller makes them effective.
+
+## Future Possibilities


Suggestion: list a few forward-compat items so the model stays open to them: disaggregated/cross-node accelerators (ARM64 runtime bridged to a remote GPU via virtio-gpu/RDMA), a separate reusable *ProviderConfig CRD (multi-account credential reuse/rotation), a realized-instance CRD (PV analog) for static/pre-provisioned devices, an ExporterDeployment rollout tier (Deployment analog), multiple/spawned-on-lease VirtualTargets per Exporter, and a universal physical+virtual Target abstraction.

🤖 Drafted with AI assistance (Claude) and reviewed by the PR reviewer before posting.

kirkbrauer · 2026-06-12T16:10:42Z

+for scalable testing. However:
+
+- Virtual targets must faithfully emulate the interfaces exposed by physical
+  hardware (serial, network, storage, power) through the existing driver model.


@raballew I think if we can use virtio here, we might be able to get very high fidelity for guests that run natively on virtio targets and we can avoid privileged Pods in this case :)

kirkbrauer · 2026-06-12T16:11:42Z

+
+- [ ] `AndroidExporterPool` CRD and reconciler
+- [ ] Provider authoring guide documenting how to add a new `*ExporterPool`
+


@mangelajo Humm, yeah we'll need to think about this if this is related to a DeviceClass or ExporterClass since the Exporter is the primary scheduling unit in this proposal.

mangelajo force-pushed the jep-0014-virtual-scalable-exporters branch from 8928e29 to 08d970b Compare June 3, 2026 17:03

coderabbitai Bot reviewed Jun 3, 2026

View reviewed changes

Comment thread python/docs/source/contributing/jeps/JEP-0014-virtual-scalable-exporters.md

Comment thread python/docs/source/contributing/jeps/JEP-0014-virtual-scalable-exporters.md Outdated

Comment thread python/docs/source/contributing/jeps/JEP-0014-virtual-scalable-exporters.md

raballew reviewed Jun 3, 2026

View reviewed changes

mangelajo and others added 2 commits June 4, 2026 10:37

raballew reviewed Jun 4, 2026

View reviewed changes

mangelajo commented Jun 4, 2026

View reviewed changes

docs(jeps): re-add toctree to fix Sphinx warning for new JEP files

0c8b743

New JEP files not listed in any toctree cause Sphinx build warnings, which fail the check-warnings CI job. Co-authored-by: Cursor <cursoragent@cursor.com>

kirkbrauer reviewed Jun 12, 2026

View reviewed changes


		- [ ] `AndroidExporterPool` CRD and reconciler
		- [ ] Provider authoring guide documenting how to add a new `*ExporterPool`


		## Abstract

		This JEP proposes a Virtual Scalable Exporter subsystem for Jumpstarter that

class (provisioner)	fidelity	scale/cost	role
container sim (qemu)	low	cheap / CI	functional checks
cloud virtual device (api)	high	metered	higher-fidelity behavior
real hardware (exporter)	full	scarce	ground truth

Conversation

mangelajo commented Jun 3, 2026

Summary

Test plan

Uh oh!

coderabbitai Bot commented Jun 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reviews paused

Walkthrough

Changes

Estimated code review effort

Suggested labels

Suggested reviewers

Poem

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

mangelajo Jun 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

mangelajo Jun 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

kirkbrauer commented Jun 4, 2026

Uh oh!

Choose a reason for hiding this comment

coderabbitai Bot commented Jun 3, 2026 •

edited

Loading

mangelajo Jun 3, 2026 •

edited

Loading

mangelajo Jun 4, 2026 •

edited

Loading