Skip to content

Conversation

@merlimat
Copy link
Contributor

Motivation

Modifications

Verifying this change

  • Make sure that the change passes the CI checks.

(Please pick either of the following options)

This change is a trivial rework / code cleanup without any test coverage.

(or)

This change is already covered by existing tests, such as (please describe tests).

(or)

This change added tests and can be verified as follows:

(example:)

  • Added integration tests for end-to-end deployment with large payloads (10MB)
  • Extended integration test for recovery after broker failure

Does this pull request potentially affect one of the following parts:

If the box was checked, please highlight the changes

  • Dependencies (add or upgrade a dependency)
  • The public API
  • The schema
  • The default values of configurations
  • The threading model
  • The binary protocol
  • The REST endpoints
  • The admin CLI options
  • The metrics
  • Anything that affects deployment

Documentation

  • doc
  • doc-required
  • doc-not-needed
  • doc-complete

Matching PR in forked repository

PR in forked repository:

@github-actions github-actions bot added the doc-not-needed Your PR changes do not impact docs label Jan 29, 2026

**Broker/Bookie actions (automatic, triggered by watching the flag):**
1. Detect migration flag via watch on `/pulsar/migration-coordinator/migration`
2. Defer non-critical metadata writes (e.g., ledger rollovers, bundle ownership changes)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would this use the existing solution? a SessionEvent.ConnectionLost/SessionEvent.SessionLost event sets a flag metadataServiceAvailable that is used for this purpose in many locations.

/**
* Keep a flag to indicate whether we're currently connected to the metadata service.
*/
@Getter
private boolean metadataServiceAvailable;

private synchronized void handleMetadataStoreNotification(SessionEvent e) {
log.info("Received MetadataStore session event: {}", e);
metadataServiceAvailable = e.isConnected();
}

It seems that currently ledger trimming, ledger rollover and loadbalancer load shedding are using the the metadataServiceAvailable flag in ManagedLedgerFactoryImpl.

There's also a dependency on the event directly:

// Do not attempt to write if not connected
if (lastMetadataSessionEvent != null
&& lastMetadataSessionEvent.isConnected()
&& (needBrokerDataUpdate() || force)) {
localData.setLastUpdate(System.currentTimeMillis());
brokerDataLock.updateValue(localData).join();
// Clear deltas.
localData.cleanDeltas();
// Update previous data.
lastData.update(localData);
}

would the coordinator send a SessionEvent.ConnectionLost event when migration starts so that it remains compatible with the existing solution?

AbstractMetadataStore has a flag isConnected which could also be useful? it's not currently used within Pulsar, just for metadata store caching decisions. I guess it would be necessary to skip cache refreshs while the migration is on-going.

Comment on lines +91 to +93
Each broker and bookie registers itself as a migration participant by creating a sequential ephemeral node:
- Path: `/pulsar/migration-coordinator/participants/id-NNNN` (sequential)
- This allows the coordinator to know how many participants exist before migration starts
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Which node is selected as the migration coordinator?
If it's the broker, what if the migration takes a lot more memory than the broker usually does and causes OOME.
Would it be possible to deploy a dedicated coordinator or run the coordinator in-process, let's say in a pod with sufficient resources, running in a Pulsar cluster?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some additional questions:
Does the same single coordinator instance migrate both Pulsar and Bookkeeper metadata? Why does the coordinator need to know how many participants exist?

Copy link
Member

@lhotari lhotari left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, just a few comments / questions.

@Technoboy- Technoboy- added this to the 4.2.0 milestone Feb 2, 2026
Copy link
Contributor

@eolivelli eolivelli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A couple of questions:

  1. is it possible to also migrate back from Oxia to ZooKeeper ?
  2. this PIP is not about the metadata store to coordinate geo replication, should we state it explicitly ?

@merlimat
Copy link
Contributor Author

merlimat commented Feb 3, 2026

Good questions.

  1. is it possible to also migrate back from Oxia to ZooKeeper ?

While the mechanism can be quite generic, the tricky part to make it work seamlessly is to carry over the same version ids on each metadata record. That allows brokers and bookies conditional writes to not fail in the migration.

In Oxia we were able to add explicit override for this specific purpose. With ZK it's unfortunately not as easy.

  1. this PIP is not about the metadata store to coordinate geo replication, should we state it explicitly ?

Good point. I'll highlight this more.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

doc-not-needed Your PR changes do not impact docs PIP ready-to-test

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants