[feat][pip] PIP-454: Metadata Store Migration Framework #25196

merlimat · 2026-01-29T20:03:16Z

Motivation

Modifications

Verifying this change

Make sure that the change passes the CI checks.

(Please pick either of the following options)

This change is a trivial rework / code cleanup without any test coverage.

(or)

This change is already covered by existing tests, such as (please describe tests).

(or)

This change added tests and can be verified as follows:

(example:)

Added integration tests for end-to-end deployment with large payloads (10MB)
Extended integration test for recovery after broker failure

Does this pull request potentially affect one of the following parts:

If the box was checked, please highlight the changes

Documentation

doc
doc-required
doc-not-needed
doc-complete

Matching PR in forked repository

PR in forked repository:

lhotari · 2026-01-30T09:45:20Z

pip/pip-454.md

+
+**Broker/Bookie actions (automatic, triggered by watching the flag):**
+1. Detect migration flag via watch on `/pulsar/migration-coordinator/migration`
+2. Defer non-critical metadata writes (e.g., ledger rollovers, bundle ownership changes)


would this use the existing solution? a SessionEvent.ConnectionLost/SessionEvent.SessionLost event sets a flag metadataServiceAvailable that is used for this purpose in many locations.

pulsar/managed-ledger/src/main/java/org/apache/bookkeeper/mledger/impl/ManagedLedgerFactoryImpl.java

Lines 148 to 152 in d630394

/**

* Keep a flag to indicate whether we're currently connected to the metadata service.

*/

@Getter

private boolean metadataServiceAvailable;

pulsar/managed-ledger/src/main/java/org/apache/bookkeeper/mledger/impl/ManagedLedgerFactoryImpl.java

Lines 288 to 291 in 1617bb2

private synchronized void handleMetadataStoreNotification(SessionEvent e) {

log.info("Received MetadataStore session event: {}", e);

metadataServiceAvailable = e.isConnected();

}

It seems that currently ledger trimming, ledger rollover and loadbalancer load shedding are using the the metadataServiceAvailable flag in ManagedLedgerFactoryImpl.

There's also a dependency on the event directly:

pulsar/pulsar-broker/src/main/java/org/apache/pulsar/broker/loadbalance/impl/ModularLoadManagerImpl.java

Lines 1137 to 1150 in 38807b1

// Do not attempt to write if not connected

if (lastMetadataSessionEvent != null

&& lastMetadataSessionEvent.isConnected()

&& (needBrokerDataUpdate() || force)) {

localData.setLastUpdate(System.currentTimeMillis());

brokerDataLock.updateValue(localData).join();

// Clear deltas.

localData.cleanDeltas();

// Update previous data.

lastData.update(localData);

}

would the coordinator send a SessionEvent.ConnectionLost event when migration starts so that it remains compatible with the existing solution?

AbstractMetadataStore has a flag isConnected which could also be useful? it's not currently used within Pulsar, just for metadata store caching decisions. I guess it would be necessary to skip cache refreshs while the migration is on-going.

lhotari · 2026-01-30T09:50:04Z

pip/pip-454.md

+Each broker and bookie registers itself as a migration participant by creating a sequential ephemeral node:
+- Path: `/pulsar/migration-coordinator/participants/id-NNNN` (sequential)
+- This allows the coordinator to know how many participants exist before migration starts


Which node is selected as the migration coordinator?
If it's the broker, what if the migration takes a lot more memory than the broker usually does and causes OOME.
Would it be possible to deploy a dedicated coordinator or run the coordinator in-process, let's say in a pod with sufficient resources, running in a Pulsar cluster?

Some additional questions:
Does the same single coordinator instance migrate both Pulsar and Bookkeeper metadata? Why does the coordinator need to know how many participants exist?

lhotari

LGTM, just a few comments / questions.

eolivelli

A couple of questions:

is it possible to also migrate back from Oxia to ZooKeeper ?
this PIP is not about the metadata store to coordinate geo replication, should we state it explicitly ?

merlimat · 2026-02-03T20:20:00Z

Good questions.

is it possible to also migrate back from Oxia to ZooKeeper ?

While the mechanism can be quite generic, the tricky part to make it work seamlessly is to carry over the same version ids on each metadata record. That allows brokers and bookies conditional writes to not fail in the migration.

In Oxia we were able to add explicit override for this specific purpose. With ZK it's unfortunately not as easy.

this PIP is not about the metadata store to coordinate geo replication, should we state it explicitly ?

Good point. I'll highlight this more.

PIP-454: Metadata Store Migration Framework

dde4392

merlimat added ready-to-test PIP labels Jan 29, 2026

github-actions bot added the doc-not-needed Your PR changes do not impact docs label Jan 29, 2026

lhotari reviewed Jan 30, 2026

View reviewed changes

lhotari approved these changes Jan 30, 2026

View reviewed changes

Technoboy- assigned merlimat Feb 2, 2026

Technoboy- added this to the 4.2.0 milestone Feb 2, 2026

eolivelli reviewed Feb 3, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[feat][pip] PIP-454: Metadata Store Migration Framework #25196

[feat][pip] PIP-454: Metadata Store Migration Framework #25196

merlimat commented Jan 29, 2026

Uh oh!

lhotari Jan 30, 2026

Uh oh!

lhotari Jan 30, 2026

Uh oh!

lhotari Jan 30, 2026

Uh oh!

lhotari left a comment

Uh oh!

eolivelli left a comment

Uh oh!

merlimat commented Feb 3, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

	/**
	* Keep a flag to indicate whether we're currently connected to the metadata service.
	*/
	@Getter
	private boolean metadataServiceAvailable;

	private synchronized void handleMetadataStoreNotification(SessionEvent e) {
	log.info("Received MetadataStore session event: {}", e);
	metadataServiceAvailable = e.isConnected();
	}

	// Do not attempt to write if not connected
	if (lastMetadataSessionEvent != null
	&& lastMetadataSessionEvent.isConnected()
	&& (needBrokerDataUpdate() \|\| force)) {
	localData.setLastUpdate(System.currentTimeMillis());

	brokerDataLock.updateValue(localData).join();

	// Clear deltas.
	localData.cleanDeltas();

	// Update previous data.
	lastData.update(localData);
	}

[feat][pip] PIP-454: Metadata Store Migration Framework #25196

Are you sure you want to change the base?

[feat][pip] PIP-454: Metadata Store Migration Framework #25196

Conversation

merlimat commented Jan 29, 2026

Motivation

Modifications

Verifying this change

Does this pull request potentially affect one of the following parts:

Documentation

Matching PR in forked repository

Uh oh!

lhotari Jan 30, 2026

Choose a reason for hiding this comment

Uh oh!

lhotari Jan 30, 2026

Choose a reason for hiding this comment

Uh oh!

lhotari Jan 30, 2026

Choose a reason for hiding this comment

Uh oh!

lhotari left a comment

Choose a reason for hiding this comment

Uh oh!

eolivelli left a comment

Choose a reason for hiding this comment

Uh oh!

merlimat commented Feb 3, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants