| # PIP-335: Support Oxia metadata store plugin |
| |
| # Motivation |
| |
| Oxia is a scalable metadata store and coordination system that can be used as the core infrastructure |
| to build large scale distributed systems. |
| |
| Oxia was created with the primary goal of providing an alternative Pulsar to replace ZooKeeper as the |
| long term preferred metadata store, overcoming all the current limitations in terms of metadata |
| access throughput and data set size. |
| |
| # Goals |
| |
| Add a Pulsar MetadataStore plugin that uses Oxia client SDK. |
| |
| Users will be able to start a Pulsar cluster using just Oxia, without any ZooKeeper involved. |
| |
| ## Not in Scope |
| |
| It's not in the scope of this proposal to change any default behavior or configuration of Pulsar. |
| |
| # Detailed Design |
| |
| ## Design & Implementation Details |
| |
| Oxia semantics and client SDK were already designed with Pulsar and MetadataStore plugin API in mind, so |
| there is not much integration work that needs to be done here. |
| |
| Just few notes: |
| 1. Oxia client already provides support for transparent batching of read and write operations, |
| so there will be no use of the batching logic in `AbstractBatchedMetadataStore` |
| 2. Oxia does not treat keys as a walkable file-system like interface, with directories and files. Instead |
| all the keys are independent. Though Oxia sorting of keys is aware of '/' and provides efficient key |
| range scanning operations to identify the first level children of a given key |
| 3. Oxia, unlike ZooKeeper, doesn't require the parent path of a key to exist. eg: we can create `/a/b/c` key |
| without `/a/b` and `/a` existing. |
| In the Pulsar integration for Oxia we're forcing to create all parent keys when they are not there. This |
| is due to several places in BookKeeper access where it does not create the parent keys, though it will |
| later make `getChildren()` operations on the parents. |
| |
| ## Other notes |
| |
| Unlike in the ZooKeeper implementation, the notification of events is guaranteed in Oxia, because the Oxia |
| client SDK will use the transaction offset after server reconnections and session restarted events. This |
| will ensure that brokers cache will always be properly invalidated. We will then be able to remove the |
| current 5minutes automatic cache refresh which is in place to prevent the ZooKeeper missed watch issue. |
| |
| # Links |
| |
| Oxia: https://github.com/streamnative/oxia |
| Oxia Java Client SDK: https://github.com/streamnative/oxia-java |