| # PIP-414: Enforce topic consistency check |
| |
| ## Background Knowledge |
| |
| In Apache Pulsar, topics can be either non-partitioned or partitioned. A non-partitioned topic handles all messages on a |
| single topic instance, while a partitioned topic is logically split across multiple partitions for parallelism and |
| scalability. Partitioned topic is backed by multiple internal non-partitioned topics, one per partition (e.g., |
| `topic-partition-0`, `topic-partition-1`, etc.). |
| |
| ## Motivation |
| |
| Apache Pulsar allows clients to produce to or consume from a fully qualified partitioned topic name (e.g., |
| `public/default/test-partitioned-topic-partition-N`), even if the partitioned topic metadata does not yet exist. If topic |
| auto-creation is enabled and `allowAutoTopicCreationType=non-partitioned` is set, the broker will automatically create a |
| non-partitioned topic for that partition name. |
| |
| This behavior can lead to inconsistencies: users expecting to interact with a partitioned topic may unknowingly create |
| non-partitioned topics for individual partitions. As a result, listing topics may yield unexpected structures, with |
| partitioned and non-partitioned topics mixed under similar names. |
| |
| While this issue can also occur in GEO-replicated environments due to asynchronous metadata propagation, the core |
| problem is the lack of topic validation. Disabling this behavior and enforcing |
| strict topic-type consistency is essential to ensure clarity, prevent accidental misconfigurations, and maintain the |
| integrity of topic structures in Pulsar. |
| |
| ## Goals |
| |
| ### In Scope |
| |
| - Introduce validation to enforce consistency between topic metadata and topic type. |
| - Prevent producing or consuming from a topic when metadata indicates a mismatch. |
| |
| ## High Level Design |
| |
| The proposed solution adds server-side validation to ensure that the topic type matches the metadata. |
| |
| - When accessing a non-partitioned topic, the broker checks that no partition metadata exists(which shouldn't be there). |
| - When accessing a partitioned topic (via a `-partition-N` name), the broker ensures that valid partition metadata |
| exists and that the topic is indeed part of a partitioned group. |
| |
| - If these checks fail, the broker rejects the client request with an appropriate error. |
| |
| ## Detailed Design |
| |
| ### Design & Implementation Details |
| |
| - Validation is added when loading the topic. |
| - If a mismatch is detected: |
| - Non-partitioned topic with partition metadata → reject request. |
| - Partitioned topic (e.g., `-partition-0`) with missing metadata → reject request. |
| |
| ### Public-facing Changes |
| |
| ## Monitoring |
| |
| Operators should monitor for client/broker errors indicating rejected produce/consume requests due to topic-type mismatches. |
| These errors help identify misconfigured clients. |
| |
| ## Backward & Forward Compatibility |
| |
| ### Upgrade |
| |
| No special upgrade procedure is required. Once upgraded, clients/brokers attempting invalid topic-type interactions will |
| receive an error. |
| |
| If you encounter the error `Partition metadata not found for the partitioned topic`, you can recreate the metadata using |
| the following command: |
| |
| ``` |
| bin/pulsar-admin topics create-partitioned-topic $TOPIC -p $PARITIONS |
| ``` |
| |
| This behavior is supported by [#24225](https://github.com/apache/pulsar/pull/24225). |
| |
| ### Downgrade / Rollback |
| |
| On downgrade, previous behavior resumes: mismatched topic types may be silently created or accessed. Operators should |
| monitor carefully if a downgrade is necessary. |
| |
| ### Pulsar Geo-Replication Upgrade & Downgrade/Rollback Considerations |
| |
| The old version may allow inconsistent topic-type usage while newer ones enforce validation. |
| |
| # Alternatives |
| |
| @BewareMyPower proposed adding a configuration option that `validatePartitionMetadataConsistency` to allow enabling or |
| disabling this validation behavior. While this would provide flexibility, we believe it is not the right direction. |
| |
| Pulsar already has an extensive set of configuration parameters. Introducing more options for narrowly scoped behaviors |
| increases maintenance overhead and adds to user confusion. Since topic-type consistency is fundamental to correct Pulsar |
| usage, this validation should be enforced by default, not made optional. |
| |
| ## General Notes |
| |
| This change increases Pulsar robustness and predictability by enforcing stricter topic-type expectations. |
| |
| ## Links |
| |
| * Implement PR: https://github.com/apache/pulsar/pull/24118 |
| * Mailing List discussion thread: https://lists.apache.org/thread/v7v889qzfjpqznm713wvogmx2bs4pkkb |
| * Mailing List voting thread: https://lists.apache.org/thread/92zpyqwbrfbjrsxbh1dsqksn65scsc5m |