Pulsar Improvement Proposal (PIP)

What is a PIP?

The PIP is a “Pulsar Improvement Proposal” and it's the mechanism used to propose changes to the Apache Pulsar codebases.

The changes might be in terms of new features, large code refactoring, changes to APIs.

In practical terms, the PIP defines a process in which developers can submit a design doc, receive feedback and get the “go ahead” to execute.

What is the goal of a PIP?

There are several goals for the PIP process:

  1. Ensure community technical discussion of major changes to the Apache Pulsar codebase.

  2. Provide clear and thorough design documentation of the proposed changes. Make sure every Pulsar developer will have enough context to effectively perform a code review of the Pull Requests.

  3. Use the PIP document to serve as the baseline on which to create the documentation for the new feature.

  4. Have greater scrutiny to changes are affecting the public APIs (as defined below) to reduce chances of introducing breaking changes or APIs that are not expressing an ideal semantic.

It is not a goal for PIP to add undue process or slow-down the development.

When is a PIP required?

  • Any new feature for Pulsar brokers or client
  • Any change to the public APIs (Client APIs, REST APIs, Plugin APIs)
  • Any change to the wire protocol APIs
  • Any change to the API of Pulsar CLI tools (eg: new options)
  • Any change to the semantic of existing functionality, even when current behavior is incorrect.
  • Any large code change that will touch multiple components
  • Any changes to the metrics (metrics endpoint, topic stats, topics internal stats, broker stats, etc.)
  • Any change to the configuration

When is a PIP not required?

  • Bug-fixes
  • Simple enhancements that won't affect the APIs or the semantic
  • Small documentation changes
  • Small website changes
  • Build scripts changes (except: a complete rewrite)

Who can create a PIP?

Any person willing to contribute to the Apache Pulsar project is welcome to create a PIP.

How does the PIP process work?

A PIP proposal can be in these states:

  1. DRAFT: (Optional) This might be used for contributors to collaborate and to seek feedback on an incomplete version of the proposal.

  2. DISCUSSION: The proposal has been submitted to the community for feedback and approval.

  3. ACCEPTED: The proposal has been accepted by the Pulsar project.

  4. REJECTED: The proposal has not been accepted by the Pulsar project.

  5. IMPLEMENTED: The implementation of the proposed changes have been completed and everything has been merged.

  6. RELEASED: The proposed changes have been included in an official Apache Pulsar release.

The process works in the following way:

  1. Fork https://github.com/apache/pulsar repository (Using the fork button on GitHub).
  2. Clone the repository, and on it, copy the file pip/TEMPLATE.md and name it pip-xxx.md. The number xxx should be the next sequential number after the last contributed PIP. You view the list of contributed PIPs (at any status) as a list of Pull Requests having a “PIP” label. Use the link here as shortcut.
  3. Write the proposal following the section outlined by the template and the explanation for each section in the comment it contains (you can delete the comment once done).
    • If you need diagrams, avoid attaching large files. You can use MermaidJS as simple language to describe many types of diagrams.
  4. Create GitHub Pull request (PR). The PR title should be [improve][pip] PIP-xxx: {title}, where the xxx match the number given in previous step (file-name). Replace {title} with a short title to your proposal. Validate again that your number does not collide, by step (2) numbering check.
  5. The author(s) will email the dev@pulsar.apache.org mailing list to kick off a discussion, using subject prefix [DISCUSS] PIP-xxx: {PIP TITLE}. The discussion will happen in broader context either on the mailing list or as general comments on the PR. Many of the discussion items will be on particular aspect of the proposal, hence they should be as comments in the PR to specific lines in the proposal file.
  6. Update file with a link to the discussion on the mailing. You can obtain it from Apache Pony Mail.
  7. Based on the discussion and feedback, some changes might be applied by authors to the text of the proposal. They will be applied as extra commits, making it easier to track the changes.
  8. Once some consensus is reached, there will be a vote to formally approve the proposal. The vote will be held on the dev@pulsar.apache.org mailing list, by sending a message using subject [VOTE] PIP-xxx: {PIP TITLE}. Make sure to include a link to the PIP PR in the body of the message. Make sure to update the PIP with a link to the vote. You can obtain it from Apache Pony Mail. Everyone is welcome to vote on the proposal, though only the vote of the PMC members will be considered binding. The requirement is to have at least one binding +1 vote from a lazy majority if no binding -1 votes have been cast on the PIP. The vote should stay open for at least 48 hours.
  9. When the vote is closed, if the outcome is positive, ask a PMC member (using voting thread on mailing list) to merge the PR.
  10. If the outcome is negative, please close the PR (with a small comment that the close is a result of a vote).

All the future implementation Pull Requests that will be created, should always reference the PIP-XXX in the commit log message and the PR title. It is advised to create a master GitHub issue to formulate the execution plan and track its progress.

Example

  • Eve ran into some issues with the client metrics - she needed a metric which was missing.
  • She read the code a bit, and has an idea what metrics she wishes to add.
  • She summarized her idea and direction in an email to the DEV mailing list (she located it on Discussions section on the website.
  • She didn't get any response from the community, so she joined the next community meeting. There Matteo Merli and Asaf helped setup a channel in Slack to brainstorm the idea and meet on Zoom with a few Pulsar contributors (e.g. Lari and Tison).
  • Once Eve had a good enough context, and good design outline, she opened a new branch in her Pulsar repository, duplicated TEMPLATE.md and created pip-xxx.MD (the number she will take later).
  • She followed the template and submitted the pip as a new PR to pulsar repository.
  • Once the PR was created, she modified the version to match the rules described at step 2, both for PR title and file name.
  • She sent an email to the DEV mailing list, titled “[DISCUSS] PIP-123: Adding metrics for ...” , described shortly in the email what the PIP was about and gave a link.
  • She got no response for anyone for 2 weeks, so she nudged the people that helped her brainstorm (e.g. Lary and Tison) and pinged in #dev that she needs more reviewers.
  • Once she got 3 reviews from PMC members and the community had at least a few days from the moment the PR was announceed on DEV, she sent a vote email to the DEV mailing list titled “[VOTE] PIP-123: Adding metrics for ...”.
  • She nudged the reviewers to reply with a binding vote, waited for 2-3 days, and then concluded the vote by sending a reply tallying up the binding and non-binding votes.
  • She updated the PIP with links to discuss and vote emails, and then asked a PMC member who voted +1, to merge (using GitHub mentionon the PR).

List of PIPs

Current PIPs Table of Contents

The following table lists all current PIPs in this directory, sorted by PIP number:

PIP NumberTitle
PIP-1Pulsar Proxy
PIP-2Non Persistent topic
PIP-3Message dispatch throttling
PIP-4Pulsar End to End Encryption
PIP-5Event time
PIP-6Guaranteed Message Deduplication
PIP-7Pulsar Failure domain and Anti affinity namespaces
PIP-8Pulsar beyond 1M topics
PIP-9Adding more Security checks to Pulsar Proxy
PIP-10Remove cluster for namespace and topic names
PIP-11Short topic names
PIP-12Introduce builder for creating Producer Consumer Reader
PIP-13Subscribe to topics represented by regular expressions
PIP-14Topic compaction
PIP-15Pulsar Functions
PIP-16Pulsar “instance” terminology change
PIP-17Tiered storage for Pulsar topics
PIP-18Pulsar Replicator
PIP-19Pulsar SQL
PIP-20Mechanism to revoke TLS authentication
PIP-21Pulsar Edge Component
PIP-22Pulsar Dead Letter Topic
PIP-23Message Tracing By Interceptors
PIP-24Simplify memory settings
PIP-25Token based authentication
PIP-26Delayed Message Delivery
PIP-27Add checklist in github pull request template
PIP-28Pulsar Proxy Gateway Improvement
PIP-29One package for both pulsar-client and pulsar-admin
PIP-30change authentication provider API to support mutual authentication
PIP-31Transaction Support
PIP-32Go Function API, Instance and LocalRun
PIP-33Replicated subscriptions
PIP-34Add new subscribe type Key_shared
PIP-35Improve topic lookup for topics that have high number of partitions
PIP-36Max Message Size
PIP-37Large message size handling in Pulsar
PIP-38Batch Receiving Messages
PIP-39Namespace Change Events
PIP-40Pulsar Manager
PIP-41Pluggable Protocol Handler
PIP-42KoP - Kafka on Pulsar
PIP-43producer send message with different schema
PIP-44Separate schema compatibility checker for producer and consumer
PIP-45Pluggable metadata interface
PIP-46Next-gen Proxy
PIP-47Time Based Release Plan
PIP-48hierarchical admin api
PIP-49Permission levels and inheritance
PIP-50Package Management
PIP-51Tenant policy support
PIP-52Message dispatch throttling relative to publish rate
PIP-53Contribute DotPulsar to Apache Pulsar
PIP-54Support acknowledgement at batch index level
PIP-55Refresh Authentication Credentials
PIP-56Python3 Migration
PIP-57Improve Broker's Zookeeper Session Timeout Handling
PIP-58Support Consumers Set Custom Retry Delay
PIP-59gPRC Protocol Handler
PIP-60Support Proxy server with SNI routing
PIP-61Advertised multiple addresses
PIP-62Move connectors, adapters and Pulsar Presto to separate repositories
PIP-63Readonly Topic Ownership Support
PIP-64Introduce REST endpoints for producing, consuming and reading messages
PIP-65Adapting Pulsar IO Sources to support Batch Sources
PIP-66Pulsar Function Mesh
PIP-67Pulsarctl - An alternative tools of pulsar-admin
PIP-68Exclusive Producer
PIP-69Schema design for Go client
PIP-70Introduce lightweight broker entry metadata
PIP-71Pulsar SQL migrate SchemaHandle to presto decoder
PIP-72Introduce Pulsar Interface Taxonomy: Audience and Stability Classification
PIP-73Configurable data source priority for message reading
PIP-74Pulsar client memory limits
PIP-75Replace protobuf code generator
PIP-76Streaming Offload
PIP-77Contribute Supernova to Apache Pulsar
PIP-78Generate Docs from Code Automatically
PIP-79Reduce redundant producers from partitioned producer
PIP-80Unified namespace-level admin API
PIP-81Split the individual acknowledgments into multiple entries
PIP-82Tenant and namespace level rate limiting
PIP-83Pulsar client: Message consumption with pooled buffer
PIP-84Pulsar client: Redeliver command add epoch
PIP-85Expose Pulsar-Client via Function/Connector BaseContext
PIP-86Pulsar Functions: Preload and release external resources
PIP-87Upgrade Pulsar Website Framework (Docusaurus)
PIP-88Replicate schemas across multiple
PIP-89Structured document logging
PIP-90Expose broker entry metadata to the client
PIP-91Separate lookup timeout from operation timeout
PIP-92Topic policy across multiple clusters
PIP-93Transaction performance tools
PIP-94Message converter at broker level
PIP-95Smart Listener Selection with Multiple Bind Addresses
PIP-96Message payload processor for Pulsar client
PIP-97Asynchronous Authentication Provider
PIP-98Redesign Pulsar Information Architecture
PIP-99Pulsar Proxy Extensions
PIP-100Pulsar pluggable topic factory
PIP-101Add seek by index feature for consumer
PIP-104Add new consumer type: TableView
PIP-105Support pluggable entry filter in Dispatcher
PIP-106Negative acknowledgment backoff
PIP-107Introduce the chunk message ID PIP
PIP-108Pulsar Feature Matrix (Client and Function)
PIP-109Introduce Bot to Improve Efficiency of Developing Docs
PIP-110Topic metadata
PIP-111Add messages produced by Protocol Handler When checking maxMessagePublishBufferSizeInMB
PIP-112Generate Release Notes Automatically
PIP-116Create Pulsar Writing Style Guide
PIP-117Change Pulsar Standalone defaults
PIP-118Do not restart brokers when ZooKeeper session expires
PIP-119Enable consistent hashing by default on KeyShared dispatcher
PIP-120Enable client memory limit by default
PIP-121Pulsar cluster level auto failover
PIP-122Change loadBalancer default loadSheddingStrategy to ThresholdShedder
PIP-123Introduce Pulsar metadata CLI tool
PIP-124Create init subscription before sending message to DLQ
PIP-129Introduce intermediate state for ledger deletion
PIP-130Apply redelivery backoff policy for ack timeout
PIP-131Resolve produce chunk messages failed when topic level maxMessageSize is set
PIP-132Include message header size when check maxMessageSize for non-batch message on the client side
PIP-135Include MetadataStore backend for Etcd
PIP-136Sync Pulsar policies across multiple clouds
PIP-137Pulsar Client Shared State API
PIP-143Support split bundle by specified boundaries
PIP-144Making SchemaRegistry implementation configurable
PIP-146ManagedCursorInfo compression
PIP-148Create Pulsar client release notes
PIP-149Making the REST Admin API fully async
PIP-152Support subscription level dispatch rate limiter setting
PIP-154Max active transaction limitation for transaction coordinator
PIP-155Drop support for Python2
PIP-156Build and Run Pulsar Server on Java 17
PIP-157Bucketing topic metadata to allow more topics per namespace
PIP-160Make transactions work more efficiently by aggregation operation for transaction log and pending ack store
PIP-161Exclusive Producer: new mode ExclusiveWithFencing
PIP-162LTS Releases
PIP-165Auto release client useless connections
PIP-173Create a built-in Function implementing the most common basic transformations
PIP-174Provide new implementation for broker dispatch cache
PIP-175Extend time based release process
PIP-176Refactor Doc Bot
PIP-177Add the classLoader field for SchemaDefinition
PIP-178Multiple snapshots for transaction buffer
PIP-179Support the admin API to check unknown request parameters
PIP-180Shadow Topic, an alternative way to support readonly topic ownership
PIP-181Pulsar Shell
PIP-182Provide new load balance placement strategy implementation for ModularLoadManagerStrategy
PIP-183Reduce unnecessary REST call in broker
PIP-184Topic specific consumer priorityLevel
PIP-186Introduce two phase deletion protocol based on system topic
PIP-187Add API to analyse a subscription backlog and provide a accurate value
PIP-188Cluster migration or Blue-Green cluster deployment support in Pulsar
PIP-189No batching if only one message in batch
PIP-190Simplify documentation release and maintenance strategy
PIP-191Support batched message using entry filter
PIP-192New Pulsar Broker Load Balancer
PIP-193Sink preprocessing Function
PIP-194Pulsar client: seek command add epoch
PIP-195New bucket based delayed message tracker
PIP-198Standardize PR Naming Convention using GitHub Actions
PIP-201Extensions mechanism for Pulsar Admin CLI tools
PIP-204Extensions for broker interceptor
PIP-205Reactive Java client for Apache Pulsar
PIP-209Separate C++/Python clients to own repositories
PIP-243Register Jackson Java 8 support modules by default
PIP-249Pulsar website redesign
PIP-259Make the config httpMaxRequestHeaderSize of the pulsar web server to configurable
PIP-261Restructure Getting Started section
PIP-264Support OpenTelemetry metrics in Pulsar
PIP-265PR-based system for managing and reviewing PIPs
PIP-275Rename numWorkerThreadsForNonPersistentTopic to topicOrderedExecutorThreadNum
PIP-276Add pulsar prefix to topic_load_times metric
PIP-277Add current cluster marking to clusters list command
PIP-278Pluggable topic compaction service
PIP-279Support topic-level policies using TableView API
PIP-280Refactor CLI Argument Parsing Logic for Measurement Units using JCommander's custom converter
PIP-281Add notifyError method on PushSource
PIP-282Add Key_Shared subscription initial position support
PIP-284Migrate topic policies implementation to use TableView
PIP-286Support get position based on timestamp with topic compaction
PIP-289Secure Pulsar Connector Configuration
PIP-290Support message encryption in WebSocket proxy
PIP-292Enforce token expiration time in the Websockets plugin
PIP-293Support reader to read compacted data
PIP-296Add getLastMessageIds API for Reader
PIP-297Support terminating Function & Connector with the fatal exception
PIP-298Support read transaction buffer snapshot segments from earliest
PIP-299Support setting max unacked messages at subscription level
PIP-300Add RedeliverCount field to CommandAck
PIP-301Separate load data storage from configuration metadata store
PIP-302Support for TableView with strong read consistency
PIP-303Support PartitionedTopicStats exclude publishers and subscriptions
PIP-305Add OpAddEntry and pendingData statistics info in JVM metrics
PIP-306Support subscribing multi topics for WebSocket
PIP-307Optimize Bundle Unload(Transfer) Protocol for ExtensibleLoadManager
PIP-312Use StateStoreProvider to manage state in Pulsar Functions endpoints
PIP-313Support force unsubscribe using consumer api
PIP-315Configurable max delay limit for delayed delivery
PIP-318Don't retain null-key messages during topic compaction
PIP-320OpenTelemetry Scaffolding
PIP-321Split the responsibilities of namespace replication-clusters
PIP-322Pulsar Rate Limiting Refactoring
PIP-323Complete Backlog Quota Telemetry
PIP-324Switch to Alpine Linux base Docker images
PIP-325Support reading from transaction buffer for pending transaction
PIP-326Create a BOM to ease dependency management
PIP-327Support force topic loading for unrecoverable errors
PIP-329Strategy for maintaining the latest tag to Pulsar docker images
PIP-330getMessagesById gets all messages
PIP-335Support Oxia metadata store plugin
PIP-337SSL Factory Plugin to customize SSLContext/SSLEngine generation
PIP-339Introducing the --log-topic Option for Pulsar Sinks and Sources
PIP-342Support OpenTelemetry metrics in Pulsar client
PIP-343Use picocli instead of jcommander
PIP-344Correct the behavior of the public API pulsarClient.getPartitionsForTopic(topicName)
PIP-347add role field in consumer's stat
PIP-348Trigger offload on topic load stage
PIP-349Add additionalSystemCursorNames ignore list for ttl check
PIP-350Allow to disable the managedLedgerOffloadDeletionLagInMillis
PIP-351Additional options for Pulsar-Test client to support KeyStore based TLS
PIP-352Event time based topic compactor
PIP-353Improve transaction message visibility for peek-messages cli
PIP-354apply topK mechanism to ModularLoadManagerImpl
PIP-355Enhancing Broker-Level Metrics for Pulsar
PIP-356Support Geo-Replication starts at earliest position
PIP-357Correct the conf name in load balance module.
PIP-358let resource weight work for OverloadShedder, LeastLongTermMessageRate, ModularLoadManagerImpl.
PIP-359Support custom message listener executor for specific subscription
PIP-360Admin API to display Schema metadata
PIP-363Add callback parameters to the method: org.apache.pulsar.client.impl.SendCallback.sendComplete.
PIP-364Introduce a new load balance algorithm AvgShedder
PIP-366Support to specify different config for Configuration and Local Metadata Store
PIP-367Propose a Contributor Repository for Pulsar
PIP-368Support lookup based on the lookup properties
PIP-369Flag based selective unload on changing ns-isolation-policy
PIP-370configurable remote topic creation in geo-replication
PIP-373Add a topic's system prop that indicates whether users have published TXN messages in before.
PIP-374Visibility of messages in receiverQueue for the consumers
PIP-376Make Topic Policies Service Pluggable
PIP-378Add ServiceUnitStateTableView abstraction (ExtensibleLoadMangerImpl only)
PIP-379Key_Shared Draining Hashes for Improved Message Ordering
PIP-380Support setting up specific namespaces to skipping the load-shedding
PIP-381Handle large PositionInfo state
PIP-383Support granting/revoking permissions for multiple topics
PIP-384ManagedLedger interface decoupling
PIP-389Add Producer config compressMinMsgBodySize to improve compression performance
PIP-391Improve Batch Messages Acknowledgment
PIP-392Add configuration to enable consistent hashing to select active consumer for partitioned topic
PIP-393Improve performance of Negative Acknowledgement
PIP-395Add Proxy configuration to support configurable response headers for http reverse-proxy
PIP-396Align WindowFunction's WindowContext with BaseContext
PIP-399Fix Metric Name for Delayed Queue
PIP-401Support set batching configurations for Pulsar Functions&Sources
PIP-402Role Anonymizer for Pulsar Logging
PIP-404Introduce per-ledger properties
PIP-406Introduce metrics related to dispatch throttled events
PIP-407Add a newMessage API to create a message with a schema and transaction
PIP-409support producer configuration for retry/dead letter topic producer
PIP-412Support setting messagePayloadProcessor in Pulsar Functions and Sinks
PIP-414Enforce topic consistency check
PIP-415Support getting message ID by index
PIP-416Add a new topic method to implement trigger offload by size threshold
PIP-420Provides an ability for Pulsar clients to integrate with third-party schema registry service
PIP-421Require Java 17 as the minimum for Pulsar Java client SDK
PIP-422Support global topic-level policy: replicated clusters and new API to delete topic-level policies
PIP-425Support connecting with next available endpoint for multi-endpoint serviceUrls
PIP-427Align pulsar-admin Default for Mark-Delete Rate with Broker Configuration
PIP-428Change TopicPoliciesService interface to fix consistency issues
PIP-429Optimize Handling of Compacted Last Entry by Skipping Payload Buffer Parsing
PIP-430Pulsar Broker cache improvements: refactoring eviction and adding a new cache strategy based on expected read count
PIP-431Add Creation and Last Publish Timestamps to Topic Stats
PIP-432Add isEncrypted field to EncryptionContext
PIP-433Optimize the conflicts of the replication and automatic creation mechanisms, including the automatic creation of topics and schemas
PIP-435Add startTimestamp and endTimestamp for consuming messages in client cli
PIP-436Add decryptFailListener to Consumer

Additional Information

  1. You can view all PIPs (besides the historical ones) as the list of Pull Requests having title starting with [improve][pip] PIP-. Here is the link for it.
    • Merged PR means the PIP was accepted.
    • Closed PR means the PIP was rejected.
    • Open PR means the PIP was submitted and is in the process of discussion.
  2. All PIP files in the pip folder follow the naming convention pip-xxx.md where xxx is the PIP number.