This Pulsar Improvement Proposal (PIP) addresses an observability gap in Apache Pulsar by proposing the addition of two new fields to the topic statistics API: topicCreationTimeStamp and lastPublishTimeStamp. Currently, operators and developers lack a direct and efficient method to determine a topic‘s age or its most recent message activity, hindering effective lifecycle management, troubleshooting, and compliance auditing. This proposal outlines a plan to introduce these timestamps into the TopicStats object, accessible via the Admin REST API and pulsar-admin CLI. The topicCreationTimeStamp will be retrieved directly from the topic’s underlying metadata node statistics, providing an immutable value for all topics. The lastPublishTimeStamp will be maintained as a high-performance, in-memory atomic value on the owning broker, with a robust recovery mechanism from the managed ledger to ensure accuracy across broker restarts and topic reloads. These additions will be implemented in a backward-compatible manner, providing essential lifecycle visibility without impacting system performance.
The primary motivation for this proposal is to enhance the operational observability of Apache Pulsar topics by providing fundamental lifecycle metadata. Administrators, developers, and automated systems currently face significant challenges in managing topics at scale due to the absence of two key data points: when a topic was created and when it last received a message. This lack of information complicates several critical operational tasks.
This proposal aims to fill this observability gap by introducing dedicated, reliable timestamps for topic creation and last publish activity, thereby empowering users with the necessary tools for more sophisticated and automated management of their Pulsar deployments.
This proposal introduces two new fields to the data structures that represent topic statistics. These fields will be exposed in the JSON response of the Admin API endpoints for both non-partitioned and partitioned topics
The new fields are defined as follows:
topicCreationTimeStamp
: A long value representing the UTC timestamp in epoch milliseconds when the topic was first durably created in the metadata store. This value is immutable for the lifetime of the topic.lastPublishTimeStamp
: A long value representing the UTC timestamp in epoch milliseconds corresponding to the publish_time field of the last message successfully persisted by the broker for this topic. This value is updated upon every successful message publication. If no message has ever been published to the topic, this field will return 0 Upon topic load (e.g., after a broker restart), this value will be recovered from the last entry in the managed ledger; until then, it may temporarily be 0.The implementation strategy for the two new fields is designed to balance durability, accuracy, and performance, leveraging existing Pulsar components and patterns to minimize complexity.
The topicCreationTimeStamp is an intrinsic property of the metadata node (e.g., a z-node in ZooKeeper) that represents the topic‘s managed ledger. This timestamp is automatically recorded by the metadata store upon the node’s creation and is available for all topics, regardless of when they were created.
topicCreationTimeStamp
will be fetched asynchronously when the PersistentTopic object is initialized. The value will be cached in a long
field for immediate access during stats requests. This is a one-time operation per topic load that leverages the existing MetadataStore interface.The lastPublishTimeStamp must be updated on every message publish, a high-frequency operation. Persisting this value to the metadata store on every publish would introduce unacceptable latency and contention, creating a severe performance bottleneck. Therefore, a hybrid in-memory and durable-recovery approach is proposed.
This hybrid design ensures that updating the lastPublishTimeStamp is extremely fast (an in-memory atomic operation), while the recovery mechanism provides strong durability and accuracy guarantees across broker failures and topic lifecycle events.
The proposed changes are designed to be minimally invasive while providing significant value.
topicCreationTimeStamp: The performance impact is negligible. It involves a single, one-time asynchronous read from the metadata store during topic initialization, an infrequent operation. Reading the value for stats requests is an in-memory operation from a cached field, resulting in zero I/O overhead. lastPublishTimeStamp: The impact on the message publish path is extremely low. The update consists of a single, non-blocking long value update operation. The recovery process does not block the topic from becoming available to clients.
The proposal is fully backward and forward compatible.
API Compatibility: The changes are purely additive to the TopicStats JSON response. Older admin clients or tools that are unaware of the new fields will simply ignore them, continuing to function without error.
Broker Compatibility: A cluster can be upgraded in a rolling fashion.
Client Compatibility: The client-broker binary protocol is not being changed. This proposal only affects the administrative API.
No required change for now, but we can consider to add metrics for the new fields in topic stats if there is a strong requirement.
A thorough testing plan is essential to validate the correctness, performance, and resilience of the implementation by adding unit test with topic unloading for lastPublishTimeStamp recovery.