The config managedLedgerMaxUnackedRangesToPersist
Indicates the number of acknowledgment holes
that are going to be persistently stored. When acknowledging out of order, a consumer will leave holes that are supposed to be quickly filled by acking all the messages. The information of which messages are acknowledged is persisted by compressing in ranges
of messages that were acknowledged. After the maximum number of ranges is reached, the information will only be tracked in memory, and messages will be redelivered in case of crashes.
The cursor metadata contains the following three data:
pulsar.dedup
. If a topic has many, many producers, this part of the data will be large. See PIP-6:-Guaranteed-Message-Deduplication for more details.Differ with Kafka: Pulsar supports individual acknowledgment (just like ack {pos-1, pos-3, pos-5}
), so instead of a pointer(acknowledged on the left and un-acknowledged on the right), Pulsar needs to persist the acknowledgment state of each message, we call these records Individual Deleted Messages.
The current persistence mechanism of the cursor metadata(including Individual Deleted Messages
) works like this:
Individual Deleted Messages
) to BK in one Entry; by default, the maximum size of the Entry is 5MB.Individual Deleted Messages
) to the Metadata Store(such as ZK) if BK-Write fails; data of a Metadata Store Node that is less than 10MB is recommended. Since writing large chunks of data to the Metadata Store frequently makes the Metadata Store work unstable, this is only a backstop measure.Is 5MB enough? Individual Deleted Messages
consists of Position_Rang(each Position_Rang occupies 32 bytes; the implementation will not be explained in this proposal). This means that the Broker can persist 5m / 32bytes
number of Position_Rang for each Subscription, and there is an additional compression mechanism at work, so it is sufficient for almost all scenarios except the following three scenarios:
pulsar.dedup
cursor.The config managedLedgerMaxUnackedRangesToPersist
If the cursor metadata is too large to persist, the Broker will persist only part of the data according to the following priorities.
Since the frequent persistence of Individual Deleted Messages
will magnify the amount of BK Written and increase the latency of ack-response, the Broker does not immediately persist it when receiving a consumer's acknowledgment but persists it regularly.
The data of cursor metadata is recommended to be less than 5MB; if a subscription's Individual Deleted Messages
data is too large to persist, as the program grows for a long time, there will be more and more non-persistent data. Eventually, there will be an unacceptable amount of repeated consumption of messages when the Broker restarts.
To avoid repeated consumption due to the cursor metadata being too large to persist.
This proposal will not care about this scenario: if so many producers make the metadata of cursor pulsar.dedup
cannot persist, the task Take Deduplication Snapshot
will be in vain due to the inability to persist.
Provide a new config named dispatcherPauseOnAckStatePersistentEnabled
(default value is false
) for a new feature: stop dispatch messages to clients when reaching the limitation managedLedgerMaxUnackedRangesToPersist
.
false
.true
.broker.conf
/** * After enabling this feature, Pulsar will stop delivery messages to clients if the cursor metadata is too large to persist, it will help to reduce the duplicates caused by the ack state that can not be fully persistent. Default "false". */ boolean dispatcherPauseOnAckStatePersistentEnabled;
SubscriptionStats
/** * After enabling the feature "dispatcherPauseOnAckStatePersistentEnabled", return "true" if the cursor metadata is too large to persist, else return "false". * Always return "false" if disabled the feature "dispatcherPauseOnAckStatePersistentEnabled". */ boolean isBlockedOnAckStatePersistent();
Cache the range count of the Individual Deleted Messages in the memory when doing persist cursor metadata to BK. Stuck delivery messages to clients if reaching the limitation managedLedgerMaxUnackedRangesToPersist
.
Since the cache will not be updated in time, the actual count will decrease when clients acknowledge messages(but it does not persist immediately, so the cached value does not update immediately), the cached value is an estimated value.
Nothing.