Pulsar provide a way to compress messages before sending them to the broker[0]. This can be done by setting the compressionType
in the producer configuration. The compressionType can be set to one of the following values:
But the compressionType is applied to all messages sent by the producer. This means that even small messages are compressed.
In our test, we found that compressing small messages can is meaningless. The compression ratio is low and spend more cpu. The relevant description in the official documentation:
The smaller the amount of data to compress, the more difficult it is to compress. This problem is common to all compression algorithms. [1]
The similar configuration in RocketMQ is compressMsgBodyOverHowmuch
[2]:
/**
- Compress message body threshold, namely, message body larger than 4k will be compressed on default. */ private int compressMsgBodyOverHowmuch = 1024 * 4;
[0] https://pulsar.apache.org/docs/4.0.x/concepts-messaging/#compression [1] https://github.com/facebook/zstd?tab=readme-ov-file#the-case-for-small-data-compression [2] https://github.com/apache/rocketmq/blob/dd62ed0f3b16919adec5d5eece21a1050dc9c5a0/client/src/main/java/org/apache/rocketmq/client/producer/DefaultMQProducer.java#L117
The motivation of this PIP is to provide a way to improve the compression performance by skipping the compression of small messages. We want to add a new configuration compressMinMsgBodySize
to the producer configuration. This configuration will allow the user to set the minimum size of the message body that will be compressed. If the message body size is less than the compressMinMsgBodySize
, the message will not be compressed.
Add a new configuration compressMinMsgBodySize
to the producer configuration.
Solve the compression problem of small data
Add a new configuration compressMinMsgBodySize
to the producer configuration. This configuration will allow the user to set the minimum size of the message body that will be compressed. If the message body size is less than the compressMinMsgBodySize
, the message will not be compressed.
Add a new configuration compressMinMsgBodySize
to the producer configuration.
NA
NA
NA
NA
This is a new feature, and it does not affect the existing configuration.
The new configuration compressMinMsgBodySize
will to be removed from the producer configuration. If you used it, you need to remove it manually.