DISTRIBUTED BY ... BUCKETS auto
Users often set inappropriate buckets, leading to various problems. For now, it only works for olap tables
In the past, when creating buckets, you had to set the number of buckets manually, but the automatic bucket projection feature is a way for Apache Doris to dynamically project the number of buckets, so that the number of buckets always stays within a suitable range and users don't have to worry about the minutiae of the number of buckets. First, for the sake of clarity, this section splits the bucket into two periods, the initial bucket and the subsequent bucket; the initial and subsequent are just terms used in this article to describe the feature clearly, there is no initial or subsequent Apache Doris bucket. As we know from the section above on creating buckets, BUCKET_DESC is very simple, but you need to specify the number of buckets; for the automatic bucket projection feature, the syntax of BUCKET_DESC directly changes the number of buckets to “Auto” and adds a new Properties configuration.
-- old version of the creation syntax for specifying the number of buckets DISTRIBUTED BY HASH(site) BUCKETS 20 -- Newer versions use the creation syntax for automatic bucket imputation DISTRIBUTED BY HASH(site) BUCKETS AUTO properties("estimate_partition_size" = "100G")
The new configuration parameter estimate_partition_size indicates the amount of data for a single partition. This parameter is optional and if not given, Doris will take the default value of estimate_partition_size to 10GB. As you know from the above, a partitioned bucket is a Tablet at the physical level, and for best performance, it is recommended that the Tablet size be in the range of 1GB - 10GB. So how does the automatic bucketing projection ensure that the Tablet size is within this range? To summarize, there are a few principles.
:::tip propertie estimate_partition_size not support alter :::
Initial bucketing projection Starting from the principle, it becomes easy to understand the detailed logic of the automatic bucket imputation function. First look at the initial bucketing
First, use the value of estimate_partition_size divided by 5 (calculated as a 5-to-1 data compression ratio in text format into Doris) to obtain the following result.
The pseudo-code representation of the above process is as follows
int N = Compute the N value;
int M = compute M value;
int y = number of BE nodes;
int x = min(M, N, 128);
if (x < N && x < y) {
return y;
}
return x;
With the above algorithm in mind, let's introduce some examples to better understand this part of the logic.
case1: Amount of data 100 MB, 10 BE machines, 2TB * 3 disks Amount of data N = 1 BE disks M = 10* (2TB/50GB) * 3 = 1230 x = min(M, N, 128) = 1 Final: 1 case2: Data volume 1GB, 3 BE machines, 500GB * 2 disks Amount of data N = 2 BE disks M = 3* (500GB/50GB) * 2 = 60 x = min(M, N, 128) = 2 Final: 2 case3: Data volume 100GB, 3 BE machines, 500GB * 2 disks Amount of data N = 20 BE disks M = 3* (500GB/50GB) * 2 = 60 x = min(M, N, 128) = 20 Final: 20 case4: Data volume 500GB, 3 BE machines, 1TB * 1 disk Data volume N = 100 BE disks M = 3* (1TB /50GB) * 1 = 60 x = min(M, N, 128) = 63 Final: 63 case5: Data volume 500GB, 10 BE machines, 2TB * 3 disks Amount of data N = 100 BE disks M = 10* (2TB / 50GB) * 3 = 1230 x = min(M, N, 128) = 100 Final: 100 case 6: Data volume 1TB, 10 BE machines, 2TB * 3 disks Amount of data N = 205 BE disks M = 10* (2TB / 50GB) * 3 = 1230 x = min(M, N, 128) = 128 Final: 128 case 7: Data volume 500GB, 1 BE machine, 100TB * 1 disk Amount of data N = 100 BE disk M = 1* (100TB / 50GB) * 1 = 2048 x = min(M, N, 128) = 100 Final: 100 case 8: Data volume 1TB, 200 BE machines, 4TB * 7 disks Amount of data N = 205 BE disks M = 200* (4TB / 50GB) * 7 = 114800 x = min(M, N, 128) = 128 Final: 200
As you can see, the detailed logic matches the principle. Subsequent bucketing projection The above is the calculation logic for the initial bucketing. The subsequent bucketing can be evaluated based on the amount of partition data available since there is already a certain amount of partition data. The subsequent bucket size is evaluated based on the EMA[1] (short term exponential moving average) value of up to the first 7 partitions, which is used as the estimate_partition_size. At this point there are two ways to calculate the partition buckets, assuming partitioning by days, counting forward to the first day partition size of S7, counting forward to the second day partition size of S6, and so on to S1.
There are 6 delta values, which are
S7 - S6 = delta1, S6 - S5 = delta2, ... S2 - S1 = delta6
This yields the ema(delta) value. Then, today's estimate_partition_size = S7 + ema(delta)
today's estimate_partition_size = EMA(S1, ... , S7) , S7)
According to the above algorithm, the initial number of buckets and the number of subsequent buckets can be calculated. Unlike before when only a fixed number of buckets could be specified, due to changes in business data, it is possible that the number of buckets in the previous partition is different from the number of buckets in the next partition, which is transparent to the user, and the user does not need to care about the exact number of buckets in each partition, and this automatic extrapolation will make the number of buckets more reasonable.
When autobucket is enabled, the schema you see in show create table is also BUCKETS AUTO. If you want to see the exact number of buckets, you can do so by show partitions from ${table};.