| // Licensed to the Apache Software Foundation (ASF) under one |
| // or more contributor license agreements. See the NOTICE file |
| // distributed with this work for additional information |
| // regarding copyright ownership. The ASF licenses this file |
| // to you under the Apache License, Version 2.0 (the |
| // "License"); you may not use this file except in compliance |
| // with the License. You may obtain a copy of the License at |
| // |
| // http://www.apache.org/licenses/LICENSE-2.0 |
| // |
| // Unless required by applicable law or agreed to in writing, |
| // software distributed under the License is distributed on an |
| // "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY |
| // KIND, either express or implied. See the License for the |
| // specific language governing permissions and limitations |
| // under the License. |
| [[known_issues_and_limitations]] |
| = Known Issues and Limitations |
| |
| :author: Kudu Team |
| :imagesdir: ./images |
| :icons: font |
| :toc: left |
| :toclevels: 3 |
| :doctype: book |
| :backend: html5 |
| :sectlinks: |
| :experimental: |
| |
| include::top.adoc[tags=version] |
| |
| == Schema |
| |
| === Primary keys |
| |
| * The primary key may not be changed after the table is created. |
| You must drop and recreate a table to select a new primary key. |
| |
| * The columns which make up the primary key must be listed first in the schema. |
| |
| * The primary key of a row may not be modified using the `UPDATE` functionality. |
| To modify a row's primary key, the row must be deleted and re-inserted with |
| the modified key. Such a modification is non-atomic. |
| |
| * Columns with `DOUBLE`, `FLOAT`, or `BOOL` types are not allowed as part of a |
| primary key definition. Additionally, all columns that are part of a primary |
| key definition must be `NOT NULL`. |
| |
| * Auto-generated primary keys are not supported. |
| |
| * Cells making up a composite primary key are limited to a total of 16KiB after the internal |
| composite-key encoding done by Kudu. |
| |
| === Columns |
| |
| * `CHAR` and complex types like `ARRAY`, `MAP`, and `STRUCT` are not yet |
| supported. |
| |
| * Type, nullability and type attributes (i.e. precision and scale of `DECIMAL`, |
| length of `VARCHAR`) of existing columns cannot be changed by altering the |
| table. |
| |
| * Tables can have a maximum of 300 columns by default. |
| |
| === Tables |
| |
| * Tables must have an odd number of replicas, with a maximum of 7. |
| |
| * The replication factor is set at table creation time and cannot be changed |
| later. As a workaround, create a new table with the same schema and |
| the desired replication factor and insert the contents of the old table |
| into the new one. |
| |
| === Cells (individual values) |
| |
| * By default, cells cannot be larger than 64KiB before encoding or compression. |
| |
| === Other usage limitations |
| |
| * Secondary indexes are not supported. |
| |
| * Multi-row transactions are not yet supported. |
| |
| * Relational features, like foreign keys, are not supported. |
| |
| * Identifiers such as column and table names are restricted to be valid UTF-8 strings. |
| Additionally, a maximum length of 256 characters is enforced. |
| |
| * Dropping a column does not immediately reclaim space. Compaction must run first. |
| |
| * There is no way to run compaction manually, but dropping the table will reclaim the |
| space immediately. |
| |
| == Partitioning Limitations |
| |
| * Tables must be manually pre-split into tablets using simple or compound primary |
| keys. Automatic splitting is not yet possible. Range partitions may be added |
| or dropped after a table has been created. See |
| <<schema_design.adoc#schema_design,Schema Design>> for more information. |
| |
| * Data in existing tables cannot currently be automatically repartitioned. As a workaround, |
| create a new table with the new partitioning and insert the contents of the old |
| table. |
| |
| * Tablets that lose a majority of replicas (such as 1 left out of 3) require manual |
| intervention to be repaired. |
| |
| == Cluster management |
| |
| * Recommended maximum point-to-point latency within a Kudu cluster is 20 milliseconds. |
| |
| * Recommended minimum point-to-point bandwidth within a Kudu cluster is 10 Gbps. |
| |
| * If you intend to use the location awareness feature to place tablet servers in |
| different locations, it is recommended to measure the bandwidth and latency between servers |
| to ensure they fit within the above guidelines. |
| |
| * All masters must be started at the same time when the cluster is started for the very first time. |
| |
| == Server management |
| |
| * Production deployments should configure a least 4 GiB of memory for tablet servers, |
| and ideally more than 16 GiB when approaching the data and tablet <<Scale>> limits. |
| |
| * Write ahead logs (WAL) can only be stored on one disk. |
| |
| * Tablet servers can’t change address/port. |
| |
| * Kudu has a requirement on having the server-side clock synchronized with NTP. |
| Kudu masters and tablet servers crash if detecting that the clock is out of |
| sync. For the clock synchronization, either synchronize the local clock |
| using an NTP server such as `ntpd` or `chronyd`, or use the built-in NTP |
| client for Kudu masters and tablet servers. |
| |
| == Scale |
| |
| Kudu is known to run seamlessly across a wide array of environments and workloads |
| with minimal expertise and configuration at the following scale: |
| |
| * 3 master servers |
| |
| * 100 tablet servers |
| |
| * 8 TiB of stored data per tablet server, post-replication and post-compression. |
| |
| * 1000 tablets per tablet server, post-replication. |
| |
| * 60 tablets per table, per tablet server, at table-creation time. |
| |
| * 10 GiB of stored data per tablet. |
| |
| Staying within these limits will provide the most predictable and straightforward |
| Kudu experience. |
| |
| However, experienced users who run on modern hardware, use the latest |
| versions of Kudu, test and tune Kudu for their use case, and work closely with |
| the community, can achieve much higher scales comfortably. Below are some |
| anecdotal values that have been seen in real world production clusters: |
| |
| * 3 master servers |
| |
| * 300+ tablet servers |
| |
| * 10+ TiB of stored data per tablet server, post-replication and post-compression. |
| |
| * 4000+ tablets per tablet server, post-replication. |
| |
| * 50 GiB of stored data per tablet. Going beyond this can cause issues such a |
| reduced performance, compaction issues, and slow tablet startup times. |
| |
| == Security Limitations |
| |
| * Row-level authorization is not available. |
| |
| * Data encryption at rest is not directly built into Kudu. Encryption of |
| Kudu data at rest can be achieved through the use of local block device |
| encryption software such as `dmcrypt`. |
| |
| * Server certificates generated by Kudu IPKI are incompatible with |
| link:https://www.bouncycastle.org/[bouncycastle] version 1.52 and earlier. See |
| link:https://issues.apache.org/jira/browse/KUDU-2145[KUDU-2145] for details. |
| |
| == Other Known Issues |
| |
| The following are known bugs and issues with the current release of Kudu. They will |
| be addressed in later releases. Note that this list is not exhaustive, and is meant |
| to communicate only the most important known issues. |
| |
| * If a tablet server has a very large number of tablets, it may take several minutes |
| to start up. It is recommended to limit the number of tablets per server to 1000 |
| or fewer. Consider this limitation when pre-splitting your tables. If you notice slow |
| start-up times, you can monitor the number of tablets per server in the web UI. |
| |
| * NVM-based cache doesn't work reliably on RH6/CentOS6 |
| (see link:https://issues.apache.org/jira/browse/KUDU-2978[KUDU-2978]). |
| |
| * When upgrading a Kudu cluster to 1.11.0 version with existing pre-1.11.0 |
| tables, the `live_row_count` and `on_disk_size` metrics might produce |
| inconsistent readings in some scenarios |
| (see link:https://issues.apache.org/jira/browse/KUDU-2986[KUDU-2986]). |
| |
| * In Kudu 1.10.0 and Kudu 1.11.0, the kudu-binary JAR (targeted for |
| containerized Kudu deployments using mini-cluster) contains libnuma dynamic |
| library. Also, if building Kudu binaries in release mode with default cmake |
| settings, the libnuma library is linked statically with the Kudu binaries |
| (add `-DKUDU_LINK=dynamic` when running cmake to avoid that). The library is |
| licensed under LGPL v.2.1, however the ASF thirdparty license policy |
| explicitly prohibits including such contents into releases: see |
| link:https://www.apache.org/legal/resolved.html#category-x[Category X]. This |
| issue has been addressed in 1.10.1 and 1.11.1 patch releases correspondingly |
| (see link:https://issues.apache.org/jira/browse/KUDU-2990[KUDU-2990]). |
| |
| * Due to a bug in SSSD PAC plugin of version prior to 1.16, a KRPC connection |
| negotiation may stuck and the whole process wouldn't be able to negotiate |
| any new connection for about 5 minutes in a secure Kudu cluster using SSSD. |
| If using SSSD in secure Kudu cluster deployments, make sure SSSD packages are |
| of version 1.16 or newer |
| (see link:https://issues.apache.org/jira/browse/KUDU-3217[KUDU-3217]). |