| commit | 4af1cf5a09bcf5f199e4135db14bb6123987c102 | [log] [tgz] |
|---|---|---|
| author | Leonid <63977577+leborchuk@users.noreply.github.com> | Wed Feb 04 16:54:52 2026 +0300 |
| committer | GitHub <noreply@github.com> | Wed Feb 04 16:54:52 2026 +0300 |
| tree | 2d408b353d6630d53669bdc133c30f28fed41035 | |
| parent | 2cc5674ea6e3a5cc85ff9101daf08b8dee5d533e [diff] |
Movable DataBase Locales for Cloudberry (#1363) * Movable DataBase Locales for Cloudberry We inherited this issue from PostgreSQL. PostgreSQL uses glibc to sort strings. In version glibc=2.28, collations broke down badly (in general, there are no guarantees when updating glibc). Changing collations breaks indexes. Similarly, a cluster with different collations also behaves unpredictably. What and when something has changed in glibc can be found on https://github.com/ardentperf/glibc-unicode-sorting Also there is special postgresql-wiki https://wiki.postgresql.org/wiki/Locale_data_changes And you tube video https://www.youtube.com/watch?v=0E6O-V8Jato In short, the issue can be seen through the use of bash: ( echo "1-1"; echo "11" ) | LC_COLLATE=en_US.UTF-8 sort gives the different results in ubunru 18.04 and 22.04. There is no way to solve the problem other than by not changing the symbol order. We freeze symbol order and use it instead of glibc. Here the solution https://github.com/postgredients/mdb-locales. In this PR I have added PostgreSQL patch that replaces all glibc locale-related calls with a calls to an external libary. It activates using new configure parameter --with-mdblocales, which is off by default. Using custom locales needs libmdblocales1 package and mdb-locales package with symbol table. Build needs libmdblocales-dev package with headers. Fixing the symbol order is necessary for OS upgrade. For example Ubuntu 22.04 EOL is April 2027, Rocky 8 Active Support ended May 2024, and Security support ends in 2029. We use Movable DataBase Locales in Greenplum 6 and all our PostgreSQL installations (starting with PostgreSQL 12). This patch is adopted patch version from our internal PostgreSQL 14 fork. * mdb_admin role This patch introcudes new pseudo-pre-defined role "mdb_admin". Introduces 2 new function: extern bool mdb_admin_allow_bypass_owner_checks(Oid userId, Oid ownerId); extern void check_mdb_admin_is_member_of_role(Oid member, Oid role); To check mdb admin belongship and role-to-role ownership transfer correctness. Our mdb_admin ACL model is the following: * Any roles user or/and roles can be granted with mdb_admin * mdb_admin memeber can tranfser ownershup of relations, namespaces and functions to other roles, if target role in neither: superuser, pg_read_server_files, pg_write_server_files nor pg_execute_server_program. * mdb_superuser role This patch introcudes new pseudo-pre-defined role "mdb_superuser". Role is capable of: GRANT/REVOKE any set of priviledges to/from any object in database. Has power of pg_database_owner in any database, including: DROP any object in database (except system catalog and stuff) Role is NOT capable of: Create database, role, extension or alter other roles with such priviledges. Transfer ownership to /pass has_priv of roles: PG_READ_ALL_DATA PG_WRITE_ALL_DATA PG_EXECUTE_SERVER_PROGRAM PG_READ_SERVER_FILES PG_WRITE_SERVER_FILES PG_DATABASE_OWNER Allow mdb_superuser to alter objects and grant ACl to objects, owner by pg_database_owner. Also, when acl check, allow mdb_supersuer use pg_database_owner role power to pass check * Extend multixact SLRU The issue here is the same as for the PG, good detail description I found in Nikolay blog post https://v2.postgres.ai/blog/20210831-postgresql-subtransactions-considered-harmful See also the history of original PG patches in https://commitfest.postgresql.org/patch/2627/ We could get all those fixes after rebasing to PG18, but for now, we need to adjust SLRU structure sizes. --------- Co-authored-by: usernamedt <usernamedt@yandex-team.com> Co-authored-by: reshke <reshkekirill@gmail.com>
Apache Cloudberry (Incubating), created by the original developers of Greenplum Database, is one advanced and mature open-source Massively Parallel Processing (MPP) database, which evolves from the open-source version of the Pivotal Greenplum Database®️ but features a newer PostgreSQL kernel and more advanced enterprise capabilities. It can serve as a data warehouse and can also be used for large-scale analytics and AI/ML workloads.
You can follow these guides to build Cloudberry on Linux OS (including RHEL/Rocky Linux, and Ubuntu) and macOS.
Welcome to try out Cloudberry via building one Docker-based Sandbox, which is tailored to help you gain a basic understanding of Cloudberry's capabilities and features.
This is the main repository for Apache Cloudberry (Incubating). Alongside this, there are several ecosystem repositories for Cloudberry, including the website, extensions, connectors, adapters, and other utilities.
We have many channels for community members to discuss, ask for help, feedback, and chat:
| Type | Description |
|---|---|
| Slack | Click to Join the real-time chat on Slack for QA, Dev, Events, and more. Don't miss out! Check out the Slack guide to learn more. |
| Q&A | Ask for help when running/developing Cloudberry, visit GitHub Discussions - QA. |
| New ideas / Feature Requests | Share ideas for new features, visit GitHub Discussions - Ideas. |
| Report bugs | Problems and issues in Apache Cloudberry core. If you find bugs, welcome to submit them here. |
| Report a security vulnerability | View our security policy to learn how to report and contact us. |
| Community events | Including meetups, webinars, conferences, and more events, visit the Events page and subscribe events calendar. |
| Documentation | Official documentation for Cloudberry. You can explore it to discover more details about us. |
Contributions can be diverse, such as code enhancements, bug fixes, feature proposals, documents, marketing, and so on. No contribution is too small, we encourage all types of contributions. Cloudberry community welcomes contributions from anyone, new and experienced! Our contribution guide will help you get started with the contribution.
| Type | Description |
|---|---|
| Code contribution | Learn how to contribute code to the Cloudberry, including coding preparation, conventions, workflow, review, and checklist following the code contribution guide. |
| Submit the proposal | Proposing major changes to Cloudberry through proposal guide. |
| Doc contribution | We need you to join us to help us improve the documentation, see the doc contribution guide. |
You can check our Cloudberry Roadmap out to see the product plans and goals we want to achieve. Welcome to share your thoughts and ideas to join us in shaping the future of Apache Cloudberry (Incubating). (We will update the Roadmap after entering the Incubator.)
Thanks to PostgreSQL, Greenplum Database and other great open source projects to make Apache Cloudberry has a sound foundation.
Cloudberry is licensed under the Apache License, Version 2.0. For details, see the LICENSE.
Apache Cloudberry is an effort undergoing incubation at The Apache Software Foundation (ASF), sponsored by the Apache Incubator. Incubation is required for all newly accepted projects until a further review indicates that the infrastructure, communications, and decision making process have stabilized in a manner consistent with other successful ASF projects. While incubation status is not necessarily a reflection of the completeness or stability of the code, it does indicate that the project has yet to be fully endorsed by the ASF.