Movable DataBase Locales for Cloudberry (#1363)

* Movable DataBase Locales for Cloudberry

We inherited this issue from PostgreSQL.

PostgreSQL uses glibc to sort strings. In version glibc=2.28, collations
broke down badly (in general, there are no guarantees when updating glibc).
Changing collations breaks indexes. Similarly, a cluster with different
collations also behaves unpredictably.

What and when something has changed in glibc can be found
on https://github.com/ardentperf/glibc-unicode-sorting
Also there is special postgresql-wiki https://wiki.postgresql.org/wiki/Locale_data_changes
And you tube video https://www.youtube.com/watch?v=0E6O-V8Jato

In short, the issue can be seen through the use of bash:

( echo "1-1"; echo "11" ) | LC_COLLATE=en_US.UTF-8 sort

gives the different results in ubunru 18.04 and 22.04.

There is no way to solve the problem other than by not changing the symbol order.
We freeze symbol order and use it instead of glibc.

Here the solution https://github.com/postgredients/mdb-locales.

In this PR I have added PostgreSQL patch that replaces all glibc
locale-related calls with a calls to an external libary. It activates
using new configure parameter --with-mdblocales, which is off by
default.

Using custom locales needs libmdblocales1 package and mdb-locales
package with symbol table.

Build needs libmdblocales-dev package with headers.

Fixing the symbol order is necessary for OS upgrade. For example Ubuntu 22.04 EOL is April 2027, Rocky 8 Active Support ended May 2024, and Security support ends in 2029.

We use Movable DataBase Locales in Greenplum 6 and all our PostgreSQL installations (starting with PostgreSQL 12). This patch is adopted patch version from our internal PostgreSQL 14 fork.

* mdb_admin role

This patch introcudes new pseudo-pre-defined role "mdb_admin".

Introduces 2 new function:
extern bool mdb_admin_allow_bypass_owner_checks(Oid userId,  Oid ownerId);
extern void check_mdb_admin_is_member_of_role(Oid member, Oid role);

To check mdb admin belongship and role-to-role ownership transfer
correctness.

Our mdb_admin ACL model is the following:
 * Any roles user or/and roles can be granted with mdb_admin
 * mdb_admin memeber can tranfser ownershup of relations,
   namespaces and functions to other roles, if target role in neither:
   superuser, pg_read_server_files, pg_write_server_files nor
   pg_execute_server_program.

* mdb_superuser role

This patch introcudes new pseudo-pre-defined role "mdb_superuser".

Role is capable of:

GRANT/REVOKE any set of priviledges to/from any object in database.
Has power of pg_database_owner in any database, including:
DROP any object in database (except system catalog and stuff)

Role is NOT capable of:

Create database, role, extension or alter other roles with such
priviledges.

Transfer ownership to /pass has_priv of roles:

PG_READ_ALL_DATA
PG_WRITE_ALL_DATA
PG_EXECUTE_SERVER_PROGRAM
PG_READ_SERVER_FILES
PG_WRITE_SERVER_FILES
PG_DATABASE_OWNER

Allow mdb_superuser to alter objects and grant ACl to
objects, owner by pg_database_owner. Also, when acl check,
allow mdb_supersuer use pg_database_owner role power to pass check

* Extend multixact SLRU

The issue here is the same as for the PG, good detail description I found in Nikolay blog post https://v2.postgres.ai/blog/20210831-postgresql-subtransactions-considered-harmful  
See also the history of original PG patches in https://commitfest.postgresql.org/patch/2627/  We could get all those fixes after rebasing to PG18, but for now, we need to adjust SLRU structure sizes.

---------
Co-authored-by: usernamedt <usernamedt@yandex-team.com>
Co-authored-by: reshke <reshkekirill@gmail.com>
56 files changed
tree: 2d408b353d6630d53669bdc133c30f28fed41035
  1. .abi-check/
  2. .github/
  3. config/
  4. contrib/
  5. dependency/
  6. devops/
  7. doc/
  8. gpAux/
  9. gpcontrib/
  10. gpMgmt/
  11. licenses/
  12. mcp-server/
  13. src/
  14. .asf.yaml
  15. .clang-tidy
  16. .dir-locals.el
  17. .editorconfig
  18. .git-blame-ignore-revs
  19. .gitattributes
  20. .gitignore
  21. .gitmessage
  22. .gitmodules
  23. aclocal.m4
  24. CODE_OF_CONDUCT.md
  25. configure
  26. configure.ac
  27. CONTRIBUTING.md
  28. DISCLAIMER
  29. getversion
  30. GNUmakefile.in
  31. LICENSE
  32. Makefile
  33. NOTICE
  34. pom.xml
  35. putversion
  36. python-dependencies.txt
  37. README.apache.md
  38. README.md
  39. SECURITY.md
  40. sonar-project.properties
README.md

Apache Cloudberry (Incubating)

Website Documentation Slack Twitter Follow WeChat Youtube GitHub Discussions GitHub commit activity(branch) GitHub contributors GitHub License Apache Cloudberry Build Ask DeepWiki Apache Rat Audit

Introduction

Apache Cloudberry (Incubating), created by the original developers of Greenplum Database, is one advanced and mature open-source Massively Parallel Processing (MPP) database, which evolves from the open-source version of the Pivotal Greenplum Database®️ but features a newer PostgreSQL kernel and more advanced enterprise capabilities. It can serve as a data warehouse and can also be used for large-scale analytics and AI/ML workloads.

Build and try out

Build from source

You can follow these guides to build Cloudberry on Linux OS (including RHEL/Rocky Linux, and Ubuntu) and macOS.

Try out quickly

Welcome to try out Cloudberry via building one Docker-based Sandbox, which is tailored to help you gain a basic understanding of Cloudberry's capabilities and features.

Repositories

This is the main repository for Apache Cloudberry (Incubating). Alongside this, there are several ecosystem repositories for Cloudberry, including the website, extensions, connectors, adapters, and other utilities.

Community & Support

We have many channels for community members to discuss, ask for help, feedback, and chat:

TypeDescription
SlackClick to Join the real-time chat on Slack for QA, Dev, Events, and more. Don't miss out! Check out the Slack guide to learn more.
Q&AAsk for help when running/developing Cloudberry, visit GitHub Discussions - QA.
New ideas / Feature RequestsShare ideas for new features, visit GitHub Discussions - Ideas.
Report bugsProblems and issues in Apache Cloudberry core. If you find bugs, welcome to submit them here.
Report a security vulnerabilityView our security policy to learn how to report and contact us.
Community eventsIncluding meetups, webinars, conferences, and more events, visit the Events page and subscribe events calendar.
DocumentationOfficial documentation for Cloudberry. You can explore it to discover more details about us.

Contribution

Contributions can be diverse, such as code enhancements, bug fixes, feature proposals, documents, marketing, and so on. No contribution is too small, we encourage all types of contributions. Cloudberry community welcomes contributions from anyone, new and experienced! Our contribution guide will help you get started with the contribution.

TypeDescription
Code contributionLearn how to contribute code to the Cloudberry, including coding preparation, conventions, workflow, review, and checklist following the code contribution guide.
Submit the proposalProposing major changes to Cloudberry through proposal guide.
Doc contributionWe need you to join us to help us improve the documentation, see the doc contribution guide.

Roadmap

You can check our Cloudberry Roadmap out to see the product plans and goals we want to achieve. Welcome to share your thoughts and ideas to join us in shaping the future of Apache Cloudberry (Incubating). (We will update the Roadmap after entering the Incubator.)

Acknowledgment

Thanks to PostgreSQL, Greenplum Database and other great open source projects to make Apache Cloudberry has a sound foundation.

License

Cloudberry is licensed under the Apache License, Version 2.0. For details, see the LICENSE.

ASF Incubator disclaimer

Apache Cloudberry is an effort undergoing incubation at The Apache Software Foundation (ASF), sponsored by the Apache Incubator. Incubation is required for all newly accepted projects until a further review indicates that the infrastructure, communications, and decision making process have stabilized in a manner consistent with other successful ASF projects. While incubation status is not necessarily a reflection of the completeness or stability of the code, it does indicate that the project has yet to be fully endorsed by the ASF.