Stamp Cloudberry Database 1.5.0
Answer Query Using Materialized Views.

AQUMV for short, is used to compute part or all of a Query from
materialized views during planning.
It could provide massive improvements in query processing time,
especially for aggregation queries over large tables[1].

AQUMV usually uses Incremental Materialized Views(IMV) as candidates,
as IMV usually have real time data when there are writable operations
on related tables.

AQUMV is actually a Equivalent Transformation on Query tree.
A materialized view(MV) could be use to compute a Query if:
1.The view contains all rows needed by the query expression.
  If MV has more rows than query wants, additional filter may be added
  if possible.
2.All output expressions can be computed from the output of the view.
  The output expressions could be fully or partially matched from MV's
  target list.
3.Cost-based.
  There may be multiple valid MV candidates, or select from MV is not
  better than select from origin table(ex: has an index and etc),
  let planner decide the best one.

Construct rows by splitting MV query quals(mv_query_quals) and Query quals
(origin_query_quals) to difference set and intersection set.
And post_quals formed by:{origin_query_quals - mv_query_quals} will be
processed by MV query's target list, and rewritten to MV relation's
target list expressions.

Construct columns expressions using a Greedy Algorithm.
Sort the MV query's target list by complexity, and try to rewrite expressions
by that order.
Expressions that have no Vars are kept to upper(Const Expressions) or be
rewritten if there were corresponding expressions.

Reference:
   [1] Optimizing Queries Using Materialized Views: A Practical,
Scalable Solution.
   https://courses.cs.washington.edu/courses/cse591d/01sp/opt_views.pdf

Authored-by: Zhang Mingli avamingli@gmail.com
9 files changed
tree: 1696c942dc4beffd4b3ec7383c964cd924087e04
  1. .github/
  2. concourse/
  3. config/
  4. contrib/
  5. deploy/
  6. doc/
  7. gpAux/
  8. gpcontrib/
  9. gpMgmt/
  10. hd-ci/
  11. hooks/
  12. readmes/
  13. src/
  14. .clang-tidy
  15. .cloudberry-db.spec
  16. .dir-locals.el
  17. .editorconfig
  18. .git-blame-ignore-revs
  19. .gitattributes
  20. .gitignore
  21. .gitmessage
  22. .gitmodules
  23. aclocal.m4
  24. CODE_OF_CONDUCT.md
  25. configure
  26. configure.ac
  27. CONTRIBUTING.md
  28. COPYRIGHT
  29. getversion
  30. GNUmakefile.in
  31. HISTORY
  32. LICENSE
  33. logo_cloudberry_database.svg
  34. Makefile
  35. NOTICE
  36. putversion
  37. README.git
  38. README.md
  39. README.PostgreSQL
  40. SECURITY.md
README.md

Slack Twitter Follow Website GitHub Discussions GitHub commit activity(branch) GitHub contributors GitHub License FOSSA Status cbdb pipeline


Cloudberry Database (CBDB) is shipped with PostgreSQL 14.4 as its kernel and is forked from Greenplum Database 7, which serves as our code base.

Features

Cloudberry Database is 100% compatible with Greenplum, and provides all the Greenplum features you need. In addition, Cloudberry Database possesses some features that Greenplum currently lacks or does not support. Visit this feature comparison doc for details.

Code layout

The directory layout of the repository follows the same general layout as upstream PostgreSQL. There are changes compared to PostgreSQL throughout the codebase, but a few larger additions worth noting:

  • gpMgmt/ : Contains CloudberryDB-specific command-line tools for managing the cluster. Scripts like gpinit, gpstart, and gpstop live here. They are mostly written in Python.

  • gpAux/ : Contains CloudberryDB-specific release management scripts, and vendored dependencies. Some additional directories are submodules and will be made available over time.

  • gpcontrib/ : Much like the PostgreSQL contrib/ directory, this directory contains extensions such as gpfdist, PXF and gpmapreduce which are CloudberryDB-specific.

  • doc/ : In PostgreSQL, the user manual lives here. In Cloudberry Database, the user manual is maintained separately at Cloudberry Database Website Repo.

  • hd-ci/ : Contains configuration files for the CBDB continuous integration system.

  • src/

    • src/backend/cdb/ : Contains larger CloudberryDB-specific backend modules. For example, communication between segments, turning plans into parallelizable plans, mirroring, distributed transaction and snapshot management, etc. cdb stands for Cluster Database - it was a workname used in the early days. That name is no longer used, but the cdb prefix remains.

    • src/backend/gpopt/ : Contains the so-called translator library, for using the GPORCA optimizer with Cloudberry Database. The translator library is written in C++ code, and contains glue code for translating plans and queries between the DXL format used by GPORCA, and the PostgreSQL internal representation.

    • src/backend/gporca/ : Contains the GPORCA optimizer code and tests. This is written in C++. See README.md for more information and how to unit-test GPORCA.

    • src/backend/fts/ : FTS is a process that runs in the coordinator node, and periodically polls the segments to maintain the status of each segment.

Building Cloudberry Database

You can follow these guides to build the Cloudberry Database on Linux OS(including CentOS, RHEL, and Ubuntu) and macOS.

Documentation

For Cloudberry Database documentation, please check the documentation website. Our documents are still in construction, welcome to help. If you're interested in document contribution, you can submit the pull request here.

We also recommend you take PostgreSQL Documentation and Greenplum Documentation as quick references.

Contribution

Cloudberry Database is maintained actively by a group of community database experts by individuals and companies. We believe in the Apache Way “Community Over Code” and we want to make Cloudberry Database a community-driven project.

Contributions can be diverse, such as code enhancements, bug fixes, feature proposals, documents, marketing, and so on. No contribution is too small, we encourage all types of contributions. Cloudberry Database community welcomes contributions from anyone, new and experienced! Our contribution guide will help you get started with the contribution.

TypeDescription
Code contributionLearn how to contribute code to the Cloudberry Database, including coding preparation, conventions, workflow, review, and checklist following the code contribution guide.
Submit the proposalProposing major changes to Cloudberry Database through proposal guide.
Doc contributionWe need you to join us to help us improve the documentation, see the doc contribution guide.

For better collaboration, it's important for developers to learn how to work well with Git and GitHub, see the guide “Working with Git & GitHub”.

Community & Support

We have many channels for community members to discuss, ask for help, feedback, and chat:

TypeDescription
SlackClick to Join the real-time chat on Slack for QA, Dev, Events, and more. Don't miss out! Check out the Slack guide to learn more.
Q&AAsk for help when running/developing Cloudberry Database, visit GitHub Discussions - QA.
New ideas / Feature RequestsShare ideas for new features, visit GitHub Discussions - Ideas.
Report bugsProblems and issues in Cloudberry Database core. If you find bugs, welcome to submit them here.
Report a security vulnerabilityView our security policy to learn how to report and contact us.
Community eventsIncluding meetups, webinars, conferences, and more events, visit the Events page and subscribe events calendar.
DocumentationOfficial documentation for Cloudberry Database. You can explore it to discover more details about us.

When you are involved, please follow our community Code of Conduct to help create a safe space for everyone.

Acknowledgment

Thanks to PostgreSQL, Greenplum Database and other great open source projects to make Cloudberry Database has a sound foundation.

License

Cloudberry Database is released under the Apache License, Version 2.0.