202c78c8f32bd3ec5e70e021b1494d878f38d678 - druid

commit	202c78c8f32bd3ec5e70e021b1494d878f38d678	[log] [tgz]
author	Gian Merlino <gianmerlino@gmail.com>	Wed Apr 14 10:49:27 2021 -0700
committer	GitHub <noreply@github.com>	Wed Apr 14 10:49:27 2021 -0700
tree	7b2f3dc6d420dabdbedc96a65e15a7903d022284
parent	b51632b0bf38d1ef396e62a4851d84433029713d [diff]

commit

202c78c8f32bd3ec5e70e021b1494d878f38d678

[log] [tgz]

author

Gian Merlino <gianmerlino@gmail.com>

Wed Apr 14 10:49:27 2021 -0700

committer

GitHub <noreply@github.com>

Wed Apr 14 10:49:27 2021 -0700

tree

7b2f3dc6d420dabdbedc96a65e15a7903d022284

parent

b51632b0bf38d1ef396e62a4851d84433029713d [diff]

Enable rewriting certain inner joins as filters. (#11068) * Enable rewriting certain inner joins as filters. The main logic for doing the rewrite is in JoinableFactoryWrapper's segmentMapFn method. The requirements are: - It must be an inner equi-join. - The right-hand columns referenced by the condition must not contain any duplicate values. (If they did, the inner join would not be guaranteed to return at most one row for each left-hand-side row.) - No columns from the right-hand side can be used by anything other than the join condition itself. HashJoinSegmentStorageAdapter is also modified to pass through to the base adapter (even allowing vectorization!) in the case where 100% of join clauses could be rewritten as filters. In support of this goal: - Add Query getRequiredColumns() method to help us figure out whether the right-hand side of a join datasource is being used or not. - Add JoinConditionAnalysis getRequiredColumns() method to help us figure out if the right-hand side of a join is being used by later join clauses acting on the same base. - Add Joinable getNonNullColumnValuesIfAllUnique method to enable retrieving the set of values that will form the "in" filter. - Add LookupExtractor canGetKeySet() and keySet() methods to support LookupJoinable in its efforts to implement the new Joinable method. - Add "enableRewriteJoinToFilter" feature flag to JoinFilterRewriteConfig. The default is disabled. * Test improvements. * Test fixes. * Avoid slow size() call. * Remove invalid test. * Fix style. * Fix mistaken default. * Small fixes. * Fix logic error.

tree: 7b2f3dc6d420dabdbedc96a65e15a7903d022284

README.md

Apache Druid

Druid is a high performance real-time analytics database. Druid's main value add is to reduce time to insight and action.

Druid is designed for workflows where fast queries and ingest really matter. Druid excels at powering UIs, running operational (ad-hoc) queries, or handling high concurrency. Consider Druid as an open source alternative to data warehouses for a variety of use cases.

Getting started

You can get started with Druid with our local or Docker quickstart.

Druid provides a rich set of APIs (via HTTP and JDBC) for loading, managing, and querying your data. You can also interact with Druid via the built-in console (shown below).

Load data

Load streaming and batch data using a point-and-click wizard to guide you through ingestion setup. Monitor one off tasks and ingestion supervisors.

Manage the cluster

Manage your cluster with ease. Get a view of your datasources, segments, ingestion tasks, and services from one convenient location. All powered by SQL systems tables, allowing you to see the underlying query for each view.

Issue queries

Use the built-in query workbench to prototype DruidSQL and native queries or connect one of the many tools that help you make the most out of Druid.

Documentation

You can find the documentation for the latest Druid release on the project website.

If you would like to contribute documentation, please do so under /docs in this repository and submit a pull request.

Community

Community support is available on the druid-user mailing list, which is hosted at Google Groups.

Development discussions occur on dev@druid.apache.org, which you can subscribe to by emailing dev-subscribe@druid.apache.org.

Chat with Druid committers and users in real-time on the #druid channel in the Apache Slack team. Please use this invitation link to join the ASF Slack, and once joined, go into the #druid channel.

Building from source

Please note that JDK 8 is required to build Druid.

For instructions on building Druid from source, see docs/development/build.md

Contributing

Please follow the community guidelines for contributing.

For instructions on setting up IntelliJ dev/intellij-setup.md

License

Apache License, Version 2.0