commit | ce1faa56352766522610e3ae306c15b55df19fb0 | [log] [tgz] |
---|---|---|
author | Abhishek Agarwal <1477457+abhishekagarwal87@users.noreply.github.com> | Thu Jul 22 18:00:49 2021 +0530 |
committer | GitHub <noreply@github.com> | Thu Jul 22 18:00:49 2021 +0530 |
tree | e797002d9dcfbe495cec0627d9b39da73a93c41b | |
parent | 167c45260c76057b9856bd073661365663bd80f2 [diff] |
Make SegmentLoader extensible and customizable (#11398) This PR refactors the code related to segment loading specifically SegmentLoader and SegmentLoaderLocalCacheManager. SegmentLoader is marked UnstableAPI which means, it can be extended outside core druid in custom extensions. Here is a summary of changes SegmentLoader returns an instance of ReferenceCountingSegment instead of Segment. Earlier, SegmentManager was wrapping Segment objects inside ReferenceCountingSegment. That is now moved to SegmentLoader. With this, a custom implementation can track the references of segments. It also allows them to create custom ReferenceCountingSegment implementations. For this reason, the constructor visibility in ReferenceCountingSegment is changed from private to protected. SegmentCacheManager has two additional methods called - reserve(DataSegment) and release(DataSegment). These methods let the caller reserve or release space without calling SegmentLoader#getSegment. We already had similar methods in StorageLocation and now they are available in SegmentCacheManager too which wraps multiple locations. Refactoring to simplify the code in SegmentCacheManager wherever possible. There is no change in the functionality.
Website | Documentation | Developer Mailing List | User Mailing List | Slack | Twitter | Download
Druid is a high performance real-time analytics database. Druid's main value add is to reduce time to insight and action.
Druid is designed for workflows where fast queries and ingest really matter. Druid excels at powering UIs, running operational (ad-hoc) queries, or handling high concurrency. Consider Druid as an open source alternative to data warehouses for a variety of use cases. The design documentation explains the key concepts.
You can get started with Druid with our local or Docker quickstart.
Druid provides a rich set of APIs (via HTTP and JDBC) for loading, managing, and querying your data. You can also interact with Druid via the built-in console (shown below).
Load streaming and batch data using a point-and-click wizard to guide you through ingestion setup. Monitor one off tasks and ingestion supervisors.
Manage your cluster with ease. Get a view of your datasources, segments, ingestion tasks, and services from one convenient location. All powered by SQL systems tables, allowing you to see the underlying query for each view.
Use the built-in query workbench to prototype DruidSQL and native queries or connect one of the many tools that help you make the most out of Druid.
You can find the documentation for the latest Druid release on the project website.
If you would like to contribute documentation, please do so under /docs
in this repository and submit a pull request.
Community support is available on the druid-user mailing list, which is hosted at Google Groups.
Development discussions occur on dev@druid.apache.org, which you can subscribe to by emailing dev-subscribe@druid.apache.org.
Chat with Druid committers and users in real-time on the #druid
channel in the Apache Slack team. Please use this invitation link to join the ASF Slack, and once joined, go into the #druid
channel.
Please note that JDK 8 is required to build Druid.
For instructions on building Druid from source, see docs/development/build.md
Please follow the community guidelines for contributing.
For instructions on setting up IntelliJ dev/intellij-setup.md