| # Metastore ANALYZE commands |
| |
| Drill provides the functionality to collect, use and store table metadata into Drill Metastore. |
| |
| Set `metastore.enabled` option to true to enable Metastore usage. |
| |
| To collect table metadata, the following command should be used: |
| |
| ``` |
| ANALYZE TABLE [table_name] [COLUMNS (col1, col2, ...)] |
| REFRESH METADATA [partition LEVEL] |
| {COMPUTE | ESTIMATE} | STATISTICS [(column1, column2, ...)] |
| [ SAMPLE numeric PERCENT ] |
| ``` |
| |
| For the case when this command is executed for the first time, whole table metadata will be collected and stored into |
| Metastore. |
| If analyze was already executed for the table, and table data wasn't changed, all further analyze commands wouldn't |
| trigger table analyzing and message that table metadata is up to date will be returned. |
| |
| # Incremental analyze |
| |
| For the case when some table data was updated, Drill will try to execute incremental analyze - calculate metadata only |
| for updated data and reuse required metadata from the Metastore. |
| |
| Incremental analyze wouldn't be produced for the following cases: |
| - list of interesting columns specified in analyze is not a subset of interesting columns from the previous analyze; |
| - specified metadata level differs from the metadata level in previous analyze. |
| |
| # Metadata usage |
| |
| Drill provides the ability to use metadata obtained from the Metastore at the planning stage to prune segments, files |
| and row groups. |
| |
| Tables metadata from the Metastore is exposed to `INFORMATION_SCHEMA` tables (if Metastore usage is enabled). |
| |
| The following tables are populated with table metadata from the Metastore: |
| |
| `TABLES` table has the following additional columns populated from the Metastore: |
| - `TABLE_SOURCE` - table data type: `PARQUET`, `CSV`, `JSON` |
| - `LOCATION` - table location: `/tmp/nation` |
| - `NUM_ROWS` - number of rows in a table if known, `null` if not known |
| - `LAST_MODIFIED_TIME` - table's last modification time |
| |
| `COLUMNS` table has the following additional columns populated from the Metastore: |
| - `COLUMN_DEFAULT` - column default value |
| - `COLUMN_FORMAT` - usually applicable for date time columns: `yyyy-MM-dd` |
| - `NUM_NULLS` - number of nulls in column values |
| - `MIN_VAL` - column min value in String representation: `aaa` |
| - `MAX_VAL` - column max value in String representation: `zzz` |
| - `NDV` - number of distinct values in column, expressed in Double |
| - `EST_NUM_NON_NULLS` - estimated number of non null values, expressed in Double |
| - `IS_NESTED` - if column is nested. Nested columns are extracted from columns with struct type. |
| |
| `PARTITIONS` table has the following additional columns populated from the Metastore: |
| - `TABLE_CATALOG` - table catalog (currently we have only one catalog): `DRILL` |
| - `TABLE_SCHEMA` - table schema: `dfs.tmp` |
| - `TABLE_NAME` - table name: `nation` |
| - `METADATA_KEY` - top level segment key, the same for all nested segments and partitions: `part_int=3` |
| - `METADATA_TYPE` - `SEGMENT` or `PARTITION` |
| - `METADATA_IDENTIFIER` - current metadata identifier: `part_int=3/part_varchar=g` |
| - `PARTITION_COLUMN` - partition column name: `part_varchar` |
| - `PARTITION_VALUE` - partition column value: `g` |
| - `LOCATION` - segment location, `null` for partitions: `/tmp/nation/part_int=3` |
| - `LAST_MODIFIED_TIME` - last modification time |
| |
| # Metastore related options |
| |
| - `metastore.enabled` - enables Drill Metastore usage to be able to store table metadata during `ANALYZE TABLE` commands |
| execution and to be able to read table metadata during regular queries execution or when querying some `INFORMATION_SCHEMA` tables. |
| - `metastore.metadata.store.depth_level` - specifies maximum level depth for collecting metadata. |
| Possible values : `TABLE`, `SEGMENT`, `PARTITION`, `FILE`, `ROW_GROUP`, `ALL`. |
| - `metastore.metadata.use_schema` - enables schema usage, stored to the Metastore. |
| - `metastore.metadata.use_statistics` - enables statistics usage, stored in the Metastore, at the planning stage. |
| - `metastore.metadata.fallback_to_file_metadata` - allows using file metadata cache for the case when required metadata is absent in the Metastore. |
| - `metastore.retrieval.retry_attempts` - specifies the number of attempts for retrying query planning after detecting that query metadata is changed. |
| If the number of retries was exceeded, query will be planned without metadata information from the Metastore. |
| |
| # Analyze operators description |
| |
| Entry point for `ANALYZE` command is `MetastoreAnalyzeTableHandler` class. It creates plan which includes some |
| Metastore specific operators for collecting metadata. |
| |
| `MetastoreAnalyzeTableHandler` uses `AnalyzeInfoProvider` for providing the information |
| required for building a suitable plan for collecting metadata. |
| Each group scan should provide corresponding `AnalyzeInfoProvider` implementation class. |
| |
| Analyze command specific operators: |
| - `MetadataAggBatch` - operator which adds aggregate calls for all incoming table columns to calculate required |
| metadata and produces aggregations. If aggregation is performed on top of another aggregation, |
| required aggregate calls for merging metadata will be added. |
| - `MetadataHandlerBatch` - operator responsible for handling metadata returned by incoming aggregate operators and |
| fetching required metadata form the Metastore to produce further aggregations. |
| - `MetadataControllerBatch` - responsible for converting obtained metadata, fetching absent metadata from the Metastore |
| and storing resulting metadata into the Metastore. |
| |
| `MetastoreAnalyzeTableHandler` forms plan depending on segments count in the following form: |
| |
| ``` |
| MetadataControllerRel |
| ... |
| MetadataHandlerRel |
| MetadataAggRel(dir0, ...) |
| MetadataHandlerRel |
| MetadataAggRel(dir0, dir1, ...) |
| MetadataHandlerRel |
| MetadataAggRel(dir0, dir1, fqn, ...) |
| DrillScanRel(DYNAMIC_STAR **, ANY fqn, ...) |
| ``` |
| |
| For the case when `ANALYZE` uses columns for which statistics is present in parquet metadata, |
| `ConvertMetadataAggregateToDirectScanRule` rule will be applied to the |
| |
| ``` |
| MetadataAggRel(dir0, dir1, fqn, ...) |
| DrillScanRel(DYNAMIC_STAR **, ANY fqn, ...) |
| ``` |
| |
| plan part and convert it to the `DrillDirectScanRel` populated with row group metadata for the case when `ANALYZE` |
| was done for `ROW_GROUP` metadata level. |
| For the case when metadata level in `ANALYZE` is not `ROW_GROUP`, the plan above will be converted into the following plan: |
| |
| ``` |
| MetadataAggRel(metadataLevel=FILE (or another non-ROW_GROUP value), createNewAggregations=false) |
| DrillDirectScanRel |
| ``` |
| |
| When it is converted into the physical plan, two-phase aggregation may be used for the case when incoming row |
| count is greater than `planner.slice_target` option value. In this case, the lowest aggregation will be hash |
| aggregation and it will be executed on the same minor fragments where the scan is produced. `Sort` operator will be |
| placed above hash aggregation. `HashToMergeExchange` operator above `Sort` will send aggregated sorted data to the |
| stream aggregate above. |
| |
| Example of the resulting plan: |
| |
| ``` |
| MetadataControllerPrel |
| ... |
| MetadataStreamAggPrel(PHASE_1of1) |
| SortPrel |
| MetadataHandlerPrel |
| MetadataStreamAggPrel(PHASE_2of2) |
| HashToMergeExchangePrel |
| SortPrel |
| MetadataHashAggPrel(PHASE_1of2) |
| ScanPrel |
| ``` |
| |
| The lowest `MetadataStreamAggBatch` (or `MetadataHashAggBatch` for the case of two-phase aggregation with |
| `MetadataStreamAggBatch` above) creates required aggregate calls for every (or interesting only) table columns |
| and produces aggregations with grouping by segment columns that correspond to specific table level. |
| `MetadataHandlerBatch` above it populates batch with additional information about metadata type and other info. |
| `MetadataStreamAggBatch` above merges metadata calculated before to obtain metadata for parent metadata levels and also stores incoming data to populate it to the Metastore later. |
| |
| `MetadataControllerBatch` obtains all calculated metadata, converts it to the suitable form and sends it to the Metastore. |
| |
| For the case of incremental analyze, `MetastoreAnalyzeTableHandler` creates Scan with updated files only |
| and provides `MetadataHandlerBatch` with information about metadata which should be fetched from the Metastore, so existing actual metadata wouldn't be recalculated. |