Timeseries DataMap is a pre-aggregate table implementation based on ‘pre-aggregate’ DataMap. Difference is that Timeseries DataMap has built-in understanding of time hierarchy and levels: year, month, day, hour, minute, so that it supports automatic roll-up in time dimension for query.
CAUTION: Current version of CarbonData does not support roll-up.It will be implemented in future versions.
The data loading, querying, compaction command and its behavior is the same as preaggregate DataMap. Please refer to Pre-aggregate DataMap for more information.
To use this datamap, user can create multiple timeseries datamap on the main table which has a event_time column, one datamap for one time granularity.
For example, below statement effectively create multiple pre-aggregate tables on main table called timeseries
CREATE DATAMAP agg_year ON TABLE sales USING "timeseries" DMPROPERTIES ( 'event_time'='order_time', 'year_granularity'='1', ) AS SELECT order_time, country, sex, sum(quantity), max(quantity), count(user_id), sum(price), avg(price) FROM sales GROUP BY order_time, country, sex CREATE DATAMAP agg_month ON TABLE sales USING "timeseries" DMPROPERTIES ( 'event_time'='order_time', 'month_granularity'='1', ) AS SELECT order_time, country, sex, sum(quantity), max(quantity), count(user_id), sum(price), avg(price) FROM sales GROUP BY order_time, country, sex CREATE DATAMAP agg_day ON TABLE sales USING "timeseries" DMPROPERTIES ( 'event_time'='order_time', 'day_granularity'='1', ) AS SELECT order_time, country, sex, sum(quantity), max(quantity), count(user_id), sum(price), avg(price) FROM sales GROUP BY order_time, country, sex CREATE DATAMAP agg_sales_hour ON TABLE sales USING "timeseries" DMPROPERTIES ( 'event_time'='order_time', 'hour_granularity'='1', ) AS SELECT order_time, country, sex, sum(quantity), max(quantity), count(user_id), sum(price), avg(price) FROM sales GROUP BY order_time, country, sex CREATE DATAMAP agg_minute ON TABLE sales USING "timeseries" DMPROPERTIES ( 'event_time'='order_time', 'minute_granularity'='1', ) AS SELECT order_time, country, sex, sum(quantity), max(quantity), count(user_id), sum(price), avg(price) FROM sales GROUP BY order_time, country, sex
For querying timeseries data, Carbondata has builtin support for following time related UDF
timeseries(timeseries column name, 'aggregation level')
SELECT timeseries(order_time, 'hour'), sum(quantity) FROM sales GROUP BY timeseries(order_time,'hour')
It is not necessary to create pre-aggregate tables for each granularity unless required for query.
For Example: For main table sales , if following timeseries datamaps were created for day level and hour level pre-aggregate
CREATE DATAMAP agg_day ON TABLE sales USING "timeseries" DMPROPERTIES ( 'event_time'='order_time', 'day_granularity'='1', ) AS SELECT order_time, country, sex, sum(quantity), max(quantity), count(user_id), sum(price), avg(price) FROM sales GROUP BY order_time, country, sex CREATE DATAMAP agg_sales_hour ON TABLE sales USING "timeseries" DMPROPERTIES ( 'event_time'='order_time', 'hour_granularity'='1', ) AS SELECT order_time, country, sex, sum(quantity), max(quantity), count(user_id), sum(price), avg(price) FROM sales GROUP BY order_time, country, sex
Queries like below will not be rolled-up and hit the main table
Select timeseries(order_time, 'month'), sum(quantity) from sales group by timeseries(order_time, 'month') Select timeseries(order_time, 'year'), sum(quantity) from sales group by timeseries(order_time, 'year')
NOTE (RESTRICTION):
Refer to Compaction section in preaggregation datamap. Same applies to timeseries datamap.
Refer to Data Management section in preaggregation datamap. Same applies to timeseries datamap.