blob: b1a25c3b5bb58fc2d0773dc168c110ee5e60c635 [file] [log] [blame]
~~
~~ Licensed to the Apache Software Foundation (ASF) under one
~~ or more contributor license agreements. See the NOTICE file
~~ distributed with this work for additional information
~~ regarding copyright ownership. The ASF licenses this file
~~ to you under the Apache License, Version 2.0 (the
~~ "License"); you may not use this file except in compliance
~~ with the License. You may obtain a copy of the License at
~~
~~ http://www.apache.org/licenses/LICENSE-2.0
~~
~~ Unless required by applicable law or agreed to in writing,
~~ software distributed under the License is distributed on an
~~ "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
~~ KIND, either express or implied. See the License for the
~~ specific language governing permissions and limitations
~~ under the License.
~~
Hive driver configuration
===
*--+--+---+--+
|<<No.>>|<<Property Name>>|<<Default Value>>|<<Description>>|
*--+--+---+--+
|1|hive.server.read.socket.timeout|10|Socket timeout for the client connection|
*--+--+---+--+
|2|hive.server.tcp.keepalive|true|TCP Keep alive socket option for HiveServer connection|
*--+--+---+--+
|3|hive.server2.thrift.bind.host| |The host on which hive server is running|
*--+--+---+--+
|4|hive.server2.thrift.client.connect.retry.limit|1|Number of times to retry a connection to a Thrift hive server|
*--+--+---+--+
|5|hive.server2.thrift.client.retry.delay.seconds|1|Number of seconds the client should wait between connection attempts.|
*--+--+---+--+
|6|hive.server2.thrift.client.retry.limit|1|Number of times to retry a Thrift service call upon failure|
*--+--+---+--+
|7|hive.server2.thrift.port|10000|The port on which hive server is running|
*--+--+---+--+
|8|lens.cube.query.driver.supported.storages| |List of comma separated storage names that supported by a driver. If no value is specified, all storages are valid|
*--+--+---+--+
|9|lens.cube.query.replace.timedim|true|Tells whether timedim attribute queried in the time range should be replaced with its corresponding partition column name.|
*--+--+---+--+
|10|lens.driver.hive.calculate.priority|true|Whether priority should be calculated for hive mr jobs or not|
*--+--+---+--+
|11|lens.driver.hive.connection.class|org.apache.lens.driver.hive.EmbeddedThriftConnection|The connection class from HiveDriver to HiveServer. The default is an embedded connection which does not require a remote hive server. For connecting to a hiveserver end point, remote connection should be used. The possible values are org.apache.lens.driver.hive.EmbeddedThriftConnection and org.apache.lens.driver.hive.RemoteThriftConnection.|
*--+--+---+--+
|12|lens.driver.hive.cost.calculator.class|org.apache.lens.cube.query.cost.FactPartitionBasedQueryCostCalculator|Cost calculator class. By default calculating cost through fact partitions.|
*--+--+---+--+
|13|lens.driver.hive.hs2.connection.expiry.delay|600000|The idle time (in milliseconds) for expiring connection from hivedriver to HiveServer2|
*--+--+---+--+
|14|lens.driver.hive.priority.ranges|VERY_HIGH,7.0,HIGH,30.0,NORMAL,90,LOW|Priority Ranges. The numbers are the costs of the query. \ |
| | | |The cost is calculated based on partition weights and fact weights. The interpretation of the default config is: \ |
| | | | \ |
| | | |cost \<= 7\ \ \ \ \ \ \ \ \ \ \ :\ \ \ \ \ Priority = VERY_HIGH \ |
| | | |7 \< cost \<= 30\ \ \ \ \ \ \ :\ \ \ \ \ Priority = HIGH \ |
| | | |30 \< cost \<= 90\ \ \ \ \ \ :\ \ \ \ \ Priority = NORMAL \ |
| | | |90 \< cost\ \ \ \ \ \ \ \ \ \ \ :\ \ \ \ \ Priority = LOW \ |
| | | | \ |
| | | |Some perspective wrt default weights and default ranges(1 for hourly, 0.75 for daily, 0.5 for monthly): \ |
| | | |For exclusively hourly data this translates to VERY_HIGH,7days,HIGH,30days,NORMAL,90days,LOW. \ |
| | | |FOR exclusively daily data this translates to VERY_HIGH,9days,HIGH,40days,NORMAL,120days,LOW. \ |
| | | |for exclusively monthly data this translates to VERY_HIGH,never,HIGH,1month,NORMAL,6months,LOW. \ |
| | | | \ |
| | | |One use case in range tuning can be that you never want queries to run with VERY_HIGH, assuming no other changes, you'll modify the value of this param in hivedriver-site.xml to be HIGH,30.0,NORMAL,90,LOW\ |
| | | |via the configs, you can tune both the ranges and partition weights. this would give the end user more control. |
*--+--+---+--+
|15|lens.driver.hive.query.hook.class|org.apache.lens.server.api.driver.NoOpDriverQueryHook|The query hook class for hive driver. By default hook is No op. To add a hook, you should look at the default implementation and from there it'll be easy to derive what value can be added through a new hook|
*--+--+---+--+
|16|lens.driver.hive.query.launching.constraint.factories| |Factories used to instantiate constraints enforced on queries by driver. A query will be launched only if all constraints pass. Every Factory should be an implementation of org.apache.lens.server.api.common.ConfigBasedObjectCreationFactory and create an implementation of org.apache.lens.server.api.query.constraint.QueryLaunchingConstraint.|
*--+--+---+--+
|17|lens.driver.hive.waiting.queries.selection.policy.factories| |Factories used to instantiate driver specific waiting queries selection policies. Every factory should be an implementation of org.apache.lens.server.api.common.ConfigBasedObjectCreationFactory and create an implementation of org.apache.lens.server.api.query.collect.WaitingQueriesSelectionPolicy.|
*--+--+---+--+
The configuration parameters and their default values