src/site/apt/admin/session-config.apt - lens - Git at Google

 ~~
 ~~ Licensed to the Apache Software Foundation (ASF) under one
 ~~ or more contributor license agreements.  See the NOTICE file
 ~~ distributed with this work for additional information
 ~~ regarding copyright ownership.  The ASF licenses this file
 ~~ to you under the Apache License, Version 2.0 (the
 ~~ "License"); you may not use this file except in compliance
 ~~ with the License.  You may obtain a copy of the License at
 ~~
 ~~   http://www.apache.org/licenses/LICENSE-2.0
 ~~
 ~~ Unless required by applicable law or agreed to in writing,
 ~~ software distributed under the License is distributed on an
 ~~ "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
 ~~ KIND, either express or implied.  See the License for the
 ~~ specific language governing permissions and limitations
 ~~ under the License.
 ~~

 Lens session configuration

 ===

 *--+--+---+--+
 |<<No.>>|<<Property Name>>|<<Default Value>>|<<Description>>|
 *--+--+---+--+
 |1|hive.metastore.batch.retrieve.max|100|Maximum number of objects (tables/partitions) can be retrieved from metastore in one batch. The higher the number, the less the number of round trips is needed to the Hive metastore server, but it may also cause higher memory requirement at the client side.|
 *--+--+---+--+
 |2|hive.metastore.batch.retrieve.table.partition.max|500|Maximum number of table partitions that metastore internally retrieves in one batch.|
 *--+--+---+--+
 |3|hive.metastore.client.connect.retry.delay|1|Number of seconds for the client to wait between consecutive connection attempts|
 *--+--+---+--+
 |4|hive.metastore.client.socket.timeout|20|MetaStore Client socket timeout in seconds|
 *--+--+---+--+
 |5|hive.metastore.connect.retries|5|Number of retries while opening a connection to metastore|
 *--+--+---+--+
 |6|hive.metastore.failure.retries|3|Number of call retries when Hive Metastore calls fail with Thrift errros|
 *--+--+---+--+
 |7|hive.metastore.uris| |The hive metastore server URI that the lens server is talking to|
 *--+--+---+--+
 |8|lens.cube.query.completeness.threshold|100|The query will fail if data completeness is less than the set threshold given that the flag "lens.cube.query.fail.if.data.partial" is set as true|
 *--+--+---+--+
 |9|lens.cube.query.disable.aggregate.resolver|false|Tells whether to disable automatic resolution of aggregations for measures in a cube. To enable automatic resolution, this value should be false.|
 *--+--+---+--+
 |10|lens.cube.query.disable.auto.join|false|Tells whether to disable automatic resolution of join conditions between tables involved. To enable automatic resolution, this value should be false.|
 *--+--+---+--+
 |11|lens.cube.query.fail.if.data.partial|true|Whether to fail the query of data is partial|
 *--+--+---+--+
 |12|lens.cube.query.promote.select.togroupby|true|Tells whether to promote select expressions which is not inside any aggregate, to be promoted to groupby clauses, if they are already not part of groupby clauses. To enable automatic promotion, this value should be true.|
 *--+--+---+--+
 |13|lens.query.add.insert.overwrite|true|Prefix query with insert overwrite clause if the query is persistent. User can disable if user gave the clause himself.|
 *--+--+---+--+
 |14|lens.query.cancel.on.timeout|true|Specifies whether to attempt cancellation of a query whose execution takes longer than the timeout value specified while submitting the query for execution. The default value is true.|
 *--+--+---+--+
 |15|lens.query.enable.mail.notify|false|When a query ends, whether to notify the submitter by mail or not.|
 *--+--+---+--+
 |16|lens.query.enable.metrics.per.query|false|Generates gauge metrics for each query to measure time taken with unique id appended for each query. Should be enabled only for performance measurements. Should not be enabled in day to day production environment.|
 *--+--+---+--+
 |17|lens.query.enable.persistent.resultset|false|Whether to enable persistent resultset for queries. When enabled, server will fetch results from driver, custom format them if any and store in a configured location. The file name of query output is queryhandle-id, with configured extensions|
 *--+--+---+--+
 |18|lens.query.enable.persistent.resultset.indriver|true|Whether the result should be persisted by driver. Currently only HiveDriver persists the results in a HDFS location.|
 *--+--+---+--+
 |19|lens.query.hdfs.output.path|hdfsout|The directory under the parent result directory, in which HiveDriver will persist the results, if persisting by driver is enabled. This directory should exist and should have world writable permissions sothat all users will be able put query outputs here.|
 *--+--+---+--+
 |20|lens.query.http.notification.mediatype|application/json|This is the media type for Query Http notifications. Accepted types are "application/json" and "application/xml". The default value is "application/json"|
 *--+--+---+--+
 |21|lens.query.http.notification.type.FINISHED|false|Setting this property to true will enable query FINISHED notifications which includes SUCCESSFUL, FAILED and CANCELLED queries. The notification will have eventtype = "FINISHED", eventtime = long event time and query = org.apache.lens.api.query.LensQuery instance. The mediatype for eventtype and eventtime will be TEXT/PLAIN and the mediatype for query will be based on property lens.query.http.notification.mediatype. Default value of this property is false.|
 *--+--+---+--+
 |22|lens.query.http.notification.urls| |These are the http end points for Query http notifications. Users can specify more than one comma separated end points for a query. Url parameter values that include special characters should be encoded. Please note that if this property is not set, no http notification will be sent out by lens server for the query.|
 *--+--+---+--+
 |23|lens.query.output.charset.encoding|UTF-8|The charset encoding for formatting query result. It supports all the encodings supported by java.io.OutputStreamWriter.|
 *--+--+---+--+
 |24|lens.query.output.compression.codec|org.apache.hadoop.io.compress.GzipCodec|The codec used to compress the query output, if compression is enabled|
 *--+--+---+--+
 |25|lens.query.output.enable.compression|false|Whether to compress the query result output|
 *--+--+---+--+
 |26|lens.query.output.file.extn|.csv|The extension name for the persisted query output file. If file is compressed, the extension from compression codec will be appended to this extension.|
 *--+--+---+--+
 |27|lens.query.output.footer| |The value of custom footer that should be written, if any. This footer will be added in formatting driver persisted results.|
 *--+--+---+--+
 |28|lens.query.output.formatter| |The query result output formatter for the query. If no value is specified, then org.apache.lens.lib.query.FileSerdeFormatter will be used to format in-memory result sets, org.apache.lens.lib.query.FilePersistentFormatter will be used to format driver persisted result sets.|
 *--+--+---+--+
 |29|lens.query.output.header| |The value of custom header that should be written, if any. If no value column names will be used as header.|
 *--+--+---+--+
 |30|lens.query.output.write.footer|false|Whether to write footer as part of query result. When enabled, total number of rows will be written as part of header.|
 *--+--+---+--+
 |31|lens.query.output.write.header|false|Whether to write header as part of query result formatting. When enabled the user given header will be added in case of driver persisted results, and column names chosen will be added as header for in-memory results.|
 *--+--+---+--+
 |32|lens.query.prefetch.inmemory.resultset|true|When set to true, specified number of rows of result set will be pre-fetched if the result set is of type InMemoryResultSet and query execution is not asynchronous i.e. query should be launched with operation as EXECUTE_WITH_TIMEOUT. Suggested usage of this property: It can be used by client to stream as well as persist results in server for queries that finish fast and produce results with fewer rows (should be less than number of rows pre-fetched). Note that the results are streamed to the client early, without waiting for persistence to finish. Default value of this property is true.|
 *--+--+---+--+
 |33|lens.query.prefetch.inmemory.resultset.rows|100|Specifies the number of rows to pre-fetch when lens.query.prefetch.inmemory.resultset is set to true. Default value is 100 rows.|
 *--+--+---+--+
 |34|lens.query.result.email.cc| |When query ends, the result/failure reason will be sent to the user via email. The mail would be cc'ed to the addresses provided in this field.|
 *--+--+---+--+
 |35|lens.query.result.fs.read.url| |Http read URL for FileSystem on which result is present, if available. For example webhdfs as http read url should http://host:port/webhdfs/v1. Currently we support only webhdfs url as the http url for HDFS file system|
 *--+--+---+--+
 |36|lens.query.result.output.dir.format| |The format of the output if result is persisted in hdfs. The format should be expressed in HQL.|
 *--+--+---+--+
 |37|lens.query.result.output.serde|org.apache.lens.lib.query.CSVSerde|The default serde class name that should be used by org.apache.lens.lib.query.FileSerdeFormatter for formatting the output|
 *--+--+---+--+
 |38|lens.query.result.parent.dir|file:///tmp/lensreports|The directory for storing persisted result of query. This directory should exist and should have writable permissions by lens server|
 *--+--+---+--+
 |39|lens.query.result.size.format.threshold|10737418240|The maximum allowed size of the query result. If exceeds, no server side formatting would be done.|
 *--+--+---+--+
 |40|lens.query.result.split.multiple|false|Whether to split the result into multiple files. If enabled, each file will be restricted to max rows configured. All the files will be available as zip.|
 *--+--+---+--+
 |41|lens.query.result.split.multiple.maxrows|100000|The maximum number of rows allowed in each file, when splitting the result into multiple files is enabled.|
 *--+--+---+--+
 |42|lens.query.timeout.millis|86400000|The runtime(millis) of the query after which query will be timedout and cancelled. Default is 1 day.|
 *--+--+---+--+
 |43|lens.session.aux.jars| |List of comma separated jar paths, which will added to the session|
 *--+--+---+--+
 |44|lens.session.cluster.user| |Session level config which will determine which cluster user will access hdfs|
 *--+--+---+--+
 |45|lens.session.loggedin.user| |The username used to log in to lens. e.g. LDAP user|
 *--+--+---+--+
 |46|lens.session.metastore.exclude.cubetables.from.nativetables|true|Exclude cube related tables when fetching native tables|
 *--+--+---+--+
 The configuration parameters and their default values
	~~
	~~ Licensed to the Apache Software Foundation (ASF) under one
	~~ or more contributor license agreements. See the NOTICE file
	~~ distributed with this work for additional information
	~~ regarding copyright ownership. The ASF licenses this file
	~~ to you under the Apache License, Version 2.0 (the
	~~ "License"); you may not use this file except in compliance
	~~ with the License. You may obtain a copy of the License at
	~~
	~~ http://www.apache.org/licenses/LICENSE-2.0
	~~
	~~ Unless required by applicable law or agreed to in writing,
	~~ software distributed under the License is distributed on an
	~~ "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
	~~ KIND, either express or implied. See the License for the
	~~ specific language governing permissions and limitations
	~~ under the License.
	~~

	Lens session configuration

	===

	*--+--+---+--+
	\|<<No.>>\|<<Property Name>>\|<<Default Value>>\|<<Description>>\|
	*--+--+---+--+
	\|1\|hive.metastore.batch.retrieve.max\|100\|Maximum number of objects (tables/partitions) can be retrieved from metastore in one batch. The higher the number, the less the number of round trips is needed to the Hive metastore server, but it may also cause higher memory requirement at the client side.\|
	*--+--+---+--+
	\|2\|hive.metastore.batch.retrieve.table.partition.max\|500\|Maximum number of table partitions that metastore internally retrieves in one batch.\|
	*--+--+---+--+
	\|3\|hive.metastore.client.connect.retry.delay\|1\|Number of seconds for the client to wait between consecutive connection attempts\|
	*--+--+---+--+
	\|4\|hive.metastore.client.socket.timeout\|20\|MetaStore Client socket timeout in seconds\|
	*--+--+---+--+
	\|5\|hive.metastore.connect.retries\|5\|Number of retries while opening a connection to metastore\|
	*--+--+---+--+
	\|6\|hive.metastore.failure.retries\|3\|Number of call retries when Hive Metastore calls fail with Thrift errros\|
	*--+--+---+--+
	\|7\|hive.metastore.uris\| \|The hive metastore server URI that the lens server is talking to\|
	*--+--+---+--+
	\|8\|lens.cube.query.completeness.threshold\|100\|The query will fail if data completeness is less than the set threshold given that the flag "lens.cube.query.fail.if.data.partial" is set as true\|
	*--+--+---+--+
	\|9\|lens.cube.query.disable.aggregate.resolver\|false\|Tells whether to disable automatic resolution of aggregations for measures in a cube. To enable automatic resolution, this value should be false.\|
	*--+--+---+--+
	\|10\|lens.cube.query.disable.auto.join\|false\|Tells whether to disable automatic resolution of join conditions between tables involved. To enable automatic resolution, this value should be false.\|
	*--+--+---+--+
	\|11\|lens.cube.query.fail.if.data.partial\|true\|Whether to fail the query of data is partial\|
	*--+--+---+--+
	\|12\|lens.cube.query.promote.select.togroupby\|true\|Tells whether to promote select expressions which is not inside any aggregate, to be promoted to groupby clauses, if they are already not part of groupby clauses. To enable automatic promotion, this value should be true.\|
	*--+--+---+--+
	\|13\|lens.query.add.insert.overwrite\|true\|Prefix query with insert overwrite clause if the query is persistent. User can disable if user gave the clause himself.\|
	*--+--+---+--+
	\|14\|lens.query.cancel.on.timeout\|true\|Specifies whether to attempt cancellation of a query whose execution takes longer than the timeout value specified while submitting the query for execution. The default value is true.\|
	*--+--+---+--+
	\|15\|lens.query.enable.mail.notify\|false\|When a query ends, whether to notify the submitter by mail or not.\|
	*--+--+---+--+
	\|16\|lens.query.enable.metrics.per.query\|false\|Generates gauge metrics for each query to measure time taken with unique id appended for each query. Should be enabled only for performance measurements. Should not be enabled in day to day production environment.\|
	*--+--+---+--+
	\|17\|lens.query.enable.persistent.resultset\|false\|Whether to enable persistent resultset for queries. When enabled, server will fetch results from driver, custom format them if any and store in a configured location. The file name of query output is queryhandle-id, with configured extensions\|
	*--+--+---+--+
	\|18\|lens.query.enable.persistent.resultset.indriver\|true\|Whether the result should be persisted by driver. Currently only HiveDriver persists the results in a HDFS location.\|
	*--+--+---+--+
	\|19\|lens.query.hdfs.output.path\|hdfsout\|The directory under the parent result directory, in which HiveDriver will persist the results, if persisting by driver is enabled. This directory should exist and should have world writable permissions sothat all users will be able put query outputs here.\|
	*--+--+---+--+
	\|20\|lens.query.http.notification.mediatype\|application/json\|This is the media type for Query Http notifications. Accepted types are "application/json" and "application/xml". The default value is "application/json"\|
	*--+--+---+--+
	\|21\|lens.query.http.notification.type.FINISHED\|false\|Setting this property to true will enable query FINISHED notifications which includes SUCCESSFUL, FAILED and CANCELLED queries. The notification will have eventtype = "FINISHED", eventtime = long event time and query = org.apache.lens.api.query.LensQuery instance. The mediatype for eventtype and eventtime will be TEXT/PLAIN and the mediatype for query will be based on property lens.query.http.notification.mediatype. Default value of this property is false.\|
	*--+--+---+--+
	\|22\|lens.query.http.notification.urls\| \|These are the http end points for Query http notifications. Users can specify more than one comma separated end points for a query. Url parameter values that include special characters should be encoded. Please note that if this property is not set, no http notification will be sent out by lens server for the query.\|
	*--+--+---+--+
	\|23\|lens.query.output.charset.encoding\|UTF-8\|The charset encoding for formatting query result. It supports all the encodings supported by java.io.OutputStreamWriter.\|
	*--+--+---+--+
	\|24\|lens.query.output.compression.codec\|org.apache.hadoop.io.compress.GzipCodec\|The codec used to compress the query output, if compression is enabled\|
	*--+--+---+--+
	\|25\|lens.query.output.enable.compression\|false\|Whether to compress the query result output\|
	*--+--+---+--+
	\|26\|lens.query.output.file.extn\|.csv\|The extension name for the persisted query output file. If file is compressed, the extension from compression codec will be appended to this extension.\|
	*--+--+---+--+
	\|27\|lens.query.output.footer\| \|The value of custom footer that should be written, if any. This footer will be added in formatting driver persisted results.\|
	*--+--+---+--+
	\|28\|lens.query.output.formatter\| \|The query result output formatter for the query. If no value is specified, then org.apache.lens.lib.query.FileSerdeFormatter will be used to format in-memory result sets, org.apache.lens.lib.query.FilePersistentFormatter will be used to format driver persisted result sets.\|
	*--+--+---+--+
	\|29\|lens.query.output.header\| \|The value of custom header that should be written, if any. If no value column names will be used as header.\|
	*--+--+---+--+
	\|30\|lens.query.output.write.footer\|false\|Whether to write footer as part of query result. When enabled, total number of rows will be written as part of header.\|
	*--+--+---+--+
	\|31\|lens.query.output.write.header\|false\|Whether to write header as part of query result formatting. When enabled the user given header will be added in case of driver persisted results, and column names chosen will be added as header for in-memory results.\|
	*--+--+---+--+
	\|32\|lens.query.prefetch.inmemory.resultset\|true\|When set to true, specified number of rows of result set will be pre-fetched if the result set is of type InMemoryResultSet and query execution is not asynchronous i.e. query should be launched with operation as EXECUTE_WITH_TIMEOUT. Suggested usage of this property: It can be used by client to stream as well as persist results in server for queries that finish fast and produce results with fewer rows (should be less than number of rows pre-fetched). Note that the results are streamed to the client early, without waiting for persistence to finish. Default value of this property is true.\|
	*--+--+---+--+
	\|33\|lens.query.prefetch.inmemory.resultset.rows\|100\|Specifies the number of rows to pre-fetch when lens.query.prefetch.inmemory.resultset is set to true. Default value is 100 rows.\|
	*--+--+---+--+
	\|34\|lens.query.result.email.cc\| \|When query ends, the result/failure reason will be sent to the user via email. The mail would be cc'ed to the addresses provided in this field.\|
	*--+--+---+--+
	\|35\|lens.query.result.fs.read.url\| \|Http read URL for FileSystem on which result is present, if available. For example webhdfs as http read url should http://host:port/webhdfs/v1. Currently we support only webhdfs url as the http url for HDFS file system\|
	*--+--+---+--+
	\|36\|lens.query.result.output.dir.format\| \|The format of the output if result is persisted in hdfs. The format should be expressed in HQL.\|
	*--+--+---+--+
	\|37\|lens.query.result.output.serde\|org.apache.lens.lib.query.CSVSerde\|The default serde class name that should be used by org.apache.lens.lib.query.FileSerdeFormatter for formatting the output\|
	*--+--+---+--+
	\|38\|lens.query.result.parent.dir\|file:///tmp/lensreports\|The directory for storing persisted result of query. This directory should exist and should have writable permissions by lens server\|
	*--+--+---+--+
	\|39\|lens.query.result.size.format.threshold\|10737418240\|The maximum allowed size of the query result. If exceeds, no server side formatting would be done.\|
	*--+--+---+--+
	\|40\|lens.query.result.split.multiple\|false\|Whether to split the result into multiple files. If enabled, each file will be restricted to max rows configured. All the files will be available as zip.\|
	*--+--+---+--+
	\|41\|lens.query.result.split.multiple.maxrows\|100000\|The maximum number of rows allowed in each file, when splitting the result into multiple files is enabled.\|
	*--+--+---+--+
	\|42\|lens.query.timeout.millis\|86400000\|The runtime(millis) of the query after which query will be timedout and cancelled. Default is 1 day.\|
	*--+--+---+--+
	\|43\|lens.session.aux.jars\| \|List of comma separated jar paths, which will added to the session\|
	*--+--+---+--+
	\|44\|lens.session.cluster.user\| \|Session level config which will determine which cluster user will access hdfs\|
	*--+--+---+--+
	\|45\|lens.session.loggedin.user\| \|The username used to log in to lens. e.g. LDAP user\|
	*--+--+---+--+
	\|46\|lens.session.metastore.exclude.cubetables.from.nativetables\|true\|Exclude cube related tables when fetching native tables\|
	*--+--+---+--+
	The configuration parameters and their default values