| <?xml version="1.0" encoding="UTF-8"?> |
| <!-- |
| Licensed to the Apache Software Foundation (ASF) under one |
| or more contributor license agreements. See the NOTICE file |
| distributed with this work for additional information |
| regarding copyright ownership. The ASF licenses this file |
| to you under the Apache License, Version 2.0 (the |
| "License"); you may not use this file except in compliance |
| with the License. You may obtain a copy of the License at |
| |
| http://www.apache.org/licenses/LICENSE-2.0 |
| |
| Unless required by applicable law or agreed to in writing, |
| software distributed under the License is distributed on an |
| "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY |
| KIND, either express or implied. See the License for the |
| specific language governing permissions and limitations |
| under the License. |
| --> |
| <!DOCTYPE concept PUBLIC "-//OASIS//DTD DITA Concept//EN" "concept.dtd"> |
| <concept id="config_options"> |
| |
| <title>Modifying Impala Startup Options</title> |
| |
| <prolog> |
| <metadata> |
| <data name="Category" value="Impala"/> |
| <data name="Category" value="Configuring"/> |
| <data name="Category" value="Administrators"/> |
| <data name="Category" value="Developers"/> |
| </metadata> |
| </prolog> |
| |
| <conbody> |
| |
| <p> |
| <indexterm audience="hidden">defaults file</indexterm> |
| |
| <indexterm audience="hidden">configuration file</indexterm> |
| |
| <indexterm audience="hidden">options</indexterm> |
| |
| <indexterm audience="hidden">IMPALA_STATE_STORE_PORT</indexterm> |
| |
| <indexterm audience="hidden">IMPALA_BACKEND_PORT</indexterm> |
| |
| <indexterm audience="hidden">IMPALA_LOG_DIR</indexterm> |
| |
| <indexterm audience="hidden">IMPALA_STATE_STORE_ARGS</indexterm> |
| |
| <indexterm audience="hidden">IMPALA_SERVER_ARGS</indexterm> |
| |
| <indexterm audience="hidden">ENABLE_CORE_DUMPS</indexterm> |
| |
| <indexterm audience="hidden">core dumps</indexterm> |
| |
| <indexterm audience="hidden">restarting services</indexterm> |
| |
| <indexterm audience="hidden">services</indexterm> |
| The configuration options for the Impala-related daemons let you choose which hosts and |
| ports to use for the services that run on a single host, specify directories for logging, |
| control resource usage and security, and specify other aspects of the Impala software. |
| </p> |
| |
| <p outputclass="toc inpage"/> |
| |
| </conbody> |
| |
| <concept id="config_options_noncm"> |
| |
| <title>Configuring Impala Startup Options through the Command Line</title> |
| |
| <conbody> |
| |
| <p> The Impala server, statestore, and catalog services start up using values provided in a |
| defaults file, <filepath>/etc/default/impala</filepath>. </p> |
| |
| <p> |
| This file includes information about many resources used by Impala. Most of the defaults |
| included in this file should be effective in most cases. For example, typically you |
| would not change the definition of the <codeph>CLASSPATH</codeph> variable, but you |
| would always set the address used by the statestore server. Some of the content you |
| might modify includes: |
| </p> |
| |
| <!-- Note: Update the following example for each release with the associated lines from /etc/default/impala. --> |
| |
| <codeblock rev="ver">IMPALA_STATE_STORE_HOST=127.0.0.1 |
| IMPALA_STATE_STORE_PORT=24000 |
| IMPALA_BACKEND_PORT=22000 |
| IMPALA_LOG_DIR=/var/log/impala |
| IMPALA_CATALOG_SERVICE_HOST=... |
| IMPALA_STATE_STORE_HOST=... |
| |
| export IMPALA_STATE_STORE_ARGS=${IMPALA_STATE_STORE_ARGS:- \ |
| -log_dir=${IMPALA_LOG_DIR} -state_store_port=${IMPALA_STATE_STORE_PORT}} |
| IMPALA_SERVER_ARGS=" \ |
| -log_dir=${IMPALA_LOG_DIR} \ |
| -catalog_service_host=${IMPALA_CATALOG_SERVICE_HOST} \ |
| -state_store_port=${IMPALA_STATE_STORE_PORT} \ |
| -state_store_host=${IMPALA_STATE_STORE_HOST} \ |
| -be_port=${IMPALA_BACKEND_PORT}" |
| export ENABLE_CORE_DUMPS=${ENABLE_COREDUMPS:-false}</codeblock> |
| |
| <p> |
| To use alternate values, edit the defaults file, then restart all the Impala-related |
| services so that the changes take effect. Restart the Impala server using the following |
| commands: |
| </p> |
| |
| <codeblock>$ sudo service impala-server restart |
| Stopping Impala Server: [ OK ] |
| Starting Impala Server: [ OK ]</codeblock> |
| |
| <p> |
| Restart the Impala statestore using the following commands: |
| </p> |
| |
| <codeblock>$ sudo service impala-state-store restart |
| Stopping Impala State Store Server: [ OK ] |
| Starting Impala State Store Server: [ OK ]</codeblock> |
| |
| <p> |
| Restart the Impala catalog service using the following commands: |
| </p> |
| |
| <codeblock>$ sudo service impala-catalog restart |
| Stopping Impala Catalog Server: [ OK ] |
| Starting Impala Catalog Server: [ OK ]</codeblock> |
| |
| <p> |
| Some common settings to change include: |
| </p> |
| |
| <ul> |
| <li> |
| <p> |
| Statestore address. Where practical, put the statestore on a separate host not |
| running the <cmdname>impalad</cmdname> daemon. In that recommended configuration, |
| the <cmdname>impalad</cmdname> daemon cannot refer to the statestore server using |
| the loopback address. If the statestore is hosted on a machine with an IP address of |
| 192.168.0.27, change: |
| </p> |
| <codeblock>IMPALA_STATE_STORE_HOST=127.0.0.1</codeblock> |
| <p> |
| to: |
| </p> |
| <codeblock>IMPALA_STATE_STORE_HOST=192.168.0.27</codeblock> |
| </li> |
| |
| <li rev="1.2"> |
| <p> |
| Catalog server address (including both the hostname and the port number). Update the |
| value of the <codeph>IMPALA_CATALOG_SERVICE_HOST</codeph> variable. Where |
| practical, run the catalog server on the same host as the statestore. In that |
| recommended configuration, the <cmdname>impalad</cmdname> daemon cannot refer to the |
| catalog server using the loopback address. If the catalog service is hosted on a |
| machine with an IP address of 192.168.0.27, add the following line: |
| </p> |
| <codeblock>IMPALA_CATALOG_SERVICE_HOST=192.168.0.27:26000</codeblock> |
| <p> |
| The <filepath>/etc/default/impala</filepath> defaults file currently does not define |
| an <codeph>IMPALA_CATALOG_ARGS</codeph> environment variable, but if you add one it |
| will be recognized by the service startup/shutdown script. Add a definition for this |
| variable to <filepath>/etc/default/impala</filepath> and add the option |
| <codeph>-catalog_service_host=<varname>hostname</varname></codeph>. If the port is |
| different than the default 26000, also add the option |
| <codeph>-catalog_service_port=<varname>port</varname></codeph>. |
| </p> |
| </li> |
| |
| <li id="mem_limit"> |
| <p> |
| Memory limits. You can limit the amount of memory available to Impala. For example, |
| to allow Impala to use no more than 70% of system memory, change: |
| </p> |
| <!-- Note: also needs to be updated for each release to reflect latest /etc/default/impala. --> |
| <codeblock>export IMPALA_SERVER_ARGS=${IMPALA_SERVER_ARGS:- \ |
| -log_dir=${IMPALA_LOG_DIR} \ |
| -state_store_port=${IMPALA_STATE_STORE_PORT} \ |
| -state_store_host=${IMPALA_STATE_STORE_HOST} \ |
| -be_port=${IMPALA_BACKEND_PORT}}</codeblock> |
| <p> |
| to: |
| </p> |
| <codeblock>export IMPALA_SERVER_ARGS=${IMPALA_SERVER_ARGS:- \ |
| -log_dir=${IMPALA_LOG_DIR} -state_store_port=${IMPALA_STATE_STORE_PORT} \ |
| -state_store_host=${IMPALA_STATE_STORE_HOST} \ |
| -be_port=${IMPALA_BACKEND_PORT} -mem_limit=70%}</codeblock> |
| <p> |
| You can specify the memory limit using absolute notation such as |
| <codeph>500m</codeph> or <codeph>2G</codeph>, or as a percentage of physical memory |
| such as <codeph>60%</codeph>. |
| </p> |
| |
| <note> |
| Queries that exceed the specified memory limit are aborted. Percentage limits are |
| based on the physical memory of the machine and do not consider cgroups. |
| </note> |
| </li> |
| |
| <li> |
| <p> Core dump enablement. To enable core dumps, change: </p> |
| <codeblock>export ENABLE_CORE_DUMPS=${ENABLE_COREDUMPS:-false}</codeblock> |
| <p> |
| to: |
| </p> |
| <codeblock>export ENABLE_CORE_DUMPS=${ENABLE_COREDUMPS:-true}</codeblock> |
| |
| <note conref="../shared/impala_common.xml#common/core_dump_considerations"/> |
| </li> |
| |
| <li> |
| <p> |
| Authorization using the open source Sentry plugin. Specify the |
| <codeph>-server_name</codeph> and <codeph>-authorization_policy_file</codeph> |
| options as part of the <codeph>IMPALA_SERVER_ARGS</codeph> and |
| <codeph>IMPALA_STATE_STORE_ARGS</codeph> settings to enable the core Impala support |
| for authentication. See <xref href="impala_authorization.xml#secure_startup"/> for |
| details. |
| </p> |
| </li> |
| |
| <li> |
| <p> |
| Auditing for successful or blocked Impala queries, another aspect of security. |
| Specify the <codeph>-audit_event_log_dir=<varname>directory_path</varname></codeph> |
| option and optionally the |
| <codeph>-max_audit_event_log_file_size=<varname>number_of_queries</varname></codeph> |
| and <codeph>-abort_on_failed_audit_event</codeph> options as part of the |
| <codeph>IMPALA_SERVER_ARGS</codeph> settings, for each Impala node, to enable and |
| customize auditing. See <xref href="impala_auditing.xml#auditing"/> for details. |
| </p> |
| </li> |
| |
| <li> |
| <p> |
| Password protection for the Impala web UI, which listens on port 25000 by default. |
| This feature involves adding some or all of the |
| <codeph>--webserver_password_file</codeph>, |
| <codeph>--webserver_authentication_domain</codeph>, and |
| <codeph>--webserver_certificate_file</codeph> options to the |
| <codeph>IMPALA_SERVER_ARGS</codeph> and <codeph>IMPALA_STATE_STORE_ARGS</codeph> |
| settings. See <xref href="impala_security_guidelines.xml#security_guidelines"/> for |
| details. |
| </p> |
| </li> |
| |
| <li id="default_query_options"> |
| <p rev="DOCS-677"> |
| Another setting you might add to <codeph>IMPALA_SERVER_ARGS</codeph> is a |
| comma-separated list of query options and values: |
| <codeblock>-default_query_options='<varname>option</varname>=<varname>value</varname>,<varname>option</varname>=<varname>value</varname>,...' |
| </codeblock> |
| These options control the behavior of queries performed by this |
| <cmdname>impalad</cmdname> instance. The option values you specify here override the |
| default values for <xref href="impala_query_options.xml#query_options">Impala query |
| options</xref>, as shown by the <codeph>SET</codeph> statement in |
| <cmdname>impala-shell</cmdname>. |
| </p> |
| </li> |
| |
| <!-- Removing this reference now that the options are de-emphasized / desupported in Impala 2.3 and up. |
| <li rev="1.2"> |
| <p> |
| Options for resource management, in conjunction with the YARN component. These options include |
| <codeph>-enable_rm</codeph> and <codeph>-cgroup_hierarchy_path</codeph>. |
| <ph rev="1.4.0">Additional options to help fine-tune the resource estimates are |
| <codeph>-—rm_always_use_defaults</codeph>, |
| <codeph>-—rm_default_memory=<varname>size</varname></codeph>, and |
| <codeph>-—rm_default_cpu_cores</codeph>.</ph> For details about these options, see |
| <xref href="impala_resource_management.xml#rm_options"/>. See |
| <xref href="impala_resource_management.xml#resource_management"/> for information about resource |
| management in general. |
| </p> |
| </li> |
| --> |
| |
| <li> |
| <p> |
| During troubleshooting, <keyword keyref="support_org"/> might direct you to change other values, |
| particularly for <codeph>IMPALA_SERVER_ARGS</codeph>, to work around issues or |
| gather debugging information. |
| </p> |
| </li> |
| </ul> |
| |
| <!-- Removing this reference now that the options are de-emphasized / desupported in Impala 2.3 and up. |
| <p conref="impala_resource_management.xml#rm_options/resource_management_impalad_options"/> |
| --> |
| |
| <note> |
| <p> |
| These startup options for the <cmdname>impalad</cmdname> daemon are different from the |
| command-line options for the <cmdname>impala-shell</cmdname> command. For the |
| <cmdname>impala-shell</cmdname> options, see |
| <xref href="impala_shell_options.xml#shell_options"/>. |
| </p> |
| </note> |
| |
| <p audience="hidden" outputclass="toc inpage"/> |
| |
| </conbody> |
| |
| <concept audience="hidden" id="config_options_impalad_details"> |
| |
| <title>Configuration Options for impalad Daemon</title> |
| |
| <conbody> |
| |
| <p> |
| Some common settings to change include: |
| </p> |
| |
| <ul> |
| <li> |
| <p> |
| Statestore address. Where practical, put the statestore on a separate host not |
| running the <cmdname>impalad</cmdname> daemon. In that recommended configuration, |
| the <cmdname>impalad</cmdname> daemon cannot refer to the statestore server using |
| the loopback address. If the statestore is hosted on a machine with an IP address |
| of 192.168.0.27, change: |
| </p> |
| <codeblock>IMPALA_STATE_STORE_HOST=127.0.0.1</codeblock> |
| <p> |
| to: |
| </p> |
| <codeblock>IMPALA_STATE_STORE_HOST=192.168.0.27</codeblock> |
| </li> |
| |
| <li rev="1.2"> |
| <p> |
| Catalog server address. Update the <codeph>IMPALA_CATALOG_SERVICE_HOST</codeph> |
| variable, including both the hostname and the port number in the value. Where |
| practical, run the catalog server on the same host as the statestore. In that |
| recommended configuration, the <cmdname>impalad</cmdname> daemon cannot refer to |
| the catalog server using the loopback address. If the catalog service is hosted on |
| a machine with an IP address of 192.168.0.27, add the following line: |
| </p> |
| <codeblock>IMPALA_CATALOG_SERVICE_HOST=192.168.0.27:26000</codeblock> |
| <p> |
| The <filepath>/etc/default/impala</filepath> defaults file currently does not |
| define an <codeph>IMPALA_CATALOG_ARGS</codeph> environment variable, but if you |
| add one it will be recognized by the service startup/shutdown script. Add a |
| definition for this variable to <filepath>/etc/default/impala</filepath> and add |
| the option <codeph>-catalog_service_host=<varname>hostname</varname></codeph>. If |
| the port is different than the default 26000, also add the option |
| <codeph>-catalog_service_port=<varname>port</varname></codeph>. |
| </p> |
| </li> |
| |
| <li id="mem_limit"> |
| Memory limits. You can limit the amount of memory available to Impala. For example, |
| to allow Impala to use no more than 70% of system memory, change: |
| <!-- Note: also needs to be updated for each release to reflect latest /etc/default/impala. --> |
| <codeblock>export IMPALA_SERVER_ARGS=${IMPALA_SERVER_ARGS:- \ |
| -log_dir=${IMPALA_LOG_DIR} \ |
| -state_store_port=${IMPALA_STATE_STORE_PORT} \ |
| -state_store_host=${IMPALA_STATE_STORE_HOST} \ |
| -be_port=${IMPALA_BACKEND_PORT}}</codeblock> |
| <p> |
| to: |
| </p> |
| <codeblock>export IMPALA_SERVER_ARGS=${IMPALA_SERVER_ARGS:- \ |
| -log_dir=${IMPALA_LOG_DIR} -state_store_port=${IMPALA_STATE_STORE_PORT} \ |
| -state_store_host=${IMPALA_STATE_STORE_HOST} \ |
| -be_port=${IMPALA_BACKEND_PORT} -mem_limit=70%}</codeblock> |
| <p> |
| You can specify the memory limit using absolute notation such as |
| <codeph>500m</codeph> or <codeph>2G</codeph>, or as a percentage of physical |
| memory such as <codeph>60%</codeph>. |
| </p> |
| |
| <note> |
| Queries that exceed the specified memory limit are aborted. Percentage limits are |
| based on the physical memory of the machine and do not consider cgroups. |
| </note> |
| </li> |
| |
| <li> |
| Core dump enablement. To enable core dumps, change: |
| <codeblock>export ENABLE_CORE_DUMPS=${ENABLE_COREDUMPS:-false}</codeblock> |
| <p> |
| to: |
| </p> |
| <codeblock>export ENABLE_CORE_DUMPS=${ENABLE_COREDUMPS:-true}</codeblock> |
| <note> |
| The location of core dump files may vary according to your operating system |
| configuration. Other security settings may prevent Impala from writing core dumps |
| even when this option is enabled. |
| </note> |
| </li> |
| |
| <li> |
| Authorization using the open source Sentry plugin. Specify the |
| <codeph>-server_name</codeph> and <codeph>-authorization_policy_file</codeph> |
| options as part of the <codeph>IMPALA_SERVER_ARGS</codeph> and |
| <codeph>IMPALA_STATE_STORE_ARGS</codeph> settings to enable the core Impala support |
| for authentication. See <xref href="impala_authorization.xml#secure_startup"/> for |
| details. |
| </li> |
| |
| <li> |
| Auditing for successful or blocked Impala queries, another aspect of security. |
| Specify the <codeph>-audit_event_log_dir=<varname>directory_path</varname></codeph> |
| option and optionally the |
| <codeph>-max_audit_event_log_file_size=<varname>number_of_queries</varname></codeph> |
| and <codeph>-abort_on_failed_audit_event</codeph> options as part of the |
| <codeph>IMPALA_SERVER_ARGS</codeph> settings, for each Impala node, to enable and |
| customize auditing. See <xref href="impala_auditing.xml#auditing"/> for details. |
| </li> |
| |
| <li> |
| Password protection for the Impala web UI, which listens on port 25000 by default. |
| This feature involves adding some or all of the |
| <codeph>--webserver_password_file</codeph>, |
| <codeph>--webserver_authentication_domain</codeph>, and |
| <codeph>--webserver_certificate_file</codeph> options to the |
| <codeph>IMPALA_SERVER_ARGS</codeph> and <codeph>IMPALA_STATE_STORE_ARGS</codeph> |
| settings. See <xref href="impala_security_webui.xml"/> for details. |
| </li> |
| |
| <li id="default_query_options"> |
| Another setting you might add to <codeph>IMPALA_SERVER_ARGS</codeph> is: |
| <codeblock>-default_query_options='<varname>option</varname>=<varname>value</varname>,<varname>option</varname>=<varname>value</varname>,...' |
| </codeblock> |
| These options control the behavior of queries performed by this |
| <cmdname>impalad</cmdname> instance. The option values you specify here override the |
| default values for <xref href="impala_query_options.xml#query_options">Impala query |
| options</xref>, as shown by the <codeph>SET</codeph> statement in |
| <cmdname>impala-shell</cmdname>. |
| </li> |
| |
| <!-- Removing this reference now that the options are de-emphasized / desupported in Impala 2.3 and up. |
| <li rev="1.2"> |
| Options for resource management, in conjunction with the YARN component. These options |
| include <codeph>-enable_rm</codeph> and <codeph>-cgroup_hierarchy_path</codeph>. |
| <ph rev="1.4.0">Additional options to help fine-tune the resource estimates are |
| <codeph>-—rm_always_use_defaults</codeph>, |
| <codeph>-—rm_default_memory=<varname>size</varname></codeph>, and |
| <codeph>-—rm_default_cpu_cores</codeph>.</ph> For details about these options, see |
| <xref href="impala_resource_management.xml#rm_options"/>. See |
| <xref href="impala_resource_management.xml#resource_management"/> for information about resource |
| management in general. |
| </li> |
| --> |
| |
| <li> |
| During troubleshooting, <keyword keyref="support_org"/> might direct you to change other values, |
| particularly for <codeph>IMPALA_SERVER_ARGS</codeph>, to work around issues or |
| gather debugging information. |
| </li> |
| </ul> |
| |
| <!-- Removing this reference now that the options are de-emphasized / desupported in Impala 2.3 and up. |
| <p conref="impala_resource_management.xml#rm_options/resource_management_impalad_options"/> |
| --> |
| |
| <note> |
| <p> |
| These startup options for the <cmdname>impalad</cmdname> daemon are different from |
| the command-line options for the <cmdname>impala-shell</cmdname> command. For the |
| <cmdname>impala-shell</cmdname> options, see |
| <xref href="impala_shell_options.xml#shell_options"/>. |
| </p> |
| </note> |
| |
| </conbody> |
| |
| </concept> |
| |
| <concept audience="hidden" id="config_options_statestored_details"> |
| |
| <title>Configuration Options for statestored Daemon</title> |
| |
| <conbody> |
| |
| <p></p> |
| |
| </conbody> |
| |
| </concept> |
| |
| <concept audience="hidden" id="config_options_catalogd_details"> |
| |
| <title>Configuration Options for catalogd Daemon</title> |
| |
| <conbody> |
| |
| <p></p> |
| |
| </conbody> |
| |
| </concept> |
| |
| </concept> |
| |
| <concept id="config_options_checking"> |
| |
| <title>Checking the Values of Impala Configuration Options</title> |
| |
| <conbody> |
| |
| <p> |
| You can check the current runtime value of all these settings through the Impala web |
| interface, available by default at |
| <codeph>http://<varname>impala_hostname</varname>:25000/varz</codeph> for the |
| <cmdname>impalad</cmdname> daemon, |
| <codeph>http://<varname>impala_hostname</varname>:25010/varz</codeph> for the |
| <cmdname>statestored</cmdname> daemon, or |
| <codeph>http://<varname>impala_hostname</varname>:25020/varz</codeph> for the |
| <cmdname>catalogd</cmdname> daemon. |
| </p> |
| |
| </conbody> |
| |
| </concept> |
| |
| <concept id="config_options_impalad"> |
| |
| <title>Startup Options for impalad Daemon</title> |
| |
| <conbody> |
| |
| <p> |
| The <codeph>impalad</codeph> daemon implements the main Impala service, which performs |
| query processing and reads and writes the data files. |
| </p> |
| |
| </conbody> |
| |
| </concept> |
| |
| <concept id="config_options_statestored"> |
| |
| <title>Startup Options for statestored Daemon</title> |
| |
| <conbody> |
| |
| <p> |
| The <cmdname>statestored</cmdname> daemon implements the Impala statestore service, |
| which monitors the availability of Impala services across the cluster, and handles |
| situations such as nodes becoming unavailable or becoming available again. |
| </p> |
| |
| </conbody> |
| |
| </concept> |
| |
| <concept rev="1.2" id="config_options_catalogd"> |
| |
| <title>Startup Options for catalogd Daemon</title> |
| |
| <conbody> |
| |
| <p> |
| The <cmdname>catalogd</cmdname> daemon implements the Impala catalog service, which |
| broadcasts metadata changes to all the Impala nodes when Impala creates a table, inserts |
| data, or performs other kinds of DDL and DML operations. |
| </p> |
| |
| <p conref="../shared/impala_common.xml#common/load_catalog_in_background"/> |
| |
| </conbody> |
| |
| </concept> |
| |
| </concept> |