blob: 80e5c9b5c265341ea4a77396ab8fdcc2dcf457cc [file] [log] [blame]
<?xml version="1.0" encoding="UTF-8"?>
<!--
Licensed to the Apache Software Foundation (ASF) under one
or more contributor license agreements. See the NOTICE file
distributed with this work for additional information
regarding copyright ownership. The ASF licenses this file
to you under the Apache License, Version 2.0 (the
"License"); you may not use this file except in compliance
with the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing,
software distributed under the License is distributed on an
"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
KIND, either express or implied. See the License for the
specific language governing permissions and limitations
under the License.
-->
<!DOCTYPE concept PUBLIC "-//OASIS//DTD DITA Concept//EN" "concept.dtd">
<concept rev="1.2.1" id="timeouts">
<title>Setting Timeout Periods for Daemons, Queries, and Sessions</title>
<titlealts audience="PDF">
<navtitle>Setting Timeouts</navtitle>
</titlealts>
<prolog>
<metadata>
<data name="Category" value="Impala"/>
<data name="Category" value="Administrators"/>
<data name="Category" value="Scheduling"/>
<data name="Category" value="Scalability"/>
</metadata>
</prolog>
<conbody>
<p>
Depending on how busy your <keyword keyref="distro"/> cluster is, you might increase or decrease various timeout
values. Increase timeouts if Impala is cancelling operations prematurely, when the system
is responding slower than usual but the operations are still successful if given extra
time. Decrease timeouts if operations are idle or hanging for long periods, and the idle
or hung operations are consuming resources and reducing concurrency.
</p>
<p outputclass="toc inpage"/>
</conbody>
<concept id="statestore_timeout">
<title>Increasing the Statestore Timeout</title>
<conbody>
<p rev="IMP-1210">
If you have an extensive Impala schema, for example with hundreds of databases, tens of
thousands of tables, and so on, you might encounter timeout errors during startup as the
Impala catalog service broadcasts metadata to all the Impala nodes using the statestore
service. To avoid such timeout errors on startup, increase the statestore timeout value
from its default of 10 seconds. Specify the timeout value using the
<codeph>-statestore_subscriber_timeout_seconds</codeph> option for the statestore
service, using the configuration instructions in
<xref href="impala_config_options.xml#config_options"/>. The symptom of this problem is
messages in the <codeph>impalad</codeph> log such as:
</p>
<codeblock>Connection with state-store lost
Trying to re-register with state-store</codeblock>
<p>
See <xref href="impala_scalability.xml#statestore_scalability"/> for more details about
statestore operation and settings on clusters with a large number of Impala-related
objects such as tables and partitions.
</p>
</conbody>
</concept>
<concept id="impalad_timeout">
<title>Setting the Idle Query and Idle Session Timeouts for impalad</title>
<conbody>
<p>
To keep long-running queries or idle sessions from tying up cluster resources, you can
set timeout intervals for both individual queries, and entire sessions.
</p>
<note conref="../shared/impala_common.xml#common/timeout_clock_blurb"/>
<p>
Specify the following startup options for the <cmdname>impalad</cmdname> daemon:
</p>
<ul>
<li>
<p>
The <codeph>--idle_query_timeout</codeph> option specifies the time in seconds after
which an idle query is cancelled. This could be a query whose results were all fetched
but was never closed, or one whose results were partially fetched and then the client
program stopped requesting further results. This condition is most likely to occur in
a client program using the JDBC or ODBC interfaces, rather than in the interactive
<cmdname>impala-shell</cmdname> interpreter. Once the query is cancelled, the client
program cannot retrieve any further results.
</p>
<p rev="2.0.0">
You can reduce the idle query timeout by using the <codeph>QUERY_TIMEOUT_S</codeph>
query option. Any non-zero value specified for the <codeph>--idle_query_timeout</codeph> startup
option serves as an upper limit for the <codeph>QUERY_TIMEOUT_S</codeph> query option.
A zero value for <codeph>--idle_query_timeout</codeph> disables query timeouts.
See <xref href="impala_query_timeout_s.xml#query_timeout_s"/> for details.
</p>
</li>
<li>
<p>
The <codeph>--idle_session_timeout</codeph> option specifies the time in seconds after
which an idle session is expired. A session is idle when no activity is occurring for
any of the queries in that session, and the session has not started any new queries.
Once a session is expired, you cannot issue any new query requests to it. The session
remains open, but the only operation you can perform is to close it. The default value
of 0 means that sessions never expire.
</p>
</li>
</ul>
<p>
For instructions on changing <cmdname>impalad</cmdname> startup options, see
<xref href="impala_config_options.xml#config_options"/>.
</p>
<note>
<p rev="IMPALA-5108">
Impala checks periodically for idle sessions and queries
to cancel. The actual idle time before cancellation might be up to 50% greater than
the specified configuration setting. For example, if the timeout setting was 60, the
session or query might be cancelled after being idle between 60 and 90 seconds.
</p>
</note>
</conbody>
</concept>
<concept id="concept_rfy_jl1_rx">
<title>Setting Timeout and Retries for Thrift Connections to the Backend
Client</title>
<conbody>
<p>Impala connections to the backend client are subject to failure in
cases when the network is momentarily overloaded. To avoid failed
queries due to transient network problems, you can configure the number
of Thrift connection retries using the following option: </p>
<ul id="ul_bj3_ql1_rx">
<li>The <codeph>--backend_client_connection_num_retries</codeph> option
specifies the number of times Impala will try connecting to the
backend client after the first connection attempt fails. By default,
<cmdname>impalad</cmdname> will attempt three re-connections before
it returns a failure. </li>
</ul>
<p>You can configure timeouts for sending and receiving data from the
backend client. Therefore, if for some reason a query hangs, instead of
waiting indefinitely for a response, Impala will terminate the
connection after a configurable timeout.</p>
<ul id="ul_vm2_2v1_rx">
<li>The <codeph>--backend_client_rpc_timeout_ms</codeph> option can be
used to specify the number of milliseconds Impala should wait for a
response from the backend client before it terminates the connection
and signals a failure. The default value for this property is 300000
milliseconds, or 5 minutes. </li>
</ul>
</conbody>
</concept>
<concept id="cancel_query">
<title>Cancelling a Query</title>
<conbody>
<p>
Sometimes, an Impala query might run for an unexpectedly long time, tying up resources
in the cluster. You can cancel the query explicitly, independent of the timeout period,
by going into the web UI for the <cmdname>impalad</cmdname> host (on port 25000 by
default), and using the link on the <codeph>/queries</codeph> tab to cancel the running
query. For example, press <codeph>^C</codeph> in <cmdname>impala-shell</cmdname>.
</p>
</conbody>
</concept>
</concept>