blob: f8c50feefbf758ab17b0ac0d9712cce67c98d2bf [file] [log] [blame]
<?xml version="1.0" encoding="UTF-8"?>
<!--
Licensed to the Apache Software Foundation (ASF) under one
or more contributor license agreements. See the NOTICE file
distributed with this work for additional information
regarding copyright ownership. The ASF licenses this file
to you under the Apache License, Version 2.0 (the
"License"); you may not use this file except in compliance
with the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing,
software distributed under the License is distributed on an
"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
KIND, either express or implied. See the License for the
specific language governing permissions and limitations
under the License.
-->
<!DOCTYPE concept PUBLIC "-//OASIS//DTD DITA Concept//EN" "concept.dtd">
<concept id="schedule_random_replica" rev="2.5.0">
<title>SCHEDULE_RANDOM_REPLICA Query Option (<keyword keyref="impala25"/> or higher only)</title>
<titlealts audience="PDF">
<navtitle>SCHEDULE_RANDOM_REPLICA</navtitle>
</titlealts>
<prolog>
<metadata>
<data name="Category" value="Impala"/>
<data name="Category" value="Impala Query Options"/>
<data name="Category" value="Performance"/>
<data name="Category" value="Developers"/>
<data name="Category" value="Data Analysts"/>
</metadata>
</prolog>
<conbody>
<p>
The <codeph>SCHEDULE_RANDOM_REPLICA</codeph> query option fine-tunes the scheduling
algorithm for deciding which host processes each HDFS data block or Kudu tablet to reduce
the chance of CPU hotspots.
</p>
<p>
By default, Impala estimates how much work each host has done for the query, and selects
the host that has the lowest workload. This algorithm is intended to reduce CPU hotspots
arising when the same host is selected to process multiple data blocks / tablets. Use the
<codeph>SCHEDULE_RANDOM_REPLICA</codeph> query option if hotspots still arise for some
combinations of queries and data layout.
</p>
<p>
The <codeph>SCHEDULE_RANDOM_REPLICA</codeph> query option only applies to tables and
partitions that are not enabled for the HDFS caching.
</p>
<p conref="../shared/impala_common.xml#common/type_boolean"/>
<p conref="../shared/impala_common.xml#common/default_false"/>
<p conref="../shared/impala_common.xml#common/added_in_250"/>
<p conref="../shared/impala_common.xml#common/related_info"/>
<p>
<xref href="impala_perf_hdfs_caching.xml#hdfs_caching"/>,
<xref
href="impala_scalability.xml#scalability_hotspots"/> ,
<xref
href="impala_replica_preference.xml#replica_preference"/>
</p>
</conbody>
</concept>