blob: f451417f0dbe1e667f647154e7964f253ef15b0e [file] [log] [blame]
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<!-- Generated by Apache Maven Doxia at 2016-10-02 -->
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<title>Apache Hadoop 2.6.5 - Hadoop Distributed File System-2.6.5 - Short-Circuit Local Reads</title>
<style type="text/css" media="all">
@import url("./css/maven-base.css");
@import url("./css/maven-theme.css");
@import url("./css/site.css");
</style>
<link rel="stylesheet" href="./css/print.css" type="text/css" media="print" />
<meta name="Date-Revision-yyyymmdd" content="20161002" />
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
</head>
<body class="composite">
<div id="banner">
<a href="https://hadoop.apache.org/" id="bannerLeft">
<img src="https://hadoop.apache.org/images/hadoop-logo.jpg" alt="" />
</a>
<a href="https://www.apache.org/" id="bannerRight">
<img src="https://www.apache.org/images/asf_logo_wide.png" alt="" />
</a>
<div class="clear">
<hr/>
</div>
</div>
<div id="breadcrumbs">
<div class="xleft">
<a href="https://www.apache.org/" class="externalLink">Apache</a>
&gt;
<a href="https://hadoop.apache.org/" class="externalLink">Hadoop</a>
&gt;
<a href="../">Apache Hadoop Project Dist POM</a>
&gt;
Apache Hadoop 2.6.5
</div>
<div class="xright"> <a href="https://wiki.apache.org/hadoop" class="externalLink">Wiki</a>
|
<a href="https://svn.apache.org/repos/asf/hadoop/" class="externalLink">SVN</a>
|
<a href="https://hadoop.apache.org/" class="externalLink">Apache Hadoop</a>
&nbsp;| Last Published: 2016-10-02
&nbsp;| Version: 2.6.5
</div>
<div class="clear">
<hr/>
</div>
</div>
<div id="leftColumn">
<div id="navcolumn">
<h5>General</h5>
<ul>
<li class="none">
<a href="../../index.html">Overview</a>
</li>
<li class="none">
<a href="../../hadoop-project-dist/hadoop-common/SingleCluster.html">Single Node Setup</a>
</li>
<li class="none">
<a href="../../hadoop-project-dist/hadoop-common/ClusterSetup.html">Cluster Setup</a>
</li>
<li class="none">
<a href="../../hadoop-project-dist/hadoop-common/CommandsManual.html">Hadoop Commands Reference</a>
</li>
<li class="none">
<a href="../../hadoop-project-dist/hadoop-common/FileSystemShell.html">FileSystem Shell</a>
</li>
<li class="none">
<a href="../../hadoop-project-dist/hadoop-common/Compatibility.html">Hadoop Compatibility</a>
</li>
<li class="none">
<a href="../../hadoop-project-dist/hadoop-common/filesystem/index.html">FileSystem Specification</a>
</li>
</ul>
<h5>Common</h5>
<ul>
<li class="none">
<a href="../../hadoop-project-dist/hadoop-common/CLIMiniCluster.html">CLI Mini Cluster</a>
</li>
<li class="none">
<a href="../../hadoop-project-dist/hadoop-common/NativeLibraries.html">Native Libraries</a>
</li>
<li class="none">
<a href="../../hadoop-project-dist/hadoop-common/Superusers.html">Superusers</a>
</li>
<li class="none">
<a href="../../hadoop-project-dist/hadoop-common/SecureMode.html">Secure Mode</a>
</li>
<li class="none">
<a href="../../hadoop-project-dist/hadoop-common/ServiceLevelAuth.html">Service Level Authorization</a>
</li>
<li class="none">
<a href="../../hadoop-project-dist/hadoop-common/HttpAuthentication.html">HTTP Authentication</a>
</li>
<li class="none">
<a href="../../hadoop-kms/index.html">Hadoop KMS</a>
</li>
<li class="none">
<a href="../../hadoop-project-dist/hadoop-common/Tracing.html">Tracing</a>
</li>
</ul>
<h5>HDFS</h5>
<ul>
<li class="none">
<a href="../../hadoop-project-dist/hadoop-hdfs/HdfsUserGuide.html">HDFS User Guide</a>
</li>
<li class="none">
<a href="../../hadoop-project-dist/hadoop-hdfs/HDFSCommands.html">HDFS Commands Reference</a>
</li>
<li class="none">
<a href="../../hadoop-project-dist/hadoop-hdfs/HDFSHighAvailabilityWithQJM.html">High Availability With QJM</a>
</li>
<li class="none">
<a href="../../hadoop-project-dist/hadoop-hdfs/HDFSHighAvailabilityWithNFS.html">High Availability With NFS</a>
</li>
<li class="none">
<a href="../../hadoop-project-dist/hadoop-hdfs/Federation.html">Federation</a>
</li>
<li class="none">
<a href="../../hadoop-project-dist/hadoop-hdfs/ViewFs.html">ViewFs Guide</a>
</li>
<li class="none">
<a href="../../hadoop-project-dist/hadoop-hdfs/HdfsSnapshots.html">HDFS Snapshots</a>
</li>
<li class="none">
<a href="../../hadoop-project-dist/hadoop-hdfs/HdfsDesign.html">HDFS Architecture</a>
</li>
<li class="none">
<a href="../../hadoop-project-dist/hadoop-hdfs/HdfsEditsViewer.html">Edits Viewer</a>
</li>
<li class="none">
<a href="../../hadoop-project-dist/hadoop-hdfs/HdfsImageViewer.html">Image Viewer</a>
</li>
<li class="none">
<a href="../../hadoop-project-dist/hadoop-hdfs/HdfsPermissionsGuide.html">Permissions and HDFS</a>
</li>
<li class="none">
<a href="../../hadoop-project-dist/hadoop-hdfs/HdfsQuotaAdminGuide.html">Quotas and HDFS</a>
</li>
<li class="none">
<a href="../../hadoop-project-dist/hadoop-hdfs/Hftp.html">HFTP</a>
</li>
<li class="none">
<a href="../../hadoop-project-dist/hadoop-hdfs/LibHdfs.html">C API libhdfs</a>
</li>
<li class="none">
<a href="../../hadoop-project-dist/hadoop-hdfs/WebHDFS.html">WebHDFS REST API</a>
</li>
<li class="none">
<a href="../../hadoop-hdfs-httpfs/index.html">HttpFS Gateway</a>
</li>
<li class="none">
<a href="../../hadoop-project-dist/hadoop-hdfs/ShortCircuitLocalReads.html">Short Circuit Local Reads</a>
</li>
<li class="none">
<a href="../../hadoop-project-dist/hadoop-hdfs/CentralizedCacheManagement.html">Centralized Cache Management</a>
</li>
<li class="none">
<a href="../../hadoop-project-dist/hadoop-hdfs/HdfsNfsGateway.html">HDFS NFS Gateway</a>
</li>
<li class="none">
<a href="../../hadoop-project-dist/hadoop-hdfs/HdfsRollingUpgrade.html">HDFS Rolling Upgrade</a>
</li>
<li class="none">
<a href="../../hadoop-project-dist/hadoop-hdfs/ExtendedAttributes.html">Extended Attributes</a>
</li>
<li class="none">
<a href="../../hadoop-project-dist/hadoop-hdfs/TransparentEncryption.html">Transparent Encryption</a>
</li>
<li class="none">
<a href="../../hadoop-project-dist/hadoop-hdfs/HdfsMultihoming.html">HDFS Support for Multihoming</a>
</li>
<li class="none">
<a href="../../hadoop-project-dist/hadoop-hdfs/ArchivalStorage.html">Archival Storage, SSD & Memory</a>
</li>
</ul>
<h5>MapReduce</h5>
<ul>
<li class="none">
<a href="../../hadoop-mapreduce-client/hadoop-mapreduce-client-core/MapReduceTutorial.html">MapReduce Tutorial</a>
</li>
<li class="none">
<a href="../../hadoop-mapreduce-client/hadoop-mapreduce-client-core/MapredCommands.html">MapReduce Commands Reference</a>
</li>
<li class="none">
<a href="../../hadoop-mapreduce-client/hadoop-mapreduce-client-core/MapReduce_Compatibility_Hadoop1_Hadoop2.html">Compatibilty between Hadoop 1.x and Hadoop 2.x</a>
</li>
<li class="none">
<a href="../../hadoop-mapreduce-client/hadoop-mapreduce-client-core/EncryptedShuffle.html">Encrypted Shuffle</a>
</li>
<li class="none">
<a href="../../hadoop-mapreduce-client/hadoop-mapreduce-client-core/PluggableShuffleAndPluggableSort.html">Pluggable Shuffle/Sort</a>
</li>
<li class="none">
<a href="../../hadoop-mapreduce-client/hadoop-mapreduce-client-core/DistributedCacheDeploy.html">Distributed Cache Deploy</a>
</li>
<li class="none">
<a href="../../hadoop-mapreduce-client/hadoop-mapreduce-client-core/HadoopStreaming.html">Hadoop Streaming</a>
</li>
<li class="none">
<a href="../../hadoop-mapreduce-client/hadoop-mapreduce-client-core/HadoopArchives.html">Hadoop Archives</a>
</li>
<li class="none">
<a href="../../hadoop-mapreduce-client/hadoop-mapreduce-client-core/DistCp.html">DistCp</a>
</li>
</ul>
<h5>MapReduce REST APIs</h5>
<ul>
<li class="none">
<a href="../../hadoop-mapreduce-client/hadoop-mapreduce-client-core/MapredAppMasterRest.html">MR Application Master</a>
</li>
<li class="none">
<a href="../../hadoop-mapreduce-client/hadoop-mapreduce-client-hs/HistoryServerRest.html">MR History Server</a>
</li>
</ul>
<h5>YARN</h5>
<ul>
<li class="none">
<a href="../../hadoop-yarn/hadoop-yarn-site/YARN.html">YARN Architecture</a>
</li>
<li class="none">
<a href="../../hadoop-yarn/hadoop-yarn-site/CapacityScheduler.html">Capacity Scheduler</a>
</li>
<li class="none">
<a href="../../hadoop-yarn/hadoop-yarn-site/FairScheduler.html">Fair Scheduler</a>
</li>
<li class="none">
<a href="../../hadoop-yarn/hadoop-yarn-site/ResourceManagerRestart.html">ResourceManager Restart</a>
</li>
<li class="none">
<a href="../../hadoop-yarn/hadoop-yarn-site/ResourceManagerHA.html">ResourceManager HA</a>
</li>
<li class="none">
<a href="../../hadoop-yarn/hadoop-yarn-site/WebApplicationProxy.html">Web Application Proxy</a>
</li>
<li class="none">
<a href="../../hadoop-yarn/hadoop-yarn-site/TimelineServer.html">YARN Timeline Server</a>
</li>
<li class="none">
<a href="../../hadoop-yarn/hadoop-yarn-site/WritingYarnApplications.html">Writing YARN Applications</a>
</li>
<li class="none">
<a href="../../hadoop-yarn/hadoop-yarn-site/YarnCommands.html">YARN Commands</a>
</li>
<li class="none">
<a href="../../hadoop-sls/SchedulerLoadSimulator.html">Scheduler Load Simulator</a>
</li>
<li class="none">
<a href="../../hadoop-yarn/hadoop-yarn-site/NodeManagerRestart.html">NodeManager Restart</a>
</li>
<li class="none">
<a href="../../hadoop-yarn/hadoop-yarn-site/DockerContainerExecutor.html">DockerContainerExecutor</a>
</li>
</ul>
<h5>YARN REST APIs</h5>
<ul>
<li class="none">
<a href="../../hadoop-yarn/hadoop-yarn-site/WebServicesIntro.html">Introduction</a>
</li>
<li class="none">
<a href="../../hadoop-yarn/hadoop-yarn-site/ResourceManagerRest.html">Resource Manager</a>
</li>
<li class="none">
<a href="../../hadoop-yarn/hadoop-yarn-site/NodeManagerRest.html">Node Manager</a>
</li>
</ul>
<h5>Auth</h5>
<ul>
<li class="none">
<a href="../../hadoop-auth/index.html">Overview</a>
</li>
<li class="none">
<a href="../../hadoop-auth/Examples.html">Examples</a>
</li>
<li class="none">
<a href="../../hadoop-auth/Configuration.html">Configuration</a>
</li>
<li class="none">
<a href="../../hadoop-auth/BuildingIt.html">Building</a>
</li>
</ul>
<h5>Reference</h5>
<ul>
<li class="none">
<a href="../../hadoop-project-dist/hadoop-common/releasenotes.html">Release Notes</a>
</li>
<li class="none">
<a href="../../api/index.html">API docs</a>
</li>
<li class="none">
<a href="../../hadoop-project-dist/hadoop-common/CHANGES.txt">Common CHANGES.txt</a>
</li>
<li class="none">
<a href="../../hadoop-project-dist/hadoop-hdfs/CHANGES.txt">HDFS CHANGES.txt</a>
</li>
<li class="none">
<a href="../../hadoop-project-dist/hadoop-mapreduce/CHANGES.txt">MapReduce CHANGES.txt</a>
</li>
<li class="none">
<a href="../../hadoop-project-dist/hadoop-yarn/CHANGES.txt">YARN CHANGES.txt</a>
</li>
<li class="none">
<a href="../../hadoop-project-dist/hadoop-common/Metrics.html">Metrics</a>
</li>
</ul>
<h5>Configuration</h5>
<ul>
<li class="none">
<a href="../../hadoop-project-dist/hadoop-common/core-default.xml">core-default.xml</a>
</li>
<li class="none">
<a href="../../hadoop-project-dist/hadoop-hdfs/hdfs-default.xml">hdfs-default.xml</a>
</li>
<li class="none">
<a href="../../hadoop-mapreduce-client/hadoop-mapreduce-client-core/mapred-default.xml">mapred-default.xml</a>
</li>
<li class="none">
<a href="../../hadoop-yarn/hadoop-yarn-common/yarn-default.xml">yarn-default.xml</a>
</li>
<li class="none">
<a href="../../hadoop-project-dist/hadoop-common/DeprecatedProperties.html">Deprecated Properties</a>
</li>
</ul>
<a href="https://maven.apache.org/" title="Built by Maven" class="poweredBy">
<img alt="Built by Maven" src="./images/logos/maven-feather.png"/>
</a>
</div>
</div>
<div id="bodyColumn">
<div id="contentBox">
<!-- Licensed under the Apache License, Version 2.0 (the "License"); --><!-- you may not use this file except in compliance with the License. --><!-- You may obtain a copy of the License at --><!-- --><!-- http://www.apache.org/licenses/LICENSE-2.0 --><!-- --><!-- Unless required by applicable law or agreed to in writing, software --><!-- distributed under the License is distributed on an "AS IS" BASIS, --><!-- WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. --><!-- See the License for the specific language governing permissions and --><!-- limitations under the License. See accompanying LICENSE file. --><div class="section">
<h2>HDFS Short-Circuit Local Reads<a name="HDFS_Short-Circuit_Local_Reads"></a></h2>
<ul>
<li><a href="#HDFS_Short-Circuit_Local_Reads">HDFS Short-Circuit Local Reads</a>
<ul>
<li><a href="#Short-Circuit_Local_Reads">Short-Circuit Local Reads</a>
<ul>
<li><a href="#Background">Background</a></li>
<li><a href="#Setup">Setup</a></li>
<li><a href="#Example_Configuration">Example Configuration</a></li></ul></li>
<li><a href="#Legacy_HDFS_Short-Circuit_Local_Reads">Legacy HDFS Short-Circuit Local Reads</a></li></ul></li></ul>
<div class="section">
<h3><a name="Short-Circuit_Local_Reads">Short-Circuit Local Reads</a></h3>
<div class="section">
<h4>Background<a name="Background"></a></h4>
<p>In <tt>HDFS</tt>, reads normally go through the <tt>DataNode</tt>. Thus, when the client asks the <tt>DataNode</tt> to read a file, the <tt>DataNode</tt> reads that file off of the disk and sends the data to the client over a TCP socket. So-called &quot;short-circuit&quot; reads bypass the <tt>DataNode</tt>, allowing the client to read the file directly. Obviously, this is only possible in cases where the client is co-located with the data. Short-circuit reads provide a substantial performance boost to many applications.</p></div>
<div class="section">
<h4>Setup<a name="Setup"></a></h4>
<p>To configure short-circuit local reads, you will need to enable <tt>libhadoop.so</tt>. See <a href="../hadoop-common/NativeLibraries.html">Native Libraries</a> for details on enabling this library.</p>
<p>Short-circuit reads make use of a UNIX domain socket. This is a special path in the filesystem that allows the client and the <tt>DataNode</tt>s to communicate. You will need to set a path to this socket. The <tt>DataNode</tt> needs to be able to create this path. On the other hand, it should not be possible for any user except the HDFS user or root to create this path. For this reason, paths under <tt>/var/run</tt> or <tt>/var/lib</tt> are often used.</p>
<p>The client and the <tt>DataNode</tt> exchange information via a shared memory segment on <tt>/dev/shm</tt>.</p>
<p>Short-circuit local reads need to be configured on both the <tt>DataNode</tt> and the client.</p></div>
<div class="section">
<h4>Example Configuration<a name="Example_Configuration"></a></h4>
<p>Here is an example configuration.</p>
<div>
<pre>&lt;configuration&gt;
&lt;property&gt;
&lt;name&gt;dfs.client.read.shortcircuit&lt;/name&gt;
&lt;value&gt;true&lt;/value&gt;
&lt;/property&gt;
&lt;property&gt;
&lt;name&gt;dfs.domain.socket.path&lt;/name&gt;
&lt;value&gt;/var/lib/hadoop-hdfs/dn_socket&lt;/value&gt;
&lt;/property&gt;
&lt;/configuration&gt;</pre></div></div></div>
<div class="section">
<h3>Legacy HDFS Short-Circuit Local Reads<a name="Legacy_HDFS_Short-Circuit_Local_Reads"></a></h3>
<p>Legacy implementation of short-circuit local reads on which the clients directly open the HDFS block files is still available for platforms other than the Linux. Setting the value of <tt>dfs.client.use.legacy.blockreader.local</tt> in addition to <tt>dfs.client.read.shortcircuit</tt> to true enables this feature.</p>
<p>You also need to set the value of <tt>dfs.datanode.data.dir.perm</tt> to <tt>750</tt> instead of the default <tt>700</tt> and chmod/chown the directory tree under <tt>dfs.datanode.data.dir</tt> as readable to the client and the <tt>DataNode</tt>. You must take caution because this means that the client can read all of the block files bypassing HDFS permission.</p>
<p>Because Legacy short-circuit local reads is insecure, access to this feature is limited to the users listed in the value of <tt>dfs.block.local-path-access.user</tt>.</p>
<div>
<pre>&lt;configuration&gt;
&lt;property&gt;
&lt;name&gt;dfs.client.read.shortcircuit&lt;/name&gt;
&lt;value&gt;true&lt;/value&gt;
&lt;/property&gt;
&lt;property&gt;
&lt;name&gt;dfs.client.use.legacy.blockreader.local&lt;/name&gt;
&lt;value&gt;true&lt;/value&gt;
&lt;/property&gt;
&lt;property&gt;
&lt;name&gt;dfs.datanode.data.dir.perm&lt;/name&gt;
&lt;value&gt;750&lt;/value&gt;
&lt;/property&gt;
&lt;property&gt;
&lt;name&gt;dfs.block.local-path-access.user&lt;/name&gt;
&lt;value&gt;foo,bar&lt;/value&gt;
&lt;/property&gt;
&lt;/configuration&gt;</pre></div></div></div>
</div>
</div>
<div class="clear">
<hr/>
</div>
<div id="footer">
<div class="xright">&#169; 2016
Apache Software Foundation
- <a href="https://maven.apache.org/privacy-policy.html">Privacy Policy</a></div>
<div class="clear">
<hr/>
</div>
</div>
</body>
</html>