blob: 3006f588b59f6f30204ad5ab24e5aa2c27a89892 [file] [log] [blame]
<!DOCTYPE html>
<!--
Licensed to the Apache Software Foundation (ASF) under one or more
contributor license agreements. See the NOTICE file distributed with
this work for additional information regarding copyright ownership.
The ASF licenses this file to You under the Apache License, Version 2.0
(the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
-->
<html>
<head>
<title>Cluster Setup - Apache Blur (Incubator) Documentation</title>
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<!-- Bootstrap -->
<link href="resources/css/bootstrap.min.css" rel="stylesheet" media="screen">
<link href="resources/css/bs-docs.css" rel="stylesheet" media="screen">
</head>
<body>
<div class="navbar navbar-inverse navbar-fixed-top">
<div class="container">
<div class="navbar-header">
<button type="button" class="navbar-toggle" data-toggle="collapse" data-target=".navbar-collapse">
<span class="icon-bar"></span>
<span class="icon-bar"></span>
<span class="icon-bar"></span>
</button>
<a class="navbar-brand" href="http://incubator.apache.org/blur">Apache Blur (Incubator)</a>
</div>
<div class="collapse navbar-collapse">
<ul class="nav navbar-nav">
<li><a href="index.html">Main</a></li>
<li><a href="getting-started.html">Getting Started</a></li>
<li><a href="platform.html">Platform</a></li>
<li><a href="data-model.html">Data Model</a></li>
<li class="active"><a href="cluster-setup.html">Cluster Setup</a></li>
<li><a href="using-blur.html">Using Blur</a></li>
<li><a href="Blur.html">Blur API</a></li>
<li><a href="console.html">Console</a></li>
</ul>
</div>
</div>
</div>
<div class="container bs-docs-container">
<div class="row">
<div class="col-md-3">
<div class="bs-sidebar hidden-print affix" role="complementary">
<ul class="nav bs-sidenav">
<li>
<a href="#general">General Configuration</a>
<ul class="nav">
<li><a href="#general-blur-site">blur-site.properties</a></li>
<li><a href="#general-hadoop">Hadoop</a></li>
</ul>
</li>
<li>
<a href="#controller">Controller Server Configuration</a>
<ul class="nav">
<li><a href="#controller-blur-site">blur-site.properties</a></li>
<li><a href="#controller-blur-env">blur-env.sh</a></li>
</ul>
</li>
<li>
<a href="#shard">Shard Server Configuration</a>
<ul class="nav">
<li><a href="#shard-blur-site">blur-site.properties</a></li>
<li><a href="#shard-blur-env">blur-env.sh</a></li>
<li><a href="#block-cache">Block Cache</a>
<ul class="nav">
<li><a href="#block-cache-v2">&nbsp;&nbsp;V2 Block Cache Configuration</a></li>
<li><a href="#block-cache-v1">&nbsp;&nbsp;V1 Block Cache Configuration</a></li>
</ul>
</li>
</ul>
</li>
<li>
<a href="#metrics">Metrics</a>
<ul class="nav">
<li><a href="#shard-mbean">Shard Server - MBean</a></li>
<li><a href="#reporters">Other Reporters</a></li>
</ul>
</li>
</ul>
</div>
</div>
<div class="col-md-9" role="main">
<section>
<div class="page-header">
<h1 id="general">General Configuration</h1>
</div>
<p>
The basic cluster setup involves editing the blur-site.properties and the blur-env.sh
files in the $BLUR_HOME/conf directory. It is recommended that a standalone ZooKeeper
be setup. Also a modern version of Hadoop with append support is required for proper data
management (the write ahead log requires the sync operation).
<div class="bs-callout bs-callout-warning"><h4>Caution</h4>If you setup a standalone ZooKeeper
you will need to configure Blur to NOT manage the ZooKeeper. You will need to edit blur-env.sh
file:
<pre><code class="bash">export BLUR_MANAGE_ZK=false</code></pre>
</div>
</p>
<h3 id="general-blur-site">blur-site.properties</h3>
<p>
<pre>
<code class="bash"># The ZooKeeper connection string, consider adding a root path to the string, it
# can help when upgrading Blur.
# Example: zknode1:2181,zknode2:2181,zknode3:2181/blur-0.2.4
#
# NOTE: If you provide the root path "/blur-0.2.4", that will have to be manually
# created before Blur will start.
blur.zookeeper.connection=127.0.0.1
# If you are only going to run a single shard cluster then leave this as default.
blur.cluster.name=default
# Sets the default table location in hdfs. If left null or omitted the table uri property in
# the table descriptor will be required for all tables.
blur.cluster.default.table.uri=hdfs://namenode/blur/tables</code>
</pre></p>
<h4>Default Properties</h4>
<table class="table-bordered table-striped table-condensed">
<tr><td>Property</td><td>Description</td></tr>
|||General-Server-Properties|||
</table>
<h3 id="general-hadoop">Hadoop</h3>
<p>
The current version of Blur has Hadoop 1.2.1 embedded in the &quot;apache-blur-*/lib/hadoop-1.2.1&quot; path. However if
you are using a different version of Hadoop or want Blur to use the Hadoop configuration in your installed
version you will need to set the &quot;HADOOP_HOME&quot; environment variable in the
&quot;blur-env.sh&quot; script found in &quot;apache-blur-*/conf/&quot;.
<pre>
<code class="bash"># Edit the blur-env.sh
export HADOOP_HOME=&lt;path to your Hadoop install directory&gt;</code>
</pre>
</p>
</section>
<section>
<div class="page-header">
<h1 id="controller">Controller Server Configuration</h1>
</div>
<h3 id="controller-blur-site">blur-site.properties</h3>
<p>
These are the default settings for the shard server that can be overridden in the blur-site.properties file. Consider increasing the various thread pool counts (*.thread.count). The blur.controller.server.remote.thread.count is very important to increase for larger clusters, basically one thread is used per shard server per query. Some production clusters have set this thread pool to 2000 or more threads.
</p>
<h4>Default Properties</h4>
<table class="table-bordered table-striped table-condensed">
<tr><td>Property</td><td>Description</td></tr>
|||Controller-Server-Properties|||
</table>
<h3 id="controller-blur-env">blur-env.sh</h3>
<pre><code class="bash"># JAVA JVM OPTIONS for the controller servers, jvm tuning parameters are placed here.
# Consider adding the -XX:OnOutOfMemoryError="kill -9 %p" option to kill jvms that are failing due to memory issues.
export BLUR_CONTROLLER_JVM_OPTIONS="-Xmx1024m -Djava.net.preferIPv4Stack=true "
# Time to sleep between controller server commands.
export BLUR_CONTROLLER_SLEEP=0.1
# The of controller servers to spawn per machine.
export BLUR_NUMBER_OF_CONTROLLER_SERVER_INSTANCES_PER_MACHINE=1</code></pre>
</section>
<section>
<div class="page-header">
<h1 id="shard">Shard Server Configuration</h1>
</div>
<h3>Minimum Settings to Configure</h3>
<p>
It is highly recommended that the ulimits are increase on the server specifically:
<ul>
<li>open files</li>
<li>max user processes</li>
</ul>
<br/>
In Hadoop the dfs.datanode.max.xcievers should be increased to at least 4096 if not more.
<pre>
<code class="bash">&lt;property&gt;
&lt;name&gt;dfs.datanode.max.xcievers&lt;/name&gt;
&lt;value&gt;4096&lt;/value&gt;
&lt;/property&gt;</code></pre>
<br/>
In blur-env.sh set the cache memory for the shard processes. DO NOT over allocate this will
likely crash your server.
<pre><code class="bash">-XX:MaxDirectMemorySize=13g</code></pre>
<div class="bs-callout bs-callout-warning"><h4>Caution</h4>
Swap can kill java perform, you may want to consider disabling swap.</div>
</p>
<h3 id="shard-blur-site">blur-site.properties</h3>
<p>
These are the default settings for the shard server that can be overridden in the blur-site.properties file. Consider increasing the various thread pool counts (*.thread.count). Also the blur.max.clause.count sets the BooleanQuery max clause count for Lucene queries.
</p>
<h4>Default Properties</h4>
<table class="table-bordered table-striped table-condensed">
<tr><td>Property</td><td>Description</td></tr>
|||Shard-Server-Properties|||
</table>
<h3 id="shard-blur-env">blur-env.sh</h3>
<pre><code class="bash"># JAVA JVM OPTIONS for the shard servers, jvm tuning parameters are placed here.
export BLUR_SHARD_JVM_OPTIONS="-Xmx1024m -Djava.net.preferIPv4Stack=true -XX:MaxDirectMemorySize=256m "
# Time to sleep between shard server commands.
export BLUR_SHARD_SLEEP=0.1
# The of shard servers to spawn per machine.
export BLUR_NUMBER_OF_SHARD_SERVER_INSTANCES_PER_MACHINE=1</code></pre>
<h3 id="block-cache">Block Cache</h3>
<h4>Why</h4>
<p>HDFS is a great filesystem for streaming large amounts data across large scale clusters. However the random access latency is typically the same performance you would get in reading from a local drive if the data you are trying to access is not in the operating systems file cache. In other words every access to HDFS is similar to a local read with a cache miss. There have been great performance boosts in HDFS over the past few years but it still can't perform at the level that a search engine needs.</p>
<p>Now you might be thinking that Lucene reads from the local hard drive and performs great, so why wouldn't HDFS perform fairly well on it's own? However most of time the Lucene index files are cached by the operating system's file system cache. So Blur has it's own file system cache allows it to perform low latency data look-ups against HDFS.</p>
<h3 id="block-cache-v2">V2 Block Cache Configuration</h3>
<h4>How</h4>
<p>The Google <a href="http://code.google.com/p/concurrentlinkedhashmap/">concurrentlinkedhashmap</a> library is at the center of the block cache in the shard servers. In version 2, which is enabled by default, the slab allocation is no longer used. <a href="http://mail-archives.apache.org/mod_mbox/incubator-blur-dev/201310.mbox/%3CCAB6tTr0Nr2aDLc4kkHoeqiO-utwzBAhb=Ru==GMhQry4aXPjug@mail.gmail.com%3E">Here</a> is a discussion of the motivations behind the rewrite.</p>
<p>Below are the properties related to V2 of the block cache.</p>
<table class="table-bordered table-striped table-condensed">
<tr><td nowrap="1">blur.shard.block.cache.total.size</td><td>
<p>This is used to limit the amount of off heap cache size. By default the cache is 64MB less than the -XX:MaxDirectMemorySize,
so if you want the block cache to use less than that amount then set this value.</p></td></tr>
<tr><td nowrap="1">blur.shard.block.cache.v2.fileBufferSize</td><td>
<p>This is the size of the buffer when accessing hdfs, by default it is set to 8K. However in most systems this should probably be increased to something closer to 64K. Use the &quot;fstune&quot; command in the shell to help figure out what the best buffer size should be in your system.</p></td></tr>
<tr><td nowrap="1">blur.shard.block.cache.v2.cacheBlockSize</td><td>
<p>This is the size of the cache entry for any file that is NOT explicitly defined. Most of the time you are going to want this value to equal the &quot;blur.shard.block.cache.v2.fileBufferSize&quot; value.</p></td></tr>
<tr><td nowrap="1">blur.shard.block.cache.v2.cacheBlockSize.&lt;ext&gt;</td><td>
<p>This is the size of the cache entry for any file that has the given extension. By default &quot;filter&quot; is the only file that has a none default cache block size, it's current value is 32MB. This means that unless file is larger than 32MB in size, it will be stored as a single value in the cache. For cached filters this is required for performance during the transversal of the logical bitset stored in the file.</p></td></tr>
<tr><td nowrap="1">blur.shard.block.cache.v2.store</td><td>
<p>This property defines how the cache will be stored, by default it's off heap. This means that it is not accounted for in the used heap section that you can find in jconsole or visualvm. However you can track it's size through the &quot;top&quot; command in the shell, MBeans in jconsole, or the metrics call via the Blur thrift API.<br/><br/>Unless you are using a specialized JVM or are debugging problem this should remain off heap, however if you would like to use the cache as on heap allocated blocks change this value to ON_HEAP.</p></td></tr>
blur.shard.block.cache.v2.write.cache.ext=
blur.shard.block.cache.v2.write.nocache.ext=fdt
<tr><td nowrap="1">blur.shard.block.cache.v2.read.default</td><td>
<p>This property defines the default action to cache or not to cache the data during a read operation. By default this is true. This will be the action taken if the file extension is not found in either the &quot;blur.shard.block.cache.v2.read.cache.ext&quot; property or the &quot;blur.shard.block.cache.v2.read.nocache.ext&quot; property.</p></td></tr>
<tr><td nowrap="1">blur.shard.block.cache.v2.read.cache.ext</td><td>
<p>This property defines a comma separated list of file extensions that are to be cached during a read operations.</p></td></tr>
<tr><td nowrap="1">blur.shard.block.cache.v2.read.nocache.ext</td><td>
<p>This property defines a comma separated list of file extensions that are NOT to be cached during a read operations. If the file extension is in the &quot;blur.shard.block.cache.v2.read.cache.ext&quot; property, it will have no effect in this list.</p></td></tr>
<tr><td nowrap="1">blur.shard.block.cache.v2.write.default</td><td>
<p>This property defines the default action to cache or not to cache the data during a write operation. By default this is true. This will be the action taken if the file extension is not found in either the &quot;blur.shard.block.cache.v2.write.cache.ext&quot; property or the &quot;blur.shard.block.cache.v2.write.nocache.ext&quot; property.</p></td></tr>
<tr><td nowrap="1">blur.shard.block.cache.v2.write.cache.ext</td><td>
<p>This property defines a comma separated list of file extensions that are to be cached during a write operations.</p></td></tr>
<tr><td nowrap="1">blur.shard.block.cache.v2.write.nocache.ext</td><td>
<p>This property defines a comma separated list of file extensions that are NOT to be cached during a write operations. If the file extension is in the &quot;blur.shard.block.cache.v2.write.cache.ext&quot; property, it will have no effect in this list.</p></td></tr>
</table>
<h3 id="block-cache-v1">V1 Block Cache Configuration</h3>
<h4>How</h4>
<p>On shard server start-up Blur creates 1 or more block cache slabs blur.shard.blockcache.slab.count that are each 128 MB in size. These slabs can be allocated on or off the heap blur.shard.blockcache.direct.memory.allocation. Each slab is broken up into 16,384 blocks with each block size being 8K. Then on the heap there is a concurrent LRU cache that tracks what blocks of what files are in which slab(s) at what offset. So the more slabs of cache you create the more entries there will be in the LRU thus more heap.</p>
<h4>Configuration</h4>
<p>Scenario:
Say the shard server(s) that you are planning to run Blur on have 32G of ram. These machines are probably also running HDFS data nodes as well with very high xcievers (dfs.datanode.max.xcievers in hdfs-site.xml) say 8K. If the data nodes are configured with 1G of heap then they may consume up to 4G of memory due to the high thread count because of the xcievers. Next let's say you configure Blur to 4G of heap as well, and you want to use 12G of off heap cache.</p>
<h5>Auto Configuration</h5>
<p>In the blur-env.sh file you would need to change BLUR_SHARD_JVM_OPTIONS to include "-XX:MaxDirectMemorySize=12g" and possibly "-XX:+UseLargePages" depending on your Linux setup. If you leave the blur.shard.blockcache.slab.count to the default -1 the shard startup will automatically detect the -XX:MaxDirectMemorySize size and automatically use almost all of the memory. By default the JVM has 64m in reserve for direct memory so by default Blur leaves at least that amount available to the JVM.</p>
<h5>Custom Configuration</h5>
<p>Again in the blur-env.sh file you would need to change BLUR_SHARD_JVM_OPTIONS to include "-XX:MaxDirectMemorySize=13g" and possibly "-XX:+UseLargePages" depending on your Linux setup. I set the MaxDirectMemorySize to more than 12G to make sure we don't hit the maximum limit and cause a OOM exception, this does not reserve 13G it's a control to not allow more than that. Below is a working example, it also contains GC logging and GC configuration:</p>
<pre><code class="bash">export BLUR_SHARD_JVM_OPTIONS="-XX:MaxDirectMemorySize=13g \
-XX:+UseLargePages \
-Xms4g \
-Xmx4g \
-Xmn512m \
-XX:+UseCompressedOops \
-XX:+UseConcMarkSweepGC \
-XX:+CMSIncrementalMode \
-XX:CMSIncrementalDutyCycleMin=10 \
-XX:CMSIncrementalDutyCycle=50 \
-XX:ParallelGCThreads=8 \
-XX:+UseParNewGC \
-XX:MaxGCPauseMillis=200 \
-XX:GCTimeRatio=10 \
-XX:+DisableExplicitGC \
-verbose:gc \
-XX:+PrintGCDetails \
-XX:+PrintGCDateStamps \
-Xloggc:$BLUR_HOME/logs/gc-blur-shard-server_`date +%Y%m%d_%H%M%S`.log"</code></pre>
<p>Next you will need to setup blur-site.properties by changing blur.shard.blockcache.slab.count to 96. This is telling blur to allocate 96 128MB slabs of memory at shard server start-up. Note, that the first time you do this that the shard servers may take long time to allocate the memory. This is because the OS could be using most of that memory for it's own filesystem caching and it will need to unload it which may cause some IO due the cache synching to disk.</p>
<p>Also the blur.shard.blockcache.direct.memory.allocation is set to true by default, this will tell the JVM to try and allocate the memory off heap. If you want to run the slabs in the heap (which is not recommended) set this value to false.</p>
</section>
<section>
<div class="page-header">
<h1 id="metrics">Metrics</h1>
</div>
<p class="lead">Internally Blur uses the Metrics library from Coda Hale (<a href="http://metrics.codahale.com/">http://metrics.codahale.com/</a>). So by default all metrics are available through JMX here is a screenshot of what is available in the Shard server.</p>
<h3 id="shard-mbean">Shard Server - MBean Screenshot</h3>
<img src="resources/img/BlurShardServer.png" style="max-width:1000px"/>
<h3 id="reporters">Configuring Other Reporters</h3>
<p class="lead">New reporters can be added configured in the blur-site.properties. Multiple reporters can be configured.</p>
<h4>Example</h4>
<pre><code class="bash">blur.metrics.reporters=GangliaReporter
blur.metrics.reporter.ganglia.period=3
blur.metrics.reporter.ganglia.unit=SECONDS
blur.metrics.reporter.ganglia.host=ganglia1
blur.metrics.reporter.ganglia.port=8649</code></pre>
<h4>Reporters to Enable</h4>
<pre><code class="bash">blur.metrics.reporters=[ConsoleReporter,CsvReporter,GangliaReporter,GraphiteReporter]</code></pre>
<h4>ConsoleReporter</h4>
<pre><code class="bash">blur.metrics.reporter.console.period=[5]
blur.metrics.reporter.console.unit=[NANOSECONDS,MICROSECONDS,MILLISECONDS,SECONDS,MINUTES,HOURS,DAYS]</code></pre>
<h4>CsvReporter</h4>
<pre><code class="bash">blur.metrics.reporter.csv.period=[5]
blur.metrics.reporter.csv.unit=[NANOSECONDS,MICROSECONDS,MILLISECONDS,SECONDS,MINUTES,HOURS,DAYS]
blur.metrics.reporter.csv.outputDir=[.]</code></pre>
<h4>GangliaReporter</h4>
<pre><code class="bash">blur.metrics.reporter.ganglia.period=[5]
blur.metrics.reporter.ganglia.unit=[NANOSECONDS,MICROSECONDS,MILLISECONDS,SECONDS,MINUTES,HOURS,DAYS]
blur.metrics.reporter.ganglia.host=[localhost]
blur.metrics.reporter.ganglia.port=[-1]
blur.metrics.reporter.ganglia.prefix=[""]
blur.metrics.reporter.ganglia.compressPackageNames=[false]</code></pre>
<h4>GraphiteReporter</h4>
<pre><code class="bash">blur.metrics.reporter.graphite.period=[5]
blur.metrics.reporter.graphite.unit=[NANOSECONDS,MICROSECONDS,MILLISECONDS,SECONDS,MINUTES,HOURS,DAYS]
blur.metrics.reporter.graphite.host=[localhost]
blur.metrics.reporter.graphite.port=[-1]
blur.metrics.reporter.graphite.prefix=[""]</code></pre>
</section>
</div>
</div>
</div>
<!-- jQuery (necessary for Bootstrap's JavaScript plugins) -->
<script src="resources/js/jquery-2.0.3.min.js"></script>
<!-- Include all compiled plugins (below), or include individual files as needed -->
<script src="resources/js/bootstrap.min.js"></script>
<!-- Enable responsive features in IE8 with Respond.js (https://github.com/scottjehl/Respond) -->
<script src="resources/js/respond.min.js"></script>
<script src="resources/js/docs.js"></script>
</body>
</html>