blob: a076b65d41a8bc9ac45327b79c44f7d02b302434 [file] [log] [blame]
<!DOCTYPE html>
<!--
| Generated by Apache Maven Doxia at 2017-01-25
| Rendered using Apache Maven Fluido Skin 1.3.0
-->
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
<head>
<meta charset="UTF-8" />
<meta name="viewport" content="width=device-width, initial-scale=1.0" />
<meta name="Date-Revision-yyyymmdd" content="20170125" />
<meta http-equiv="Content-Language" content="en" />
<title>AsterixDB &#x2013; <a id="toc">Table of Contents</a></title>
<link rel="stylesheet" href="./css/apache-maven-fluido-1.3.0.min.css" />
<link rel="stylesheet" href="./css/site.css" />
<link rel="stylesheet" href="./css/print.css" media="print" />
<script type="text/javascript" src="./js/apache-maven-fluido-1.3.0.min.js"></script>
<script>(function(i,s,o,g,r,a,m){i['GoogleAnalyticsObject']=r;i[r]=i[r]||function(){
(i[r].q=i[r].q||[]).push(arguments)},i[r].l=1*new Date();a=s.createElement(o),
m=s.getElementsByTagName(o)[0];a.async=1;a.src=g;m.parentNode.insertBefore(a,m)
})(window,document,'script','//www.google-analytics.com/analytics.js','ga');
ga('create', 'UA-41536543-1', 'uci.edu');
ga('send', 'pageview');</script>
</head>
<body class="topBarDisabled">
<div class="container-fluid">
<div id="banner">
<div class="pull-left">
<a href="./" id="bannerLeft">
<img src="images/asterixlogo.png" alt="AsterixDB"/>
</a>
</div>
<div class="pull-right"> </div>
<div class="clear"><hr/></div>
</div>
<div id="breadcrumbs">
<ul class="breadcrumb">
<li id="publishDate">Last Published: 2017-01-25</li>
<li id="projectVersion" class="pull-right">Version: 0.9.0</li>
<li class="divider pull-right">|</li>
<li class="pull-right"> <a href="index.html" title="Documentation Home">
Documentation Home</a>
</li>
</ul>
</div>
<div class="row-fluid">
<div id="leftColumn" class="span3">
<div class="well sidebar-nav">
<ul class="nav nav-list">
<li class="nav-header">Get Started - Installation</li>
<li class="active">
<a href="#"><i class="none"></i>Option 1: using NCService</a>
</li>
<li>
<a href="install.html" title="Option 2: using Managix">
<i class="none"></i>
Option 2: using Managix</a>
</li>
<li>
<a href="yarn.html" title="Option 3: using YARN">
<i class="none"></i>
Option 3: using YARN</a>
</li>
<li class="nav-header">AsterixDB Primer</li>
<li>
<a href="sqlpp/primer-sqlpp.html" title="Option 1: using SQL++">
<i class="none"></i>
Option 1: using SQL++</a>
</li>
<li>
<a href="aql/primer.html" title="Option 2: using AQL">
<i class="none"></i>
Option 2: using AQL</a>
</li>
<li class="nav-header">Data Model</li>
<li>
<a href="datamodel.html" title="The Asterix Data Model">
<i class="none"></i>
The Asterix Data Model</a>
</li>
<li class="nav-header">Queries - SQL++</li>
<li>
<a href="sqlpp/manual.html" title="The SQL++ Query Language">
<i class="none"></i>
The SQL++ Query Language</a>
</li>
<li>
<a href="sqlpp/builtins.html" title="Builtin Functions">
<i class="none"></i>
Builtin Functions</a>
</li>
<li class="nav-header">Queries - AQL</li>
<li>
<a href="aql/manual.html" title="The Asterix Query Language (AQL)">
<i class="none"></i>
The Asterix Query Language (AQL)</a>
</li>
<li>
<a href="aql/builtins.html" title="Builtin Functions">
<i class="none"></i>
Builtin Functions</a>
</li>
<li class="nav-header">Advanced Features</li>
<li>
<a href="aql/similarity.html" title="Support of Similarity Queries">
<i class="none"></i>
Support of Similarity Queries</a>
</li>
<li>
<a href="aql/fulltext.html" title="Support of Full-text Queries">
<i class="none"></i>
Support of Full-text Queries</a>
</li>
<li>
<a href="aql/externaldata.html" title="Accessing External Data">
<i class="none"></i>
Accessing External Data</a>
</li>
<li>
<a href="feeds/tutorial.html" title="Support for Data Ingestion">
<i class="none"></i>
Support for Data Ingestion</a>
</li>
<li>
<a href="udf.html" title="User Defined Functions">
<i class="none"></i>
User Defined Functions</a>
</li>
<li>
<a href="aql/filters.html" title="Filter-Based LSM Index Acceleration">
<i class="none"></i>
Filter-Based LSM Index Acceleration</a>
</li>
<li class="nav-header">API/SDK</li>
<li>
<a href="api.html" title="HTTP API">
<i class="none"></i>
HTTP API</a>
</li>
</ul>
<hr class="divider" />
<div id="poweredBy">
<div class="clear"></div>
<div class="clear"></div>
<div class="clear"></div>
<a href="./" title="AsterixDB" class="builtBy">
<img class="builtBy" alt="AsterixDB" src="images/asterixlogo.png" />
</a>
</div>
</div>
</div>
<div id="bodyColumn" class="span9" >
<!-- ! Licensed to the Apache Software Foundation (ASF) under one
! or more contributor license agreements. See the NOTICE file
! distributed with this work for additional information
! regarding copyright ownership. The ASF licenses this file
! to you under the Apache License, Version 2.0 (the
! "License"); you may not use this file except in compliance
! with the License. You may obtain a copy of the License at
!
! http://www.apache.org/licenses/LICENSE-2.0
!
! Unless required by applicable law or agreed to in writing,
! software distributed under the License is distributed on an
! "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
! KIND, either express or implied. See the License for the
! specific language governing permissions and limitations
! under the License.
! --><div class="section">
<h2><a name="Table_of_Contents"></a><a name="toc" id="toc">Table of Contents</a></h2>
<ul>
<li><a href="#Small_cluster">Starting a small cluster using the NCService</a></li>
<li><a href="#Parameters">Parameter setting</a></li>
</ul>
<h1><a name="Small_cluster" id="Small_cluster">Starting a small cluster using the NCService</a></h1>
<p>When running a cluster using the <tt>NCService</tt> there are 3 different kind of processes involved:</p>
<ol style="list-style-type: decimal">
<li><tt>NCDriver</tt> does the work of a NodeController</li>
<li><tt>NCService</tt> configures and starts an <tt>NCDriver</tt></li>
<li><tt>CCDriver</tt> does the work of a ClusterController and sends the configuration to the <tt>NCServices</tt></li>
</ol>
<p>To start a small cluster consisting of 2 NodeControllers (<tt>red</tt> and <tt>blue</tt>) and 1 ClusterController (<tt>cc</tt>) on a single machine only 2 configuration files are required. The first one is</p>
<p><tt>blue.conf</tt>:</p>
<div class="source">
<div class="source">
<pre>[ncservice]
port=9091
</pre></div></div>
<p>It is a configuration file for the second <tt>NCService</tt>. This contains only the port that the <tt>NCService</tt> of the second NodeControllers listens to as it is non-standard. The first <tt>NCService</tt> does not need a configuration file, as it only uses default parameters. In a distributed environment with 1 NodeController per machine, no <tt>NCService</tt> needs a configuration file.</p>
<p>The second configuration file is</p>
<p><tt>cc.conf</tt>:</p>
<div class="source">
<div class="source">
<pre>[nc/red]
txnlogdir=/tmp/asterix/red/txnlog
coredumpdir=/tmp/asterix/red/coredump
iodevices=/tmp/asterix/red
[nc/blue]
port=9091
txnlogdir=/tmp/asterix/blue/txnlog
coredumpdir=/tmp/asterix/blue/coredump
iodevices=/tmp/asterix/blue
[nc]
app.class=org.apache.asterix.hyracks.bootstrap.NCApplicationEntryPoint
storagedir=storage
address=127.0.0.1
command=asterixnc
[cc]
cluster.address = 127.0.0.1
http.port = 12345
</pre></div></div>
<p>This is the configuration file for the cluster and it contains information that each <tt>NCService</tt> will use when starting the corresponding <tt>NCDriver</tt> as well as information for the <tt>CCDriver</tt>.</p>
<p>To start the cluster simply use the following steps</p>
<ol style="list-style-type: decimal">
<li>
<p>Set BASEDIR to location of an unzipped asterix-server binary assembly (in the source tree that&#x2019;s at <tt>asterixdb/asterix-server/target</tt>).</p>
<div class="source">
<div class="source">
<pre>$ export BASEDIR=[..]/asterix-server-0.8.9-SNAPSHOT-binary-assembly
</pre></div></div></li>
<li>
<p>Start the 2 <tt>NCServices</tt> for <tt>red</tt> and <tt>blue</tt>.</p>
<div class="source">
<div class="source">
<pre>$ $BASEDIR/bin/asterixncservice -config-file blue.conf &gt; blue-service.log 2&gt;&amp;1 &amp;
$ $BASEDIR/bin/asterixncservice &gt;red-service.log 2&gt;&amp;1 &amp;
</pre></div></div></li>
<li>
<p>Start the <tt>CCDriver</tt>.</p>
<div class="source">
<div class="source">
<pre>$ $BASEDIR/bin/asterixcc -config-file cc.conf &gt; cc.log 2&gt;&amp;1 &amp;
</pre></div></div></li>
</ol>
<p>The <tt>CCDriver</tt> will connect to the <tt>NCServices</tt> and thus initiate the configuration and the start of the <tt>NCDrivers</tt>. After running these scripts, <tt>jps</tt> should show a result similar to this:</p>
<div class="source">
<div class="source">
<pre>$ jps
13184 NCService
13200 NCDriver
13185 NCService
13186 CCDriver
13533 Jps
13198 NCDriver
</pre></div></div>
<p>The logs for the <tt>NCDrivers</tt> will be in <tt>$BASEDIR/logs</tt>.</p>
<p>To stop the cluster again simply run</p>
<div class="source">
<div class="source">
<pre>$ kill `jps | egrep '(CDriver|NCService)' | awk '{print $1}'`
</pre></div></div>
<p>to kill all processes.</p>
<h1><a name="Parameters" id="Parameters">Parameter settings</a></h1>
<p>The following parameters are for the master process, under the &#x201c;[cc]&#x201d; section.</p>
<table border="0" class="table table-striped">
<thead>
<tr class="a">
<th>Parameter </th>
<th>Meaning </th>
<th>Default </th>
</tr>
</thead>
<tbody>
<tr class="b">
<td>instance.name </td>
<td>The name of the AsterixDB instance </td>
<td>&#x201c;DEFAULT_INSTANCE&#x201d; </td>
</tr>
<tr class="a">
<td>max.wait.active.cluster </td>
<td>The max pending time (in seconds) for cluster startup. After the threshold, if the cluster still is not up and running, it is considered unavailable. </td>
<td>60 </td>
</tr>
<tr class="b">
<td>metadata.callback.port </td>
<td>The port for metadata communication </td>
<td>0 </td>
</tr>
<tr class="a">
<td>cluster.address </td>
<td>The binding IP address for the AsterixDB instance </td>
<td>N/A </td>
</tr>
</tbody>
</table>
<p>The following parameters for slave processes, under &#x201c;[nc]&#x201d; sections.</p>
<table border="0" class="table table-striped">
<thead>
<tr class="a">
<th>Parameter </th>
<th>Meaning </th>
<th>Default </th>
</tr>
</thead>
<tbody>
<tr class="b">
<td>address </td>
<td>The binding IP address for the slave process </td>
<td>N/A </td>
</tr>
<tr class="a">
<td>command </td>
<td>The command for the slave process </td>
<td>N/A (for AsterixDB, it should be &#x201c;asterixnc&#x201d;) </td>
</tr>
<tr class="b">
<td>coredumpdir </td>
<td>The path for core dump </td>
<td>N/A </td>
</tr>
<tr class="a">
<td>iodevices </td>
<td>Comma separated directory paths for both storage files and temporary files </td>
<td>N/A </td>
</tr>
<tr class="b">
<td>jvm.args </td>
<td>The JVM arguments </td>
<td>-Xmx1536m </td>
</tr>
<tr class="a">
<td>metadata.port </td>
<td>The metadata communication port on the metadata node. This parameter should only be present in the section of the metadata NC </td>
<td>0 </td>
</tr>
<tr class="b">
<td>metadata.registration.timeout.secs </td>
<td>The time out threshold (in seconds) for metadata node registration </td>
<td>60 </td>
</tr>
<tr class="a">
<td>port </td>
<td>The port for the NCService that starts the slave process </td>
<td>N/A </td>
</tr>
<tr class="b">
<td>storagedir </td>
<td>The directory for storage files </td>
<td>N/A </td>
</tr>
<tr class="a">
<td>storage.buffercache.maxopenfiles </td>
<td>The maximum number of open files for the buffer cache. Note that this is the parameter for the AsterixDB and setting the operating system parameter is still required. </td>
<td>2147483647 </td>
</tr>
<tr class="b">
<td>storage.buffercache.pagesize </td>
<td>The page size (in bytes) for the disk buffer cache (for reads) </td>
<td>131072 </td>
</tr>
<tr class="a">
<td>storage.buffercache.size </td>
<td>The overall budget (in bytes) of the disk buffer cache (for reads) </td>
<td>536870912 </td>
</tr>
<tr class="b">
<td>storage.lsm.bloomfilter.falsepositiverate </td>
<td>The false positive rate for the bloom filter for each memory/disk components </td>
<td>0.01 </td>
</tr>
<tr class="a">
<td>storage.memorycomponent.globalbudget </td>
<td>The global budget (in bytes) for all memory components of all datasets and indexes (for writes) </td>
<td>536870912 </td>
</tr>
<tr class="b">
<td>storage.memorycomponent.numcomponents </td>
<td>The number of memory components per data partition per index </td>
<td>2 </td>
</tr>
<tr class="a">
<td>storage.memorycomponent.numpages </td>
<td>The number of pages for all memory components of a dataset, including those for secondary indexes </td>
<td>256 </td>
</tr>
<tr class="b">
<td>storage.memorycomponent.pagesize </td>
<td>The page size (in bytes) of memory components </td>
<td>131072 </td>
</tr>
<tr class="a">
<td>storage.metadata.memorycomponent.numpages </td>
<td>The number of pages for all memory components of a metadata dataset </td>
<td>256 </td>
</tr>
<tr class="b">
<td>txnlogdir </td>
<td>The directory for transaction logs </td>
<td>N/A </td>
</tr>
<tr class="a">
<td>txn.commitprofiler.reportinterval </td>
<td>The interval for reporting commit statistics </td>
<td>5 </td>
</tr>
<tr class="b">
<td>txn.job.recovery.memorysize </td>
<td>The memory budget (in bytes) used for recovery </td>
<td>67108864 </td>
</tr>
<tr class="a">
<td>txn.lock.timeout.sweepthreshold </td>
<td>Interval (in milliseconds) for checking lock timeout </td>
<td>10000 </td>
</tr>
<tr class="b">
<td>txn.lock.timeout.waitthreshold </td>
<td>Time out (in milliseconds) of waiting for a lock </td>
<td>60000 </td>
</tr>
<tr class="a">
<td>txn.log.buffer.numpages </td>
<td>The number of pages in the transaction log tail </td>
<td>8 </td>
</tr>
<tr class="b">
<td>txn.log.buffer.pagesize </td>
<td>The page size (in bytes) for transaction log buffer. </td>
<td>131072 </td>
</tr>
<tr class="a">
<td>txn.log.checkpoint.history </td>
<td>The number of checkpoints to keep in the transaction log </td>
<td>0 </td>
</tr>
<tr class="b">
<td>txn.log.checkpoint.lsnthreshold </td>
<td>The checkpoint threshold (in terms of LSNs (log sequence numbers) that have been written to the transaction log, i.e., the length of the transaction log) for transection logs </td>
<td>67108864 </td>
</tr>
</tbody>
</table>
<p>The following parameter is for both master and slave processes, under the &#x201c;[app]&#x201d; section.</p>
<table border="0" class="table table-striped">
<thead>
<tr class="a">
<th>Parameter </th>
<th>Meaning </th>
<th>Default </th>
</tr>
</thead>
<tbody>
<tr class="b">
<td>log.level </td>
<td>The logging level for master and slave processes </td>
<td>&#x201c;INFO&#x201d; </td>
</tr>
<tr class="a">
<td>compiler.framesize </td>
<td>The page size (in bytes) for computation </td>
<td>32768 </td>
</tr>
<tr class="b">
<td>compiler.groupmemory </td>
<td>The memory budget (in bytes) for a group by operator instance in a partition </td>
<td>33554432 </td>
</tr>
<tr class="a">
<td>compiler.joinmemory </td>
<td>The memory budget (in bytes) for a join operator instance in a partition </td>
<td>33554432 </td>
</tr>
<tr class="b">
<td>compiler.sortmemory </td>
<td>The memory budget (in bytes) for a sort operator instance in a partition </td>
<td>33554432 </td>
</tr>
<tr class="a">
<td>compiler.parallelism </td>
<td>The degree of parallelism for query execution. Zero means to use the storage parallelism as the query execution parallelism, while other integer values dictate the number of query execution parallel partitions. The system will fall back to use the number of all available CPU cores in the cluster as the degree of parallelism if the number set by a user is too large or too small. </td>
<td>0 </td>
</tr>
</tbody>
</table></div>
</div>
</div>
</div>
<hr/>
<footer>
<div class="container-fluid">
<div class="row span12">Copyright &copy; 2017
<a href="https://www.apache.org/">The Apache Software Foundation</a>.
All Rights Reserved.
</div>
<?xml version="1.0" encoding="UTF-8"?>
<div class="row-fluid">Apache AsterixDB, AsterixDB, Apache, the Apache
feather logo, and the Apache AsterixDB project logo are either
registered trademarks or trademarks of The Apache Software
Foundation in the United States and other countries.
All other marks mentioned may be trademarks or registered
trademarks of their respective owners.</div>
</div>
</footer>
</body>
</html>