|  | <!DOCTYPE html> | 
|  | <!-- | 
|  | | Generated by Apache Maven Doxia Site Renderer 1.8.1 from target/generated-site/markdown/aws.md at 2024-04-01 | 
|  | | Rendered using Apache Maven Fluido Skin 1.7 | 
|  | --> | 
|  | <html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en"> | 
|  | <head> | 
|  | <meta charset="UTF-8" /> | 
|  | <meta name="viewport" content="width=device-width, initial-scale=1.0" /> | 
|  | <meta name="Date-Revision-yyyymmdd" content="20240401" /> | 
|  | <meta http-equiv="Content-Language" content="en" /> | 
|  | <title>AsterixDB – Installation using Amazon Web Services</title> | 
|  | <link rel="stylesheet" href="./css/apache-maven-fluido-1.7.min.css" /> | 
|  | <link rel="stylesheet" href="./css/site.css" /> | 
|  | <link rel="stylesheet" href="./css/print.css" media="print" /> | 
|  | <script type="text/javascript" src="./js/apache-maven-fluido-1.7.min.js"></script> | 
|  |  | 
|  | </head> | 
|  | <body class="topBarDisabled"> | 
|  | <div class="container-fluid"> | 
|  | <div id="banner"> | 
|  | <div class="pull-left"><a href="./" id="bannerLeft"><img src="images/asterixlogo.png"  alt="AsterixDB"/></a></div> | 
|  | <div class="pull-right"></div> | 
|  | <div class="clear"><hr/></div> | 
|  | </div> | 
|  |  | 
|  | <div id="breadcrumbs"> | 
|  | <ul class="breadcrumb"> | 
|  | <li id="publishDate">Last Published: 2024-04-01</li> | 
|  | <li id="projectVersion" class="pull-right">Version: 0.9.9</li> | 
|  | <li class="pull-right"><a href="index.html" title="Documentation Home">Documentation Home</a></li> | 
|  | </ul> | 
|  | </div> | 
|  | <div class="row-fluid"> | 
|  | <div id="leftColumn" class="span2"> | 
|  | <div class="well sidebar-nav"> | 
|  | <ul class="nav nav-list"> | 
|  | <li class="nav-header">Get Started - Installation</li> | 
|  | <li><a href="ncservice.html" title="Option 1: using NCService"><span class="none"></span>Option 1: using NCService</a></li> | 
|  | <li><a href="ansible.html" title="Option 2: using Ansible"><span class="none"></span>Option 2: using Ansible</a></li> | 
|  | <li class="active"><a href="#"><span class="none"></span>Option 3: using Amazon Web Services</a></li> | 
|  | <li class="nav-header">AsterixDB Primer</li> | 
|  | <li><a href="sqlpp/primer-sqlpp.html" title="Using SQL++"><span class="none"></span>Using SQL++</a></li> | 
|  | <li class="nav-header">Data Model</li> | 
|  | <li><a href="datamodel.html" title="The Asterix Data Model"><span class="none"></span>The Asterix Data Model</a></li> | 
|  | <li class="nav-header">Queries</li> | 
|  | <li><a href="sqlpp/manual.html" title="The SQL++ Query Language"><span class="none"></span>The SQL++ Query Language</a></li> | 
|  | <li><a href="SQLPP.html" title="Raw SQL++ Grammar"><span class="none"></span>Raw SQL++ Grammar</a></li> | 
|  | <li><a href="sqlpp/builtins.html" title="Builtin Functions"><span class="none"></span>Builtin Functions</a></li> | 
|  | <li class="nav-header">API/SDK</li> | 
|  | <li><a href="api.html" title="HTTP API"><span class="none"></span>HTTP API</a></li> | 
|  | <li><a href="csv.html" title="CSV Output"><span class="none"></span>CSV Output</a></li> | 
|  | <li class="nav-header">Advanced Features</li> | 
|  | <li><a href="aql/externaldata.html" title="Accessing External Data"><span class="none"></span>Accessing External Data</a></li> | 
|  | <li><a href="feeds.html" title="Data Ingestion with Feeds"><span class="none"></span>Data Ingestion with Feeds</a></li> | 
|  | <li><a href="udf.html" title="User Defined Functions"><span class="none"></span>User Defined Functions</a></li> | 
|  | <li><a href="sqlpp/filters.html" title="Filter-Based LSM Index Acceleration"><span class="none"></span>Filter-Based LSM Index Acceleration</a></li> | 
|  | <li><a href="sqlpp/fulltext.html" title="Support of Full-text Queries"><span class="none"></span>Support of Full-text Queries</a></li> | 
|  | <li><a href="sqlpp/similarity.html" title="Support of Similarity Queries"><span class="none"></span>Support of Similarity Queries</a></li> | 
|  | <li><a href="geo/quickstart.html" title="GIS Support Overview"><span class="none"></span>GIS Support Overview</a></li> | 
|  | <li><a href="geo/functions.html" title="GIS Functions"><span class="none"></span>GIS Functions</a></li> | 
|  | <li><a href="interval_join.html" title="Support of Interval Joins"><span class="none"></span>Support of Interval Joins</a></li> | 
|  | <li><a href="spatial_join.html" title="Support of Spatial Joins"><span class="none"></span>Support of Spatial Joins</a></li> | 
|  | <li><a href="sqlpp/arrayindex.html" title="Support of Array Indexes"><span class="none"></span>Support of Array Indexes</a></li> | 
|  | <li class="nav-header">Deprecated</li> | 
|  | <li><a href="aql/primer.html" title="AsterixDB Primer: Using AQL"><span class="none"></span>AsterixDB Primer: Using AQL</a></li> | 
|  | <li><a href="aql/manual.html" title="Queries: The Asterix Query Language (AQL)"><span class="none"></span>Queries: The Asterix Query Language (AQL)</a></li> | 
|  | <li><a href="aql/builtins.html" title="Queries: Builtin Functions (AQL)"><span class="none"></span>Queries: Builtin Functions (AQL)</a></li> | 
|  | </ul> | 
|  | <hr /> | 
|  | <div id="poweredBy"> | 
|  | <div class="clear"></div> | 
|  | <div class="clear"></div> | 
|  | <div class="clear"></div> | 
|  | <div class="clear"></div> | 
|  | <a href="./" title="AsterixDB" class="builtBy"><img class="builtBy"  alt="AsterixDB" src="images/asterixlogo.png"    /></a> | 
|  | </div> | 
|  | </div> | 
|  | </div> | 
|  | <div id="bodyColumn"  class="span10" > | 
|  | <!-- | 
|  | ! Licensed to the Apache Software Foundation (ASF) under one | 
|  | ! or more contributor license agreements.  See the NOTICE file | 
|  | ! distributed with this work for additional information | 
|  | ! regarding copyright ownership.  The ASF licenses this file | 
|  | ! to you under the Apache License, Version 2.0 (the | 
|  | ! "License"); you may not use this file except in compliance | 
|  | ! with the License.  You may obtain a copy of the License at | 
|  | ! | 
|  | !   http://www.apache.org/licenses/LICENSE-2.0 | 
|  | ! | 
|  | ! Unless required by applicable law or agreed to in writing, | 
|  | ! software distributed under the License is distributed on an | 
|  | ! "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY | 
|  | ! KIND, either express or implied.  See the License for the | 
|  | ! specific language governing permissions and limitations | 
|  | ! under the License. | 
|  | !--> | 
|  | <h1>Installation using Amazon Web Services</h1> | 
|  | <div class="section"> | 
|  | <h2><a name="Table_of_Contents"></a><a name="atoc" id="#toc">Table of Contents</a></h2> | 
|  | <ul> | 
|  |  | 
|  | <li><a href="#Introduction">Introduction</a></li> | 
|  | <li><a href="#Prerequisites">Prerequisites</a></li> | 
|  | <li><a href="#config">Cluster Configuration</a></li> | 
|  | <li><a href="#lifecycle">Cluster Lifecycle Management</a></li> | 
|  | </ul><!-- | 
|  | ! Licensed to the Apache Software Foundation (ASF) under one | 
|  | ! or more contributor license agreements.  See the NOTICE file | 
|  | ! distributed with this work for additional information | 
|  | ! regarding copyright ownership.  The ASF licenses this file | 
|  | ! to you under the Apache License, Version 2.0 (the | 
|  | ! "License"); you may not use this file except in compliance | 
|  | ! with the License.  You may obtain a copy of the License at | 
|  | ! | 
|  | !   http://www.apache.org/licenses/LICENSE-2.0 | 
|  | ! | 
|  | ! Unless required by applicable law or agreed to in writing, | 
|  | ! software distributed under the License is distributed on an | 
|  | ! "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY | 
|  | ! KIND, either express or implied.  See the License for the | 
|  | ! specific language governing permissions and limitations | 
|  | ! under the License. | 
|  | !--> | 
|  | </div> | 
|  | <div class="section"> | 
|  | <h2><a name="Introduction" id="Introduction">Introduction</a></h2> | 
|  | <p>Note that you can always manually launch a number of Amazon Web Services EC2 instances and then run the Ansible cluster installation scripts as described <a href="ansible.html">here</a> separately to manage the lifecycle of an AsterixDB cluster on those EC2 instances.</p> | 
|  | <p>However, via this installation option, we provide a combo solution for automating both AWS EC2 and AsterixDB, where you can run only one script to deploy, start, stop, and terminate an AsterixDB cluster on AWS.</p></div> | 
|  | <div class="section"> | 
|  | <h2><a name="Prerequisites" id="Prerequisites">Prerequisites</a></h2> | 
|  | <ul> | 
|  |  | 
|  | <li> | 
|  |  | 
|  | <p>Supported operating systems for the client: <b>Linux</b> and <b>MacOS</b></p> | 
|  | </li> | 
|  | <li> | 
|  |  | 
|  | <p>Supported operating systems for Amazon Web Services instances: <b>Linux</b></p> | 
|  | </li> | 
|  | <li> | 
|  |  | 
|  | <p>Install pip on your client machine:</p> | 
|  | <p>CentOS</p> | 
|  |  | 
|  | <div> | 
|  | <div> | 
|  | <pre class="source"> $ sudo yum install python-pip | 
|  | </pre></div></div> | 
|  |  | 
|  | <p>Ubuntu</p> | 
|  |  | 
|  | <div> | 
|  | <div> | 
|  | <pre class="source"> $ sudo apt-get install python-pip | 
|  | </pre></div></div> | 
|  |  | 
|  | <p>macOS</p> | 
|  |  | 
|  | <div> | 
|  | <div> | 
|  | <pre class="source"> $ brew install pip | 
|  | </pre></div></div> | 
|  | </li> | 
|  | <li> | 
|  |  | 
|  | <p>Install Ansible, boto, and boto3 on your client machine:</p> | 
|  |  | 
|  | <div> | 
|  | <div> | 
|  | <pre class="source"> $ pip install ansible | 
|  | $ pip install boto | 
|  | $ pip install boto3 | 
|  | </pre></div></div> | 
|  |  | 
|  | <p>Note that you might need <tt>sudo</tt> depending on your system configuration.</p> | 
|  | <p><b>Make sure that the version of Ansible is no less than 2.2.1.0</b>:</p> | 
|  |  | 
|  | <div> | 
|  | <div> | 
|  | <pre class="source"> $ ansible --version | 
|  | ansible 2.2.1.0 | 
|  | </pre></div></div> | 
|  |  | 
|  | <p><b>For users with macOS 10.11+</b>, please create a user-level Ansible configuration file at:</p> | 
|  |  | 
|  | <div> | 
|  | <div> | 
|  | <pre class="source"> ~/.ansible.cfg | 
|  | </pre></div></div> | 
|  |  | 
|  | <p>and add the following configuration:</p> | 
|  |  | 
|  | <div> | 
|  | <div> | 
|  | <pre class="source"> [ssh_connection] | 
|  | control_path = %(directory)s/%%C | 
|  | </pre></div></div> | 
|  | </li> | 
|  | <li> | 
|  |  | 
|  | <p>Download the AsterixDB distribution package, unzip it, navigate to <tt>opt/aws/</tt></p> | 
|  |  | 
|  | <div> | 
|  | <div> | 
|  | <pre class="source"> $ cd opt/aws | 
|  | </pre></div></div> | 
|  |  | 
|  | <p>The following files and directories are in the directory <tt>opt/aws</tt>:</p> | 
|  |  | 
|  | <div> | 
|  | <div> | 
|  | <pre class="source"> README  bin  conf  yaml | 
|  | </pre></div></div> | 
|  |  | 
|  | <p><tt>bin</tt> contains scripts that start and terminate an AWS-based cluster instance, according to the configuration specified in files under <tt>conf</tt>, and <tt>yaml</tt> contains internal Ansible scripts that the shell scripts in <tt>bin</tt> use.</p> | 
|  | </li> | 
|  | <li> | 
|  |  | 
|  | <p>Create an AWS account and an IAM user.</p> | 
|  | <p>Set up a security group that you’d like to use for your AWS cluster. <b>The security group should at least allow all TCP connections from anywhere.</b> Provide the name of the security group as the value for the <tt>group</tt> field in <tt>conf/aws_settings.yml</tt>.</p> | 
|  | </li> | 
|  | <li> | 
|  |  | 
|  | <p>Retrieve your AWS EC2 key pair name and use that as the <tt>keypair</tt> in <tt>conf/aws_settings.yml</tt>;</p> | 
|  | <p>retrieve your AWS IAM <tt>access key ID</tt> and use that as the <tt>access_key_id</tt> in <tt>conf/aws_settings.yml</tt>;</p> | 
|  | <p>retrieve your AWS IAM <tt>secret access key</tt> and use that as the <tt>secret_access_key</tt> in <tt>conf/aws_settings.yml</tt>.</p> | 
|  | <p>Note that you can only read or download <tt>access key ID</tt> and <tt>secret access key</tt> once from your AWS console. If you forget them, you have to create new keys and delete the old ones.</p> | 
|  | </li> | 
|  | <li> | 
|  |  | 
|  | <p>Configure your ssh setting by editing <tt>~/.ssh/config</tt> and adding the following entry:</p> | 
|  |  | 
|  | <div> | 
|  | <div> | 
|  | <pre class="source"> Host *.amazonaws.com | 
|  | IdentityFile <path_of_private_key> | 
|  | </pre></div></div> | 
|  |  | 
|  | <p>Note that <path_of_private_key> should be replaced by the path to the file that stores the private key for the key pair that you uploaded to AWS and used in <tt>conf/aws_settings</tt>. For example:</p> | 
|  |  | 
|  | <div> | 
|  | <div> | 
|  | <pre class="source"> Host *.amazonaws.com | 
|  | IdentityFile ~/.ssh/id_rsa | 
|  | </pre></div></div> | 
|  | </li> | 
|  | </ul></div> | 
|  | <div class="section"> | 
|  | <h2><a name="Cluster_Configuration"></a><a name="config" id="config">Cluster Configuration</a></h2> | 
|  | <ul> | 
|  |  | 
|  | <li> | 
|  |  | 
|  | <p><b>AWS settings</b>.  Edit <tt>conf/instance_settings.yml</tt>. The meaning of each parameter is listed as follows:</p> | 
|  |  | 
|  | <div> | 
|  | <div> | 
|  | <pre class="source"> # The OS image id for ec2 instances. | 
|  | image: ami-76fa4116 | 
|  |  | 
|  | # The data center region for ec2 instances. | 
|  | region: us-west-2 | 
|  |  | 
|  | # The tag for each ec2 machine. Use different tags for isolation. | 
|  | tag: scale_test | 
|  |  | 
|  | # The name of a security group that appears in your AWS console. | 
|  | group: default | 
|  |  | 
|  | # The name of a key pair that appears in your AWS console. | 
|  | keypair: <to be filled> | 
|  |  | 
|  | # The AWS access key id for your IAM user. | 
|  | access_key_id: <to be filled> | 
|  |  | 
|  | # The AWS secret key for your IAM user. | 
|  | secret_access_key: <to be filled> | 
|  |  | 
|  | # The AWS instance type. A full list of available types are listed at: | 
|  | # https://aws.amazon.com/ec2/instance-types/ | 
|  | instance_type: t2.micro | 
|  |  | 
|  | # The number of ec2 instances that construct a cluster. | 
|  | count: 3 | 
|  |  | 
|  | # The user name. | 
|  | user: ec2-user | 
|  |  | 
|  | # Whether to reuse one slave machine to host the master process. | 
|  | cc_on_nc: false | 
|  | </pre></div></div> | 
|  |  | 
|  | <p><b>As described in <a href="#Prerequisites">prerequisites</a>, the following parameters must be customized:</b></p> | 
|  |  | 
|  | <div> | 
|  | <div> | 
|  | <pre class="source"> # The tag for each ec2 machine. Use different tags for isolation. | 
|  | tag: scale_test | 
|  |  | 
|  | # The name of a security group that appears in your AWS console. | 
|  | group: default | 
|  |  | 
|  | # The name of a key pair that appears in your AWS console. | 
|  | keypair: <to be filled> | 
|  |  | 
|  | # The AWS access key id for your IAM user. | 
|  | access_key_id: <to be filled> | 
|  |  | 
|  | # The AWS secrety key for your IAM user. | 
|  | secret_access_key: <to be filled> | 
|  | </pre></div></div> | 
|  | </li> | 
|  | <li> | 
|  |  | 
|  | <p><b>Remote working directories</b>. Edit <tt>conf/instance_settings.yml</tt> to change the remote binary directory (the variable “binarydir”) when necessary. By default, the binary directory will be under the home directory (as the value of Ansible builtin variable ansible_env.HOME) of the ssh user account on each node.</p> | 
|  | </li> | 
|  | </ul></div> | 
|  | <div class="section"> | 
|  | <h2><a name="Cluster_Lifecycle_Management"></a><a name="lifecycle" id="lifecycle">Cluster Lifecycle Management</a></h2> | 
|  | <ul> | 
|  |  | 
|  | <li> | 
|  |  | 
|  | <p>Allocate AWS EC2 nodes (the number of nodes is specified in <tt>conf/instance_settings.yml</tt>) and deploy the binary to all allocated EC2 nodes:</p> | 
|  |  | 
|  | <div> | 
|  | <div> | 
|  | <pre class="source"> bin/deploy.sh | 
|  | </pre></div></div> | 
|  | </li> | 
|  | <li> | 
|  |  | 
|  | <p>Before starting the AsterixDB cluster, you the instance configuration file <tt>conf/instance/cc.conf</tt> can be modified with the exception of the IP addresses/DNS names which are are generated and cannot be changed. All available parameters and their usage can be found <a href="ncservice.html#Parameters">here</a>.</p> | 
|  | </li> | 
|  | <li> | 
|  |  | 
|  | <p>Launch your AsterixDB cluster on EC2:</p> | 
|  |  | 
|  | <div> | 
|  | <div> | 
|  | <pre class="source"> bin/start.sh | 
|  | </pre></div></div> | 
|  |  | 
|  | <p>Now you can use the multi-node AsterixDB cluster on EC2 by by opening the master node listed in <tt>conf/instance/inventory</tt> at port <tt>19001</tt> (which can be customized in <tt>conf/instance/cc.conf</tt>) in your browser.</p> | 
|  | </li> | 
|  | <li> | 
|  |  | 
|  | <p>If you want to stop the AWS-based AsterixDB cluster, run the following script:</p> | 
|  |  | 
|  | <div> | 
|  | <div> | 
|  | <pre class="source"> bin/stop.sh | 
|  | </pre></div></div> | 
|  |  | 
|  | <p>Note that this only stops AsterixDB but does not stop the EC2 nodes.</p> | 
|  | </li> | 
|  | <li> | 
|  |  | 
|  | <p>If you want to terminate the EC2 nodes that run the AsterixDB cluster, run the following script:</p> | 
|  |  | 
|  | <div> | 
|  | <div> | 
|  | <pre class="source"> bin/terminate.sh | 
|  | </pre></div></div> | 
|  |  | 
|  | <p><b>Note that it will destroy everything in the AsterixDB cluster you installed and terminate all EC2 nodes for the cluster.</b></p> | 
|  | </li> | 
|  | </ul></div> | 
|  | </div> | 
|  | </div> | 
|  | </div> | 
|  | <hr/> | 
|  | <footer> | 
|  | <div class="container-fluid"> | 
|  | <div class="row-fluid"> | 
|  | <div class="row-fluid">Apache AsterixDB, AsterixDB, Apache, the Apache | 
|  | feather logo, and the Apache AsterixDB project logo are either | 
|  | registered trademarks or trademarks of The Apache Software | 
|  | Foundation in the United States and other countries. | 
|  | All other marks mentioned may be trademarks or registered | 
|  | trademarks of their respective owners. | 
|  | </div> | 
|  | </div> | 
|  | </div> | 
|  | </footer> | 
|  | </body> | 
|  | </html> |