| //// |
| /** |
| * @@@ START COPYRIGHT @@@ |
| * |
| * Licensed to the Apache Software Foundation (ASF) under one |
| * or more contributor license agreements. See the NOTICE file |
| * distributed with this work for additional information |
| * regarding copyright ownership. The ASF licenses this file |
| * to you under the Apache License, Version 2.0 (the |
| * "License"); you may not use this file except in compliance |
| * with the License. You may obtain a copy of the License at |
| * |
| * http://www.apache.org/licenses/LICENSE-2.0 |
| * |
| * Unless required by applicable law or agreed to in writing, |
| * software distributed under the License is distributed on an |
| * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY |
| * KIND, either express or implied. See the License for the |
| * specific language governing permissions and limitations |
| * under the License. |
| * |
| * @@@ END COPYRIGHT @@@ |
| */ |
| //// |
| |
| [[introduction]] |
| = Introduction |
| |
| {project-name} is a Hadoop add-on service that provides transactional SQL on top of HBase. Typically, you |
| use {project-name} as the database for applications that require Online Transaction Processing (OLTP), |
| Operational Data Store (ODS), and/or strong reporting capabilities. You access {project-name} using |
| standard JDBC and ODBC APIs. |
| |
| You may choose whether to add {project-name} to an existing Hadoop environment or to create a standalone |
| Hadoop environment specifically for Hadoop. |
| |
| This guide assumes that a Hadoop environment exists upon which your provisioning {project-name}. Refer to |
| <<requirements-hadoop-software,Hadoop Software>> for information about what Hadoop software is required |
| {project-name}. |
| |
| [[introduction-security-considerations]] |
| == Security Considerations |
| |
| The following users and principals need be considered for {project-name}: |
| |
| * *Provisioning User*: A Linux-level user that performs the {project-name} provisioning tasks. This user ID |
| requires `sudo` access and passwordless ssh among the nodes where {project-name} is installed. In addition, |
| this user ID requires access to Hadoop distribution, HDFS, and HBase administrative users to change |
| respective environment's configuration settings per {project-name} requirements. Refer to |
| <<requirements-trafodion-provisioning-user,{project-name} Provisioning User>> for more information |
| about the requirements and usage associated with this user ID. |
| |
| * *Runtime User*: A Linux-level user under which the {project-name} software runs, default name is `trafodion`. This user ID must be registered |
| as a user in the Hadoop Distributed File System (HDFS) to store and access objects in HDFS, HBase, and Hive. |
| In addition, this user ID requires passwordless access among the nodes where {project-name} is installed. |
| Refer to <<requirements-trafodion-runtime-user,{project-name} Runtime User>> for more information about this user ID. |
| |
| * *{project-name} Database Users*: {project-name} users are managed by {project-name} security features (grant, revoke, etc.), |
| which can be integrated with LDAP if so desired. These users are referred to as *database users* and |
| do not have direct access to the operating system. Refer to <<enable-security-ldap,LDAP>> for |
| details on enabling LDAP for authenticating database users. |
| Refer to {docs-url}/sql_reference/index.html#register_user_statement[Register User], |
| {docs-url}/sql_reference/index.html#grant_statement[Grant], and other SQL statements |
| in the {docs-url}/sql_reference/index.html[{project-name} SQL Reference Manual] for |
| more information about managing {project-name} Database Users. |
| + |
| + |
| If your environment has been provisioned with Kerberos, then the following additional information is required. |
| |
| * *KDC admin principal*: {project-name} requires administrator access to Kerberos to create principals |
| and keytabs for the `trafodion` user, and to look-up principal names for HDFS and HBase keytabs. Refer to |
| <<enable-security-kerberos,Kerberos>> for more information about the requirements and usage associated with this principal. |
| |
| * *HBase keytab location*: {project-name} requires administrator access to HBase to grant required privileges to the `trafodion` user. Refer to |
| <<enable-security-kerberos,Kerberos>> for more information about the requirements and usage associated with this keytab. |
| |
| * *HDFS keytab location*: {project-name} requires administrator access to HDFS to create directories that store files needed to perform SQL requests |
| such as data loads and backups. Refer to |
| <<enable-security-kerberos,Kerberos>> for more information about the requirements and usage associated with this keytab. |
| + |
| + |
| If your environment is using LDAP for authentication, then the following additional information is required. |
| |
| * *LDAP username for database root access*: When {project-name} is installed, it creates a predefined database user referred to as the DB\__ROOT user. |
| In order to connect to the database as database root, there must be a mapping between the database user DB__ROOT and an LDAP user. Refer to |
| <<enable-security-ldap,LDAP>> for more information about this option. |
| |
| * *LDAP search user name*: {project-name} optionally requests an LDAP username and password in order to perform LDAP operations |
| such as LDAP search. Refer to |
| <<enable-security-ldap,LDAP>> for more information about this option. |
| |
| [[introduction-provisioning-options]] |
| == Provisioning Options |
| |
| {project-name} includes two options for installation: a plug-in integration with Apache Ambari and command-line installation scripts. |
| |
| The Ambari integration provides support for Hortonworks Hadoop distributions, while the command-line {project-name} Installer |
| supports Cloudera and Hortonworks Hadoop distributions, and for select vanilla Hadoop installations. |
| |
| The {project-name} Installer supports Linux distributions SUSE and RedHat/CentOS. There are, however, some differences. |
| Prerequisite software packages are not installed automatically on SUSE. |
| |
| The {project-name} Installer automates many of the tasks required to install/upgrade {project-name}, from downloading and |
| installing required software packages and making required configuration changes to your Hadoop environment via creating |
| the {project-name} runtime user ID to installing and starting {project-name}. It is, therefore, highly recommend that |
| you use the {project-name} Installer for initial installation and upgrades of {project-name}. These steps are referred to as |
| "Script-Based Provisioning" in this guide. Refer to <<introduction-trafodion-installer, {project-name} Installer>> that provides |
| usage information. |
| |
| |
| [[introduction-provisioning-activities]] |
| == Provisioning Activities |
| |
| {project-name} provisioning is divided into the following main activities: |
| |
| * *<<requirements,Requirements>>*: Activities and documentation required to install the {project-name} software. |
| These activities include tasks such as understanding hardware and operating system requirements, |
| Hadoop requirements, what software packages that need to be downloaded, configuration settings that need to be changed, |
| and user ID requirements. |
| |
| * *<<prepare,Prepare>>*: Activities to prepare the operating system and the Hadoop ecosystem to run |
| {project-name}. These activities include tasks such as installing required software packages, configure |
| the {project-name} Installation User, gather information about the Hadoop environment, and the modify configuration |
| for different Hadoop services. |
| |
| * *<<install,Install>>*: Activities related to installing the {project-name} software. These activities |
| include tasks such as unpacking the {project-name} tar files, creating the {project-name} Runtime User, |
| creating {project-name} HDFS directories, installing the {project-name} software, and enabling security features. |
| |
| * *<<upgrade,Upgrade>>*: Activities related to the upgrading the {project-name} software. These activities |
| include tasks such as shutting down {project-name} and installing a new version of the {project-name} software. |
| The upgrade tasks vary depending on the differences between the current and new release of |
| {project-name}. For example, an upgrade may or may not include an upgrade of the {project-name} metadata. |
| |
| * *<<activate,Activate>>*: Activities related to starting the {project-name} software. These actives |
| include basic management tasks such as starting and checking the status of the {project-name} components and performing basic smoke tests. |
| |
| * *<<remove,Remove>>*: Activities related to removing {project-name} from your Hadoop cluster. |
| |
| [[introduction-provisioning-master-node]] |
| == Provisioning Master Node |
| All provisioning tasks are performed from a single node in the cluster, which can be any node |
| as long as it has access to the Hadoop environment you're adding {project-name} to. |
| This node is referred to as the "*Provisioning Master Node*" in this guide. |
| |
| The {project-name} Provisioning User must have access to all other nodes from the Provisioning |
| Master Node in order to perform provisioning tasks on the cluster. |
| |
| [[introduction-trafodion-installer]] |
| == {project-name} Installer |
| |
| The {project-name} Installer is a set of scripts automates most of the tasks requires to install/upgrade {project-name}. |
| You download the {project-name} Installer tar file from the {project-name} {download-url}[download] page. |
| Next, you unpack the tar file. |
| |
| *Example* |
| |
| ``` |
| $ mkdir $HOME/trafodion-installer |
| $ cd $HOME/trafodion-downloads |
| $ tar -zxf apache-trafodion-pyinstaller-2.2.0.tar.gz -C $HOME/trafodion-installer |
| $ |
| ``` |
| |
| <<< |
| The {project-name} Installer supports two different modes: |
| |
| 1. *Guided Setup*: Prompts for information as it works through the installation/upgrade process. This mode is recommended for new users. |
| 2. *Automated Setup*: Required information is provided in a pre-formatted ini configuration file, which is provided |
| via a command argument when running the {project-name} Installer thereby suppressing all prompts. This ini configuration file only exists |
| on the *Provisioning Master Node*, please secure this file or delete it after you installed {project-name} successfully. |
| + |
| A template of the configuration file is available here within the installer directory: `configs/db_config_default.ini`. |
| Make a copy of the file in your directory and populate the needed information. |
| + |
| Automated Setup is recommended since it allows you to record the required provisioning information ahead of time. |
| Refer to <<introduction-trafodion-installer-automated-setup,Automated Setup>> for information about how to |
| populate this file. |
| |
| [[introduction-trafodion-installer-usage]] |
| === Usage |
| |
| The following shows help for the {project-name} Installer. |
| |
| ``` |
| $ ./db_install.py -h |
| ********************************** |
| Trafodion Installation ToolKit |
| ********************************** |
| Usage: db_install.py [options] |
| Trafodion install main script. |
| |
| Options: |
| -h, --help show this help message and exit |
| -c FILE, --config-file=FILE |
| Json format file. If provided, all install prompts |
| will be taken from this file and not prompted for. |
| -u USER, --remote-user=USER |
| Specify ssh login user for remote server, |
| if not provided, use current login user as default. |
| -v, --verbose Verbose mode, will print commands. |
| --silent Do not ask user to confirm configuration result |
| --enable-pwd Prompt SSH login password for remote hosts. |
| If set, 'sshpass' tool is required. |
| --build Build the config file in guided mode only. |
| --reinstall Reinstall Trafodion without restarting Hadoop. |
| --apache-hadoop Install Trafodion on top of Apache Hadoop. |
| --offline Enable local repository for offline installing |
| Trafodion. |
| ``` |
| |
| <<< |
| [[introduction-trafodion-installer-install-vs-upgrade]] |
| === Install vs. Upgrade |
| |
| The {project-name} Installer automatically detects whether you're performing an install |
| or an upgrade by looking for the {project-name} Runtime User in the `/etc/passwd` file. |
| |
| * If the user ID doesn't exist, then the {project-name} Installer runs in install mode. |
| * If the user ID exists, then the {project-name} Installer runs in upgrade mode. |
| * If `--reinstall` option is specified, then the {project-name} Installer will not restart Hadoop. It's only available when |
| you reinstall the same release version, otherwise an error will be reported during installation. |
| |
| |
| [[introduction-trafodion-installer-guided-setup]] |
| === Guided Setup |
| |
| By default, the {project-name} Installer runs in Guided Setup mode, which means |
| that it prompts you for information during the install/upgrade process. |
| |
| Refer to the following sections for examples: |
| |
| * <<install-guided-install, Guided Install>> |
| * <<upgrade-guided-upgrade, Guided Upgrade>> |
| |
| [[introduction-trafodion-installer-automated-setup]] |
| === Automated Setup |
| |
| The `--config-file` option runs the {project-name} in Automated Setup mode. |
| |
| Before running the {project-name} Installer with this option, you do the following: |
| |
| 1. Copy the `db_config_default.ini` file. |
| + |
| *Example* |
| + |
| ``` |
| cp configs/db_config_default.ini my_config |
| ``` |
| |
| 2. Edit the new file using information you collect in the |
| <<prepare-gather-configuration-information,Gather Configuration Information>> |
| section in the <<prepare,Prepare>> chapter. |
| |
| 3. Run the {project-name} Installer in Automated Setup Mode |
| + |
| *Example* |
| + |
| ``` |
| ./db_install.py --config-file my_config |
| ``` |
| |
| NOTE: Your {project-name} Configuration File contains the password for the {project-name} Runtime User |
| and for the Distribution Manager. Therefore, we recommend that you secure the file in a manner |
| that matches the security policies of your organization. |
| |
| ==== Example: Quick start using a {project-name} Configuration File |
| The {project-name} Installer supports a minimum configuration to quick start your installation in two steps. |
| 1. Copy {project-name} server binary file to your installer directory. |
| ``` |
| cp /path/to/apache-trafodion_server-2.2.0-RH-x86_64.tar.gz python-installer/ |
| ``` |
| 2. Modify configuration file `my_config`, add the Hadoop Distribution Manager URL in `mgr_url`. |
| ``` |
| mgr_url = 192.168.0.1:8080 |
| ``` |
| Once completed, run the {project-name} Installer with the --config-file option. |
| |
| ==== Example: Creating a {project-name} Configuration File |
| |
| Using the instructions in <<prepare-gather-configuration-information,Gather Configuration Information>> |
| in the <<prepare,Prepare>> chapter, record the information and edit `my_config` to contain the following: |
| |
| ``` |
| [dbconfigs] |
| # NOTICE: if you are using CDH/HDP hadoop distro, |
| # you can only specifiy management url address for a quick install |
| |
| ################################## |
| # Common Settings |
| ################################## |
| |
| # trafodion username and password |
| traf_user = trafodion |
| traf_pwd = traf123 |
| # trafodion user's home directory |
| home_dir = /home |
| # the directory location of trafodion binary |
| # if not provided, the default value will be {package_name}-{version} |
| traf_dirname = |
| |
| # trafodion used java(JDK) path on trafodion nodes |
| # if not provided, installer will auto detect installed JDK |
| java_home = |
| |
| # cloudera/ambari management url(i.e. http://192.168.0.1:7180 or just 192.168.0.1) |
| # if 'http' or 'https' prefix is not provided, the default one is 'http' |
| # if port is not provided, the default port is cloudera port '7180' |
| mgr_url = 192.168.0.1:8080 |
| # user name for cloudera/ambari management url |
| mgr_user = admin |
| # password for cloudera/ambari management url |
| mgr_pwd = admin |
| # set the cluster number if multiple clusters managed by one Cloudera manager |
| # ignore it if only one cluster being managed |
| cluster_no = 1 |
| |
| # trafodion tar package file location |
| # no need to provide it if the package can be found in current installer's directory |
| traf_package = |
| # the number of dcs servers on each node |
| dcs_cnt_per_node = 4 |
| |
| # scratch file location, seperated by comma if more than one |
| scratch_locs = $TRAF_VAR |
| |
| # start trafodion instance after installation completed |
| traf_start = Y |
| |
| ################################## |
| # DCS HA configuration |
| ################################## |
| |
| # set it to 'Y' if enable DCS HA |
| dcs_ha = N |
| # if HA is enabled, provide floating ip, network interface and the hostname of backup dcs master nodes |
| dcs_floating_ip = |
| # network interface that dcs used |
| dcs_interface = |
| # backup dcs master nodes, seperated by comma if more than one |
| dcs_backup_nodes = |
| |
| ################################## |
| # Offline installation setting |
| ################################## |
| |
| # set offline mode to Y if no internet connection |
| offline_mode = N |
| # if offline mode is set, you must provide a local repository directory with all needed RPMs |
| local_repo_dir = |
| |
| ################################## |
| # LDAP security configuration |
| ################################## |
| |
| # set it to 'Y' if enable LDAP security |
| ldap_security = N |
| # LDAP user name and password to be assigned as DB admin privilege |
| db_admin_user = admin |
| db_admin_pwd = traf123 |
| # LDAP user to be assigned DB root privileges (DB__ROOT) |
| db_root_user = trafodion |
| # if LDAP security is enabled, provide the following items |
| ldap_hosts = |
| # 389 for no encryption or TLS, 636 for SSL |
| ldap_port = 389 |
| ldap_identifiers = |
| ldap_encrypt = 0 |
| ldap_certpath = |
| |
| # set to Y if user info is needed |
| ldap_userinfo = N |
| # provide if ldap_userinfo = Y |
| ladp_user = |
| ladp_pwd = |
| |
| ################################## |
| # Kerberos security configuration |
| ################################## |
| # if kerberos is enabled in your hadoop system, provide below info |
| |
| # KDC server address |
| kdc_server = |
| # include realm, i.e. admin/admin@EXAMPLE.COM |
| admin_principal = |
| # admin password for admin principal, it is used to create trafodion user's principal and keytab |
| kdcadmin_pwd = |
| ``` |
| |
| Once completed, run the {project-name} Installer with the `--config-file` option. |
| |
| Refer to the following sections for examples: |
| |
| * <<install-automated-install, Automated Install>> |
| * <<upgrade-automated-upgrade, Automated Upgrade>> |
| |
| [[introduction-trafodion-provisioning-directories]] |
| == {project-name} Provisioning Directories |
| |
| {project-name} stores its provisioning information in the following directories on each node in the cluster: |
| |
| * `/etc/trafodion`: Configuration information. |