| <?xml version="1.0" encoding="UTF-8"?> |
| <!-- |
| Licensed to the Apache Software Foundation (ASF) under one or more |
| contributor license agreements. See the NOTICE file distributed with |
| this work for additional information regarding copyright ownership. |
| The ASF licenses this file to You under the Apache License, Version 2.0 |
| (the "License"); you may not use this file except in compliance with |
| the License. You may obtain a copy of the License at |
| |
| http://www.apache.org/licenses/LICENSE-2.0 |
| |
| Unless required by applicable law or agreed to in writing, software |
| distributed under the License is distributed on an "AS IS" BASIS, |
| WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. |
| See the License for the specific language governing permissions and |
| limitations under the License. |
| --> |
| <!DOCTYPE document PUBLIC "-//APACHE//DTD Documentation V2.0//EN" "http://forrest.apache.org/dtd/document-v20.dtd"> |
| |
| <document> |
| <header> |
| <title>Installation from Tarball</title> |
| </header> |
| <body> |
| |
| <section> |
| <title>Server Installation from Source</title> |
| |
| <p><strong>Prerequisites</strong></p> |
| <ul> |
| <li>machine to build the installation tar on</li> |
| <li>machine on which the server can be installed — this should have |
| access to the Hadoop cluster in question, and be accessible from |
| the machines you launch jobs from</li> |
| <li>an RDBMS — we recommend MySQL and provide instructions for it</li> |
| <li>Hadoop cluster</li> |
| <li>Unix user that the server will run as, and, if you are running your |
| cluster in secure mode, an associated Kerberos service principal and keytabs.</li> |
| <li>Apache Ant 1.8 or greater. Version 1.7.x is not supported.</li> |
| </ul> |
| |
| <p>Throughout these instructions when you see a word in <em>italics</em> it |
| indicates a place where you should replace the word with a locally |
| appropriate value such as a hostname or password.</p> |
| |
| <p><strong>Building a tarball </strong></p> |
| |
| <p>If you downloaded HCatalog from Apache or another site as a source release, |
| you will need to first build a tarball to install. You can tell if you have |
| a source release by looking at the name of the object you downloaded. If |
| it is named hcatalog-src-0.5.0-incubating.tar.gz (notice the |
| <strong>src</strong> in the name) then you have a source release.</p> |
| |
| <p>If you do not already have Apache Ant installed on your machine, you |
| will need to obtain it. You can get it from the <a href="http://ant.apache.org/"> |
| Apache Ant website</a>. Once you download it, you will need to unpack it |
| somewhere on your machine. The directory where you unpack it will be referred |
| to as <em>ant_home</em> in this document.</p> |
| |
| <p>To produce a binary tarball from downloaded src tarball, execute the following steps:</p> |
| <p><code>tar xzf hcatalog-src-0.5.0-incubating.tar.gz</code></p> |
| <p><code>cd hcatalog-src-0.5.0-incubating</code></p> |
| <p><em>ant_home</em><code>/bin/ant package </code></p> |
| <p>The tarball for installation should now be at |
| <code>build/hcatalog-0.5.0-incubating.tar.gz</code></p> |
| |
| <p><strong>Database Setup</strong></p> |
| |
| <p>If you do not already have Hive installed with MySQL, the following will |
| walk you through how to do so. If you have already set this up, you can skip |
| this step.</p> |
| |
| <p>Select a machine to install the database on. This need not be the same |
| machine as the Thrift server, which we will set up later. For large |
| clusters we recommend that they not be the same machine. For the |
| purposes of these instructions we will refer to this machine as |
| <em>hivedb.acme.com</em></p> |
| |
| <p>Install MySQL server on <em>hivedb.acme.com</em>. You can obtain |
| packages for MySQL from <a href="http://www.mysql.com/downloads/">MySQL's |
| download site</a>. We have developed and tested with versions 5.1.46 |
| and 5.1.48. We suggest you use these versions or later. |
| Once you have MySQL up and running, use the <code>mysql</code> command line |
| tool to add the <code>hive</code> user and <code>hivemetastoredb</code> |
| database. You will need to pick a password for your <code>hive</code> |
| user, and replace <em>dbpassword</em> in the following commands with it.</p> |
| |
| <p><code>mysql -u root</code></p> |
| <p><code>mysql> CREATE USER 'hive'@'</code><em>hivedb.acme.com</em><code>' IDENTIFIED BY '</code><em>dbpassword</em><code>';</code></p> |
| <p><code>mysql> CREATE DATABASE hivemetastoredb DEFAULT CHARACTER SET latin1 DEFAULT COLLATE latin1_swedish_ci;</code></p> |
| <p><code>mysql> GRANT ALL PRIVILEGES ON hivemetastoredb.* TO 'hive'@'</code><em>hivedb.acme.com</em><code>' WITH GRANT OPTION;</code></p> |
| <p><code>mysql> flush privileges;</code></p> |
| <p><code>mysql> quit;</code></p> |
| |
| <p>Use the database installation script found in the Hive package to create the |
| database. <code>hive_home</code> in the line below refers to the directory |
| where you have installed Hive. If you are using Hive rpms, then this will |
| be <code>/usr/lib/hive</code>.</p> |
| |
| <p><code>mysql -u hive -D hivemetastoredb -h</code><em>hivedb.acme.com</em><code> -p < </code><em>hive_home</em><code>/scripts/metastore/upgrade/mysql/hive-schema-0.10.0.mysql.sql</code></p> |
| |
| <p><strong>Thrift Server Setup</strong></p> |
| |
| <p>If you do not already have Hive running a metastore server using Thrift, |
| you can use the following instructions to setup and run one. You may skip |
| this step if you already are using a Hive metastore server.</p> |
| |
| <p>Select a machine to install your Thrift server on. For smaller and test |
| installations this can be the same machine as the database. For the |
| purposes of these instructions we will refer to this machine as |
| <em>hcatsvr.acme.com</em>.</p> |
| |
| <p>If you have not already done so, install Hive 0.10.0 (say) on this machine. You |
| can use the |
| <a href="http://hive.apache.org/releases.html">binary distributions</a> |
| provided by Hive or rpms available from |
| <a href="http://incubator.apache.org/bigtop/">Apache Bigtop</a>. If you use |
| the Apache Hive binary distribution, select a directory, henceforth |
| referred to as <code>hive_home</code>, and untar the distribution there. |
| If you use the rpms, <code>hive_home</code> will be |
| <code>/usr/lib/hive</code>.</p> |
| |
| <p>Install the MySQL Java connector libraries on <em>hcatsvr.acme.com</em>. |
| You can obtain these from |
| <a href="http://www.mysql.com/downloads/connector/j/5.1.html">MySQL's |
| download site</a>.</p> |
| |
| <p>Select a user to run the Thrift server as. This user should not be a |
| human user, and must be able to act as a proxy for other users. We suggest |
| the name "hive" for the user. Throughout the rest of this documentation |
| we will refer to this user as <em>hive</em>. If necessary, add the user to |
| <em>hcatsvr.acme.com</em>.</p> |
| |
| <p>Select a <em>root</em> directory for your installation of HCatalog. This |
| directory must be owned by the <em>hive</em> user. We recommend |
| <code>/usr/local/hive</code>. If necessary, create the directory. You will |
| need to be the <em>hive</em> user for the operations described in the remainder |
| of this Thrift Server Setup section.</p> |
| |
| <p>Copy the HCatalog installation tarball into a temporary directory, and untar |
| it. Then change directories into the new distribution and run the HCatalog |
| server installation script. You will need to know the directory you chose |
| as <em>root</em> and the |
| directory you installed the MySQL Java connector libraries into (referred |
| to in the command below as <em>dbroot</em>). You will also need your |
| <em>hadoop_home</em>, the directory where you have Hadoop installed, and |
| the port number you wish HCatalog to operate on which you will use to set |
| <em>portnum</em>.</p> |
| |
| <p><code>tar zxf hcatalog-0.5.0-incubating.tar.gz</code></p> |
| <p><code>cd hcatalog-0.5.0-incubating</code></p> |
| <p><code>share/hcatalog/scripts/hcat_server_install.sh -r </code><em>root</em><code> -d </code><em>dbroot</em><code> -h </code><em>hadoop_home</em><code> -p </code><em>portnum</em></p> |
| |
| <p>Now you need to edit your <em>hive_home</em><code>/conf/hive-site.xml</code> file. |
| If there is no such file in hive conf directory, copy <em>hcat_home</em><code>/etc/hcatalog/proto-hive-site.xml</code> |
| and rename it hive-site.xml in <em>hive_home</em><code>/conf/</code>. |
| Open this file in your favorite text editor. The following table shows the |
| values you need to configure.</p> |
| |
| <table> |
| <tr> |
| <th>Parameter</th> |
| <th>Value to Set it to</th> |
| </tr> |
| <tr> |
| <td>hive.metastore.local</td> |
| <td>false</td> |
| </tr> |
| <tr> |
| <td>javax.jdo.option.ConnectionURL</td> |
| <td>jdbc:mysql://<em>hostname</em>/hivemetastoredb?createDatabaseIfNotExist=true where <em>hostname</em> is the name of the machine you installed MySQL on.</td> |
| </tr> |
| <tr> |
| <td>javax.jdo.option.ConnectionDriverName</td> |
| <td>com.mysql.jdbc.Driver</td> |
| </tr> |
| |
| <tr> |
| <td>javax.jdo.option.ConnectionUserName</td> |
| <td>hive</td> |
| </tr> |
| <tr> |
| <td>javax.jdo.option.ConnectionPassword</td> |
| <td><em>dbpassword</em> value you used in setting up the MySQL server |
| above.</td> |
| </tr> |
| <tr> |
| <td>hive.semantic.analyzer.factory.impl</td> |
| <td>org.apache.hcatalog.cli.HCatSemanticAnalyzerFactory</td> |
| </tr> |
| <tr> |
| <td>hive.metastore.warehouse.dir</td> |
| <td>The directory can be a URI or an absolute file path. If it is an absolute file path, it will be resolved to a URI by the metastore: |
| <p>-- If default hdfs was specified in core-site.xml, path resolves to HDFS location. </p> |
| <p>-- Otherwise, path is resolved as local file: URI.</p> |
| <p>This setting becomes effective when creating new tables (it takes precedence over default DBS.DB_LOCATION_URI at the time of table creation).</p> |
| <p>You only need to set this if you have not yet configured Hive to run on your system.</p> |
| </td> |
| </tr> |
| <tr> |
| <td>hive.metastore.uris</td> |
| <td>thrift://<em>hostname</em>:<em>portnum</em> where <em>hostname</em> is the name of the machine hosting the Thrift server, and <em>portnum</em> is the port number |
| used above in the installation script.</td> |
| </tr> |
| <tr> |
| <td>hive.metastore.execute.setugi</td> |
| <td>true</td> |
| </tr> |
| <tr> |
| <td>hive.metastore.sasl.enabled</td> |
| <td>Set to true if you are using Kerberos security with your Hadoop |
| cluster, false otherwise.</td> |
| </tr> |
| <tr> |
| <td>hive.metastore.kerberos.keytab.file</td> |
| <td>The path to the Kerberos keytab file containing the metastore |
| Thrift server's service principal. Only required if you set |
| hive.metastore.sasl.enabled above to true.</td> |
| </tr> |
| <tr> |
| <td>hive.metastore.kerberos.principal</td> |
| <td>The service principal for the metastore Thrift server. You can |
| reference your host as _HOST and it will be replaced with your |
| actual hostname. Only required if you set |
| hive.metastore.sasl.enabled above to true.</td> |
| </tr> |
| </table> |
| |
| <p>You can now proceed to starting the server.</p> |
| </section> |
| |
| <section> |
| <title>Starting the Server</title> |
| |
| <p>To start your server, HCatalog needs to know where Hive is installed. |
| This is communicated by setting the environment variable <code>HIVE_HOME</code> |
| to the location you installed Hive. Start the HCatalog server by switching directories to |
| <em>root</em> and invoking |
| "<code>HIVE_HOME=</code><em>hive_home</em><code> sbin/hcat_server.sh start</code>".</p> |
| |
| </section> |
| |
| <section> |
| <title>Logging</title> |
| |
| <p>Server activity logs are located in |
| <em>root</em><code>/var/log/hcat_server</code>. Logging configuration is located at |
| <em>root</em><code>/conf/log4j.properties</code>. Server logging uses |
| <code>DailyRollingFileAppender</code> by default. It will generate a new |
| file per day and does not expire old log files automatically.</p> |
| |
| </section> |
| |
| <section> |
| <title>Stopping the Server</title> |
| |
| <p>To stop the HCatalog server, change directories to the <em>root</em> |
| directory and invoke |
| "<code>HIVE_HOME=</code><em>hive_home</em><code> sbin/hcat_server.sh stop</code>".</p> |
| |
| </section> |
| |
| <section> |
| <title>Client Installation</title> |
| |
| <p>Select a <em>root</em> directory for your installation of HCatalog client. |
| We recommend <code>/usr/local/hcat</code>. If necessary, create the directory.</p> |
| |
| <p>Copy the HCatalog installation tarball into a temporary directory, and untar |
| it.</p> |
| |
| <p><code>tar zxf hcatalog-0.5.0-incubating.tar.gz</code></p> |
| |
| <p>Now you need to edit your <em>hive_home</em><code>/conf/hive-site.xml</code> file. |
| You can use the same file as on the server <strong>except the value of |
| </strong><code>javax.jdo.option.ConnectionPasswordh</code><strong> should be |
| removed</strong>. This avoids having the password available in plain text on |
| all of your clients.</p> |
| |
| <p>The HCatalog command line interface (CLI) can now be invoked as |
| <code>HIVE_HOME=</code><em>hive_home root</em><code>/bin/hcat</code>.</p> |
| |
| </section> |
| |
| </body> |
| </document> |