| --- |
| title: Installing Additional HAWQ Components |
| --- |
| |
| This chapter describes how to install additional HAWQ components. |
| |
| - [Installing Cryptographic Functions for PostgreSQL](#) |
| - [Install pgcrypto](#) |
| - [Uninstalling pgcrypto](#) |
| - [Installing PL/R](#) |
| - [Install PL/R](#) |
| - [Enable PL/R Language Support](#) |
| - [Uninstalling PL/R](#) |
| - [Installing PL/Java](#) |
| - [Install the HAWQ PL/Java Extension](#) |
| - [Enable PL/Java and Install JAR Files](#) |
| - [Uninstalling PL/Java](#) |
| - [Installing MADlib on HAWQ](#) |
| - [Pre-requisites for Installing MADlib on HAWQ](#) |
| - [Install MADlib on HAWQ](#) |
| |
| ## Installing Cryptographic Functions for PostgreSQL <a id="installingcryptographicfunctionsforpostgresql"></a> |
| |
| pgcrypto is available as a package that you can download from the Pivotal Download Center and install using the Package Manager utility \(`gppkg`\). `gppkg` installs pgcrypto and other HAWQ extensions, along with any dependencies, on all hosts across a cluster. It will also automatically install extensions on new hosts in the case of system expansion and segment recovery. |
| |
| **Note:** Before you install the pgcrypto software package, make sure that your HAWQ database is running, you have sourced greenplum\_path.sh, and that the `$MASTER_DATA_DIRECTORY` and `$GPHOME` variables are set. |
| |
| ### Install pgcrypto <a id="installpgcrypto"></a> |
| |
| Download the pgcrypto package from the Pivotal Download Center, then copy it to the master host. Install the software package by running the following command: |
| |
| ``` |
| gppkg -i pgcrypto-1.0-rhel5-x86_64.gppkg |
| ``` |
| |
| You will see output similar to the following. |
| |
| ``` |
| [gpadmin@gp-single-host ~]$ gppkg -i pgcrypto-1.0-rhel5-x86_64.gppkg |
| 20120418:23:54:20:gppkg:gp-single-host:gpadmin-[INFO]:-Starting gppkg with args: -i pgcrypto-1.0-rhel5-x86_64.gppkg |
| 20120418:23:54:20:gppkg:gp-single-host:gpadmin-[INFO]:-Installing package pgcrypto-1.0-rhel5-x86_64.gppkg |
| 20120418:23:54:21:gppkg:gp-single-host:gpadmin-[INFO]:-Validating rpm installation cmdStr='rpm --test -i /usr/local/greenplum-db/./.tmp/pgcrypto-1.0-1.x86_64.rpm --dbpath /usr/local/greenplum-db/./share/packages/database --prefix /usr/local/greenplum-db/.' |
| 20120418:23:54:22:gppkg:gp-single-host:gpadmin-[INFO]:-Please run psql -d mydatabase -f $GPHOME/share/postgresql/contrib/pgcrypto.sql to enable the package. |
| 20120418:23:54:22:gppkg:gp-single-host:gpadmin-[INFO]:-pgcrypto-1.0-rhel5-x86_64.gppkg successfully installed. |
| ``` |
| |
| ### Uninstalling pgcrypto <a id="uninstallingpgcrypto"></a> |
| |
| #### Uninstall pgcrypto support |
| |
| To uninstall the pgcrypto objects, use uninstall\_pgcrypto.sql to remove pgcrypto support. |
| |
| For each database on which you enabled pgcrypto support, execute the following: |
| |
| ``` |
| psql -d dbname -f $GPHOME/share/postgresql/contrib/uninstall_pgcrypto.sql |
| ``` |
| |
| **Note:** This script does not remove dependent user-created objects. |
| |
| #### Uninstall the software package |
| |
| You can uninstall the pgcrypto software using the Greenplum Package Manager \(gppkg\), as follows: |
| |
| ``` |
| gppkg -r pgcrypto-1.0 |
| ``` |
| |
| ## Installing PL/R <a id="installingplr"></a> |
| |
| PL/R is available as a package that you can download from the Pivotal Download Center and install using the Package Manager utility \(`gppkg`\). `gppkg` installs PL/R and other Greenplum Database extensions, along with any dependencies, on all hosts across a cluster. It will also automatically install extensions on new hosts in the case of system expansion and segment recovery. |
| |
| **Note:** Before you install the PL/R software package, make sure that your HAWQ database is running, you have sourced greenplum\_path.sh, and that the `$MASTER_DATA_DIRECTORY` and `$GPHOME` variables are set. |
| |
| ### Install PL/R <a id="installplr"></a> |
| |
| 1. Download the PL/R package from the Pivotal Download Center, then copy it to the master host. Install the software package by running the following command: |
| |
| ``` |
| $ gppkg -i plr-1.0-rhel5-x86_64.gppkg |
| ``` |
| |
| 2. Source the $GPHOME/greenplum\_path.sh file. The extension and the R environment are installed in the $GPHOME/ext/R-2.13.0/ directory. |
| 3. Restart the database: |
| |
| ``` |
| $ gpstop -r |
| ``` |
| |
| |
| ### Enable PL/R Language Support <a id="enablingplrlanguagesupport"></a> |
| |
| For each database that requires its use, register the PL/R language with the `CREATE LANGUAGE` SQL command or the `createlang` utility. For example, running the following command as the `gpadmin` user registers the language for a database named `testdb`: |
| |
| ``` |
| $ createlang plr -d testdb |
| ``` |
| |
| PL/R is registered as an untrusted language. |
| |
| You are now ready to create new PLR functions. A library of convenient PLR functions may be found in $GPHOME/share/postgresql/contrib/plr.sql. These functions may be installed by using the `psql` utility to execute plr.sql, as follows: |
| |
| ``` |
| psql -d <dbname> -f $GPHOME/share/postgresql/contrib/plr.sql |
| ``` |
| |
| ### Uninstalling PL/R <a id="uninstallingplr"></a> |
| |
| When you remove PL/R language support from a database, the PL/R routines that you created in the database will no longer work. |
| |
| #### Remove PL/R Support for a Database <a id="removeplrsupportforadatabase"></a> |
| |
| For a database that no long requires the PL/R language, remove support for PL/R with the SQL command `DROP LANGUAGE`or the `droplang` utility. For example, running the following command as the `gpadmin` user removes support for PL/R from the database `testdb`: |
| |
| ``` |
| $ droplang plr -d testdb |
| ``` |
| |
| #### Uninstall the Software Package <a id="uninstallthesoftwarepackage"></a> |
| |
| If no databases have PL/R as a registered language, uninstall the Greenplum PL/R extension with the `gppkg` utility. This example uninstalls PL/R package version 1.0: |
| |
| ``` |
| $ gppkg -r plr-1.0 |
| ``` |
| |
| You can run the `gppkg` utility with the options `-q --all` to list the installed extensions and their versions. |
| |
| Then, restart the database. |
| |
| ``` |
| $ gpstop -r |
| ``` |
| |
| ##### Downloading and Installing R libraries |
| |
| For a given R library, identify all dependent R libraries and each library's web URL.This can be found by selecting the specific package from the following page: [http://cran.r-project.org/web/packages/available\_packages\_by\_name.html](http://cran.r-project.org/web/packages/available_packages_by_name.html) |
| |
| From the page for the arm library, you can see that this library requires the following R libraries: |
| |
| - Matrix |
| - lattice |
| - lme4 |
| - R2WinBUGS |
| - coda |
| - abind |
| - foreign |
| - MASS |
| |
| From the command line, use `wget` to download the tar.gz files for the required libraries to the master node: |
| |
| ``` |
| $ wget http://cran.r-project.org/src/contrib/arm_1.5-03.tar.gz |
| $ wget http://cran.r-project.org/src/contrib/Archive/Matrix/Matrix_1.0-1.tar.gz |
| $ wget http://cran.r-project.org/src/contrib/Archive/lattice/lattice_0.19-33.tar.gz |
| $ wget http://cran.r-project.org/src/contrib/lme4_0.999375-42.tar.gz |
| $ wget http://cran.r-project.org/src/contrib/R2WinBUGS_2.1-18.tar.gz |
| $ wget http://cran.r-project.org/src/contrib/coda_0.14-7.tar.gz |
| $ wget http://cran.r-project.org/src/contrib/abind_1.4-0.tar.gz |
| $ wget http://cran.r-project.org/src/contrib/foreign_0.8-49.tar.gz |
| $ wget http://cran.r-project.org/src/contrib/MASS_7.3-17.tar.gz |
| ``` |
| |
| Using `gpscp` and the hostname file, copy the tar.gz files to the same directory on all nodes of the HAWQ cluster. You may require `root` access to do this. |
| |
| ``` |
| $ gpscp -f /home/gpadmin/hosts_all lattice_0.19-33.tar.gz =:/home/gpadmin |
| $ gpscp -f /home/gpadmin/hosts_all Matrix_1.0-1.tar.gz =:/home/gpadmin |
| $ gpscp -f /home/gpadmin/hosts_all abind_1.4-0.tar.gz =:/home/gpadmin |
| $ gpscp -f /home/gpadmin/hosts_all coda_0.14-7.tar.gz =:/home/gpadmin |
| $ gpscp -f /home/gpadmin/hosts_all R2WinBUGS_2.1-18.tar.gz =:/home/gpadmin |
| $ gpscp -f /home/gpadmin/hosts_all lme4_0.999375-42.tar.gz =:/home/gpadmin |
| $ gpscp -f /home/gpadmin/hosts_all MASS_7.3-17.tar.gz =:/home/gpadmin |
| $ gpscp -f /home/gpadmin/hosts_all arm_1.5-03.tar.gz =:/home/gpadmin |
| ``` |
| |
| Use `R CMD INSTALL` to install the packages from the command line. You may require `root` access to do this. |
| |
| ``` |
| $ R CMD INSTALL lattice_0.19-33.tar.gz Matrix_1.0-1.tar.gz abind_1.4-0.tar.gz coda_0.14-7.tar.gz R2WinBUGS_2.1-18.tar.gz lme4_0.999375-42.tar.gz MASS_7.3-17.tar.gz arm_1.5-03.tar.gz |
| ``` |
| |
| ## Installing PL/Java <a id="installingpljava"></a> |
| |
| The PL/Java extension is available as a package that you can download from the Pivotal Download Center and then install with the Package Manager utility \(`gppkg`\). |
| |
| **Note:** Before you install PL/Java: |
| |
| - Ensure that the `$JAVA_HOME` variable is set to the same path on the master and all the segments. |
| - Perform the following step on all machines to set up `ldconfig` for JDK: |
| |
| ``` |
| $ echo "$JAVA_HOME/jre/lib/amd64/server" > /etc/ld.so.conf.d/libjdk.conf |
| $ ldconfig |
| ``` |
| |
| - If you are upgrading to the latest version of Java or installing it as part of the expansion process, follow the instructions in the chapter, *Expanding the HAWQ System* in the *HAWQ Administrator Guide*. |
| - PL/Java is compatible with JDK 1.6 and 1.7. |
| |
| The `gppkg` utility installs HAWQ extensions, along with any dependencies, on all hosts across a cluster. It also automatically installs extensions on new hosts in the case of system expansion and segment recovery. |
| |
| To install and use PL/Java: |
| |
| 1. Install the PL/Java extension. |
| 2. Enable the language for each database. |
| 3. Install user-created JAR files containing Java methods on all HAWQ hosts. |
| 4. Add the name of the JAR file to the HAWQ `pljava_classpath` environment variable. The variable lists the installed JAR files. |
| |
| **Note:** Before you install the PL/Java extension, make sure that your Greenplum database is running, you have sourced greenplum\_path.sh, and that the `$MASTER_DATA_DIRECTORY` and `$GPHOME` variables are set. |
| |
| ### Install the HAWQ PL/Java Extension <a id="installingthehawqpljavaextension"></a> |
| |
| 1. Download the PL/Java extension package from the Pivotal Download Center and copy it to the master host. |
| 2. Install the software extension package by running the `gppkg` command. This example installs the PL/Java extension package on a Linux system: |
| |
| ``` |
| $ gppkg -i pljava-1.1-rhel5-x86_64.gppkg |
| ``` |
| |
| 3. Restart the database: |
| |
| ``` |
| $ gpstop -r |
| ``` |
| |
| 4. Source the $GPHOME/greenplum\_path.sh file. |
| |
| ### Enable PL/Java and Install JAR Files <a id="enablingpljavaandinstallingjarfiles"></a> |
| |
| Perform the following steps as the HAWQ administrator `gpadmin`: |
| |
| 1. Enable PL/Java by running the $GPHOME/share/postgresql/pljava/install.sql SQL script in the databases that use PL/Java. For example, this example enables PL/Java on a database named `mytestdb`: |
| |
| ``` |
| $ psql -d mytestdb -f $GPHOME/share/postgresql/pljava/install.sql |
| ``` |
| |
| The install.sql script registers both the trusted and untrusted PL/Java. |
| |
| 2. Copy your Java archives \(JAR files\) to $GPHOME/lib/postgresql/java/ on all the HAWQ hosts. This example uses the gpscp utility to copy the myclasses.jar file: |
| |
| ``` |
| $ gpscp -f gphosts_file myclasses.jar =:/usr/local/greenplum-db/lib/postgresql/java/ |
| ``` |
| |
| The gphosts\_file file contains a list of the Greenplum Database hosts. |
| |
| 3. Set the `pljava_classpath` server configuration parameter in the master hawq-site.xml file. The parameter value is a colon \(`:`\) separated list of the JAR files containing the Java classes used in any PL/Java functions. For example: |
| |
| ``` |
| $ gpconfig -c pljava_classpath -v \'examples.jar:myclasses.jar\' --masteronly |
| ``` |
| |
| 4. Restart the database: |
| |
| ``` |
| $ gpstop -r |
| ``` |
| |
| 5. **\(Optional\)** Pivotal provides an examples.sql file containing sample PL/Java functions that you can use for testing. Run the commands in this file to create the test functions \(which use the Java classes in examples.jar\): |
| |
| ``` |
| $ psql -f $GPHOME/share/postgresql/pljava/examples.sql |
| ``` |
| |
| Enabling the PL/Java extension in the `template1` database enables PL/Java in any new Greenplum databases: |
| |
| ``` |
| $ psql template1 -f $GPHOME/share/postgresql/pljava/install.sql |
| ``` |
| |
| |
| #### Configuring PL/Java vmoptions <a id="configuringpljavavmoptions"></a> |
| |
| PL/Java JVM options can be configured via the `pljava_vmoptions` parameter in the hawq-site.xml file. For example, `pljava_vmoptions=-Xmx512M` sets the maximum heap size of the JVM. The default `Xmx` value is set to `-Xmx64M`. |
| |
| ### Uninstalling PL/Java <a id="uninstallingpljava"></a> |
| |
| To uninstall PL/Java, you should: |
| |
| 1. Remove PL/Java Support for a Database |
| 2. Uninstall the Java JAR files and Software Package |
| |
| #### Remove PL/Java Support for a Database <a id="removepljavasupportforadatabase"></a> |
| |
| For a database that no long requires the PL/Java language, remove support for PL/Java by running the uninstall.sql file as the `gpadmin` user. For example, the following command disables the PL/Java language in the specified database: |
| |
| ``` |
| $ psql -d mydatabase -f $GPHOME/share/postgresql/pljava/uninstall.sql |
| ``` |
| |
| #### Uninstall the Java JAR files and Software Package <a id="uninstallthejavajarfilesandsoftwarepackage"></a> |
| |
| If no databases have PL/Java as a registered language, remove the Java JAR files and uninstall the Greenplum PL/Java extension with the `gppkg` utility: |
| |
| 1. Remove the `pljava_classpath` server configuration parameter in the master hawq-site.xml file. |
| 2. Remove the JAR files from the $GPHOME/lib/postgresql/java/ directory of the HAWQ hosts. |
| 3. Use the `gppkg` utility with the `-r` option to uninstall the PL/Java extension. The following example uninstalls the PL/Java extension on a Linux system: |
| |
| ``` |
| $ gppkg -r pljava-1.1 |
| ``` |
| |
| You can run the `gppkg` utility with the options `-q --all` to list the installed extensions and their versions. |
| |
| 4. After you uninstall the extension, restart the database: |
| |
| ``` |
| $ gpstop -r |
| ``` |
| |
| |
| #### Installing Custom JARS <a id="installingcustomjars"></a> |
| |
| 1. Copy the jar file on the master host in $GPHOME/lib/postgresql/java. |
| 2. Copy the jar file on all segments in the same location using `gpscp` from master: |
| |
| ``` |
| $ cd $GPHOME/lib/postgresql/java |
| $ gpscp -f ~/hosts.txt myfunc.jar =:$GPHOME/lib/postgresql/java/ |
| ``` |
| |
| 3. Set `pljava_classpath` to include the newly-copied jar file: |
| - From the `psql` session, execute set to affect the current session: |
| |
| ``` |
| set pljava_classpath='myfunc.jar'; |
| ``` |
| |
| - To affect all sessions, use `gpconfig`: |
| |
| ``` |
| gpconfig -c pljava_classpath -v \'myfunc.jar\' |
| ``` |
| |
| |
| ## Installing MADlib on HAWQ <a id="installingmadlibonhawq1"></a> |
| |
| The MADlib library adds statistical and machine learning functionality to HAWQ. MADlib is provided as a package that you can download from the Pivotal Download Center and install using the Package Manager utility \(`gppkg`\). `gppkg` installs MADlib and other Greenplum Database extensions, along with any dependencies, on all hosts across a cluster. It also automatically installs extensions on new hosts in the case of system expansion and segment recovery. |
| |
| ### Pre-requisites for Installing MADlib on HAWQ <a id="pre-requisitesforinstallingmadlibonhawq"></a> |
| |
| **Note:** Before you install the MADlib software package, make sure that your HAWQ database is running, that you have sourced greenplum\_path.sh, and that the `$MASTER_DATA_DIRECTORY` and `$GPHOME` variables are set. |
| |
| ### Install MADlib on HAWQ <a id="installingmadlibonhawq2"></a> |
| |
| 1. Download the MADlib package from the Pivotal Download Center, then copy it to the master host. Install the software package by running the command: |
| |
| ``` |
| $ gppkg -i madlib-ossv1.7.1_pv1.9.3_hawq1.3-rhel5-x86_64.gppkg |
| ``` |
| |
| The installation process begins and shows output similar to: |
| |
| ``` |
| 20150330:21:28:33:021734 gppkg:gpdb11:gpdbchina-[INFO]:-Starting |
| gppkg with args: -i /data/home/gpdbchina/pulse2-data/agents/agent1/work/ |
| MADlib%20TINC%20Feature%20Test%20on%20HAWQ%201.3/rhel5_x86_64/madlib/ |
| madlib-ossv1.7.1_pv1.9.3_hawq1.3-rhel5-x86_64.gppkg |
| 20150330:21:28:33:021734 gppkg:gpdb11:gpdbchina-[INFO]:-Installing package |
| madlib-ossv1.7.1_pv1.9.3_hawq1.3-rhel5-x86_64.gppkgInstalled GPDB Version: |
| pg_ctl (HAWQ) 1.3.0.0 build 12954 |
| [...] |
| ``` |
| |
| 2. Restart the database: |
| |
| ``` |
| $ gpstop -r |
| ``` |
| |
| 3. Source the $GPHOME/greenplum\_path.sh file. |
| 4. Deploy the MADlib objects to a database using the `GPHOME/madlib/bin/madpack` utility. The syntax for installing objects is: |
| |
| ``` |
| madpack install [-s schema\_name] -p hawq -c user@host:port/database |
| ``` |
| |
| The default schema name is `madlib`. |
| |
| For example, the following command install the objects to a database named "testdb" on server mdw:5432 using the gpadmin user: |
| |
| ``` |
| $ $GPHOME/madlib/bin/madpack install -s madlib -p hawq -c gpadmin@mdw:5432/testdb |
| ``` |
| |
| Enter the password for the specified user when prompted. |
| |
| 5. To learn more about additional options for the `madpack` utility, enter: |
| |
| ``` |
| $GPHOME/madlib/bin/madpack --help |
| ``` |
| |
| See also the documentation available at [madlib.net](http://madlib.net/documentation/). |