gateway-site/src/site/markdown/examples.md.vm - knox - Git at Google

 <!---
 Licensed to the Apache Software Foundation (ASF) under one or more
 contributor license agreements.  See the NOTICE file distributed with
 this work for additional information regarding copyright ownership.
 The ASF licenses this file to You under the Apache License, Version 2.0
 (the "License"); you may not use this file except in compliance with
 the License.  You may obtain a copy of the License at

     http://www.apache.org/licenses/LICENSE-2.0

 Unless required by applicable law or agreed to in writing, software
 distributed under the License is distributed on an "AS IS" BASIS,
 WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 See the License for the specific language governing permissions and
 limitations under the License.
 -->

 ------------------------------------------------------------------------------
 Apache Knox Gateway - Usage Examples
 ------------------------------------------------------------------------------
 This guide provides detailed examples for how to do some basic interactions
 with Hadoop via the Apache Knox Gateway.

 The first two examples submit a Java MapReduce job and workflow using the
 KnoxShell DSL

 * Example #1: WebHDFS & Templeton/WebHCat via KnoxShell DSL
 * Example #2: WebHDFS & Oozie via KnoxShell DSL

 The second two examples submit the same job and workflow but do so using only
 the [cURL](http://curl.haxx.se/) command line HTTP client.

 * Example #1: WebHDFS & Templeton/WebHCat via cURL
 * Example #2: WebHDFS & Oozie via KnoxShell cURL

 ------------------------------------------------------------------------------
 Assumptions
 ------------------------------------------------------------------------------
 This document assumes a few things about your environment in order to
 simplify the examples.

 1. The JVM is executable as simply java.
 2. The Apache Knox Gateway is installed and functional.
 3. The example commands are executed within the context of the GATEWAY_HOME
    current directory. The GATEWAY_HOME directory is the directory within the
    Apache Knox Gateway installation that contains the README file and the bin,
    conf and deployments directories.
 4. A few examples optionally require the use of commands from a standard
    Groovy installation.  These examples are optional but to try them you will
    need Groovy [installed][gii].

 [gii]: http://groovy.codehaus.org/Installing+Groovy

 ------------------------------------------------------------------------------
 Customization
 ------------------------------------------------------------------------------
 These examples may need to be tailored to the execution environment.  In
 particular hostnames and ports may need to be changes to match your
 environment.  In particular there are two example files in the distribution
 that may need to be customized.  Take a moment to review these files.
 All of the values that may need to be customized can be found together at the
 top of each file.

 * samples/ExampleSubmitJob.groovy
 * samples/ExampleSubmitWorkflow.groovy

 If you are using the Sandbox VM for your Hadoop cluster you may want to
 review [these configuration tips][sb].

 [sb]: sandbox.html

 ------------------------------------------------------------------------------
 Example #1: WebHDFS & Templeton/WebHCat via KnoxShell DSL
 ------------------------------------------------------------------------------
 This example will submit the familiar WordCount Java MapReduce job to the
 Hadoop cluster via the gateway using the KnoxShell DSL.  There are several
 ways to do this depending upon your preference.

 You can use the "embedded" Groovy interpreter provided with the distribution.

     java -jar bin/shell-${gateway-version}.jar samples/ExampleSubmitJob.groovy

 You can load the KnoxShell DSL script into the standard Groovy Console.

     groovyConsole -cp bin/shell-${gateway-version}.jar samples/ExampleSubmitJob.groovy

 You can manually type in the KnoxShell DSL script into the "embedded" Groovy
 interpreter provided with the distribution.

     java -jar bin/shell-${gateway-version}.jar

 Each line from the file below will need to be typed or copied into the
 interactive shell.

 ***samples/ExampleSubmitJob***

     import com.jayway.jsonpath.JsonPath
     import org.apache.hadoop.gateway.shell.Hadoop
     import org.apache.hadoop.gateway.shell.hdfs.Hdfs
     import org.apache.hadoop.gateway.shell.job.Job

     import static java.util.concurrent.TimeUnit.SECONDS

     gateway = "https://localhost:8443/gateway/sample"
     username = "mapred"
     password = "mapred-password"
     dataFile = "LICENSE"
     jarFile = "samples/hadoop-examples.jar"

     hadoop = Hadoop.login( gateway, username, password )

     println "Delete /tmp/test " + Hdfs.rm(hadoop).file( "/tmp/test" ).recursive().now().statusCode
     println "Create /tmp/test " + Hdfs.mkdir(hadoop).dir( "/tmp/test").now().statusCode

     putData = Hdfs.put(hadoop).file( dataFile ).to( "/tmp/test/input/FILE" ).later() {
       println "Put /tmp/test/input/FILE " + it.statusCode }
     putJar = Hdfs.put(hadoop).file( jarFile ).to( "/tmp/test/hadoop-examples.jar" ).later() {
       println "Put /tmp/test/hadoop-examples.jar " + it.statusCode }
     hadoop.waitFor( putData, putJar )

     jobId = Job.submitJava(hadoop) \
       .jar( "/tmp/test/hadoop-examples.jar" ) \
       .app( "wordcount" ) \
       .input( "/tmp/test/input" ) \
       .output( "/tmp/test/output" ) \
       .now().jobId
     println "Submitted job " + jobId

     done = false
     count = 0
     while( !done && count++ < 60 ) {
       sleep( 1000 )
       json = Job.queryStatus(hadoop).jobId(jobId).now().string
       done = JsonPath.read( json, "\$.status.jobComplete" )
     }
     println "Done " + done

     println "Shutdown " + hadoop.shutdown( 10, SECONDS )

 ------------------------------------------------------------------------------
 Example #2: WebHDFS & Oozie via KnoxShell DSL
 ------------------------------------------------------------------------------
 This example will also submit the familiar WordCount Java MapReduce job to the
 Hadoop cluster via the gateway using the KnoxShell DSL.  However in this case
 the job will be submitted via a Oozie workflow.  There are several ways to do
 this depending upon your preference.

 You can use the "embedded" Groovy interpreter provided with the distribution.
     java -jar bin/shell-${gateway-version}.jar samples/ExampleSubmitWorkflow.groovy

 You can load the KnoxShell DSL script into the standard Groovy Console.
     groovyConsole -cp bin/shell-${gateway-version}.jar samples/ExampleSubmitWorkflow.groovy

 You can manually type in the KnoxShell DSL script into the "embedded" Groovy
 interpreter provided with the distribution.

     java -jar bin/shell-${gateway-version}.jar

 Each line from the file below will need to be typed or copied into the
 interactive shell.

 ***samples/ExampleSubmitWorkflow.groovy***

     import com.jayway.jsonpath.JsonPath
     import org.apache.hadoop.gateway.shell.Hadoop
     import org.apache.hadoop.gateway.shell.hdfs.Hdfs
     import org.apache.hadoop.gateway.shell.workflow.Workflow

     import static java.util.concurrent.TimeUnit.SECONDS

     gateway = "https://localhost:8443/gateway/sample"
     jobTracker = "sandbox:50300";
     nameNode = "sandbox:8020";
     username = "mapred"
     password = "mapred-password"
     inputFile = "LICENSE"
     jarFile = "samples/hadoop-examples.jar"

     definition = """\
     <workflow-app xmlns="uri:oozie:workflow:0.2" name="wordcount-workflow">
         <start to="root-node"/>
         <action name="root-node">
             <java>
                 <job-tracker>$jobTracker</job-tracker>
                 <name-node>hdfs://$nameNode</name-node>
                 <main-class>org.apache.hadoop.examples.WordCount</main-class>
                 <arg>/tmp/test/input</arg>
                 <arg>/tmp/test/output</arg>
             </java>
             <ok to="end"/>
             <error to="fail"/>
         </action>
         <kill name="fail">
             <message>Java failed</message>
         </kill>
         <end name="end"/>
     </workflow-app>
     """

     configuration = """\
     <configuration>
         <property>
             <name>user.name</name>
             <value>$username</value>
         </property>
         <property>
             <name>oozie.wf.application.path</name>
             <value>hdfs://$nameNode/tmp/test</value>
         </property>
     </configuration>
     """

     hadoop = Hadoop.login( gateway, username, password )

     println "Delete /tmp/test " + Hdfs.rm(hadoop).file( "/tmp/test" ).recursive().now().statusCode
     println "Mkdir /tmp/test " + Hdfs.mkdir(hadoop).dir( "/tmp/test").now().statusCode
     putWorkflow = Hdfs.put(hadoop).text( definition ).to( "/tmp/test/workflow.xml" ).later() {
       println "Put /tmp/test/workflow.xml " + it.statusCode }
     putData = Hdfs.put(hadoop).file( inputFile ).to( "/tmp/test/input/FILE" ).later() {
       println "Put /tmp/test/input/FILE " + it.statusCode }
     putJar = Hdfs.put(hadoop).file( jarFile ).to( "/tmp/test/lib/hadoop-examples.jar" ).later() {
       println "Put /tmp/test/lib/hadoop-examples.jar " + it.statusCode }
     hadoop.waitFor( putWorkflow, putData, putJar )

     jobId = Workflow.submit(hadoop).text( configuration ).now().jobId
     println "Submitted job " + jobId

     status = "UNKNOWN";
     count = 0;
     while( status != "SUCCEEDED" && count++ < 60 ) {
       sleep( 1000 )
       json = Workflow.status(hadoop).jobId( jobId ).now().string
       status = JsonPath.read( json, "\$.status" )
     }
     println "Job status " + status;

     println "Shutdown " + hadoop.shutdown( 10, SECONDS )

 ------------------------------------------------------------------------------
 Example #3: WebHDFS & Templeton/WebHCat via cURL
 ------------------------------------------------------------------------------
 The example below illustrates the sequence of curl commands that could be used
 to run a "word count" map reduce job.  It utilizes the hadoop-examples.jar
 from a Hadoop install for running a simple word count job.  Take care to
 follow the instructions below for steps 4/5 and 6/7 where the Location header
 returned by the call to the NameNode is copied for use with the call to the
 DataNode that follows it.

     # 0. Optionally cleanup the test directory in case a previous example was run without cleaning up.
     curl -i -k -u mapred:mapred-password -X DELETE \
       'https://localhost:8443/gateway/sample/namenode/api/v1/tmp/test?op=DELETE&recursive=true'

     # 1. Create a test input directory /tmp/test/input
     curl -i -k -u mapred:mapred-password -X PUT \
       'https://localhost:8443/gateway/sample/namenode/api/v1/tmp/test/input?op=MKDIRS'

     # 2. Create a test output directory /tmp/test/input
     curl -i -k -u mapred:mapred-password -X PUT \
       'https://localhost:8443/gateway/sample/namenode/api/v1/tmp/test/output?op=MKDIRS'

     # 3. Create the inode for hadoop-examples.jar in /tmp/test
     curl -i -k -u mapred:mapred-password -X PUT \
       'https://localhost:8443/gateway/sample/namenode/api/v1/tmp/test/hadoop-examples.jar?op=CREATE'

     # 4. Upload hadoop-examples.jar to /tmp/test.  Use a hadoop-examples.jar from a Hadoop install.
     curl -i -k -u mapred:mapred-password -T hadoop-examples.jar -X PUT '{Value Location header from command above}'

     # 5. Create the inode for a sample file README in /tmp/test/input
     curl -i -k -u mapred:mapred-password -X PUT \
       'https://localhost:8443/gateway/sample/namenode/api/v1/tmp/test/input/README?op=CREATE'

     # 6. Upload readme.txt to /tmp/test/input.  Use the readme.txt in {GATEWAY_HOME}.
     curl -i -k -u mapred:mapred-password -T README -X PUT '{Value of Location header from command above}'

     # 7. Submit the word count job via WebHCat/Templeton.
     # Take note of the Job ID in the JSON response as this will be used in the next step.
     curl -v -i -k -u mapred:mapred-password -X POST \
       -d jar=/tmp/test/hadoop-examples.jar -d class=wordcount \
       -d arg=/tmp/test/input -d arg=/tmp/test/output \
       'https://localhost:8443/gateway/sample/templeton/api/v1/mapreduce/jar'

     # 8. Look at the status of the job
     curl -i -k -u mapred:mapred-password -X GET \
       'https://localhost:8443/gateway/sample/templeton/api/v1/queue/{Job ID returned in JSON body from previous step}'

     # 9. Look at the status of the job queue
     curl -i -k -u mapred:mapred-password -X GET \
       'https://localhost:8443/gateway/sample/templeton/api/v1/queue'

     # 10. List the contents of the output directory /tmp/test/output
     curl -i -k -u mapred:mapred-password -X GET \
       'https://localhost:8443/gateway/sample/namenode/api/v1/tmp/test/output?op=LISTSTATUS'

     # 11. Optionally cleanup the test directory
     curl -i -k -u mapred:mapred-password -X DELETE \
       'https://localhost:8443/gateway/sample/namenode/api/v1/tmp/test?op=DELETE&recursive=true'

 ------------------------------------------------------------------------------
 Example #4: WebHDFS & Oozie via cURL
 ------------------------------------------------------------------------------
 The example below illustrates the sequence of curl commands that could be used
 to run a "word count" map reduce job via an Oozie workflow.  It utilizes the
 hadoop-examples.jar from a Hadoop install for running a simple word count job.
 Take care to follow the instructions below where replacement values are
 required.  These replacement values are identivied with { } markup.

     # 0. Optionally cleanup the test directory in case a previous example was run without cleaning up.
     curl -i -k -u mapred:mapred-password -X DELETE \
       'https://localhost:8443/gateway/sample/namenode/api/v1/tmp/test?op=DELETE&recursive=true'

     # 1. Create the inode for workflow definition file in /tmp/test
     curl -i -k -u mapred:mapred-password -X PUT \
       'https://localhost:8443/gateway/sample/namenode/api/v1/tmp/test/workflow.xml?op=CREATE'

     # 2. Upload the workflow definition file.  This file can be found in {GATEWAY_HOME}/templates
     curl -i -k -u mapred:mapred-password -T templates/workflow-definition.xml -X PUT \
       '{Value Location header from command above}'

     # 3. Create the inode for hadoop-examples.jar in /tmp/test/lib
     curl -i -k -u mapred:mapred-password -X PUT \
       'https://localhost:8443/gateway/sample/namenode/api/v1/tmp/test/lib/hadoop-examples.jar?op=CREATE'

     # 4. Upload hadoop-examples.jar to /tmp/test/lib.  Use a hadoop-examples.jar from a Hadoop install.
     curl -i -k -u mapred:mapred-password -T hadoop-examples.jar -X PUT \
       '{Value Location header from command above}'

     # 5. Create the inode for a sample input file readme.txt in /tmp/test/input.
     curl -i -k -u mapred:mapred-password -X PUT \
       'https://localhost:8443/gateway/sample/namenode/api/v1/tmp/test/input/README?op=CREATE'

     # 6. Upload readme.txt to /tmp/test/input.  Use the readme.txt in {GATEWAY_HOME}.
     # The sample below uses this README file found in {GATEWAY_HOME}.
     curl -i -k -u mapred:mapred-password -T README -X PUT \
       '{Value of Location header from command above}'

     # 7. Create the job configuration file by replacing the {NameNode host:port} and {JobTracker host:port}
     # in the command below to values that match your Hadoop configuration.
     # NOTE: The hostnames must be resolvable by the Oozie daemon.  The ports are the RPC ports not the HTTP ports.
     # For example {NameNode host:port} might be sandbox:8020 and {JobTracker host:port} sandbox:50300
     # The source workflow-configuration.xml file can be found in {GATEWAY_HOME}/templates
     # Alternatively, this file can copied and edited manually for environments without the sed utility.
     sed -e s/REPLACE.NAMENODE.RPCHOSTPORT/{NameNode host:port}/ \
       -e s/REPLACE.JOBTRACKER.RPCHOSTPORT/{JobTracker host:port}/ \
       <templates/workflow-configuration.xml >workflow-configuration.xml

     # 8. Submit the job via Oozie
     # Take note of the Job ID in the JSON response as this will be used in the next step.
     curl -i -k -u mapred:mapred-password -T workflow-configuration.xml -H Content-Type:application/xml -X POST \
       'https://localhost:8443/gateway/oozie/sample/api/v1/jobs?action=start'

     # 9. Query the job status via Oozie.
     curl -i -k -u mapred:mapred-password -X GET \
       'https://localhost:8443/gateway/sample/oozie/api/v1/job/{Job ID returned in JSON body from previous step}'

     # 10. List the contents of the output directory /tmp/test/output
     curl -i -k -u mapred:mapred-password -X GET \
       'https://localhost:8443/gateway/sample/namenode/api/v1/tmp/test/output?op=LISTSTATUS'

     # 11. Optionally cleanup the test directory
     curl -i -k -u mapred:mapred-password -X DELETE \
       'https://localhost:8443/gateway/sample/namenode/api/v1/tmp/test?op=DELETE&recursive=true'

 ------------------------------------------------------------------------------
 Disclaimer
 ------------------------------------------------------------------------------
 The Apache Knox Gateway is an effort undergoing incubation at the
 Apache Software Foundation (ASF), sponsored by the Apache Incubator PMC.

 Incubation is required of all newly accepted projects until a further review
 indicates that the infrastructure, communications, and decision making process
 have stabilized in a manner consistent with other successful ASF projects.

 While incubation status is not necessarily a reflection of the completeness
 or stability of the code, it does indicate that the project has yet to be
 fully endorsed by the ASF.
	<!---
	Licensed to the Apache Software Foundation (ASF) under one or more
	contributor license agreements. See the NOTICE file distributed with
	this work for additional information regarding copyright ownership.
	The ASF licenses this file to You under the Apache License, Version 2.0
	(the "License"); you may not use this file except in compliance with
	the License. You may obtain a copy of the License at

	http://www.apache.org/licenses/LICENSE-2.0

	Unless required by applicable law or agreed to in writing, software
	distributed under the License is distributed on an "AS IS" BASIS,
	WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
	See the License for the specific language governing permissions and
	limitations under the License.
	-->

	------------------------------------------------------------------------------
	Apache Knox Gateway - Usage Examples
	------------------------------------------------------------------------------
	This guide provides detailed examples for how to do some basic interactions
	with Hadoop via the Apache Knox Gateway.

	The first two examples submit a Java MapReduce job and workflow using the
	KnoxShell DSL

	* Example #1: WebHDFS & Templeton/WebHCat via KnoxShell DSL
	* Example #2: WebHDFS & Oozie via KnoxShell DSL

	The second two examples submit the same job and workflow but do so using only
	the [cURL](http://curl.haxx.se/) command line HTTP client.

	* Example #1: WebHDFS & Templeton/WebHCat via cURL
	* Example #2: WebHDFS & Oozie via KnoxShell cURL

	------------------------------------------------------------------------------
	Assumptions
	------------------------------------------------------------------------------
	This document assumes a few things about your environment in order to
	simplify the examples.

	1. The JVM is executable as simply java.
	2. The Apache Knox Gateway is installed and functional.
	3. The example commands are executed within the context of the GATEWAY_HOME
	current directory. The GATEWAY_HOME directory is the directory within the
	Apache Knox Gateway installation that contains the README file and the bin,
	conf and deployments directories.
	4. A few examples optionally require the use of commands from a standard
	Groovy installation. These examples are optional but to try them you will
	need Groovy [installed][gii].

	[gii]: http://groovy.codehaus.org/Installing+Groovy

	------------------------------------------------------------------------------
	Customization
	------------------------------------------------------------------------------
	These examples may need to be tailored to the execution environment. In
	particular hostnames and ports may need to be changes to match your
	environment. In particular there are two example files in the distribution
	that may need to be customized. Take a moment to review these files.
	All of the values that may need to be customized can be found together at the
	top of each file.

	* samples/ExampleSubmitJob.groovy
	* samples/ExampleSubmitWorkflow.groovy

	If you are using the Sandbox VM for your Hadoop cluster you may want to
	review [these configuration tips][sb].

	[sb]: sandbox.html

	------------------------------------------------------------------------------
	Example #1: WebHDFS & Templeton/WebHCat via KnoxShell DSL
	------------------------------------------------------------------------------
	This example will submit the familiar WordCount Java MapReduce job to the
	Hadoop cluster via the gateway using the KnoxShell DSL. There are several
	ways to do this depending upon your preference.

	You can use the "embedded" Groovy interpreter provided with the distribution.

	java -jar bin/shell-${gateway-version}.jar samples/ExampleSubmitJob.groovy

	You can load the KnoxShell DSL script into the standard Groovy Console.

	groovyConsole -cp bin/shell-${gateway-version}.jar samples/ExampleSubmitJob.groovy

	You can manually type in the KnoxShell DSL script into the "embedded" Groovy
	interpreter provided with the distribution.

	java -jar bin/shell-${gateway-version}.jar

	Each line from the file below will need to be typed or copied into the
	interactive shell.

	*samples/ExampleSubmitJob*

	import com.jayway.jsonpath.JsonPath
	import org.apache.hadoop.gateway.shell.Hadoop
	import org.apache.hadoop.gateway.shell.hdfs.Hdfs
	import org.apache.hadoop.gateway.shell.job.Job

	import static java.util.concurrent.TimeUnit.SECONDS

	gateway = "https://localhost:8443/gateway/sample"
	username = "mapred"
	password = "mapred-password"
	dataFile = "LICENSE"
	jarFile = "samples/hadoop-examples.jar"

	hadoop = Hadoop.login( gateway, username, password )

	println "Delete /tmp/test " + Hdfs.rm(hadoop).file( "/tmp/test" ).recursive().now().statusCode
	println "Create /tmp/test " + Hdfs.mkdir(hadoop).dir( "/tmp/test").now().statusCode

	putData = Hdfs.put(hadoop).file( dataFile ).to( "/tmp/test/input/FILE" ).later() {
	println "Put /tmp/test/input/FILE " + it.statusCode }
	putJar = Hdfs.put(hadoop).file( jarFile ).to( "/tmp/test/hadoop-examples.jar" ).later() {
	println "Put /tmp/test/hadoop-examples.jar " + it.statusCode }
	hadoop.waitFor( putData, putJar )

	jobId = Job.submitJava(hadoop) \
	.jar( "/tmp/test/hadoop-examples.jar" ) \
	.app( "wordcount" ) \
	.input( "/tmp/test/input" ) \
	.output( "/tmp/test/output" ) \
	.now().jobId
	println "Submitted job " + jobId

	done = false
	count = 0
	while( !done && count++ < 60 ) {
	sleep( 1000 )
	json = Job.queryStatus(hadoop).jobId(jobId).now().string
	done = JsonPath.read( json, "\$.status.jobComplete" )
	}
	println "Done " + done

	println "Shutdown " + hadoop.shutdown( 10, SECONDS )

	------------------------------------------------------------------------------
	Example #2: WebHDFS & Oozie via KnoxShell DSL
	------------------------------------------------------------------------------
	This example will also submit the familiar WordCount Java MapReduce job to the
	Hadoop cluster via the gateway using the KnoxShell DSL. However in this case
	the job will be submitted via a Oozie workflow. There are several ways to do
	this depending upon your preference.

	You can use the "embedded" Groovy interpreter provided with the distribution.
	java -jar bin/shell-${gateway-version}.jar samples/ExampleSubmitWorkflow.groovy

	You can load the KnoxShell DSL script into the standard Groovy Console.
	groovyConsole -cp bin/shell-${gateway-version}.jar samples/ExampleSubmitWorkflow.groovy

	You can manually type in the KnoxShell DSL script into the "embedded" Groovy
	interpreter provided with the distribution.

	java -jar bin/shell-${gateway-version}.jar

	Each line from the file below will need to be typed or copied into the
	interactive shell.

	*samples/ExampleSubmitWorkflow.groovy*

	import com.jayway.jsonpath.JsonPath
	import org.apache.hadoop.gateway.shell.Hadoop
	import org.apache.hadoop.gateway.shell.hdfs.Hdfs
	import org.apache.hadoop.gateway.shell.workflow.Workflow

	import static java.util.concurrent.TimeUnit.SECONDS

	gateway = "https://localhost:8443/gateway/sample"
	jobTracker = "sandbox:50300";
	nameNode = "sandbox:8020";
	username = "mapred"
	password = "mapred-password"
	inputFile = "LICENSE"
	jarFile = "samples/hadoop-examples.jar"

	definition = """\
	<workflow-app xmlns="uri:oozie:workflow:0.2" name="wordcount-workflow">
	<start to="root-node"/>
	<action name="root-node">
	<java>
	<job-tracker>$jobTracker</job-tracker>
	<name-node>hdfs://$nameNode</name-node>
	<main-class>org.apache.hadoop.examples.WordCount</main-class>
	<arg>/tmp/test/input</arg>
	<arg>/tmp/test/output</arg>
	</java>
	<ok to="end"/>
	<error to="fail"/>
	</action>
	<kill name="fail">
	<message>Java failed</message>
	</kill>
	<end name="end"/>
	</workflow-app>
	"""

	configuration = """\
	<configuration>
	<property>
	<name>user.name</name>
	<value>$username</value>
	</property>
	<property>
	<name>oozie.wf.application.path</name>
	<value>hdfs://$nameNode/tmp/test</value>
	</property>
	</configuration>
	"""

	hadoop = Hadoop.login( gateway, username, password )

	println "Delete /tmp/test " + Hdfs.rm(hadoop).file( "/tmp/test" ).recursive().now().statusCode
	println "Mkdir /tmp/test " + Hdfs.mkdir(hadoop).dir( "/tmp/test").now().statusCode
	putWorkflow = Hdfs.put(hadoop).text( definition ).to( "/tmp/test/workflow.xml" ).later() {
	println "Put /tmp/test/workflow.xml " + it.statusCode }
	putData = Hdfs.put(hadoop).file( inputFile ).to( "/tmp/test/input/FILE" ).later() {
	println "Put /tmp/test/input/FILE " + it.statusCode }
	putJar = Hdfs.put(hadoop).file( jarFile ).to( "/tmp/test/lib/hadoop-examples.jar" ).later() {
	println "Put /tmp/test/lib/hadoop-examples.jar " + it.statusCode }
	hadoop.waitFor( putWorkflow, putData, putJar )

	jobId = Workflow.submit(hadoop).text( configuration ).now().jobId
	println "Submitted job " + jobId

	status = "UNKNOWN";
	count = 0;
	while( status != "SUCCEEDED" && count++ < 60 ) {
	sleep( 1000 )
	json = Workflow.status(hadoop).jobId( jobId ).now().string
	status = JsonPath.read( json, "\$.status" )
	}
	println "Job status " + status;

	println "Shutdown " + hadoop.shutdown( 10, SECONDS )

	------------------------------------------------------------------------------
	Example #3: WebHDFS & Templeton/WebHCat via cURL
	------------------------------------------------------------------------------
	The example below illustrates the sequence of curl commands that could be used
	to run a "word count" map reduce job. It utilizes the hadoop-examples.jar
	from a Hadoop install for running a simple word count job. Take care to
	follow the instructions below for steps 4/5 and 6/7 where the Location header
	returned by the call to the NameNode is copied for use with the call to the
	DataNode that follows it.

	# 0. Optionally cleanup the test directory in case a previous example was run without cleaning up.
	curl -i -k -u mapred:mapred-password -X DELETE \
	'https://localhost:8443/gateway/sample/namenode/api/v1/tmp/test?op=DELETE&recursive=true'

	# 1. Create a test input directory /tmp/test/input
	curl -i -k -u mapred:mapred-password -X PUT \
	'https://localhost:8443/gateway/sample/namenode/api/v1/tmp/test/input?op=MKDIRS'

	# 2. Create a test output directory /tmp/test/input
	curl -i -k -u mapred:mapred-password -X PUT \
	'https://localhost:8443/gateway/sample/namenode/api/v1/tmp/test/output?op=MKDIRS'

	# 3. Create the inode for hadoop-examples.jar in /tmp/test
	curl -i -k -u mapred:mapred-password -X PUT \
	'https://localhost:8443/gateway/sample/namenode/api/v1/tmp/test/hadoop-examples.jar?op=CREATE'

	# 4. Upload hadoop-examples.jar to /tmp/test. Use a hadoop-examples.jar from a Hadoop install.
	curl -i -k -u mapred:mapred-password -T hadoop-examples.jar -X PUT '{Value Location header from command above}'

	# 5. Create the inode for a sample file README in /tmp/test/input
	curl -i -k -u mapred:mapred-password -X PUT \
	'https://localhost:8443/gateway/sample/namenode/api/v1/tmp/test/input/README?op=CREATE'

	# 6. Upload readme.txt to /tmp/test/input. Use the readme.txt in {GATEWAY_HOME}.
	curl -i -k -u mapred:mapred-password -T README -X PUT '{Value of Location header from command above}'

	# 7. Submit the word count job via WebHCat/Templeton.
	# Take note of the Job ID in the JSON response as this will be used in the next step.
	curl -v -i -k -u mapred:mapred-password -X POST \
	-d jar=/tmp/test/hadoop-examples.jar -d class=wordcount \
	-d arg=/tmp/test/input -d arg=/tmp/test/output \
	'https://localhost:8443/gateway/sample/templeton/api/v1/mapreduce/jar'

	# 8. Look at the status of the job
	curl -i -k -u mapred:mapred-password -X GET \
	'https://localhost:8443/gateway/sample/templeton/api/v1/queue/{Job ID returned in JSON body from previous step}'

	# 9. Look at the status of the job queue
	curl -i -k -u mapred:mapred-password -X GET \
	'https://localhost:8443/gateway/sample/templeton/api/v1/queue'

	# 10. List the contents of the output directory /tmp/test/output
	curl -i -k -u mapred:mapred-password -X GET \
	'https://localhost:8443/gateway/sample/namenode/api/v1/tmp/test/output?op=LISTSTATUS'

	# 11. Optionally cleanup the test directory
	curl -i -k -u mapred:mapred-password -X DELETE \
	'https://localhost:8443/gateway/sample/namenode/api/v1/tmp/test?op=DELETE&recursive=true'

	------------------------------------------------------------------------------
	Example #4: WebHDFS & Oozie via cURL
	------------------------------------------------------------------------------
	The example below illustrates the sequence of curl commands that could be used
	to run a "word count" map reduce job via an Oozie workflow. It utilizes the
	hadoop-examples.jar from a Hadoop install for running a simple word count job.
	Take care to follow the instructions below where replacement values are
	required. These replacement values are identivied with { } markup.

	# 0. Optionally cleanup the test directory in case a previous example was run without cleaning up.
	curl -i -k -u mapred:mapred-password -X DELETE \
	'https://localhost:8443/gateway/sample/namenode/api/v1/tmp/test?op=DELETE&recursive=true'

	# 1. Create the inode for workflow definition file in /tmp/test
	curl -i -k -u mapred:mapred-password -X PUT \
	'https://localhost:8443/gateway/sample/namenode/api/v1/tmp/test/workflow.xml?op=CREATE'

	# 2. Upload the workflow definition file. This file can be found in {GATEWAY_HOME}/templates
	curl -i -k -u mapred:mapred-password -T templates/workflow-definition.xml -X PUT \
	'{Value Location header from command above}'

	# 3. Create the inode for hadoop-examples.jar in /tmp/test/lib
	curl -i -k -u mapred:mapred-password -X PUT \
	'https://localhost:8443/gateway/sample/namenode/api/v1/tmp/test/lib/hadoop-examples.jar?op=CREATE'

	# 4. Upload hadoop-examples.jar to /tmp/test/lib. Use a hadoop-examples.jar from a Hadoop install.
	curl -i -k -u mapred:mapred-password -T hadoop-examples.jar -X PUT \
	'{Value Location header from command above}'

	# 5. Create the inode for a sample input file readme.txt in /tmp/test/input.
	curl -i -k -u mapred:mapred-password -X PUT \
	'https://localhost:8443/gateway/sample/namenode/api/v1/tmp/test/input/README?op=CREATE'

	# 6. Upload readme.txt to /tmp/test/input. Use the readme.txt in {GATEWAY_HOME}.
	# The sample below uses this README file found in {GATEWAY_HOME}.
	curl -i -k -u mapred:mapred-password -T README -X PUT \
	'{Value of Location header from command above}'

	# 7. Create the job configuration file by replacing the {NameNode host:port} and {JobTracker host:port}
	# in the command below to values that match your Hadoop configuration.
	# NOTE: The hostnames must be resolvable by the Oozie daemon. The ports are the RPC ports not the HTTP ports.
	# For example {NameNode host:port} might be sandbox:8020 and {JobTracker host:port} sandbox:50300
	# The source workflow-configuration.xml file can be found in {GATEWAY_HOME}/templates
	# Alternatively, this file can copied and edited manually for environments without the sed utility.
	sed -e s/REPLACE.NAMENODE.RPCHOSTPORT/{NameNode host:port}/ \
	-e s/REPLACE.JOBTRACKER.RPCHOSTPORT/{JobTracker host:port}/ \
	<templates/workflow-configuration.xml >workflow-configuration.xml

	# 8. Submit the job via Oozie
	# Take note of the Job ID in the JSON response as this will be used in the next step.
	curl -i -k -u mapred:mapred-password -T workflow-configuration.xml -H Content-Type:application/xml -X POST \
	'https://localhost:8443/gateway/oozie/sample/api/v1/jobs?action=start'

	# 9. Query the job status via Oozie.
	curl -i -k -u mapred:mapred-password -X GET \
	'https://localhost:8443/gateway/sample/oozie/api/v1/job/{Job ID returned in JSON body from previous step}'

	# 10. List the contents of the output directory /tmp/test/output
	curl -i -k -u mapred:mapred-password -X GET \
	'https://localhost:8443/gateway/sample/namenode/api/v1/tmp/test/output?op=LISTSTATUS'

	# 11. Optionally cleanup the test directory
	curl -i -k -u mapred:mapred-password -X DELETE \
	'https://localhost:8443/gateway/sample/namenode/api/v1/tmp/test?op=DELETE&recursive=true'

	------------------------------------------------------------------------------
	Disclaimer
	------------------------------------------------------------------------------
	The Apache Knox Gateway is an effort undergoing incubation at the
	Apache Software Foundation (ASF), sponsored by the Apache Incubator PMC.

	Incubation is required of all newly accepted projects until a further review
	indicates that the infrastructure, communications, and decision making process
	have stabilized in a manner consistent with other successful ASF projects.

	While incubation status is not necessarily a reflection of the completeness
	or stability of the code, it does indicate that the project has yet to be
	fully endorsed by the ASF.