| <noautolink> |
| |
| [[index][::Go back to Oozie Documentation Index::]] |
| |
| ----- |
| |
| %TOC% |
| |
| ---++ Oozie Web Services API, V1 (Workflow, Coordinator, And Bundle) |
| |
| The Oozie Web Services API is a HTTP REST JSON API. |
| |
| All responses are in =UTF-8=. |
| |
| Assuming Oozie is running at =OOZIE_URL=, the following web services end points are supported: |
| |
| * <OOZIE_URL>/versions |
| * <OOZIE_URL>/v1/admin |
| * <OOZIE_URL>/v1/job |
| * <OOZIE_URL>/v1/jobs |
| * <OOZIE_URL>/v2/job |
| * <OOZIE_URL>/v2/jobs |
| * <OOZIE_URL>/v2/admin |
| * <OOZIE_URL>/v2/sla |
| |
| Documentation on the API is below; in some cases, looking at the corresponding command in the |
| [[DG_CommandLineTool][Command Line Documentation]] page will provide additional details and examples. Most of the functionality |
| offered by the Oozie CLI is using the WS API. If you export <code>OOZIE_DEBUG</code> then the Oozie CLI will output the WS API |
| details used by any commands you execute. This is useful for debugging purposes to or see how the Oozie CLI works with the WS API. |
| |
| ---+++ Versions End-Point |
| |
| _Identical to the corresponding Oozie v0 WS API_ |
| |
| This endpoint is for clients to perform protocol negotiation. |
| |
| It support only HTTP GET request and not sub-resources. |
| |
| It returns the supported Oozie protocol versions by the server. |
| |
| Current returned values are =0, 1, 2=. |
| |
| *Request:* |
| |
| <verbatim> |
| GET /oozie/versions |
| </verbatim> |
| |
| *Response:* |
| |
| <verbatim> |
| HTTP/1.1 200 OK |
| Content-Type: application/json;charset=UTF-8 |
| . |
| [0,1] |
| </verbatim> |
| |
| ---+++ Admin End-Point |
| |
| This endpoint is for obtaining Oozie system status and configuration information. |
| |
| It supports the following sub-resources: =status, os-env, sys-props, configuration, instrumentation, systems, available-timezones=. |
| |
| ---++++ System Status |
| |
| _Identical to the corresponding Oozie v0 WS API_ |
| |
| A HTTP GET request returns the system status. |
| |
| *Request:* |
| |
| <verbatim> |
| GET /oozie/v1/admin/status |
| </verbatim> |
| |
| *Response:* |
| |
| <verbatim> |
| HTTP/1.1 200 OK |
| Content-Type: application/json;charset=UTF-8 |
| . |
| {"systemMode":NORMAL} |
| </verbatim> |
| |
| With a HTTP PUT request it is possible to change the system status between =NORMAL=, =NOWEBSERVICE=, and =SAFEMODE=. |
| |
| *Request:* |
| |
| <verbatim> |
| PUT /oozie/v1/admin/status?systemmode=SAFEMODE |
| </verbatim> |
| |
| *Response:* |
| |
| <verbatim> |
| HTTP/1.1 200 OK |
| </verbatim> |
| |
| ---++++ OS Environment |
| |
| _Identical to the corresponding Oozie v0 WS API_ |
| |
| A HTTP GET request returns the Oozie system OS environment. |
| |
| *Request:* |
| |
| <verbatim> |
| GET /oozie/v1/admin/os-env |
| </verbatim> |
| |
| *Response:* |
| |
| <verbatim> |
| HTTP/1.1 200 OK |
| Content-Type: application/json;charset=UTF-8 |
| . |
| { |
| TERM: "xterm", |
| JAVA_HOME: "/usr/java/latest", |
| XCURSOR_SIZE: "", |
| SSH_CLIENT: "::ffff:127.0.0.1 49082 22", |
| XCURSOR_THEME: "default", |
| INPUTRC: "/etc/inputrc", |
| HISTSIZE: "1000", |
| PATH: "/usr/java/latest/bin" |
| KDE_FULL_SESSION: "true", |
| LANG: "en_US.UTF-8", |
| ... |
| } |
| </verbatim> |
| |
| ---++++ Java System Properties |
| |
| _Identical to the corresponding Oozie v0 WS API_ |
| |
| A HTTP GET request returns the Oozie Java system properties. |
| |
| *Request:* |
| |
| <verbatim> |
| GET /oozie/v1/admin/java-sys-properties |
| </verbatim> |
| |
| *Response:* |
| |
| <verbatim> |
| HTTP/1.1 200 OK |
| Content-Type: application/json;charset=UTF-8 |
| . |
| { |
| java.vm.version: "11.0-b15", |
| sun.jnu.encoding: "UTF-8", |
| java.vendor.url: "http://java.sun.com/", |
| java.vm.info: "mixed mode", |
| ... |
| } |
| </verbatim> |
| |
| ---++++ Oozie Configuration |
| |
| _Identical to the corresponding Oozie v0 WS API_ |
| |
| A HTTP GET request returns the Oozie system configuration. |
| |
| *Request:* |
| |
| <verbatim> |
| GET /oozie/v1/admin/configuration |
| </verbatim> |
| |
| *Response:* |
| |
| <verbatim> |
| HTTP/1.1 200 OK |
| Content-Type: application/json;charset=UTF-8 |
| . |
| { |
| oozie.service.SchedulerService.threads: "5", |
| oozie.service.ActionService.executor.classes: " |
| org.apache.oozie.dag.action.decision.DecisionActionExecutor, |
| org.apache.oozie.dag.action.hadoop.HadoopActionExecutor, |
| org.apache.oozie.dag.action.hadoop.FsActionExecutor |
| ", |
| oozie.service.CallableQueueService.threads.min: "10", |
| oozie.service.DBLiteWorkflowStoreService.oozie.autoinstall: "true", |
| ... |
| } |
| </verbatim> |
| |
| ---++++ Oozie Instrumentation |
| |
| _Identical to the corresponding Oozie v0 WS API_ |
| |
| A HTTP GET request returns the Oozie instrumentation information. Keep in mind that timers and counters that the Oozie server |
| hasn't incremented yet will not show up. |
| |
| *Note:* If Instrumentation is enabled, then Metrics is unavailable. |
| |
| *Request:* |
| |
| <verbatim> |
| GET /oozie/v1/admin/instrumentation |
| </verbatim> |
| |
| *Response:* |
| |
| <verbatim> |
| HTTP/1.1 200 OK |
| Content-Type: application/json;charset=UTF-8 |
| . |
| { |
| timers: [ |
| { |
| group: "db", |
| data: [ |
| { |
| ownMinTime: 2, |
| ownTimeStdDev: 0, |
| totalTimeStdDev: 0, |
| ownTimeAvg: 3, |
| ticks: 117, |
| name: "update-workflow", |
| ownMaxTime: 32, |
| totalMinTime: 2, |
| totalMaxTime: 32, |
| totalTimeAvg: 3 |
| }, |
| ... |
| ] |
| }, |
| ... |
| ], |
| samplers: [ |
| { |
| group: "callablequeue", |
| data: [ |
| { |
| name: "threads.active", |
| value: 1.8333333333333333 |
| }, |
| { |
| name: "delayed.queue.size", |
| value: 0 |
| }, |
| { |
| name: "queue.size", |
| value: 0 |
| } |
| ] |
| }, |
| ... |
| ], |
| variables: [ |
| { |
| group: "jvm", |
| data: [ |
| { |
| name: "max.memory", |
| value: 506920960 |
| }, |
| { |
| name: "total.memory", |
| value: 56492032 |
| }, |
| { |
| name: "free.memory", |
| value: 45776800 |
| } |
| ] |
| }, |
| ... |
| ] |
| } |
| </verbatim> |
| |
| ---++++ Oozie Metrics |
| |
| _Available in the Oozie v2 WS API and later_ |
| |
| A HTTP GET request returns the Oozie metrics information. Keep in mind that timers and counters that the Oozie server |
| hasn't incremented yet will not show up. |
| |
| *Note:* If Metrics is enabled, then Instrumentation is unavailable. |
| |
| *Request:* |
| |
| <verbatim> |
| GET /oozie/v2/admin/metrics |
| </verbatim> |
| |
| *Response:* |
| |
| <verbatim> |
| HTTP/1.1 200 OK |
| Content-Type: application/json;charset=UTF-8 |
| . |
| { |
| "gauges" : { |
| "jvm.memory.non-heap.committed" : { |
| "value" : 62590976 |
| }, |
| "oozie.mode" : { |
| "value" : "NORMAL" |
| }, |
| ... |
| }, |
| "timers" : { |
| "commands.action.end.call.timer" : { |
| "mean" : 108.5, |
| "p50" : 111.5, |
| "p75" : 165.5, |
| "p999" : 169, |
| "count" : 4, |
| "p95" : 169, |
| "max" : 169, |
| "mean_rate" : 0, |
| "duration_units" : "milliseconds", |
| "p98" : 169, |
| "m1_rate" : 0, |
| "rate_units" : "calls/millisecond", |
| "m15_rate" : 0, |
| "stddev" : 62.9417720330995, |
| "m5_rate" : 0, |
| "p99" : 169, |
| "min" : 42 |
| }, |
| ... |
| }, |
| "histograms" : { |
| "callablequeue.threads.active.histogram" : { |
| "p999" : 1, |
| "mean" : 0.0625, |
| "min" : 0, |
| "p75" : 0, |
| "p95" : 1, |
| "count" : 48, |
| "p98" : 1, |
| "stddev" : 0.24462302739504083, |
| "max" : 1, |
| "p99" : 1, |
| "p50" : 0 |
| }, |
| ... |
| }, |
| "counters" : { |
| "commands.job.info.executions" : { |
| "count" : 9 |
| }, |
| "jpa.CoordJobsGetPendingJPAExecutor" : { |
| "count" : 1 |
| }, |
| "jpa.GET_WORKFLOW" : { |
| "count" : 10 |
| }, |
| ... |
| } |
| } |
| </verbatim> |
| |
| ---++++ Version |
| |
| _Identical to the corresponding Oozie v0 WS API_ |
| |
| A HTTP GET request returns the Oozie build version. |
| |
| *Request:* |
| |
| <verbatim> |
| GET /oozie/v1/admin/build-version |
| </verbatim> |
| |
| *Response:* |
| |
| <verbatim> |
| HTTP/1.1 200 OK |
| Content-Type: application/json;charset=UTF-8 |
| . |
| {buildVersion: "3.0.0-SNAPSHOT" } |
| </verbatim> |
| |
| ---++++ Available Time Zones |
| |
| A HTTP GET request returns the available time zones. |
| |
| *Request:* |
| |
| <verbatim> |
| GET /oozie/v1/admin/available-timezones |
| </verbatim> |
| |
| *Response:* |
| |
| <verbatim> |
| HTTP/1.1 200 OK |
| Content-Type: application/json;charset=UTF-8 |
| . |
| { |
| "available-timezones":[ |
| { |
| "timezoneDisplayName":"SST (Pacific\/Midway)", |
| "timezoneId":"Pacific\/Midway" |
| }, |
| { |
| "timezoneDisplayName":"NUT (Pacific\/Niue)", |
| "timezoneId":"Pacific\/Niue" |
| }, |
| { |
| "timezoneDisplayName":"SST (Pacific\/Pago_Pago)", |
| "timezoneId":"Pacific\/Pago_Pago" |
| }, |
| { |
| "timezoneDisplayName":"SST (Pacific\/Samoa)", |
| "timezoneId":"Pacific\/Samoa" |
| }, |
| { |
| "timezoneDisplayName":"SST (US\/Samoa)", |
| "timezoneId":"US\/Samoa" |
| }, |
| { |
| "timezoneDisplayName":"HAST (America\/Adak)", |
| "timezoneId":"America\/Adak" |
| }, |
| { |
| "timezoneDisplayName":"HAST (America\/Atka)", |
| "timezoneId":"America\/Atka" |
| }, |
| { |
| "timezoneDisplayName":"HST (HST)", |
| "timezoneId":"HST" |
| }, |
| ... |
| ] |
| } |
| </verbatim> |
| |
| ---++++ Queue Dump |
| |
| A HTTP GET request returns the queue dump of the Oozie system. This is an administrator debugging feature. |
| |
| *Request:* |
| |
| <verbatim> |
| |
| GET /oozie/v1/admin/queue-dump |
| </verbatim> |
| |
| ---++++ Available Oozie Servers |
| |
| A HTTP GET request returns the list of available Oozie Servers. This is useful when Oozie is configured |
| for [[AG_Install#HA][High Availability]]; if not, it will simply return the one Oozie Server. |
| |
| *Request:* |
| |
| <verbatim> |
| GET /oozie/v2/admin/available-oozie-servers |
| </verbatim> |
| |
| *Response:* |
| |
| <verbatim> |
| HTTP/1.1 200 OK |
| Content-Type: application/json;charset=UTF-8 |
| . |
| { |
| "hostA": "http://hostA:11000/oozie", |
| "hostB": "http://hostB:11000/oozie", |
| "hostC": "http://hostC:11000/oozie", |
| } |
| </verbatim> |
| |
| ---++++ List available sharelib |
| A HTTP GET request to get list of available sharelib. |
| If the name of the sharelib is passed as an argument (regex supported) then all corresponding files are also listed. |
| |
| *Request:* |
| |
| <verbatim> |
| GET /oozie/v2/admin/list_sharelib |
| </verbatim> |
| |
| *Response:* |
| |
| <verbatim> |
| HTTP/1.1 200 OK |
| Content-Type: application/json;charset=UTF-8 |
| { |
| "sharelib": |
| [ |
| "oozie", |
| "hive", |
| "distcp", |
| "hcatalog", |
| "sqoop", |
| "mapreduce-streaming", |
| "pig" |
| ] |
| } |
| </verbatim> |
| |
| *Request:* |
| |
| <verbatim> |
| GET /oozie/v2/admin/list_sharelib?lib=pig* |
| </verbatim> |
| |
| *Response:* |
| |
| <verbatim> |
| HTTP/1.1 200 OK |
| Content-Type: application/json;charset=UTF-8 |
| |
| { |
| "sharelib": |
| [ |
| { |
| "pig": |
| { |
| "sharelibFiles": |
| [ |
| hdfs://localhost:9000/user/purushah/share/lib/lib_20131114095729/pig/pig.jar |
| hdfs://localhost:9000/user/purushah/share/lib/lib_20131114095729/pig/piggybank.jar |
| ] |
| } |
| } |
| ] |
| } |
| </verbatim> |
| |
| |
| ---++++ Update system sharelib |
| This webservice call makes the oozie server(s) to pick up the latest version of sharelib present |
| under oozie.service.WorkflowAppService.system.libpath directory based on the sharelib directory timestamp or reloads |
| the sharelib metafile if one is configured. The main purpose is to update the sharelib on the oozie server without restarting. |
| |
| |
| *Request:* |
| |
| <verbatim> |
| GET /oozie/v2/admin/update_sharelib |
| </verbatim> |
| |
| *Response:* |
| |
| <verbatim> |
| HTTP/1.1 200 OK |
| Content-Type: application/json;charset=UTF-8 |
| [ |
| { |
| "sharelibUpdate":{ |
| "host":"server1", |
| "status":"Server not found" |
| } |
| }, |
| { |
| "sharelibUpdate":{ |
| "host":"server2", |
| "status":"Successful", |
| "sharelibDirOld":"hdfs://localhost:51951/user/purushah/share/lib/lib_20140107181218", |
| "sharelibDirNew":"hdfs://localhost:51951/user/purushah/share/lib/lib_20140107181218" |
| } |
| }, |
| { |
| "sharelibUpdate":{ |
| "host":"server3", |
| "status":"Successful", |
| "sharelibDirOld":"hdfs://localhost:51951/user/purushah/share/lib/lib_20140107181218", |
| "sharelibDirNew":"hdfs://localhost:51951/user/purushah/share/lib/lib_20140107181218" |
| } |
| } |
| ] |
| </verbatim> |
| |
| ---+++ Job and Jobs End-Points |
| |
| _Modified in Oozie v1 WS API_ |
| |
| These endpoints are for submitting, managing and retrieving information of workflow, coordinator, and bundle jobs. |
| |
| ---++++ Job Submission |
| |
| ---++++ Standard Job Submission |
| |
| An HTTP POST request with an XML configuration as payload creates a job. |
| |
| The type of job is determined by the presence of one of the following 3 properties: |
| |
| * =oozie.wf.application.path= : path to a workflow application directory, creates a workflow job |
| * =oozie.coord.application.path= : path to a coordinator application file, creates a coordinator job |
| * =oozie.bundle.application.path= : path to a bundle application file, creates a bundle job |
| |
| Or, if none of those are present, the jobtype parameter determines the type of job to run. It can either be mapreduce or pig. |
| |
| *Request:* |
| |
| <verbatim> |
| POST /oozie/v1/jobs |
| Content-Type: application/xml;charset=UTF-8 |
| . |
| <?xml version="1.0" encoding="UTF-8"?> |
| <configuration> |
| <property> |
| <name>user.name</name> |
| <value>bansalm</value> |
| </property> |
| <property> |
| <name>oozie.wf.application.path</name> |
| <value>hdfs://foo:8020/user/bansalm/myapp/</value> |
| </property> |
| ... |
| </configuration> |
| </verbatim> |
| |
| *Response:* |
| |
| <verbatim> |
| HTTP/1.1 201 CREATED |
| Content-Type: application/json;charset=UTF-8 |
| . |
| { |
| id: "job-3" |
| } |
| </verbatim> |
| |
| A created job will be in =PREP= status. If the query string parameter 'action=start' is provided in |
| the POST URL, the job will be started immediately and its status will be =RUNNING=. |
| |
| Coordinator jobs with start time in the future they will not create any action until the start time |
| happens. |
| |
| A coordinator job will remain in =PREP= status until it's triggered, in which case it will change to =RUNNING= status. |
| The 'action=start' parameter is not valid for coordinator jobs. |
| |
| ---++++ Proxy MapReduce Job Submission |
| |
| You can submit a Workflow that contains a single MapReduce action without writing a workflow.xml. Any required Jars or other files |
| must already exist in HDFS. |
| |
| The following properties are required; any additional parameters needed by the MapReduce job can also be specified here: |
| * =fs.default.name=: The NameNode |
| * =mapred.job.tracker=: The JobTracker |
| * =mapred.mapper.class=: The map-task classname |
| * =mapred.reducer.class=: The reducer-task classname |
| * =mapred.input.dir=: The map-task input directory |
| * =mapred.output.dir=: The reduce-task output directory |
| * =user.name=: The username of the user submitting the job |
| * =oozie.libpath=: A directory in HDFS that contains necessary Jars for your job |
| * =oozie.proxysubmission=: Must be set to =true= |
| |
| *Request:* |
| |
| <verbatim> |
| POST /oozie/v1/jobs?jobtype=mapreduce |
| Content-Type: application/xml;charset=UTF-8 |
| . |
| <?xml version="1.0" encoding="UTF-8"?> |
| <configuration> |
| <property> |
| <name>fs.default.name</name> |
| <value>hdfs://localhost:8020</value> |
| </property> |
| <property> |
| <name>mapred.job.tracker</name> |
| <value>localhost:8021</value> |
| </property> |
| <property> |
| <name>mapred.mapper.class</name> |
| <value>org.apache.oozie.example.SampleMapper</value> |
| </property> |
| <property> |
| <name>mapred.reducer.class</name> |
| <value>org.apache.oozie.example.SampleReducer</value> |
| </property> |
| <property> |
| <name>mapred.input.dir</name> |
| <value>hdfs://localhost:8020/user/rkanter/examples/input-data/text</value> |
| </property> |
| <property> |
| <name>mapred.output.dir</name> |
| <value>hdfs://localhost:8020/user/rkanter/examples/output-data/map-reduce</value> |
| </property> |
| <property> |
| <name>user.name</name> |
| <value>rkanter</value> |
| </property> |
| <property> |
| <name>oozie.libpath</name> |
| <value>hdfs://localhost:8020/user/rkanter/examples/apps/map-reduce/lib</value> |
| </property> |
| <property> |
| <name>oozie.proxysubmission</name> |
| <value>true</value> |
| </property> |
| </configuration> |
| </verbatim> |
| |
| *Response:* |
| |
| <verbatim> |
| HTTP/1.1 201 CREATED |
| Content-Type: application/json;charset=UTF-8 |
| . |
| { |
| id: "job-3" |
| } |
| </verbatim> |
| |
| ---++++ Proxy Pig Job Submission |
| |
| You can submit a Workflow that contains a single Pig action without writing a workflow.xml. Any required Jars or other files must |
| already exist in HDFS. |
| |
| The following properties are required: |
| * =fs.default.name=: The NameNode |
| * =mapred.job.tracker=: The JobTracker |
| * =user.name=: The username of the user submitting the job |
| * =oozie.pig.script=: Contains the pig script you want to run (the actual script, not a file path) |
| * =oozie.libpath=: A directory in HDFS that contains necessary Jars for your job |
| * =oozie.proxysubmission=: Must be set to =true= |
| |
| The following properties are optional: |
| * =oozie.pig.script.params.size=: The number of parameters you'll be passing to Pig |
| required =oozie.pig.script.params.n=: A parameter (variable definition for the script) in 'key=value' format, the 'n' should be an integer starting with 0 to indicate the parameter number |
| * =oozie.pig.options.size=: The number of options you'll be passing to Pig |
| * =oozie.pig.options.n=: An argument to pass to Pig, the 'n' should be an integer starting with 0 to indicate the option number |
| |
| The =oozie.pig.options.n= parameters are sent directly to Pig without any modification unless they start with =-D=, in which case |
| they are put into the <code><configuration></code> element of the action. |
| |
| In addition to passing parameters to Pig with =oozie.pig.script.params.n=, you can also create a properties file on HDFS and |
| reference it with the =-param_file= option in =oozie.pig.script.options.n=; both are shown in the following example. |
| |
| <verbatim> |
| $ hadoop fs -cat /user/rkanter/pig_params.properties |
| INPUT=/user/rkanter/examples/input-data/text |
| </verbatim> |
| |
| *Request:* |
| |
| <verbatim> |
| POST /oozie/v1/jobs?jobtype=pig |
| Content-Type: application/xml;charset=UTF-8 |
| . |
| <?xml version="1.0" encoding="UTF-8"?> |
| <configuration> |
| <property> |
| <name>fs.default.name</name> |
| <value>hdfs://localhost:8020</value> |
| </property> |
| <property> |
| <name>mapred.job.tracker</name> |
| <value>localhost:8021</value> |
| </property> |
| <property> |
| <name>user.name</name> |
| <value>rkanter</value> |
| </property> |
| <property> |
| <name>oozie.pig.script</name> |
| <value> |
| A = load '$INPUT' using PigStorage(':'); |
| B = foreach A generate $0 as id; |
| store B into '$OUTPUT' USING PigStorage(); |
| </value> |
| </property> |
| <property> |
| <name>oozie.pig.script.params.size</name> |
| <value>1</value> |
| </property> |
| <property> |
| <name>oozie.pig.script.params.0</name> |
| <value>OUTPUT=/user/rkanter/examples/output-data/pig</value> |
| </property> |
| <property> |
| <name>oozie.pig.options.size</name> |
| <value>2</value> |
| </property> |
| <property> |
| <name>oozie.pig.options.0</name> |
| <value>-param_file</value> |
| </property> |
| <property> |
| <name>oozie.pig.options.1</name> |
| <value>hdfs://localhost:8020/user/rkanter/pig_params.properties</value> |
| </property> |
| <property> |
| <name>oozie.libpath</name> |
| <value>hdfs://localhost:8020/user/rkanter/share/lib/pig</value> |
| </property> |
| <property> |
| <name>oozie.proxysubmission</name> |
| <value>true</value> |
| </property> |
| </configuration> |
| </verbatim> |
| |
| *Response:* |
| |
| <verbatim> |
| HTTP/1.1 201 CREATED |
| Content-Type: application/json;charset=UTF-8 |
| . |
| { |
| id: "job-3" |
| } |
| </verbatim> |
| |
| ---++++ Proxy Hive Job Submission |
| |
| You can submit a Workflow that contains a single Hive action without writing a workflow.xml. Any required Jars or other files must |
| already exist in HDFS. |
| |
| The following properties are required: |
| * =fs.default.name=: The NameNode |
| * =mapred.job.tracker=: The JobTracker |
| * =user.name=: The username of the user submitting the job |
| * =oozie.hive.script=: Contains the hive script you want to run (the actual script, not a file path) |
| * =oozie.libpath=: A directory in HDFS that contains necessary Jars for your job |
| * =oozie.proxysubmission=: Must be set to =true= |
| |
| The following properties are optional: |
| * =oozie.hive.script.params.size=: The number of parameters you'll be passing to Hive |
| * =oozie.hive.script.params.n=: A parameter (variable definition for the script) in 'key=value' format, the 'n' should be an integer starting with 0 to indicate the parameter number |
| * =oozie.hive.options.size=: The number of options you'll be passing to Pig |
| * =oozie.hive.options.n=: An argument to pass to Hive, the 'n' should be an integer starting with 0 to indicate the option number |
| |
| The =oozie.hive.options.n= parameters are sent directly to Hive without any modification unless they start with =-D=, in which case |
| they are put into the <code><configuration></code> element of the action. |
| |
| *Request:* |
| |
| <verbatim> |
| POST /oozie/v1/jobs?jobtype=hive |
| Content-Type: application/xml;charset=UTF-8 |
| . |
| <?xml version="1.0" encoding="UTF-8"?> |
| <configuration> |
| <property> |
| <name>fs.default.name</name> |
| <value>hdfs://localhost:8020</value> |
| </property> |
| <property> |
| <name>mapred.job.tracker</name> |
| <value>localhost:8021</value> |
| </property> |
| <property> |
| <name>user.name</name> |
| <value>rkanter</value> |
| </property> |
| <property> |
| <name>oozie.hive.script</name> |
| <value> |
| CREATE EXTERNAL TABLE test (a INT) STORED AS TEXTFILE LOCATION '${INPUT}'; |
| INSERT OVERWRITE DIRECTORY '${OUTPUT}' SELECT * FROM test; |
| </value> |
| </property> |
| <property> |
| <name>oozie.hive.script.params.size</name> |
| <value>2</value> |
| </property> |
| <property> |
| <name>oozie.hive.script.params.0</name> |
| <value>OUTPUT=/user/rkanter/examples/output-data/hive</value> |
| </property> |
| <property> |
| <name>oozie.hive.script.params.1</name> |
| <value>INPUT=/user/rkanter/examples/input-data/table</value> |
| </property> |
| <property> |
| <name>oozie.libpath</name> |
| <value>hdfs://localhost:8020/user/rkanter/share/lib/hive</value> |
| </property> |
| <property> |
| <name>oozie.proxysubmission</name> |
| <value>true</value> |
| </property> |
| </configuration> |
| </verbatim> |
| |
| *Response:* |
| |
| <verbatim> |
| HTTP/1.1 201 CREATED |
| Content-Type: application/json;charset=UTF-8 |
| . |
| { |
| id: "job-3" |
| } |
| </verbatim> |
| |
| ---++++ Proxy Sqoop Job Submission |
| |
| You can submit a Workflow that contains a single Sqoop command without writing a workflow.xml. Any required Jars or other |
| files must already exist in HDFS. |
| |
| The following properties are required: |
| * =fs.default.name=: The NameNode |
| * =mapred.job.tracker=: The JobTracker |
| * =user.name=: The username of the user submitting the job |
| * =oozie.sqoop.command=: The sqoop command you want to run where each argument occupies one line or separated by "\n" |
| * =oozie.libpath=: A directory in HDFS that contains necessary Jars for your job |
| * =oozie.proxysubmission=: Must be set to =true= |
| |
| The following properties are optional: |
| * =oozie.sqoop.options.size=: The number of options you'll be passing to Sqoop Hadoop job |
| * =oozie.sqoop.options.n=: An argument to pass to Sqoop hadoop job conf, the 'n' should be an integer starting with 0 to indicate the option number |
| |
| *Request:* |
| |
| <verbatim> |
| POST /oozie/v1/jobs?jobtype=sqoop |
| Content-Type: application/xml;charset=UTF-8 |
| . |
| <?xml version="1.0" encoding="UTF-8"?> |
| <configuration> |
| <property> |
| <name>fs.default.name</name> |
| <value>hdfs://localhost:8020</value> |
| </property> |
| <property> |
| <name>mapred.job.tracker</name> |
| <value>localhost:8021</value> |
| </property> |
| <property> |
| <name>user.name</name> |
| <value>bzhang</value> |
| </property> |
| <property> |
| <name>oozie.sqoop.command</name> |
| <value> |
| import |
| --connect |
| jdbc:mysql://localhost:3306/oozie |
| --username |
| oozie |
| --password |
| oozie |
| --table |
| WF_JOBS |
| --target-dir |
| /user/${wf:user()}/${examplesRoot}/output-data/sqoop |
| </value> |
| </property> |
| <name>oozie.libpath</name> |
| <value>hdfs://localhost:8020/user/bzhang/share/lib/sqoop</value> |
| </property> |
| <property> |
| <name>oozie.proxysubmission</name> |
| <value>true</value> |
| </property> |
| </configuration> |
| </verbatim> |
| |
| *Response:* |
| |
| <verbatim> |
| HTTP/1.1 201 CREATED |
| Content-Type: application/json;charset=UTF-8 |
| . |
| { |
| id: "job-3" |
| } |
| </verbatim> |
| |
| ---++++ Managing a Job |
| |
| A HTTP PUT request starts, suspends, resumes, kills, update or dryruns a job. |
| |
| *Request:* |
| |
| <verbatim> |
| PUT /oozie/v1/job/job-3?action=start |
| </verbatim> |
| |
| *Response:* |
| |
| <verbatim> |
| HTTP/1.1 200 OK |
| </verbatim> |
| |
| Valid values for the 'action' parameter are 'start', 'suspend', 'resume', 'kill', 'dryrun', 'rerun', and 'change'. |
| |
| Rerunning and changing a job require additional parameters, and are described below: |
| |
| ---+++++ Re-Running a Workflow Job |
| |
| A workflow job in =SUCCEEDED=, =KILLED= or =FAILED= status can be partially rerun specifying a list |
| of workflow nodes to skip during the rerun. All the nodes in the skip list must have complete its |
| execution. |
| |
| The rerun job will have the same job ID. |
| |
| A rerun request is done with a HTTP PUT request with a =rerun= action. |
| |
| *Request:* |
| |
| <verbatim> |
| PUT /oozie/v1/job/job-3?action=rerun |
| Content-Type: application/xml;charset=UTF-8 |
| . |
| <?xml version="1.0" encoding="UTF-8"?> |
| <configuration> |
| <property> |
| <name>user.name</name> |
| <value>tucu</value> |
| </property> |
| <property> |
| <name>oozie.wf.application.path</name> |
| <value>hdfs://foo:8020/user/tucu/myapp/</value> |
| </property> |
| <property> |
| <name>oozie.wf.rerun.skip.nodes</name> |
| <value>firstAction,secondAction</value> |
| </property> |
| ... |
| </configuration> |
| </verbatim> |
| |
| *Response:* |
| |
| <verbatim> |
| HTTP/1.1 200 OK |
| </verbatim> |
| |
| ---+++++ Re-Running a coordinator job |
| |
| A coordinator job in =RUNNING= =SUCCEEDED=, =KILLED= or =FAILED= status can be partially rerun by specifying the coordinator actions |
| to re-execute. |
| |
| A rerun request is done with an HTTP PUT request with a =coord-rerun= =action=. |
| |
| The =type= of the rerun can be =date= or =action=. |
| |
| The =scope= of the rerun depends on the type: |
| * =date=: a comma-separated list of date ranges. Each date range element is specified with dates separated by =::= |
| * =action=: a comma-separated list of action ranges. Each action range is specified with two action numbers separated by =-= |
| |
| The =refresh= parameter can be =true= or =false= to specify if the user wants to refresh an action's input and output events. |
| |
| The =nocleanup= parameter can be =true= or =false= to specify is the user wants to cleanup output events for the rerun actions. |
| |
| *Request:* |
| |
| <verbatim> |
| PUT /oozie/v1/job/job-3?action=coord-rerun&type=action&scope=1-2&refresh=false&nocleanup=false |
| . |
| </verbatim> |
| |
| or |
| |
| <verbatim> |
| PUT /oozie/v1/job/job-3?action=coord-rerun&type=date2009-02-01T00:10Z::2009-03-01T00:10Z&scope=&refresh=false&nocleanup=false |
| . |
| </verbatim> |
| |
| *Response:* |
| |
| <verbatim> |
| HTTP/1.1 200 OK |
| </verbatim> |
| |
| ---+++++ Re-Running a bundle job |
| |
| A coordinator job in =RUNNING= =SUCCEEDED=, =KILLED= or =FAILED= status can be partially rerun by specifying the coordinators to |
| re-execute. |
| |
| A rerun request is done with an HTTP PUT request with a =bundle-rerun= =action=. |
| |
| A comma separated list of coordinator job names (not IDs) can be specified in the =coord-scope= parameter. |
| |
| The =date-scope= parameter is a comma-separated list of date ranges. Each date range element is specified with dates separated |
| by =::=. If empty or not included, Oozie will figure this out for you |
| |
| The =refresh= parameter can be =true= or =false= to specify if the user wants to refresh the coordinator's input and output events. |
| |
| The =nocleanup= parameter can be =true= or =false= to specify is the user wants to cleanup output events for the rerun coordinators. |
| |
| *Request:* |
| |
| <verbatim> |
| PUT /oozie/v1/job/job-3?action=bundle-rerun&coord-scope=coord-1&refresh=false&nocleanup=false |
| . |
| </verbatim> |
| |
| *Response:* |
| |
| <verbatim> |
| HTTP/1.1 200 OK |
| </verbatim> |
| |
| |
| ---+++++ Changing endtime/concurrency/pausetime of a Coordinator Job |
| |
| A coordinator job not in =KILLED= status can have it's endtime, concurrency, or pausetime changed. |
| |
| A change request is done with an HTTP PUT request with a =change= =action=. |
| |
| The =value= parameter can contain any of the following: |
| * endtime: the end time of the coordinator job. |
| * concurrency: the concurrency of the coordinator job. |
| * pausetime: the pause time of the coordinator job. |
| |
| Multiple arguments can be passed to the =value= parameter by separating them with a ';' character. |
| |
| If an already-succeeded job changes its end time, its status will become running. |
| |
| *Request:* |
| |
| <verbatim> |
| PUT /oozie/v1/job/job-3?action=change&value=endtime=2011-12-01T05:00Z |
| . |
| </verbatim> |
| |
| or |
| |
| <verbatim> |
| PUT /oozie/v1/job/job-3?action=change&value=concurrency=100 |
| . |
| </verbatim> |
| |
| or |
| |
| <verbatim> |
| PUT /oozie/v1/job/job-3?action=change&value=pausetime=2011-12-01T05:00Z |
| . |
| </verbatim> |
| |
| or |
| |
| <verbatim> |
| PUT /oozie/v1/job/job-3?action=change&value=endtime=2011-12-01T05:00Z;concurrency=100;pausetime=2011-12-01T05:00Z |
| . |
| </verbatim> |
| |
| *Response:* |
| |
| <verbatim> |
| HTTP/1.1 200 OK |
| </verbatim> |
| |
| ---+++++ Updating coordinator definition and properties |
| Existing coordinator definition and properties will be replaced by new definition and properties. Refer [[DG_CommandLineTool#Updating_coordinator_definition_and_properties][Updating coordinator definition and properties]] |
| |
| <verbatim> |
| PUT oozie/v2/job/0000000-140414102048137-oozie-puru-C?action=update |
| </verbatim> |
| |
| *Response:* |
| |
| <verbatim> |
| HTTP/1.1 200 OK |
| Content-Type: application/json;charset=UTF-8 |
| {"update": |
| {"diff":"**********Job definition changes**********\n******************************************\n**********Job conf changes****************\n@@ -8,16 +8,12 @@\n |
| <value>hdfs:\/\/localhost:9000\/user\/purushah\/examples\/apps\/aggregator\/coordinator.xml<\/value>\r\n <\/property>\r\n <property>\r\n |
| - <name>user.name<\/name>\r\n |
| - <value>purushah<\/value>\r\n |
| - <\/property>\r\n |
| - <property>\r\n <name>start<\/name>\r\n |
| <value>2010-01-01T01:00Z<\/value>\r\n <\/property>\r\n <property>\r\n |
| - <name>newproperty<\/name>\r\n |
| - <value>new<\/value>\r\n |
| + <name>user.name<\/name>\r\n |
| + <value>purushah<\/value>\r\n <\/property>\r\n <property>\r\n |
| <name>queueName<\/name>\r\n******************************************\n" |
| } |
| } |
| </verbatim> |
| |
| |
| ---++++ Job Information |
| |
| A HTTP GET request retrieves the job information. |
| |
| *Request:* |
| |
| <verbatim> |
| GET /oozie/v1/job/job-3?show=info&timezone=GMT |
| </verbatim> |
| |
| *Response for a workflow job:* |
| |
| <verbatim> |
| HTTP/1.1 200 OK |
| Content-Type: application/json;charset=UTF-8 |
| . |
| { |
| id: "0-200905191240-oozie-W", |
| appName: "indexer-workflow", |
| appPath: "hdfs://user/bansalm/indexer.wf", |
| externalId: "0-200905191230-oozie-pepe", |
| user: "bansalm", |
| status: "RUNNING", |
| conf: "<configuration> ... </configuration>", |
| createdTime: "Thu, 01 Jan 2009 00:00:00 GMT", |
| startTime: "Fri, 02 Jan 2009 00:00:00 GMT", |
| endTime: null, |
| run: 0, |
| actions: [ |
| { |
| id: "0-200905191240-oozie-W@indexer", |
| name: "indexer", |
| type: "map-reduce", |
| conf: "<configuration> ...</configuration>", |
| startTime: "Thu, 01 Jan 2009 00:00:00 GMT", |
| endTime: "Fri, 02 Jan 2009 00:00:00 GMT", |
| status: "OK", |
| externalId: "job-123-200903101010", |
| externalStatus: "SUCCEEDED", |
| trackerUri: "foo:8021", |
| consoleUrl: "http://foo:50040/jobdetailshistory.jsp?jobId=...", |
| transition: "reporter", |
| data: null, |
| errorCode: null, |
| errorMessage: null, |
| retries: 0 |
| }, |
| ... |
| ] |
| } |
| </verbatim> |
| |
| *Response for a coordinator job:* |
| |
| <verbatim> |
| HTTP/1.1 200 OK |
| Content-Type: application/json;charset=UTF-8 |
| . |
| { |
| id: "0-200905191240-oozie-C", |
| appName: "indexer-Coord", |
| appPath: "hdfs://user/bansalm/myapp/logprocessor-coord.xml", |
| externalId: "0-200905191230-oozie-pepe", |
| user: "bansalm", |
| status: "RUNNING", |
| conf: "<configuration> ... </configuration>", |
| createdTime: "Thu, 01 Jan 2009 00:00:00 GMT", |
| startTime: "Fri, 02 Jan 2009 00:00:00 GMT", |
| endTime: "Fri, 31 Dec 2009 00:00:00 GMT", |
| frequency: "${days(1)}" |
| actions: [ |
| { |
| id: "0000010-130426111815091-oozie-bansalm-C@1", |
| createdTime: "Fri, 26 Apr 2013 20:57:07 GMT", |
| externalId: "", |
| missingDependencies: "", |
| runConf: null, |
| createdConf: null, |
| consoleUrl: null, |
| nominalTime: "Fri, 01 Jan 2010 01:00:00 GMT", |
| ... |
| } |
| </verbatim> |
| |
| *Response for a bundle job:* |
| |
| <verbatim> |
| HTTP/1.1 200 OK |
| Content-Type: application/json;charset=UTF-8 |
| . |
| { |
| jobType: "bundle", |
| id: "0-200905191240-oozie-B", |
| appName: "new-bundle", |
| appPath: "hdfs://user/bansalm/myapp/logprocessor-bundle.xml", |
| externalId: "0-200905191230-oozie-pepe", |
| user: "bansalm", |
| status: "RUNNING", |
| conf: "<configuration> ... </configuration>", |
| createdTime: "Thu, 01 Jan 2009 00:00:00 GMT", |
| startTime: "Fri, 02 Jan 2009 00:00:00 GMT", |
| endTime: "Fri, 31 Dec 2009 00:00:00 GMT" |
| bundleCoordJobs: [ |
| { |
| status: "RUNNING", |
| concurrency: 1, |
| conf: "<configuration> ... </configuration>", |
| executionPolicy: "FIFO", |
| toString: "Coordinator application id[0000010-130426111815091-oozie-bansalm-C] status[RUNNING]", |
| coordJobName: "coord-1", |
| endTime: "Fri, 01 Jan 2010 03:00:00 GMT", |
| ... |
| } |
| ... |
| } |
| </verbatim> |
| |
| *Getting all the Workflows corresponding to a Coordinator Action:* |
| |
| A coordinator action kicks off different workflows for its original run and all subsequent reruns. |
| Getting a list of those workflow ids is a useful tool to keep track of your actions' runs and |
| to go debug the workflow job logs if required. Along with ids, it also lists their statuses, |
| and start and end times for quick reference. |
| |
| Both v1 and v2 API are supported. v0 is not supported. |
| |
| <verbatim> |
| GET /oozie/v2/job/0000001-111219170928042-oozie-joe-C@1?show=allruns |
| </verbatim> |
| |
| *Response* |
| |
| <verbatim> |
| HTTP/1.1 200 OK |
| Content-Type: application/json;charset=UTF-8 |
| . |
| {"workflows":[ |
| { |
| "startTime":"Mon, 24 Mar 2014 23:40:53 GMT", |
| "id":"0000001-140324163709596-oozie-chit-W", |
| "status":"SUCCEEDED", |
| "endTime":"Mon, 24 Mar 2014 23:40:54 GMT" |
| }, |
| { |
| "startTime":"Mon, 24 Mar 2014 23:44:01 GMT", |
| "id":"0000000-140324164318985-oozie-chit-W", |
| "status":"SUCCEEDED", |
| "endTime":"Mon, 24 Mar 2014 23:44:01 GMT" |
| }, |
| { |
| "startTime":"Mon, 24 Mar 2014 23:44:24 GMT", |
| "id":"0000001-140324164318985-oozie-chit-W", |
| "status":"SUCCEEDED", |
| "endTime":"Mon, 24 Mar 2014 23:44:24 GMT" |
| } |
| ]} |
| </verbatim> |
| |
| An alternate API is also available for the same output. With this API, one can pass the coordinator *JOB* Id |
| followed by query params - type=action and scope=<action-number>. One single action number can be passed at a time. |
| |
| <verbatim> |
| GET /oozie/v2/job/0000001-111219170928042-oozie-joe-C?show=allruns&type=action&scope=1 |
| </verbatim> |
| |
| *Retrieve a subset of actions* |
| |
| Query parameters, =offset= and =length= can be specified with a workflow job to retrieve specific actions. Default is offset=0, len=1000 |
| <verbatim> |
| GET /oozie/v1/job/0000002-130507145349661-oozie-joe-W?show=info&offset=5&len=10 |
| </verbatim> |
| Query parameters, =offset=, =length=, =filter= can be specified with a coordinator job to retrieve specific actions. |
| Query parameter, =order= with value "desc" can be used to retrieve the latest coordinator actions materialized instead of actions from @1. |
| Query parameters =filter= can be used to retrieve coordinator actions matching specific status. |
| Default is offset=0, len=0 for v2/job (i.e., does not return any coordinator actions) and offset=0, len=1000 with v1/job and v0/job. |
| So if you need actions to be returned with v2 API, specifying =len= parameter is necessary. |
| Default =order= is "asc". |
| <verbatim> |
| GET /oozie/v1/job/0000001-111219170928042-oozie-joe-C?show=info&offset=5&len=10&filter=status%3DKILLED&order=desc |
| </verbatim> |
| Note that the filter is URL encoded, its decoded value is <code>status=KILLED</code>. |
| <verbatim> |
| GET /oozie/v1/job/0000001-111219170928042-oozie-joe-C?show=info&filter=status%21%3DSUCCEEDED&order=desc |
| </verbatim> |
| This retrieves coordinator actions except for SUCCEEDED status, which is useful for debugging. |
| |
| ---++++ Job Application Definition |
| |
| A HTTP GET request retrieves the workflow or a coordinator job definition file. |
| |
| *Request:* |
| |
| <verbatim> |
| GET /oozie/v1/job/job-3?show=definition |
| </verbatim> |
| |
| *Response for a workflow job:* |
| |
| <verbatim> |
| HTTP/1.1 200 OK |
| Content-Type: application/xml;charset=UTF-8 |
| . |
| <?xml version="1.0" encoding="UTF-8"?> |
| <workflow-app name='xyz-app' xmlns="uri:oozie:workflow:0.1"> |
| <start to='firstaction' /> |
| ... |
| <end name='end' /> |
| </workflow-app> |
| </verbatim> |
| |
| *Response for a coordinator job:* |
| |
| <verbatim> |
| HTTP/1.1 200 OK |
| Content-Type: application/xml;charset=UTF-8 |
| . |
| <?xml version="1.0" encoding="UTF-8"?> |
| <coordinator-app name='abc-app' xmlns="uri:oozie:coordinator:0.1" frequency="${days(1)} |
| start="2009-01-01T00:00Z" end="2009-12-31T00:00Z" timezone="America/Los_Angeles"> |
| <datasets> |
| ... |
| </datasets> |
| ... |
| </coordinator-app> |
| </verbatim> |
| |
| *Response for a bundle job:* |
| |
| <verbatim> |
| HTTP/1.1 200 OK |
| Content-Type: application/xml;charset=UTF-8 |
| . |
| <?xml version="1.0" encoding="UTF-8"?> |
| <bundle-app name='abc-app' xmlns="uri:oozie:coordinator:0.1" |
| start="2009-01-01T00:00Z" end="2009-12-31T00:00Z""> |
| <datasets> |
| ... |
| </datasets> |
| ... |
| </bundle-app> |
| </verbatim> |
| |
| ---++++ Job Log |
| |
| An HTTP GET request retrieves the job log. |
| |
| *Request:* |
| |
| <verbatim> |
| GET /oozie/v1/job/job-3?show=log |
| </verbatim> |
| |
| *Response:* |
| |
| <verbatim> |
| HTTP/1.1 200 OK |
| Content-Type: text/plain;charset=UTF-8 |
| . |
| ... |
| 23:21:31,272 TRACE oozieapp:526 - USER[bansalm] GROUP[other] TOKEN[-] APP[test-wf] JOB[0-20090518232130-oozie-tucu] ACTION[mr-1] Start |
| 23:21:31,305 TRACE oozieapp:526 - USER[bansalm] GROUP[other] TOKEN[-] APP[test-wf] JOB[0-20090518232130-oozie-tucu] ACTION[mr-1] End |
| ... |
| </verbatim> |
| |
| ---++++ Job Error Log |
| |
| An HTTP GET request retrieves the job error log. |
| |
| *Request:* |
| |
| <verbatim> |
| GET /oozie/v2/job/0000000-150121110331712-oozie-puru-B?show=errorlog |
| </verbatim> |
| |
| *Response:* |
| |
| <verbatim> |
| HTTP/1.1 200 OK |
| Content-Type: text/plain;charset=UTF-8 |
| 2015-01-21 11:33:29,090 WARN CoordSubmitXCommand:523 - SERVER[-] USER[-] GROUP[-] TOKEN[-] APP[-] JOB[0000000-150121110331712-oozie-puru-B] ACTION[] SAXException : |
| org.xml.sax.SAXParseException; lineNumber: 20; columnNumber: 22; cvc-complex-type.2.4.a: Invalid content was found starting with element 'concurrency'. One of '{"uri:oozie:coordinator:0.2":controls, "uri:oozie:coordinator:0.2":datasets, "uri:oozie:coordinator:0.2":input-events, "uri:oozie:coordinator:0.2":output-events, "uri:oozie:coordinator:0.2":action}' is expected. |
| at org.apache.xerces.util.ErrorHandlerWrapper.createSAXParseException(Unknown Source) |
| at org.apache.xerces.util.ErrorHandlerWrapper.error(Unknown Source) |
| ... |
| </verbatim> |
| |
| |
| ---++++ Job Audit Log |
| |
| An HTTP GET request retrieves the job audit log. |
| |
| *Request:* |
| |
| <verbatim> |
| GET /oozie/v2/job/0000000-150322000230582-oozie-puru-C?show=auditlog |
| </verbatim> |
| |
| *Response:* |
| |
| <verbatim> |
| HTTP/1.1 200 OK |
| Content-Type: text/plain;charset=UTF-8 |
| 2015-03-22 00:04:35,494 INFO oozieaudit:520 - IP [-], USER [purushah], GROUP [null], APP [-], JOBID [0000000-150322000230582-oozie-puru-C], OPERATION [start], PARAMETER [null], STATUS [SUCCESS], HTTPCODE [200], ERRORCODE [null], ERRORMESSAGE [null] |
| 2015-03-22 00:05:13,823 INFO oozieaudit:520 - IP [-], USER [purushah], GROUP [null], APP [-], JOBID [0000000-150322000230582-oozie-puru-C], OPERATION [suspend], PARAMETER [0000000-150322000230582-oozie-puru-C], STATUS [SUCCESS], HTTPCODE [200], ERRORCODE [null], ERRORMESSAGE [null] |
| 2015-03-22 00:06:59,561 INFO oozieaudit:520 - IP [-], USER [purushah], GROUP [null], APP [-], JOBID [0000000-150322000230582-oozie-puru-C], OPERATION [suspend], PARAMETER [0000000-150322000230582-oozie-puru-C], STATUS [SUCCESS], HTTPCODE [200], ERRORCODE [null], ERRORMESSAGE [null] |
| 2015-03-22 23:22:20,012 INFO oozieaudit:520 - IP [-], USER [purushah], GROUP [null], APP [-], JOBID [0000000-150322000230582-oozie-puru-C], OPERATION [suspend], PARAMETER [0000000-150322000230582-oozie-puru-C], STATUS [SUCCESS], HTTPCODE [200], ERRORCODE [null], ERRORMESSAGE [null] |
| 2015-03-22 23:28:48,218 INFO oozieaudit:520 - IP [-], USER [purushah], GROUP [null], APP [-], JOBID [0000000-150322000230582-oozie-puru-C], OPERATION [resume], PARAMETER [0000000-150322000230582-oozie-puru-C], STATUS [SUCCESS], HTTPCODE [200], ERRORCODE [null], ERRORMESSAGE [null]</verbatim> |
| </verbatim> |
| |
| |
| ---++++ Filtering the server logs with logfilter options |
| User can provide multiple option to filter logs using -logfilter opt1=val1;opt2=val1;opt3=val1. This can be used to fetch only just logs of interest faster as fetching Oozie server logs is slow due to the overhead of pattern matching. |
| |
| <verbatim> |
| GET /oozie/v1/job/0000003-140319184715726-oozie-puru-C?show=log&logfilter=limit=3;loglevel=WARN |
| </verbatim> |
| |
| |
| Refer to the [[DG_CommandLineTool#Filtering_the_server_logs_with_logfilter_options][Filtering the server logs with logfilter options]] for more details. |
| |
| |
| ---++++ Job graph |
| |
| An =HTTP GET= request returns the image of the workflow DAG (rendered as a PNG image). |
| * The nodes that are being executed are painted yellow |
| * The nodes that have successfully executed are painted green |
| * The nodes that have failed execution are painted red |
| * The nodes that are yet to be executed are pained gray |
| * An arc painted green marks the successful path taken so far |
| * An arc painted red marks the failure of the node and highlights the _error_ action |
| * An arc painted gray marks a path not taken yet |
| |
| *Request:* |
| <verbatim> |
| GET /oozie/v1/job/job-3?show=graph[&show-kill=true] |
| </verbatim> |
| |
| *Response:* |
| <verbatim> |
| HTTP/1.1 200 OK |
| Content-Type: image/png |
| Content-Length: {image_size_in_bytes} |
| |
| {image_bits} |
| </verbatim> |
| |
| The optional =show-kill= parameter shows =kill= node in the graph. Valid values for this parameter are =1=, =yes=, and =true=. This parameter has no effect when workflow fails and the failure node leads to the =kill= node; in that case =kill= node is shown always. |
| |
| The node labels are the node names provided in the workflow XML. |
| |
| This API returns =HTTP 400= when run on a resource other than a workflow, viz. bundle and coordinator. |
| |
| ---++++ Job Status |
| |
| An =HTTP GET= request that returns the current status (e.g. =SUCCEEDED=, =KILLED=, etc) of a given job. If you are only interested |
| in the status, and don't want the rest of the information that the =info= query provides, it is recommended to use this call |
| as it is more efficient. |
| |
| *Request* |
| <verbatim> |
| GET /oozie/v2/job/0000000-140908152307821-oozie-rkan-C?show=status |
| </verbatim> |
| |
| *Response* |
| |
| <verbatim> |
| HTTP/1.1 200 OK |
| Content-Type: application/json;charset=UTF-8 |
| . |
| { |
| "status" : "SUCCEEDED" |
| } |
| </verbatim> |
| |
| It accepts any valid Workflow Job ID, Coordinator Job ID, Coordinator Action ID, or Bundle Job ID. |
| |
| ---++++ Changing job SLA definition and alerting |
| An =HTTP PUT= request to change job SLA alert status/SLA definition. |
| |
| * All sla commands takes actions-list or date parameter. |
| * =date=: a comma-separated list of date ranges. Each date range element is specified with dates separated by =::= |
| * =action-list=: a comma-separated list of action ranges. Each action range is specified with two action numbers separated by =-= |
| * For bundle jobs additional =coordinators= (coord_name/id) parameter can be passed. |
| * Sla change command need extra parameter =value= to specify new sla definition. |
| |
| |
| * Changing SLA definition |
| SLA definition of should-start, should-end, nominal-time and max-duration can be changed. |
| |
| <verbatim> |
| PUT /oozie/v2/job/0000003-140319184715726-oozie-puru-C?action=sla-change&value=<key>=<value>;...;<key>=<value> |
| </verbatim> |
| |
| * Disabling SLA alert |
| |
| <verbatim> |
| PUT /oozie/v2/job/0000003-140319184715726-oozie-puru-C?action=sla-disable&action-list=3-4 |
| </verbatim> |
| Will disable SLA alert for actions 3 and 4. |
| |
| <verbatim> |
| PUT /oozie/v1/job/0000003-140319184715726-oozie-puru-C?action=sla-disable&date=2009-02-01T00:10Z::2009-03-01T00:10Z |
| </verbatim> |
| Will disable SLA alert for actions whose nominal time is in-between 2009-02-01T00:10Z 2009-03-01T00:10Z (inclusive). |
| |
| |
| <verbatim> |
| PUT /oozie/v1/job/0000004-140319184715726-oozie-puru-B?action=sla-disable&date=2009-02-01T00:10Z::2009-03-01T00:10Z&coordinators=abc |
| </verbatim> |
| For bundle jobs additional coordinators (list of comma separated coord_name/id) parameter can be passed. |
| |
| * Enabling SLA alert |
| |
| <verbatim> |
| PUT /oozie/v2/job/0000003-140319184715726-oozie-puru-C?action=sla-enable&action-list=1,14,17-20 |
| </verbatim> |
| Will enable SLA alert for actions 1,14,17,18,19,20. |
| |
| ---++++ Jobs Information |
| |
| A HTTP GET request retrieves workflow and coordinator jobs information. |
| |
| *Request:* |
| |
| <verbatim> |
| GET /oozie/v1/jobs?filter=user%3Dbansalm&offset=1&len=50&timezone=GMT |
| </verbatim> |
| |
| Note that the filter is URL encoded, its decoded value is <code>user=bansalm</code>. |
| |
| *Response:* |
| |
| <verbatim> |
| HTTP/1.1 200 OK |
| Content-Type: application/json;charset=UTF-8 |
| . |
| { |
| offset: 1, |
| len: 50, |
| total: 1002, |
| **jobs: [ |
| { |
| ** jobType: "workflow" |
| id: "0-200905191240-oozie-W", |
| appName: "indexer-workflow", |
| appPath: "hdfs://user/tucu/indexer-wf", |
| user: "bansalm", |
| group: "other", |
| status: "RUNNING", |
| createdTime: "Thu, 01 Jan 2009 00:00:00 GMT", |
| startTime: "Fri, 02 Jan 2009 00:00:00 GMT", |
| endTime: null, |
| info: "run=0", |
| }, |
| { |
| ** jobType: "coordinator" |
| id: "0-200905191240-oozie-C", |
| appName: "logprocessor-coord", |
| appPath: "hdfs://user/bansalm/myapp/logprocessor-coord.xml", |
| user: "bansalm", |
| group: "other", |
| status: "RUNNING", |
| createdTime: "Thu, 01 Jan 2009 00:00:00 GMT", |
| startTime: "Fri, 02 Jan 2009 00:00:00 GMT", |
| endTime: "Fri, 31 Dec 2009 00:00:00 GMT", |
| info: "nextAction=5", |
| }, |
| { |
| ** jobType: "bundle" |
| id: "0-200905191240-oozie-B", |
| appName: "logprocessor-bundle", |
| appPath: "hdfs://user/bansalm/myapp/logprocessor-bundle.xml", |
| user: "bansalm", |
| group: "other", |
| status: "RUNNING", |
| createdTime: "Thu, 01 Jan 2009 00:00:00 GMT", |
| startTime: "Fri, 02 Jan 2009 00:00:00 GMT", |
| endTime: "Fri, 31 Dec 2009 00:00:00 GMT", |
| }, |
| ... |
| ] |
| } |
| </verbatim> |
| |
| No action information is returned when querying for multiple jobs. |
| |
| The syntax for the filter is <verbatim>[NAME=VALUE][;NAME=VALUE]*</verbatim> |
| |
| Valid filter names are: |
| |
| * name: the application name from the workflow/coordinator/bundle definition |
| * user: the user who submitted the job |
| * group: the group for the job (support for the group filter is discontinued. version: 3.2.0 OOZIE-228). |
| * id: the id of the workflow/coordinator/bundle job |
| * status: the status of the job |
| * startCreatedTime : the start of the window about workflow job's created time |
| * endCreatedTime : the end of above window |
| * sortby: order the results. Supported values for =sortby= are: =createdTime= and =lastModifiedTime= |
| |
| The query will do an AND among all the filter names. |
| |
| The query will do an OR among all the filter values for the same name. Multiple values must be specified as different |
| name value pairs. |
| |
| Additionally the =offset= and =len= parameters can be used for pagination. The start parameter is base 1. |
| |
| Moreover, the =jobtype= parameter could be used to determine what type of job is looking for. |
| The valid values of job type are: =wf=, =coordinator= or =bundle=. |
| |
| startCreatedTime and endCreatedTime should be specified either in *ISO8601 (UTC)* format *(yyyy-MM-dd'T'HH:mm'Z')* or |
| a offset value in days or hours or minutes from the current time. For example, -2d means the (current time - 2 days), |
| -3h means the (current time - 3 hours), -5m means the (current time - 5 minutes). |
| |
| ---++++ Bulk modify jobs |
| |
| A HTTP PUT request can kill, suspend, or resume all jobs that satisfy the url encoded parameters. |
| |
| *Request:* |
| |
| <verbatim> |
| PUT /oozie/v1/jobs?action=kill&filter=name%3Dcron-coord&offset=1&len=50&jobtype=coordinator |
| </verbatim> |
| |
| This request will kill all the coordinators with name=cron-coord up to 50 of them. |
| |
| Note that the filter is URL encoded, its decoded value is <code>name=cron-coord</code>. |
| The syntax for the filter is <verbatim>[NAME=VALUE][;NAME=VALUE]*</verbatim> |
| |
| Valid filter names are: |
| |
| * name: the application name from the workflow/coordinator/bundle definition |
| * user: the user that submitted the job |
| * group: the group for the job |
| * status: the status of the job |
| |
| The query will do an AND among all the filter names. |
| |
| The query will do an OR among all the filter values for the same name. Multiple values must be specified as different |
| name value pairs. |
| |
| Additionally the =offset= and =len= parameters can be used for pagination. The start parameter is base 1. |
| |
| Moreover, the =jobtype= parameter could be used to determine what type of job is looking for. |
| The valid values of job type are: =wf=, =coordinator= or =bundle= |
| |
| *Response:* |
| <verbatim> |
| HTTP/1.1 200 OK |
| Content-Type: application/json;charset=UTF-8 |
| . |
| { |
| offset: 1, |
| len: 50, |
| total: 2, |
| **jobs: [ |
| { |
| ** jobType: "coordinator" |
| id: "0-200905191240-oozie-C", |
| appName: "cron-coord", |
| appPath: "hdfs://user/bansalm/app/cron-coord.xml", |
| user: "bansalm", |
| group: "other", |
| status: "KILLED", |
| createdTime: "Thu, 01 Jan 2009 00:00:00 GMT", |
| startTime: "Fri, 02 Jan 2009 00:00:00 GMT", |
| endTime: "Fri, 31 Dec 2009 00:00:00 GMT", |
| info: "nextAction=5", |
| }, |
| { |
| ** jobType: "coordinator" |
| id: "0-200905191240-oozie-C", |
| appName: "cron-coord", |
| appPath: "hdfs://user/bansalm/myapp/cron-coord.xml", |
| user: "bansalm", |
| group: "other", |
| status: "KILLED", |
| createdTime: "Thu, 01 Jan 2009 00:00:00 GMT", |
| startTime: "Fri, 02 Jan 2009 00:00:00 GMT", |
| endTime: "Fri, 31 Dec 2009 00:00:00 GMT", |
| }, |
| ... |
| ] |
| } |
| </verbatim> |
| |
| <verbatim> |
| PUT /oozie/v1/jobs?action=suspend&filter=status%3Drunning&offset=1&len=50&jobtype=wf |
| </verbatim> |
| |
| This request will suspend all the workflows with status=running up to 50 of them. |
| Note that the filter is URL encoded, its decoded value is <code>status=running</code>. |
| |
| *Response:* |
| <verbatim> |
| HTTP/1.1 200 OK |
| Content-Type: application/json;charset=UTF-8 |
| . |
| { |
| offset: 1, |
| len: 50, |
| total: 50, |
| **jobs: [ |
| { |
| ** jobType: "workflow" |
| id: "0-200905191240-oozie-W", |
| appName: "indexer-workflow", |
| appPath: "hdfs://user/tucu/indexer-wf", |
| user: "bansalm", |
| group: "other", |
| status: "SUSPENDED", |
| createdTime: "Thu, 01 Jan 2009 00:00:00 GMT", |
| startTime: "Fri, 02 Jan 2009 00:00:00 GMT", |
| endTime: null, |
| info: "run=0", |
| }, |
| { |
| ** jobType: "workflow" |
| id: "0-200905191240-oozie-W", |
| appName: "logprocessor-wf", |
| appPath: "hdfs://user/bansalm/myapp/workflow.xml", |
| user: "bansalm", |
| group: "other", |
| status: "SUSPENDED", |
| createdTime: "Thu, 01 Jan 2009 00:00:00 GMT", |
| startTime: "Fri, 02 Jan 2009 00:00:00 GMT", |
| endTime: null, |
| info: "run=0", |
| }, |
| ... |
| ] |
| } |
| </verbatim> |
| |
| ---++++ Jobs information using Bulk API |
| |
| A HTTP GET request retrieves a bulk response for all actions, corresponding to a particular bundle, that satisfy user specified criteria. |
| This is useful for monitoring purposes, where user can find out about the status of downstream jobs with a single bulk request. |
| The criteria are used for filtering the actions returned. Valid options (_case insensitive_) for these request criteria are: |
| |
| * *bundle*: the application name from the bundle definition |
| * *coordinators*: the application name(s) from the coordinator definition. |
| * *actionStatus*: the status of coordinator action (Valid values are WAITING, READY, SUBMITTED, RUNNING, SUSPENDED, TIMEDOUT, SUCCEEDED, KILLED, FAILED) |
| * *startCreatedTime*: the start of the window you want to look at, of the actions' created time |
| * *endCreatedTime*: the end of above window |
| * *startScheduledTime*: the start of the window you want to look at, of the actions' scheduled i.e. nominal time. |
| * *endScheduledTime*: the end of above window |
| |
| Specifying 'bundle' is REQUIRED. All the rest are OPTIONAL but that might result in thousands of results depending on the size of your job. (pagination comes into play then) |
| |
| If no 'actionStatus' values provided, by default KILLED,FAILED will be used. |
| For e.g if the query string is only "bundle=MyBundle", the response will have all actions (across all coordinators) whose status is KILLED or FAILED |
| |
| The query will do an AND among all the filter names, and OR among each filter name's values. |
| |
| The syntax for the request criteria is <verbatim>[NAME=VALUE][;NAME=VALUE]*</verbatim> |
| |
| For 'coordinators' and 'actionStatus', if user wants to check for multiple values, they can be passed in a comma-separated manner. |
| *Note*: The query will do an OR among them. Hence no need to repeat the criteria name |
| |
| All the time values should be specified in *ISO8601 (UTC)* format i.e. *yyyy-MM-dd'T'HH:mm'Z'* |
| |
| Additionally the =offset= and =len= parameters can be used as usual for pagination. The start parameter is base 1. |
| |
| If you specify a coordinator in the list, that does not exist, no error is thrown; simply the response will be empty or pertaining to the other valid coordinators. |
| However, if bundle name provided does not exist, an error is thrown. |
| |
| *Request:* |
| |
| <verbatim> |
| GET /oozie/v1/jobs?bulk=bundle%3Dmy-bundle-app;coordinators%3Dmy-coord-1,my-coord-5;actionStatus%3DKILLED&offset=1&len=50 |
| </verbatim> |
| |
| Note that the filter is URL encoded, its decoded value is <code>user=chitnis</code>. If typing in browser URL, one can type decoded value itself i.e. using '=' |
| |
| *Response:* |
| |
| <verbatim> |
| HTTP/1.1 200 OK |
| Content-Type: application/json;charset=UTF-8 |
| . |
| { |
| offset: 1, |
| len: 50, |
| total: 1002, |
| ** bulkresponses: [ |
| ** { |
| bulkbundle: |
| { |
| bundleJobName: "my-bundle-app", |
| bundleJobId: "0-200905191240-oozie-B", |
| status: "SUSPENDED", |
| }, |
| bulkcoord: |
| { |
| coordJobName: "my-coord-1", |
| status: "SUSPENDED", |
| }, |
| bulkaction: |
| { |
| id: "0-200905191240-oozie-C@21", |
| coordJobId: "0-200905191240-oozie-C", |
| actionNumber: 21, |
| externalId: "job_00076_0009", |
| status: "KILLED", |
| externalStatus: "FAILED", |
| errorCode: "E0902", |
| errorMessage: "Input file corrupt", |
| createdTime: "Fri, 02 Jan 2009 00:00:00 GMT", |
| nominalTime: "Thu, 01 Jan 2009 00:00:00 GMT", |
| missingDependencies: "hdfs://nn:port/user/joe/file.txt" |
| }, |
| }, |
| ** { |
| bulkbundle: |
| { |
| bundleJobName: "my-bundle-app", |
| bundleJobId: "0-200905191240-oozie-B", |
| status: "SUSPENDED", |
| } |
| bulkcoord: |
| { |
| coordJobName: "my-coord-5", |
| status: "SUSPENDED", |
| } |
| bulkaction: |
| { |
| id: "0-200905191245-oozie-C@114", |
| coordJobId: "0-200905191245-oozie-C", |
| actionNumber: 114, |
| externalId: "job_00076_0265", |
| status: "KILLED", |
| externalStatus: "KILLED", |
| errorCode: "E0603", |
| errorMessage: "SQL error in operation ...", |
| createdTime: "Fri, 02 Jan 2009 00:00:00 GMT", |
| nominalTime: "Thu, 01 Jan 2009 00:00:00 GMT", |
| missingDependencies: |
| } |
| } |
| ... |
| ] |
| } |
| </verbatim> |
| |
| ---++ Oozie Web Services API, V2 (Workflow , Coordinator And Bundle) |
| |
| The Oozie Web Services API is a HTTP REST JSON API. |
| |
| All responses are in =UTF-8=. |
| |
| Assuming Oozie is running at =OOZIE_URL=, the following web services end points are supported: |
| |
| * <OOZIE_URL>/versions |
| * <OOZIE_URL>/v2/admin |
| * <OOZIE_URL>/v2/job |
| * <OOZIE_URL>/v2/jobs |
| |
| *Changes in v2 job API:* |
| |
| There is a difference in the JSON format of Job Information API (*/job) particularly for map-reduce action. |
| No change for other actions. |
| In v1, externalId and consoleUrl point to spawned child job ID, and exteranlChildIDs is null in map-reduce action. |
| In v2, externalId and consoleUrl point to launcher job ID, and exteranlChildIDs is spawned child job ID in map-reduce action. |
| |
| v2 supports retrieving of JMS topic on which job notifications are sent |
| |
| *REST API URL:* |
| |
| <verbatim> |
| GET http://localhost:11000/oozie/v2/job/0000002-130507145349661-oozie-vira-W?show=jmstopic |
| </verbatim> |
| |
| *Changes in v2 admin API:* |
| |
| v2 adds support for retrieving JMS connection information related to JMS notifications. |
| |
| *REST API URL:* |
| |
| <verbatim> |
| GET http://localhost:11000/oozie/v2/admin/jmsinfo |
| </verbatim> |
| |
| v2/jobs remain the same as v1/jobs |
| |
| ---+++ Job and Jobs End-Points |
| |
| ---++++ Job Information |
| |
| A HTTP GET request retrieves the job information. |
| |
| *Request:* |
| |
| <verbatim> |
| GET /oozie/v2/job/job-3?show=info&timezone=GMT |
| </verbatim> |
| |
| *Response for a workflow job:* |
| |
| <verbatim> |
| HTTP/1.1 200 OK |
| Content-Type: application/json;charset=UTF-8 |
| . |
| { |
| **jobType: "workflow", |
| id: "0-200905191240-oozie-W", |
| appName: "indexer-workflow", |
| appPath: "hdfs://user/bansalm/indexer.wf", |
| externalId: "0-200905191230-oozie-pepe", |
| user: "bansalm", |
| group: "other", |
| status: "RUNNING", |
| conf: "<configuration> ... </configuration>", |
| createdTime: "Thu, 01 Jan 2009 00:00:00 GMT", |
| startTime: "Fri, 02 Jan 2009 00:00:00 GMT", |
| endTime: null, |
| run: 0, |
| actions: [ |
| { |
| id: "0-200905191240-oozie-W@indexer", |
| name: "indexer", |
| type: "map-reduce", |
| conf: "<configuration> ...</configuration>", |
| startTime: "Thu, 01 Jan 2009 00:00:00 GMT", |
| endTime: "Fri, 02 Jan 2009 00:00:00 GMT", |
| status: "OK", |
| externalId: "job-123-200903101010", |
| externalStatus: "SUCCEEDED", |
| trackerUri: "foo:8021", |
| consoleUrl: "http://foo:50040/jobdetailshistory.jsp?jobId=job-123-200903101010", |
| transition: "reporter", |
| data: null, |
| stats: null, |
| externalChildIDs: "job-123-200903101011" |
| errorCode: null, |
| errorMessage: null, |
| retries: 0 |
| }, |
| ... |
| ] |
| } |
| </verbatim> |
| |
| ---++++ Managing a Job |
| ---+++++ Ignore a Coordinator Job or Action |
| |
| A ignore request is done with an HTTP PUT request with a =ignore= |
| |
| The =type= parameter supports =action= only. |
| The =scope= parameter can contain coordinator action id(s) to be ignored. |
| Multiple action ids can be passed to the =scope= parameter |
| |
| *Request:* |
| |
| Ignore a coordinator job |
| <verbatim> |
| PUT /oozie/v2/job/job-3?action=ignore |
| </verbatim> |
| |
| Ignore coordinator actions |
| <verbatim> |
| PUT /oozie/v2/job/job-3?action=ignore&type=action&scope=3-4 |
| </verbatim> |
| |
| ---+++ Validate End-Point |
| |
| This endpoint is to validate a workflow, coordinator, bundle XML file. |
| |
| ---++++ Validate a local file |
| |
| *Request:* |
| |
| <verbatim> |
| POST /oozie/v2/validate?file=/home/test/myApp/workflow.xml |
| Content-Type: application/xml;charset=UTF-8 |
| . |
| <?xml version="1.0" encoding="UTF-8" standalone="yes"?> |
| <workflow-app xmlns="uri:oozie:workflow:0.3" name="test"> |
| <start to="shell"/> |
| <action name="shell"> |
| <shell xmlns="uri:oozie:shell-action:0.3"> |
| <job-tracker>${jobTracker}</job-tracker> |
| <name-node>${nameNode}</name-node> |
| <exec>script.sh</exec> |
| <argument></argument> |
| <file>script.sh</file> |
| <capture-output/> |
| </shell> |
| <ok to="end"/> |
| <error to="fail"/> |
| </action> |
| <kill name="fail"> |
| <message>failed, error message[${wf:errorMessage(wf:lastErrorNode())}]</message> |
| </kill> |
| <end name="end"/> |
| </workflow-app> |
| </verbatim> |
| |
| *Response:* |
| |
| <verbatim> |
| HTTP/1.1 200 OK |
| Content-Type: application/json;charset=UTF-8 |
| . |
| { |
| validate: "Valid workflow-app" |
| } |
| </verbatim> |
| |
| ---++++ Validate a file in HDFS |
| |
| You can validate a workflow, coordinator, bundle XML file in HDFS. The XML file must already exist in HDFS. |
| |
| *Request:* |
| |
| <verbatim> |
| POST /oozie/v2/validate?file=hdfs://localhost:8020/user/test/myApp/workflow.xml |
| Content-Type: application/xml;charset=UTF-8 |
| . |
| </verbatim> |
| |
| *Response:* |
| |
| <verbatim> |
| HTTP/1.1 200 OK |
| Content-Type: application/json;charset=UTF-8 |
| . |
| { |
| validate: "Valid workflow-app" |
| } |
| </verbatim> |
| |
| </noautolink> |
| |
| |
| |
| |
| |
| |