layout: page title: Jobs Resource

Exposes operations at the job scope (as opposed to the cluster, container, or task scope). The initial implementation includes the ability to list all jobs, get the status of a particular job, and start or stop an individual job.

#API The following sections provide general information about the response structure and detailed descriptions of each of the requests.

Response Structure

All responses will contain either a job status or an error message.

Job Status

Job status will be of the form:

{% highlight json %} { “status”:“STOPPED”, “statusDetail”:“KILLED”, “jobName”:“wikipedia-parser”, “jobId”:“1” } {% endhighlight %}

status is the abstract Samza status for the job. Initially it will be one of {STARTING, STARTED, STOPPED, UNKNOWN}.

statusDetail is the implementation-specific status for the job. For YARN, it will be one of the values in the YarnApplicationState enum.

Error Message

Every error response have the following structure:

{% highlight json %} { “message”: “Unrecognized status parameter: null” } {% endhighlight %}

message is the only field in the response and contains a description of the problem.

##Get All Jobs Lists all the jobs installed on the host and provides their status.

######Request GET /v1/jobs

######Response Status: 200 OK {% highlight json %}[

{
    "status":"STOPPED",
    "statusDetail":"KILLED",
    "jobName":"wikipedia-parser",
    "jobId":"1"
},
{
    "status":"STARTED",
    "statusDetail":"RUNNING",
    "jobName":"wikipedia-feed",
    "jobId":"1"
},
{
    "status":"STOPPED",
    "statusDetail":null,
    "jobName":"wikipedia-stats",
    "jobId":"1"
}

] {% endhighlight %}

######Response codes

##Get Job Gets the status of the specified job.

######Format GET /v1/jobs/{jobName}/{jobId} The {jobName} and {jobId} path parameters reflect the values of ‘job.name’ and ‘job.id’ in the job config.

######Request GET /v1/jobs/wikipedia-feed/1

######Response Status: 200 OK {% highlight json %} { “status”:“STARTED”, “statusDetail”:“RUNNING”, “jobName”:“wikipedia-feed”, “jobId”:“1” } {% endhighlight %}

######Response codes

##Start Job Starts the job with the specified app name if it's not already started. The command will return when it has initiated the start operation.

######Format PUT /v1/jobs/{jobName}/{jobId}?status=started

Form parameter status is the intended status of the job at the end of the request.

######Example PUT /v1/jobs/wikipedia-feed/1?status=started ######Response Status: 202 Accepted {% highlight json %} { “status”:“STARTING”, “statusDetail”:“ACCEPTED”, “jobName”: “wikipedia-feed”, “jobId”: “1” } {% endhighlight %}

######Response codes

##Stop Job Stops the job with the specified app name if it's not already stopped.

######Format PUT /v1/jobs/{jobName}/{jobId}?status=stopped

Form parameter status is the intended status of the job at the end of the request.

######Example PUT /v1/jobs/wikipedia-feed/1?status=stopped ######Response Status: 202 Accepted {% highlight json %} { “status”:“STOPPED”, “statusDetail”:“KILLED”, “jobName”: “wikipedia-feed”, “jobId”: “1” } {% endhighlight %}

######Response codes

Design

###Abstractions There are three primary abstractions used by the JobsResource that users can implement to handle any details specific to their environment.

  1. JobProxy: The JobProxy is the central point of interacting with Samza jobs. It exposes generic methods to start, stop, and get the status of a Samza job. Implementations of this interface can employ custom code to implement these methods tailored to the specific API for any cluster manager.
  2. InstallationFinder: The InstallationFinder provides a generic interface to discover all the installed jobs, hiding any customizations in the job package structure and its location (e.g. local vs remote host). The InstallationFinder also resolves the job configuration, which is used to validate and identify the job.
  3. JobStatusProvider: The JobStatusProvider allows the JobProxy to get the status of a job in a generic way. The same interface can be used to get the job status on Yarn, Mesos, or standalone jobs. It also enables different implementations for the same cluster. With Yarn, for example, one implementation may get job status via command line and another via the ResourceManager REST API.

The configuration must specify a JobProxy factory class explicitly. By contrast, the InstallationFinder and JobStatusProvider abstractions are natural extensions of the JobProxy and are solely provided to demonstrate a pattern for discovering installed jobs and fetching job status. However, they are not an explicit requirement.

The SimpleYarnJobProxy that ships with Samza REST is intended to demonstrate a functional implementation of a JobProxy which works with the Hello Samza jobs. See the tutorial to try it out. You can implement your own JobProxy to adapt the JobsResource to the specifics of your job packaging and deployment model.

Request Flow

After validating each request, the JobsResource invokes the appropriate JobProxy command. The JobProxy uses the InstallationFinder to enumerate the installed jobs and the JobStatusProvider to get the runtime status of the jobs.

The provided SimpleInstallationFinder crawls the file system, starting in the directory specified by the job.installations.path looking for valid Samza job config files. It extracts the job.name and job.id property values and creates an InstallationRecord for the each job instance. The InstallationRecord contains all the information needed to start, stop, and get the status for the job.

The provided YarnCliJobStatusProvider leverages a ScriptRunner to fetch job status using the Yarn ApplicationCLI.

The SimpleYarnJobProxy relies on the scripts in the InstallationRecord scriptFilePath (/bin) directory to start and stop jobs.

The following is a depiction of the implementation that ships with Samza REST:

Configuration

The JobsResource properties should be specified in the same file as the Samza REST configuration. They are specified here for clarity.