tag	82a71213a518e3caf48340fba33222e89bfb5082
tagger	Zhankun Tang <ztang@apache.org>	Wed Jan 22 15:26:38 2020 +0800
object	c4470ec5d36827b3b350e15e29dc60db38cea1cf

Release candidate - 0.3.0-RC0

commit	c4470ec5d36827b3b350e15e29dc60db38cea1cf	[log] [tgz]
author	Zhankun Tang <ztang@apache.org>	Wed Jan 22 14:39:26 2020 +0800
committer	Zhankun Tang <ztang@apache.org>	Wed Jan 22 15:24:24 2020 +0800
tree	0b9533fa261fe079eb733d653817c2f5d7d4099b
parent	b9240d01535696f38814b24b4c461c29ca673a71 [diff]

SUBMARINE-349. Support using existing artifacts to build mini-submarine image

### What is this PR for?
The hard-coded submarine version(0.3.0-SNAPSHOT) in build_mini-submarine.sh is not convenient. When doing a release, we need the image to package a candidate artifacts like 0.3.0.
This JIRA is for extending the build script to package specified artifacts. Setting environment variable submarine_version and release_candidates_path can trigger this code path.
It will copy binary tarball with the submarine_version to the local dir and build the image.

### What type of PR is it?
Improvement

### What is the Jira issue?
https://issues.apache.org/jira/browse/SUBMARINE-349

### How should this be tested?
Put submarine candidate aritifacts to a folder like "~/releases/submarine-release"
$ ls $release_candidates_path
submarine-dist-0.3.0-hadoop-2.9.tar.gz        submarine-dist-0.3.0-src.tar.gz.asc
submarine-dist-0.3.0-hadoop-2.9.tar.gz.asc    submarine-dist-0.3.0-src.tar.gz.sha512
submarine-dist-0.3.0-hadoop-2.9.tar.gz.sha512 submarine-dist-0.3.0-src.tar.gz

export submarine_version=0.3.0
export release_candidates_path=~/releases/submarine-release
./build_mini-submarine.sh
docker run -it -h submarine-dev --net=bridge --privileged -P local/mini-submarine:0.3.0 /bin/bash
In the container, check the submarine jar version is 0.3.0

### Questions:
* Does the licenses files need an update? No
* Is there breaking changes for older versions? No
* Does this needs documentation? Yes

Author: Zhankun Tang <ztang@apache.org>

Closes #158 from tangzhankun/submarine-349 and squashes the following commits:

165421a [Zhankun Tang] Simplify variable default value assignment and fix a typo in document
9a807c0 [Zhankun Tang] Refine the doc
099b966 [Zhankun Tang] Add doc
82990d2 [Zhankun Tang] Support using existing artifacts to build mini-submarine image

(cherry picked from commit 15d5c915a4023af6713e0b3d6afd33fb44ad879f)

6 files changed

tree: 0b9533fa261fe079eb733d653817c2f5d7d4099b

README.md

color_logo_with_text

What is Apache Submarine?

Apache Submarine is a unified AI platform which allows engineers and data scientists to run Machine Learning and Deep Learning workload in distributed cluster.

Goals of Submarine:

It allows jobs easy access data/models in HDFS and other storages.
Can launch services to serve TensorFlow/PyTorch models.
Support run distributed TensorFlow jobs with simple configs.
Support run user-specified Docker images.
Support specify GPU and other resources.
Support launch TensorBoard for training jobs if user specified.
Support customized DNS name for roles (like TensorBoard.$user.$domain:6006)

Architecture

Components

Submarine Workbench

Submarine Workbench is a WEB system. Algorithm engineers can perform complete lifecycle management of machine learning jobs in the Workbench.

Projects
Manage machine learning jobs through project.
Data
Data processing, data conversion, feature engineering, etc. in the workbench.
Job
Data processing, algorithm development, and model training in machine learning jobs as a job run.
Model
Algorithm selection, parameter adjustment, model training, model release, model Serving.
Workflow
Automate the complete life cycle of machine learning operations by scheduling workflows for data processing, model training, and model publishing.
Team
Support team development, code sharing, comments, code and model version management.

Submarine Core

The submarine core is the execution engine of the system and has the following features：

ML Engine
Support for multiple machine learning framework access, such as tensorflow, pytorch.
Data Engine
Docking the externally deployed Spark calculation engine for data processing.
SDK
Support Python, Scala, R language for algorithm development, The SDK is provided to help developers use submarine's internal data caching, data exchange, and task tracking to more efficiently improve the development and execution of machine learning tasks.
Submitter
Compatible with the underlying hybrid scheduling system of yarn and k8s for unified task scheduling and resource management, so that users are not aware.

Hybrid Scheduler
- YARN
- Kubernetes

Quick start

Run mini-submarine in one step

You can use mini-submarine for a quick experience submairne.

This is a docker image built for submarine development and quick start test.

Installation and deployment

Read the Quick Start Guide

Apache Submarine Community

Read the Apache Submarine Community Guide

How to contribute Contributing Guide

License

The Apache Submarine project is licensed under the Apache 2.0 License. See the LICENSE file for details.