Note: The Experiment API is in the alpha stage which is subjected to incompatible changes in future releases.
POST /api/v1/experiment
Example Request
curl -X POST -H "Content-Type: application/json" -d ' { "meta": { "name": "tf-mnist-json", "namespace": "default", "framework": "TensorFlow", "cmd": "python /var/tf_mnist/mnist_with_summaries.py --log_dir=/train/log --learning_rate=0.01 --batch_size=150", "envVars": { "ENV_1": "ENV1" } }, "environment": { "image": "apache/submarine:tf-mnist-with-summaries-1.0" }, "spec": { "Ps": { "replicas": 1, "resources": "cpu=1,memory=512M" }, "Worker": { "replicas": 1, "resources": "cpu=1,memory=512M" } } } ' http://127.0.0.1:8080/api/v1/experiment
Example Response:
{ "status": "OK", "code": 200, "result": { "experimentId": "experiment_1586156073228_0001", "name": "tf-mnist-json", "uid": "28e39dcd-77d4-11ea-8dbb-0242ac110003", "status": "Accepted", "acceptedTime": "2020-06-13T22:59:29.000+08:00", "spec": { "meta": { "name": "tf-mnist-json", "namespace": "default", "framework": "TensorFlow", "cmd": "python /var/tf_mnist/mnist_with_summaries.py --log_dir=/train/log --learning_rate=0.01 --batch_size=150", "envVars": { "ENV_1": "ENV1" } }, "environment": { "image": "apache/submarine:tf-mnist-with-summaries-1.0" }, "spec": { "Ps": { "replicas": 1, "resources": "cpu=1,memory=512M" }, "Worker": { "replicas": 1, "resources": "cpu=1,memory=512M" } } } } }
POST /api/v1/experiment
Example Request
curl -X POST -H "Content-Type: application/json" -d ' { "meta": { "name": "tf-mnist-json", "namespace": "default", "framework": "TensorFlow", "cmd": "python /var/tf_mnist/mnist_with_summaries.py --log_dir=/train/log --learning_rate=0.01 --batch_size=150", "envVars": { "ENV_1": "ENV1" } }, "environment": { "name": "my-submarine-env" }, "spec": { "Ps": { "replicas": 1, "resources": "cpu=1,memory=512M" }, "Worker": { "replicas": 1, "resources": "cpu=1,memory=512M" } } } ' http://127.0.0.1:8080/api/v1/experiment
Above example assume environment “my-submarine-env” already exists in Submarine. Please refer Environment API Reference doc to Create/Update/Delete/List Environment REST API's
Example Response:
{ "status": "OK", "code": 200, "result": { "experimentId": "experiment_1586156073228_0001", "name": "tf-mnist-json", "uid": "28e39dcd-77d4-11ea-8dbb-0242ac110003", "status": "Accepted", "acceptedTime": "2020-06-13T22:59:29.000+08:00", "spec": { "meta": { "name": "tf-mnist-json", "namespace": "default", "framework": "TensorFlow", "cmd": "python /var/tf_mnist/mnist_with_summaries.py --log_dir=/train/log --learning_rate=0.01 --batch_size=150", "envVars": { "ENV_1": "ENV1" } }, "environment": { "name": "my-submarine-env" }, "spec": { "Ps": { "replicas": 1, "resources": "cpu=1,memory=512M" }, "Worker": { "replicas": 1, "resources": "cpu=1,memory=512M" } } } } }
GET /api/v1/experiment
Example Request:
curl -X GET http://127.0.0.1:8080/api/v1/experiment
Example Response:
{ "status": "OK", "code": 200, "result": [ { "experimentId": "experiment_1592057447228_0001", "name": "tf-mnist-json", "uid": "28e39dcd-77d4-11ea-8dbb-0242ac110003", "status": "Accepted", "acceptedTime": "2020-06-13T22:59:29.000+08:00", "spec": { "meta": { "name": "tf-mnist-json", "namespace": "default", "framework": "TensorFlow", "cmd": "python /var/tf_mnist/mnist_with_summaries.py --log_dir=/train/log --learning_rate=0.01 --batch_size=150", "envVars": { "ENV_1": "ENV1" } }, "environment": { "image": "apache/submarine:tf-mnist-with-summaries-1.0" }, "spec": { "Ps": { "replicas": 1, "resources": "cpu=1,memory=512M" }, "Worker": { "replicas": 1, "resources": "cpu=1,memory=512M" } } } }, { "experimentId": "experiment_1592057447228_0002", "name": "mnist", "uid": "38e39dcd-77d4-11ea-8dbb-0242ac110003", "status": "Accepted", "acceptedTime": "2020-06-13T22:19:29.000+08:00", "spec": { "meta": { "name": "pytorch-mnist-json", "namespace": "default", "framework": "PyTorch", "cmd": "python /var/mnist.py --backend gloo", "envVars": { "ENV_1": "ENV1" } }, "environment": { "image": "apache/submarine:pytorch-dist-mnist-1.0" }, "spec": { "Master": { "replicas": 1, "resources": "cpu=1,memory=1024M" }, "Worker": { "replicas": 1, "resources": "cpu=1,memory=1024M" } } } } ] }
GET /api/v1/experiment/{id}
Example Request:
curl -X GET http://127.0.0.1:8080/api/v1/experiment/experiment_1592057447228_0001
Example Response:
{ "status": "OK", "code": 200, "result": { "experimentId": "experiment_1592057447228_0001", "name": "tf-mnist-json", "uid": "28e39dcd-77d4-11ea-8dbb-0242ac110003", "status": "Accepted", "acceptedTime": "2020-06-13T22:59:29.000+08:00", "spec": { "meta": { "name": "tf-mnist-json", "namespace": "default", "framework": "TensorFlow", "cmd": "python /var/tf_mnist/mnist_with_summaries.py --log_dir=/train/log --learning_rate=0.01 --batch_size=150", "envVars": { "ENV_1": "ENV1" } }, "environment": { "image": "apache/submarine:tf-mnist-with-summaries-1.0" }, "spec": { "Ps": { "replicas": 1, "resources": "cpu=1,memory=512M" }, "Worker": { "replicas": 1, "resources": "cpu=1,memory=512M" } } } } }
PATCH /api/v1/experiment/{id}
Example Request:
curl -X PATCH -H "Content-Type: application/json" -d ' { "meta": { "name": "tf-mnist-json", "namespace": "default", "framework": "TensorFlow", "cmd": "python /var/tf_mnist/mnist_with_summaries.py --log_dir=/train/log --learning_rate=0.01 --batch_size=150", "envVars": { "ENV_1": "ENV1" } }, "environment": { "image": "apache/submarine:tf-mnist-with-summaries-1.0" }, "spec": { "Ps": { "replicas": 1, "resources": "cpu=1,memory=512M" }, "Worker": { "replicas": 2, "resources": "cpu=1,memory=512M" } } } ' http://127.0.0.1:8080/api/v1/experiment/experiment_1592057447228_0001
Example Response:
{ "status": "OK", "code": 200, "success": true, "result": { "meta": { "name": "tf-mnist-json", "namespace": "default", "framework": "TensorFlow", "cmd": "python /var/tf_mnist/mnist_with_summaries.py --log_dir=/train/log --learning_rate=0.01 --batch_size=150", "envVars": { "ENV_1": "ENV1" } }, "environment": { "image": "apache/submarine:tf-mnist-with-summaries-1.0" }, "spec": { "Ps": { "replicas": 1, "resources": "cpu=1,memory=512M" }, "Worker": { "replicas": 2, "resources": "cpu=1,memory=512M" } } } }
GET /api/v1/experiment/{id}
Example Request:
curl -X DELETE http://127.0.0.1:8080/api/v1/experiment/experiment_1592057447228_0001
Example Response:
{ "status": "OK", "code": 200, "result": { "experimentId": "experiment_1586156073228_0001", "name": "tf-mnist-json", "uid": "28e39dcd-77d4-11ea-8dbb-0242ac110003", "status": "Accepted", "acceptedTime": "2020-06-13T22:59:29.000+08:00", "spec": { "meta": { "name": "tf-mnist-json", "namespace": "default", "framework": "TensorFlow", "cmd": "python /var/tf_mnist/mnist_with_summaries.py --log_dir=/train/log --learning_rate=0.01 --batch_size=150", "envVars": { "ENV_1": "ENV1" } }, "environment": { "image": "apache/submarine:tf-mnist-with-summaries-1.0" }, "spec": { "Ps": { "replicas": 1, "resources": "cpu=1,memory=512M" }, "Worker": { "replicas": 2, "resources": "cpu=1,memory=512M" } } } } }
GET /api/v1/experiment/logs
Example Request:
curl -X GET http://127.0.0.1:8080/api/v1/experiment/logs
Example Response:
{ "status": "OK", "code": 200, "success": null, "message": null, "result": [ { "experimentId": "experiment_1589199154923_0001", "logContent": [ { "podName": "mnist-worker-0", "podLog": null } ] }, { "experimentId": "experiment_1589199154923_0002", "logContent": [ { "podName": "pytorch-dist-mnist-gloo-master-0", "podLog": null }, { "podName": "pytorch-dist-mnist-gloo-worker-0", "podLog": null } ] } ], "attributes": {} }
GET /api/v1/experiment/logs/{id}
Example Request:
curl -X GET http://127.0.0.1:8080/api/v1/experiment/logs/experiment_1589199154923_0002
Example Response:
{ "status": "OK", "code": 200, "success": null, "message": null, "result": { "experimentId": "experiment_1589199154923_0002", "logContent": [ { "podName": "pytorch-dist-mnist-gloo-master-0", "podLog": "Using distributed PyTorch with gloo backend\nDownloading http://yann.lecun.com/exdb/mnist/train-images-idx3-ubyte.gz\nDownloading http://yann.lecun.com/exdb/mnist/train-labels-idx1-ubyte.gz\nDownloading http://yann.lecun.com/exdb/mnist/t10k-images-idx3-ubyte.gz\nDownloading http://yann.lecun.com/exdb/mnist/t10k-labels-idx1-ubyte.gz\nProcessing...\nDone!\nTrain Epoch: 1 [0/60000 (0%)]\tloss=2.3000\nTrain Epoch: 1 [640/60000 (1%)]\tloss=2.2135\nTrain Epoch: 1 [1280/60000 (2%)]\tloss=2.1704\nTrain Epoch: 1 [1920/60000 (3%)]\tloss=2.0766\nTrain Epoch: 1 [2560/60000 (4%)]\tloss=1.8679\nTrain Epoch: 1 [3200/60000 (5%)]\tloss=1.4135\nTrain Epoch: 1 [3840/60000 (6%)]\tloss=1.0003\nTrain Epoch: 1 [4480/60000 (7%)]\tloss=0.7762\nTrain Epoch: 1 [5120/60000 (9%)]\tloss=0.4598\nTrain Epoch: 1 [5760/60000 (10%)]\tloss=0.4860\nTrain Epoch: 1 [6400/60000 (11%)]\tloss=0.4389\nTrain Epoch: 1 [7040/60000 (12%)]\tloss=0.4084\nTrain Epoch: 1 [7680/60000 (13%)]\tloss=0.4602\nTrain Epoch: 1 [8320/60000 (14%)]\tloss=0.4289\nTrain Epoch: 1 [8960/60000 (15%)]\tloss=0.3990\nTrain Epoch: 1 [9600/60000 (16%)]\tloss=0.3852\n" }, { "podName": "pytorch-dist-mnist-gloo-worker-0", "podLog": "Using distributed PyTorch with gloo backend\nDownloading http://yann.lecun.com/exdb/mnist/train-images-idx3-ubyte.gz\nDownloading http://yann.lecun.com/exdb/mnist/train-labels-idx1-ubyte.gz\nDownloading http://yann.lecun.com/exdb/mnist/t10k-images-idx3-ubyte.gz\nDownloading http://yann.lecun.com/exdb/mnist/t10k-labels-idx1-ubyte.gz\nProcessing...\nDone!\nTrain Epoch: 1 [0/60000 (0%)]\tloss=2.3000\nTrain Epoch: 1 [640/60000 (1%)]\tloss=2.2135\nTrain Epoch: 1 [1280/60000 (2%)]\tloss=2.1704\nTrain Epoch: 1 [1920/60000 (3%)]\tloss=2.0766\nTrain Epoch: 1 [2560/60000 (4%)]\tloss=1.8679\nTrain Epoch: 1 [3200/60000 (5%)]\tloss=1.4135\nTrain Epoch: 1 [3840/60000 (6%)]\tloss=1.0003\nTrain Epoch: 1 [4480/60000 (7%)]\tloss=0.7762\nTrain Epoch: 1 [5120/60000 (9%)]\tloss=0.4598\nTrain Epoch: 1 [5760/60000 (10%)]\tloss=0.4860\nTrain Epoch: 1 [6400/60000 (11%)]\tloss=0.4389\nTrain Epoch: 1 [7040/60000 (12%)]\tloss=0.4084\nTrain Epoch: 1 [7680/60000 (13%)]\tloss=0.4602\nTrain Epoch: 1 [8320/60000 (14%)]\tloss=0.4289\nTrain Epoch: 1 [8960/60000 (15%)]\tloss=0.3990\nTrain Epoch: 1 [9600/60000 (16%)]\tloss=0.3852\n" } ] }, "attributes": {} }