You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@submarine.apache.org by by...@apache.org on 2021/07/14 06:16:30 UTC

[submarine] branch master updated: SUBMARINE-925. [User] Python SDK Experiment Client and Tracking

This is an automated email from the ASF dual-hosted git repository.

byronhsu pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/submarine.git


The following commit(s) were added to refs/heads/master by this push:
     new b6f1ad9  SUBMARINE-925. [User] Python SDK Experiment Client and Tracking
b6f1ad9 is described below

commit b6f1ad937d12bea559d14246d60253da032dacb7
Author: jeff-901 <b0...@ntu.edu.tw>
AuthorDate: Tue Jul 13 23:03:38 2021 +0800

    SUBMARINE-925. [User] Python SDK Experiment Client and Tracking
    
    ### What is this PR for?
    Update python sdk experiment client and tracking documentation.
    
    ### What type of PR is it?
    Documentation
    
    ### Todos
    
    ### What is the Jira issue?
    https://issues.apache.org/jira/browse/SUBMARINE-925
    
    ### How should this be tested?
    
    ### Screenshots (if appropriate)
    
    ### Questions:
    * Do the license files need updating? No
    * Are there breaking changes for older versions? No
    * Does this need new documentation? No
    
    Author: jeff-901 <b0...@ntu.edu.tw>
    
    Signed-off-by: byronhsu <by...@apache.org>
    
    Closes #672 from jeff-901/SUBMARINE-925 and squashes the following commits:
    
    ddd6f2ef [jeff-901] add example
    f9251a27 [jeff-901] update sdk doc
---
 .../userDocs/submarine-sdk/experiment-client.md    | 236 +++++++++++++++++++++
 website/docs/userDocs/submarine-sdk/tracking.md    |  47 ++++
 2 files changed, 283 insertions(+)

diff --git a/website/docs/userDocs/submarine-sdk/experiment-client.md b/website/docs/userDocs/submarine-sdk/experiment-client.md
index 4448dbb..e238595 100644
--- a/website/docs/userDocs/submarine-sdk/experiment-client.md
+++ b/website/docs/userDocs/submarine-sdk/experiment-client.md
@@ -20,3 +20,239 @@ KIND, either express or implied.  See the License for the
 specific language governing permissions and limitations
 under the License.
 -->
+
+## class ExperimentClient()
+
+Client of a submarine server that creates and manages experients and logs.
+
+### `create_experiment(experiment_spec: json) -> dict`
+
+Create an experiment.
+> **Parameters**
+  - **experiment_spec**: Submarine experiment spec. More detailed information can be found at [Experiment API](https://submarine.apache.org/docs/userDocs/api/experiment).
+> Returns
+  - The detailed info about the submarine experiment.
+
+Example
+
+```python
+from submarine import *
+client = ExperimentClient()
+client.create_experiment({
+  "meta": {
+    "name": "tf-mnist-json",
+    "namespace": "default",
+    "framework": "TensorFlow",
+    "cmd": "python /var/tf_mnist/mnist_with_summaries.py --log_dir=/train/log --learning_rate=0.01 --batch_size=150",
+    "envVars": {
+      "ENV_1": "ENV1"
+    }
+  },
+  "environment": {
+    "image": "apache/submarine:tf-mnist-with-summaries-1.0"
+  },
+  "spec": {
+    "Ps": {
+      "replicas": 1,
+      "resources": "cpu=1,memory=1024M"
+    },
+    "Worker": {
+      "replicas": 1,
+      "resources": "cpu=1,memory=1024M"
+    }
+  }
+})
+```
+
+Output
+
+```python
+{
+  'experimentId': 'experiment_1626160071451_0008', 
+  'name': 'tf-mnist-json', 'uid': '3513233e-33f2-4399-8fba-2a44ca2af730', 
+  'status': 'Accepted', 
+  'acceptedTime': '2021-07-13T21:29:33.000+08:00', 
+  'createdTime': None, 
+  'runningTime': None, 
+  'finishedTime': None, 
+  'spec': {
+    'meta': {
+      'name': 'tf-mnist-json', 
+      'namespace': 'default', 
+      'framework': 'TensorFlow', 
+      'cmd': 'python /var/tf_mnist/mnist_with_summaries.py --log_dir=/train/log --learning_rate=0.01 --batch_size=150', 
+      'envVars': {'ENV_1': 'ENV1'}
+    }, 
+    'environment': {
+      'name': None, 
+      'dockerImage': None, 
+      'kernelSpec': None, 
+      'description': None, 
+      'image': 'apache/submarine:tf-mnist-with-summaries-1.0'
+    }, 
+    'spec': {
+      'Ps': {
+        'replicas': 1, 
+        'resources': 'cpu=1,memory=1024M', 
+        'name': None, 
+        'image': None, 
+        'cmd': None, 
+        'envVars': None, 
+        'resourceMap': {'memory': '1024M', 'cpu': '1'}
+      }, 
+      'Worker': {
+        'replicas': 1, 
+        'resources': 'cpu=1,memory=1024M', 
+        'name': None, 
+        'image': None, 
+        'cmd': None, 
+        'envVars': None, 
+        'resourceMap': {'memory': '1024M', 'cpu': '1'}
+      }
+    }, 
+    'code': None
+  }
+}
+```
+
+### `patch_experiment(id: str, experiment_spec: json) -> dict`
+
+Patch an experiment.
+> **Parameters**
+  - **id**: Submarine experiment id. 
+  - **experiment_spec**: Submarine experiment spec. More detailed information can be found at [Experiment API](https://submarine.apache.org/docs/userDocs/api/experiment).
+> **Returns**
+  - The detailed info about the submarine experiment.
+
+Example
+
+```python
+client.patch_experiment("experiment_1626160071451_0008", {
+  "meta": {
+    "name": "tf-mnist-json",
+    "namespace": "default",
+    "framework": "TensorFlow",
+    "cmd": "python /var/tf_mnist/mnist_with_summaries.py --log_dir=/train/log --learning_rate=0.01 --batch_size=150",
+    "envVars": {
+      "ENV_1": "ENV1"
+    }
+  },
+  "environment": {
+    "image": "apache/submarine:tf-mnist-with-summaries-1.0"
+  },
+  "spec": {
+    "Worker": {
+      "replicas": 2,
+      "resources": "cpu=1,memory=1024M"
+    }
+  }
+})
+```
+
+### `get_experiment(id: str) -> dict`
+
+Get the experiment's detailed info by id.
+> **Parameters**
+  - **id**: Submarine experiment id.
+
+> **Returns**
+  - The detailed info about the submarine experiment.
+
+Example
+
+```python
+experiment = client.get_experiment("experiment_1626160071451_0008")
+```
+
+### `list_experiments(status: Optional[str]=None) -> list[dict]`
+
+List all experiment for the user.
+> **Parameters**
+  - **status**: Accepted, Created, Running, Succeeded, Deleted.
+
+> **Returns**
+  - List of submarine experiments.
+
+Example
+
+```python
+experiments = client.list_experiments()
+```
+
+### `delete_experiment(id: str) -> dict`
+
+Delete the submarine experiment.
+> **Parameters**
+  - **id**: Submarine experiment id.
+
+> **Returns**
+  - The detailed info about the deleted submarine experiment.
+
+Example
+
+```python
+client.delete_experiment("experiment_1626160071451_0008")
+```
+
+### `get_log(id: str, onlyMaster: Optional[bool]=False) -> None`
+
+Print training logs of all pod of the experiment.
+By default print all the logs of Pod.
+
+> **Parameters**
+  - **id**: Submarine experiment id.
+  - **onlyMaster**: By default include pod log of "master" which might be Tensorflow PS/Chief or PyTorch master.
+
+Example
+
+```python
+client.get_log("experiment_1626160071451_0009")
+```
+
+Output
+
+```
+The logs of Pod tf-mnist-json-2-ps-0:
+
+The logs of Pod tf-mnist-json-2-worker-0:
+
+```
+
+### `list_log(status: str) -> list[dict]`
+
+List experiment log.
+> **Parameters**
+  - **status**: Accepted, Created, Running, Succeeded, Deleted.
+> **Returns**
+  - List of submarine experiment logs.
+
+Example
+
+```python
+logs = client.list_log("Succeeded")
+```
+
+Output
+
+```python
+[{'experimentId': 'experiment_1626160071451_0009',
+  'logContent': 
+  [{'podName': 'tf-mnist-json-2-ps-0', 'podLog': []},
+   {'podName': 'tf-mnist-json-2-worker-0', 'podLog': []}]
+}]
+```
+
+### `wait_for_finish(id: str, polling_interval: Optional[int]=10) -> dict`
+
+Waits until the experiment is finished or failed.
+> **Parameters**
+  - **id**: Submarine experiment id.
+  - **polling_interval**: How many seconds between two polls for the status of the experiment.
+> **Returns**
+  - Submarine experiment logs.
+
+Example
+
+```python
+logs = client.wait_for_finish("experiment_1626160071451_0009", 5)
+```
diff --git a/website/docs/userDocs/submarine-sdk/tracking.md b/website/docs/userDocs/submarine-sdk/tracking.md
index 072ab5b..887683d 100644
--- a/website/docs/userDocs/submarine-sdk/tracking.md
+++ b/website/docs/userDocs/submarine-sdk/tracking.md
@@ -20,3 +20,50 @@ KIND, either express or implied.  See the License for the
 specific language governing permissions and limitations
 under the License.
 -->
+
+It helps developers use submarine's internal data caching,
+data exchange, and task tracking capabilities to more efficiently improve the
+development and execution of machine learning productivity
+
+- Allow data scientist to track distributed ML experiment
+- Support store ML parameters and metrics in Submarine-server
+- Support hdfs, S3 and mysql (Currently we only support mysql)
+
+## Functions
+
+### `submarine.get_tracking_uri() -> str`
+
+Get the tracking URI. If none has been specified, check the environmental variables. If uri is still none, return the default submarine jdbc url.
+
+> **Returns**
+
+  - The tracking URI.
+
+### `submarine.set_tracking_uri(uri: str) -> None`
+
+set the tracking URI. You can also set the SUBMARINE_TRACKING_URI environment variable to have Submarine find a URI from there. The URI should be database connection string.
+
+> **Parameters**
+
+  - **uri** \- Submarine record data to Mysql server. The database URL is expected in the format ``<dialect>+<driver>://<username>:<password>@<host>:<port>/<database>``.
+  By default it's `mysql+pymysql://submarine:password@localhost:3306/submarine`.
+  More detail : [SQLAlchemy docs](https://docs.sqlalchemy.org/en/latest/core/engines.html#database-urls)
+
+### `submarine.log_param(key: str, value: str) -> None`
+
+log a single key-value parameter. The key and value are both strings.
+
+> **Parameters**
+
+  - **key** - Parameter name.
+  - **value** - Parameter value.
+
+### `submarine.log_metric(key: str, value: float, step=0) -> None`
+
+log a single key-value metric. The value must always be a number.
+
+> **Parameters**
+
+  - **key** - Metric name.
+  - **value** - Metric value.
+  - **step** - A single integer step at which to log the specified Metrics, by default it's 0.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@submarine.apache.org
For additional commands, e-mail: dev-help@submarine.apache.org