You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@submarine.apache.org by by...@apache.org on 2021/07/14 06:16:30 UTC
[submarine] branch master updated: SUBMARINE-925. [User] Python SDK
Experiment Client and Tracking
This is an automated email from the ASF dual-hosted git repository.
byronhsu pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/submarine.git
The following commit(s) were added to refs/heads/master by this push:
new b6f1ad9 SUBMARINE-925. [User] Python SDK Experiment Client and Tracking
b6f1ad9 is described below
commit b6f1ad937d12bea559d14246d60253da032dacb7
Author: jeff-901 <b0...@ntu.edu.tw>
AuthorDate: Tue Jul 13 23:03:38 2021 +0800
SUBMARINE-925. [User] Python SDK Experiment Client and Tracking
### What is this PR for?
Update python sdk experiment client and tracking documentation.
### What type of PR is it?
Documentation
### Todos
### What is the Jira issue?
https://issues.apache.org/jira/browse/SUBMARINE-925
### How should this be tested?
### Screenshots (if appropriate)
### Questions:
* Do the license files need updating? No
* Are there breaking changes for older versions? No
* Does this need new documentation? No
Author: jeff-901 <b0...@ntu.edu.tw>
Signed-off-by: byronhsu <by...@apache.org>
Closes #672 from jeff-901/SUBMARINE-925 and squashes the following commits:
ddd6f2ef [jeff-901] add example
f9251a27 [jeff-901] update sdk doc
---
.../userDocs/submarine-sdk/experiment-client.md | 236 +++++++++++++++++++++
website/docs/userDocs/submarine-sdk/tracking.md | 47 ++++
2 files changed, 283 insertions(+)
diff --git a/website/docs/userDocs/submarine-sdk/experiment-client.md b/website/docs/userDocs/submarine-sdk/experiment-client.md
index 4448dbb..e238595 100644
--- a/website/docs/userDocs/submarine-sdk/experiment-client.md
+++ b/website/docs/userDocs/submarine-sdk/experiment-client.md
@@ -20,3 +20,239 @@ KIND, either express or implied. See the License for the
specific language governing permissions and limitations
under the License.
-->
+
+## class ExperimentClient()
+
+Client of a submarine server that creates and manages experients and logs.
+
+### `create_experiment(experiment_spec: json) -> dict`
+
+Create an experiment.
+> **Parameters**
+ - **experiment_spec**: Submarine experiment spec. More detailed information can be found at [Experiment API](https://submarine.apache.org/docs/userDocs/api/experiment).
+> Returns
+ - The detailed info about the submarine experiment.
+
+Example
+
+```python
+from submarine import *
+client = ExperimentClient()
+client.create_experiment({
+ "meta": {
+ "name": "tf-mnist-json",
+ "namespace": "default",
+ "framework": "TensorFlow",
+ "cmd": "python /var/tf_mnist/mnist_with_summaries.py --log_dir=/train/log --learning_rate=0.01 --batch_size=150",
+ "envVars": {
+ "ENV_1": "ENV1"
+ }
+ },
+ "environment": {
+ "image": "apache/submarine:tf-mnist-with-summaries-1.0"
+ },
+ "spec": {
+ "Ps": {
+ "replicas": 1,
+ "resources": "cpu=1,memory=1024M"
+ },
+ "Worker": {
+ "replicas": 1,
+ "resources": "cpu=1,memory=1024M"
+ }
+ }
+})
+```
+
+Output
+
+```python
+{
+ 'experimentId': 'experiment_1626160071451_0008',
+ 'name': 'tf-mnist-json', 'uid': '3513233e-33f2-4399-8fba-2a44ca2af730',
+ 'status': 'Accepted',
+ 'acceptedTime': '2021-07-13T21:29:33.000+08:00',
+ 'createdTime': None,
+ 'runningTime': None,
+ 'finishedTime': None,
+ 'spec': {
+ 'meta': {
+ 'name': 'tf-mnist-json',
+ 'namespace': 'default',
+ 'framework': 'TensorFlow',
+ 'cmd': 'python /var/tf_mnist/mnist_with_summaries.py --log_dir=/train/log --learning_rate=0.01 --batch_size=150',
+ 'envVars': {'ENV_1': 'ENV1'}
+ },
+ 'environment': {
+ 'name': None,
+ 'dockerImage': None,
+ 'kernelSpec': None,
+ 'description': None,
+ 'image': 'apache/submarine:tf-mnist-with-summaries-1.0'
+ },
+ 'spec': {
+ 'Ps': {
+ 'replicas': 1,
+ 'resources': 'cpu=1,memory=1024M',
+ 'name': None,
+ 'image': None,
+ 'cmd': None,
+ 'envVars': None,
+ 'resourceMap': {'memory': '1024M', 'cpu': '1'}
+ },
+ 'Worker': {
+ 'replicas': 1,
+ 'resources': 'cpu=1,memory=1024M',
+ 'name': None,
+ 'image': None,
+ 'cmd': None,
+ 'envVars': None,
+ 'resourceMap': {'memory': '1024M', 'cpu': '1'}
+ }
+ },
+ 'code': None
+ }
+}
+```
+
+### `patch_experiment(id: str, experiment_spec: json) -> dict`
+
+Patch an experiment.
+> **Parameters**
+ - **id**: Submarine experiment id.
+ - **experiment_spec**: Submarine experiment spec. More detailed information can be found at [Experiment API](https://submarine.apache.org/docs/userDocs/api/experiment).
+> **Returns**
+ - The detailed info about the submarine experiment.
+
+Example
+
+```python
+client.patch_experiment("experiment_1626160071451_0008", {
+ "meta": {
+ "name": "tf-mnist-json",
+ "namespace": "default",
+ "framework": "TensorFlow",
+ "cmd": "python /var/tf_mnist/mnist_with_summaries.py --log_dir=/train/log --learning_rate=0.01 --batch_size=150",
+ "envVars": {
+ "ENV_1": "ENV1"
+ }
+ },
+ "environment": {
+ "image": "apache/submarine:tf-mnist-with-summaries-1.0"
+ },
+ "spec": {
+ "Worker": {
+ "replicas": 2,
+ "resources": "cpu=1,memory=1024M"
+ }
+ }
+})
+```
+
+### `get_experiment(id: str) -> dict`
+
+Get the experiment's detailed info by id.
+> **Parameters**
+ - **id**: Submarine experiment id.
+
+> **Returns**
+ - The detailed info about the submarine experiment.
+
+Example
+
+```python
+experiment = client.get_experiment("experiment_1626160071451_0008")
+```
+
+### `list_experiments(status: Optional[str]=None) -> list[dict]`
+
+List all experiment for the user.
+> **Parameters**
+ - **status**: Accepted, Created, Running, Succeeded, Deleted.
+
+> **Returns**
+ - List of submarine experiments.
+
+Example
+
+```python
+experiments = client.list_experiments()
+```
+
+### `delete_experiment(id: str) -> dict`
+
+Delete the submarine experiment.
+> **Parameters**
+ - **id**: Submarine experiment id.
+
+> **Returns**
+ - The detailed info about the deleted submarine experiment.
+
+Example
+
+```python
+client.delete_experiment("experiment_1626160071451_0008")
+```
+
+### `get_log(id: str, onlyMaster: Optional[bool]=False) -> None`
+
+Print training logs of all pod of the experiment.
+By default print all the logs of Pod.
+
+> **Parameters**
+ - **id**: Submarine experiment id.
+ - **onlyMaster**: By default include pod log of "master" which might be Tensorflow PS/Chief or PyTorch master.
+
+Example
+
+```python
+client.get_log("experiment_1626160071451_0009")
+```
+
+Output
+
+```
+The logs of Pod tf-mnist-json-2-ps-0:
+
+The logs of Pod tf-mnist-json-2-worker-0:
+
+```
+
+### `list_log(status: str) -> list[dict]`
+
+List experiment log.
+> **Parameters**
+ - **status**: Accepted, Created, Running, Succeeded, Deleted.
+> **Returns**
+ - List of submarine experiment logs.
+
+Example
+
+```python
+logs = client.list_log("Succeeded")
+```
+
+Output
+
+```python
+[{'experimentId': 'experiment_1626160071451_0009',
+ 'logContent':
+ [{'podName': 'tf-mnist-json-2-ps-0', 'podLog': []},
+ {'podName': 'tf-mnist-json-2-worker-0', 'podLog': []}]
+}]
+```
+
+### `wait_for_finish(id: str, polling_interval: Optional[int]=10) -> dict`
+
+Waits until the experiment is finished or failed.
+> **Parameters**
+ - **id**: Submarine experiment id.
+ - **polling_interval**: How many seconds between two polls for the status of the experiment.
+> **Returns**
+ - Submarine experiment logs.
+
+Example
+
+```python
+logs = client.wait_for_finish("experiment_1626160071451_0009", 5)
+```
diff --git a/website/docs/userDocs/submarine-sdk/tracking.md b/website/docs/userDocs/submarine-sdk/tracking.md
index 072ab5b..887683d 100644
--- a/website/docs/userDocs/submarine-sdk/tracking.md
+++ b/website/docs/userDocs/submarine-sdk/tracking.md
@@ -20,3 +20,50 @@ KIND, either express or implied. See the License for the
specific language governing permissions and limitations
under the License.
-->
+
+It helps developers use submarine's internal data caching,
+data exchange, and task tracking capabilities to more efficiently improve the
+development and execution of machine learning productivity
+
+- Allow data scientist to track distributed ML experiment
+- Support store ML parameters and metrics in Submarine-server
+- Support hdfs, S3 and mysql (Currently we only support mysql)
+
+## Functions
+
+### `submarine.get_tracking_uri() -> str`
+
+Get the tracking URI. If none has been specified, check the environmental variables. If uri is still none, return the default submarine jdbc url.
+
+> **Returns**
+
+ - The tracking URI.
+
+### `submarine.set_tracking_uri(uri: str) -> None`
+
+set the tracking URI. You can also set the SUBMARINE_TRACKING_URI environment variable to have Submarine find a URI from there. The URI should be database connection string.
+
+> **Parameters**
+
+ - **uri** \- Submarine record data to Mysql server. The database URL is expected in the format ``<dialect>+<driver>://<username>:<password>@<host>:<port>/<database>``.
+ By default it's `mysql+pymysql://submarine:password@localhost:3306/submarine`.
+ More detail : [SQLAlchemy docs](https://docs.sqlalchemy.org/en/latest/core/engines.html#database-urls)
+
+### `submarine.log_param(key: str, value: str) -> None`
+
+log a single key-value parameter. The key and value are both strings.
+
+> **Parameters**
+
+ - **key** - Parameter name.
+ - **value** - Parameter value.
+
+### `submarine.log_metric(key: str, value: float, step=0) -> None`
+
+log a single key-value metric. The value must always be a number.
+
+> **Parameters**
+
+ - **key** - Metric name.
+ - **value** - Metric value.
+ - **step** - A single integer step at which to log the specified Metrics, by default it's 0.
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@submarine.apache.org
For additional commands, e-mail: dev-help@submarine.apache.org