You are viewing a plain text version of this content. The canonical link for it is here.
Posted to submarine-dev@hadoop.apache.org by zh...@apache.org on 2019/10/10 12:32:03 UTC
[hadoop-submarine] branch master updated: SUBMARINE-212. [SDK]
Update pysubmarine-tracking doc
This is an automated email from the ASF dual-hosted git repository.
zhouquan pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/hadoop-submarine.git
The following commit(s) were added to refs/heads/master by this push:
new e6519d9 SUBMARINE-212. [SDK] Update pysubmarine-tracking doc
e6519d9 is described below
commit e6519d97df18736654298796acc0f5be7e4c28ab
Author: pingsutw <pi...@gmail.com>
AuthorDate: Tue Oct 8 10:09:54 2019 +0800
SUBMARINE-212. [SDK] Update pysubmarine-tracking doc
### What is this PR for?
1. move submarine-sdk/README.md to submarine-sdk/tracking/README.md
2. introduce how to use pysubmarine-tracking
### What type of PR is it?
[Documentation]
### Todos
* [ ] - Task
### What is the Jira issue?
https://issues.apache.org/jira/browse/SUBMARINE-212
### How should this be tested?
https://travis-ci.org/pingsutw/hadoop-submarine/builds/594405780
### Screenshots (if appropriate)
### Questions:
* Does the licenses files need update? No
* Is there breaking changes for older versions? No
* Does this needs documentation? No
Author: pingsutw <pi...@gmail.com>
Closes #31 from pingsutw/SUBMARINE-212 and squashes the following commits:
551bca6 [pingsutw] Fix review comments
2267193 [pingsutw] SUBMARINE-212. [SDK] Update pysubmarine-tracking doc
---
docs/submarine-sdk/README.md | 72 +++++++++++-------------------
docs/submarine-sdk/pysubmarine/README.md | 45 +++++++++++++++++++
docs/submarine-sdk/pysubmarine/tracking.md | 63 ++++++++++++++++++++++++++
3 files changed, 134 insertions(+), 46 deletions(-)
diff --git a/docs/submarine-sdk/README.md b/docs/submarine-sdk/README.md
index e29f2b2..dd4fc9f 100644
--- a/docs/submarine-sdk/README.md
+++ b/docs/submarine-sdk/README.md
@@ -1,50 +1,30 @@
-<!---
- Licensed under the Apache License, Version 2.0 (the "License");
- you may not use this file except in compliance with the License.
- You may obtain a copy of the License at
+<!---
+ Licensed under the Apache License, Version 2.0 (the "License");
+ you may not use this file except in compliance with the License.
+ You may obtain a copy of the License at
+
+ http://www.apache.org/licenses/LICENSE-2.0
+
+ Unless required by applicable law or agreed to in writing, software
+ distributed under the License is distributed on an "AS IS" BASIS,
+ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ See the License for the specific language governing permissions and
+ limitations under the License. See accompanying LICENSE file.
+-->
+
+# Submarine-SDK
+
+## Summary
+- Support Python, Scala, R language for algorithm development
+
+- support tracking/metrics APIs which allows developers
+add tracking/metrics and view tracking/metrics from Submarine Workbench UI.
+
+- [TODO] Support user building ML pipeline
+
+### Python
+- [pysubmarine](pysubmarine)
- http://www.apache.org/licenses/LICENSE-2.0
- Unless required by applicable law or agreed to in writing, software
- distributed under the License is distributed on an "AS IS" BASIS,
- WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
- See the License for the specific language governing permissions and
- limitations under the License. See accompanying LICENSE file.
--->
-# Submarine-SDK
-Support Python, Scala, R language for algorithm development.
-The SDK is provided to help developers use submarine's internal data caching,
-data exchange, and task tracking to more efficiently improve the development
-and execution of machine learning tasks.
-
-- Allow data scients to track distributed ML job
-- Support store ML parameters and metrics in Submarine-server
-- Support store ML job output (e.g. csv,images)
-- Support hdfs,S3 and mysql
-- (WEB) Metric tracking ui in workbench-web
-- (WEB) Metric graphical display in workbench-web
-
-### Project setup
-- Clone repo
-```bash
-git https://github.com/apache/hadoop-submarine.git
-cd hadoop-submarine/pysubmarine/submarine-sdk
-```
-
-- Install pip package
-```
-pip install .
-```
-
-- Run tests
-```
-pytest --cov=submarine -vs
-```
-
-- Run checkstyle
-```
-pylint --msg-template="{path} ({line},{column}): \
-[{msg_id} {symbol}] {msg}" --rcfile=pylintrc -- submarine tests
-```
\ No newline at end of file
diff --git a/docs/submarine-sdk/pysubmarine/README.md b/docs/submarine-sdk/pysubmarine/README.md
new file mode 100644
index 0000000..f82dea5
--- /dev/null
+++ b/docs/submarine-sdk/pysubmarine/README.md
@@ -0,0 +1,45 @@
+<!---
+ Licensed under the Apache License, Version 2.0 (the "License");
+ you may not use this file except in compliance with the License.
+ You may obtain a copy of the License at
+
+ http://www.apache.org/licenses/LICENSE-2.0
+
+ Unless required by applicable law or agreed to in writing, software
+ distributed under the License is distributed on an "AS IS" BASIS,
+ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ See the License for the specific language governing permissions and
+ limitations under the License. See accompanying LICENSE file.
+-->
+
+# PySubmarine
+PySubmarine helps developers use submarine's internal data caching,
+data exchange, and task tracking capabilities to more efficiently improve the
+development and execution of machine learning productivity.
+
+## Package setup
+- Clone repo
+```bash
+git clone https://github.com/apache/hadoop-submarine.git
+cd hadoop-submarine/submarine-sdk/pysubmarine
+```
+
+- Install pip package
+```bash
+pip install .
+```
+
+- Run tests
+```bash
+pytest --cov=submarine -vs
+```
+
+- Run checkstyle
+```bash
+pylint --msg-template="{path} ({line},{column}): \
+[{msg_id} {symbol}] {msg}" --rcfile=pylintrc -- submarine tests
+```
+
+## PySubmarine API Reference
+### Tracking
+- [Tracking](tracking.md)
\ No newline at end of file
diff --git a/docs/submarine-sdk/pysubmarine/tracking.md b/docs/submarine-sdk/pysubmarine/tracking.md
new file mode 100644
index 0000000..abb0134
--- /dev/null
+++ b/docs/submarine-sdk/pysubmarine/tracking.md
@@ -0,0 +1,63 @@
+<!---
+ Licensed under the Apache License, Version 2.0 (the "License");
+ you may not use this file except in compliance with the License.
+ You may obtain a copy of the License at
+
+ http://www.apache.org/licenses/LICENSE-2.0
+
+ Unless required by applicable law or agreed to in writing, software
+ distributed under the License is distributed on an "AS IS" BASIS,
+ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ See the License for the specific language governing permissions and
+ limitations under the License. See accompanying LICENSE file.
+-->
+
+# pysubmarine-tracking
+- Allow data scientist to track distributed ML job
+- Support store ML parameters and metrics in Submarine-server
+- [TODO] Support store ML job output (e.g. csv,images)
+- Support hdfs,S3 and mysql (Currently we only support mysql)
+- [TODO] (WEB) Metric tracking ui in workbench-web
+- [TODO] (WEB) Metric graphical display in workbench-web
+
+## Functions
+### `submarine.get_tracking_uri()`
+
+return the tracking URI.
+
+### `submarine.set_tracking_uri(URI)`
+
+set the tracking URI. You can also set the
+SUBMARINE_TRACKING_URI environment variable to have Submarine find a URI from
+there. The URI should be database connection string.
+
+**Parameters**
+
+- URI - Submarine record data to Mysql server. The database URL
+is expected in the format ``<dialect>+<driver>://<username>:<password>@<host>:<port>/<database>``.
+By default it's `mysql+pymysql://submarine:password@localhost:3306/submarineDB`.
+More detail : [SQLAlchemy docs](https://docs.sqlalchemy.org/en/latest/core/engines.html#database-urls)
+
+<!--
+ TODO : get database url from submarine-site.xml
+-->
+
+### `submarine.log_metric(key, value, worker_index, step=0)`
+
+logs a single key-value metric. The value must always be a number.
+
+**Parameters**
+- key - Metric name (string).
+- value - Metric value (float).
+- worker_index - The index of worker (string). Examples are "rank-1", "worker-2"
+- step - A single integer step at which to log the specified Metrics,
+by default it's 0.
+
+### `submarine.log_param(key, value, worker_index)`
+
+logs a single key-value parameter. The key and value are both strings.
+
+**Parameters**
+- key - Parameter name (string).
+- value - Parameter value (string).
+- worker_index - The index of worker (string). Examples are "rank-1", "worker-2"
\ No newline at end of file