You are viewing a plain text version of this content. The canonical link for it is here.
Posted to submarine-dev@hadoop.apache.org by zh...@apache.org on 2019/10/10 12:32:03 UTC

[hadoop-submarine] branch master updated: SUBMARINE-212. [SDK] Update pysubmarine-tracking doc

This is an automated email from the ASF dual-hosted git repository.

zhouquan pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/hadoop-submarine.git


The following commit(s) were added to refs/heads/master by this push:
     new e6519d9  SUBMARINE-212. [SDK] Update pysubmarine-tracking doc
e6519d9 is described below

commit e6519d97df18736654298796acc0f5be7e4c28ab
Author: pingsutw <pi...@gmail.com>
AuthorDate: Tue Oct 8 10:09:54 2019 +0800

    SUBMARINE-212. [SDK] Update pysubmarine-tracking doc
    
    ### What is this PR for?
    1. move submarine-sdk/README.md to submarine-sdk/tracking/README.md
    2. introduce how to use pysubmarine-tracking
    
    ### What type of PR is it?
    [Documentation]
    
    ### Todos
    * [ ] - Task
    
    ### What is the Jira issue?
    https://issues.apache.org/jira/browse/SUBMARINE-212
    
    ### How should this be tested?
    https://travis-ci.org/pingsutw/hadoop-submarine/builds/594405780
    
    ### Screenshots (if appropriate)
    
    ### Questions:
    * Does the licenses files need update? No
    * Is there breaking changes for older versions? No
    * Does this needs documentation? No
    
    Author: pingsutw <pi...@gmail.com>
    
    Closes #31 from pingsutw/SUBMARINE-212 and squashes the following commits:
    
    551bca6 [pingsutw] Fix review comments
    2267193 [pingsutw] SUBMARINE-212. [SDK] Update pysubmarine-tracking doc
---
 docs/submarine-sdk/README.md               | 72 +++++++++++-------------------
 docs/submarine-sdk/pysubmarine/README.md   | 45 +++++++++++++++++++
 docs/submarine-sdk/pysubmarine/tracking.md | 63 ++++++++++++++++++++++++++
 3 files changed, 134 insertions(+), 46 deletions(-)

diff --git a/docs/submarine-sdk/README.md b/docs/submarine-sdk/README.md
index e29f2b2..dd4fc9f 100644
--- a/docs/submarine-sdk/README.md
+++ b/docs/submarine-sdk/README.md
@@ -1,50 +1,30 @@
-<!---
-  Licensed under the Apache License, Version 2.0 (the "License");
-  you may not use this file except in compliance with the License.
-  You may obtain a copy of the License at
+<!---      
+  Licensed under the Apache License, Version 2.0 (the "License");      
+  you may not use this file except in compliance with the License.      
+  You may obtain a copy of the License at      
+      
+   http://www.apache.org/licenses/LICENSE-2.0      
+      
+  Unless required by applicable law or agreed to in writing, software      
+  distributed under the License is distributed on an "AS IS" BASIS,      
+  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.      
+  See the License for the specific language governing permissions and      
+  limitations under the License. See accompanying LICENSE file.      
+-->    
+    
+# Submarine-SDK
+
+## Summary
+- Support Python, Scala, R language for algorithm development
+
+- support tracking/metrics APIs which allows developers 
+add tracking/metrics and view tracking/metrics from Submarine Workbench UI.
+
+- [TODO] Support user building ML pipeline
+
+### Python 
+- [pysubmarine](pysubmarine)
 
-   http://www.apache.org/licenses/LICENSE-2.0
 
-  Unless required by applicable law or agreed to in writing, software
-  distributed under the License is distributed on an "AS IS" BASIS,
-  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-  See the License for the specific language governing permissions and
-  limitations under the License. See accompanying LICENSE file.
--->
 
-# Submarine-SDK
 
-Support Python, Scala, R language for algorithm development.
-The SDK is provided to help developers use submarine's internal data caching, 
-data exchange, and task tracking to more efficiently improve the development 
-and execution of machine learning tasks.
-
-- Allow data scients to track distributed ML job 
-- Support store ML parameters and metrics in Submarine-server
-- Support store ML job output (e.g. csv,images)
-- Support hdfs,S3 and mysql 
-- (WEB) Metric tracking ui in workbench-web
-- (WEB) Metric graphical display in workbench-web
-
-### Project setup
-- Clone repo
-```bash
-git https://github.com/apache/hadoop-submarine.git
-cd hadoop-submarine/pysubmarine/submarine-sdk
-```
-
-- Install pip package
-```
-pip install .
-```
-
-- Run tests
-```
-pytest --cov=submarine -vs
-```
-
-- Run checkstyle
-```
-pylint --msg-template="{path} ({line},{column}): \
-[{msg_id} {symbol}] {msg}" --rcfile=pylintrc -- submarine tests
-```
\ No newline at end of file
diff --git a/docs/submarine-sdk/pysubmarine/README.md b/docs/submarine-sdk/pysubmarine/README.md
new file mode 100644
index 0000000..f82dea5
--- /dev/null
+++ b/docs/submarine-sdk/pysubmarine/README.md
@@ -0,0 +1,45 @@
+<!---  
+  Licensed under the Apache License, Version 2.0 (the "License");  
+  you may not use this file except in compliance with the License.  
+  You may obtain a copy of the License at  
+  
+   http://www.apache.org/licenses/LICENSE-2.0  
+  
+  Unless required by applicable law or agreed to in writing, software  
+  distributed under the License is distributed on an "AS IS" BASIS,  
+  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.  
+  See the License for the specific language governing permissions and  
+  limitations under the License. See accompanying LICENSE file.  
+-->  
+
+# PySubmarine
+PySubmarine helps developers use submarine's internal data caching,
+data exchange, and task tracking capabilities to more efficiently improve the 
+development and execution of machine learning productivity.
+
+## Package setup
+- Clone repo
+```bash
+git clone https://github.com/apache/hadoop-submarine.git 
+cd hadoop-submarine/submarine-sdk/pysubmarine
+```
+
+- Install pip package
+```bash
+pip install .
+```
+
+- Run tests
+```bash
+pytest --cov=submarine -vs
+```
+
+- Run checkstyle
+```bash
+pylint --msg-template="{path} ({line},{column}): \
+[{msg_id} {symbol}] {msg}" --rcfile=pylintrc -- submarine tests
+```
+
+## PySubmarine API Reference
+### Tracking
+- [Tracking](tracking.md)
\ No newline at end of file
diff --git a/docs/submarine-sdk/pysubmarine/tracking.md b/docs/submarine-sdk/pysubmarine/tracking.md
new file mode 100644
index 0000000..abb0134
--- /dev/null
+++ b/docs/submarine-sdk/pysubmarine/tracking.md
@@ -0,0 +1,63 @@
+<!---  
+  Licensed under the Apache License, Version 2.0 (the "License");
+  you may not use this file except in compliance with the License.
+  You may obtain a copy of the License at
+  
+   http://www.apache.org/licenses/LICENSE-2.0
+  
+  Unless required by applicable law or agreed to in writing, software
+  distributed under the License is distributed on an "AS IS" BASIS,
+  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.  
+  See the License for the specific language governing permissions and 
+  limitations under the License. See accompanying LICENSE file.
+-->
+
+# pysubmarine-tracking
+- Allow data scientist to track distributed ML job
+- Support store ML parameters and metrics in Submarine-server
+- [TODO] Support store ML job output (e.g. csv,images)
+- Support hdfs,S3 and mysql (Currently we only support mysql)
+- [TODO] (WEB) Metric tracking ui in workbench-web
+- [TODO] (WEB) Metric graphical display in workbench-web
+
+## Functions
+### `submarine.get_tracking_uri()`
+
+return the tracking URI.
+
+### `submarine.set_tracking_uri(URI)`
+
+set the tracking URI. You can also set the
+SUBMARINE_TRACKING_URI environment variable to have Submarine find a URI from
+there. The URI should be database connection string. 
+
+**Parameters**
+
+- URI - Submarine record data to Mysql server. The database URL
+is expected in the format ``<dialect>+<driver>://<username>:<password>@<host>:<port>/<database>``.
+By default it's `mysql+pymysql://submarine:password@localhost:3306/submarineDB`.
+More detail : [SQLAlchemy docs](https://docs.sqlalchemy.org/en/latest/core/engines.html#database-urls)
+
+<!--
+    TODO : get database url from submarine-site.xml 
+-->
+
+### `submarine.log_metric(key, value, worker_index, step=0)`
+
+logs a single key-value metric. The value must always be a number.
+
+**Parameters**
+- key - Metric name (string).
+- value - Metric value (float).
+- worker_index - The index of worker (string). Examples are "rank-1", "worker-2"
+- step - A single integer step at which to log the specified Metrics,
+by default it's 0.
+
+### `submarine.log_param(key, value, worker_index)`
+
+logs a single key-value parameter. The key and value are both strings.
+
+**Parameters**
+- key - Parameter name (string).
+- value - Parameter value (string).
+- worker_index - The index of worker (string). Examples are "rank-1", "worker-2"
\ No newline at end of file