You are viewing a plain text version of this content. The canonical link for it is here.
Posted to submarine-dev@hadoop.apache.org by zt...@apache.org on 2019/10/01 13:16:42 UTC
[hadoop-submarine] branch master updated: SUBMARINE-149. [SDK] Add
submarine-sdk module
This is an automated email from the ASF dual-hosted git repository.
ztang pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/hadoop-submarine.git
The following commit(s) were added to refs/heads/master by this push:
new e34a7ee SUBMARINE-149. [SDK] Add submarine-sdk module
e34a7ee is described below
commit e34a7ee8a28752b532156c40287ba7de26b485ab
Author: pingsutw <pi...@gmail.com>
AuthorDate: Tue Oct 1 17:35:18 2019 +0800
SUBMARINE-149. [SDK] Add submarine-sdk module
### What is this PR for?
- Initialize submarine-SDK
- Support to log metrics and parameters to submarine server (Mysql)
- submarine SDK example
- UnitTest
- Checkstyle testing
### What type of PR is it?
[Feature]
### Todos
* [ ] - Task
### What is the Jira issue?
https://issues.apache.org/jira/browse/SUBMARINE-149
### How should this be tested?
https://travis-ci.org/pingsutw/hadoop-submarine/builds/590359651
### Screenshots (if appropriate)
### Questions:
* Does the licenses files need update? No
* Is there breaking changes for older versions? No
* Does this needs documentation? Yes
Author: pingsutw <pi...@gmail.com>
Closes #19 from pingsutw/SUBMARINE-149 and squashes the following commits:
b7475e4 [pingsutw] Fix review comments
61119c4 [pingsutw] Fix review comments
cc6753d [pingsutw] Add test_utils.py
ebc6658 [pingsutw] Refine code
ce34aad [pingsutw] Add test metrics and params
148f09a [pingsutw] Add test_env
382c24c [pingsutw] Initialize submarine-sdk
---
.gitignore | 51 ++
.travis.yml | 11 +
docs/submarine-sdk/README.md | 50 ++
submarine-sdk/pysubmarine/example/Tracking.py | 32 ++
submarine-sdk/pysubmarine/pylintrc | 600 +++++++++++++++++++++
submarine-sdk/pysubmarine/setup.py | 38 ++
submarine-sdk/pysubmarine/submarine/__init__.py | 24 +
.../pysubmarine/submarine/entities/Metric.py | 54 ++
.../pysubmarine/submarine/entities/Param.py | 42 ++
.../pysubmarine/submarine/entities/__init__.py | 22 +
.../submarine/entities/_submarine_object.py | 58 ++
submarine-sdk/pysubmarine/submarine/exceptions.py | 26 +
.../pysubmarine/submarine/store/__init__.py | 18 +
.../pysubmarine/submarine/store/abstract_store.py | 48 ++
.../submarine/store/database/__init__.py | 14 +
.../submarine/store/database/db_types.py | 30 ++
.../pysubmarine/submarine/store/database/models.py | 138 +++++
.../submarine/store/sqlalchemy_store.py | 154 ++++++
.../pysubmarine/submarine/tracking/__init__.py | 26 +
.../pysubmarine/submarine/tracking/client.py | 65 +++
.../pysubmarine/submarine/tracking/fluent.py | 66 +++
.../pysubmarine/submarine/tracking/utils.py | 76 +++
.../pysubmarine/submarine/utils/__init__.py | 36 ++
submarine-sdk/pysubmarine/submarine/utils/env.py | 25 +
.../pysubmarine/submarine/utils/validation.py | 115 ++++
submarine-sdk/pysubmarine/tests/__init__.py | 14 +
.../pysubmarine/tests/entities/test_metrics.py | 38 ++
.../pysubmarine/tests/entities/test_params.py | 32 ++
.../tests/store/test_sqlalchemy_store.py | 73 +++
.../pysubmarine/tests/tracking/test_tracking.py | 67 +++
.../pysubmarine/tests/tracking/test_utils.py | 60 +++
submarine-sdk/pysubmarine/tests/utils/test_env.py | 28 +
.../pysubmarine/tests/utils/test_validation.py | 65 +++
submarine-sdk/pysubmarine/travis/conda.sh | 54 ++
.../pysubmarine/travis/lint-requirements.txt | 16 +
submarine-sdk/pysubmarine/travis/lint.sh | 24 +
.../pysubmarine/travis/test-requirements.txt | 27 +
submodules/tony | 2 +-
38 files changed, 2318 insertions(+), 1 deletion(-)
diff --git a/.gitignore b/.gitignore
index fa80ba1..b8552dd 100644
--- a/.gitignore
+++ b/.gitignore
@@ -1,3 +1,54 @@
+# Byte-compiled / optimized / DLL files
+__pycache__
+*.py[cod]
+*$py.class
+
+# Distribution / packaging
+.Python
+build/
+develop-eggs/
+dist/
+downloads/
+eggs/
+.eggs*/
+lib/
+lib64/
+parts/
+sdist/
+var/
+wheels/
+*.egg-info/
+.installed.cfg
+*.egg
+MANIFEST
+node_modules
+
+# Installer logs
+pip-log.txt
+pip-delete-this-directory.txt
+
+# Unit test / coverage reports
+htmlcov/
+.coverage
+.coverage.*
+.cache
+nosetests.xml
+coverage.xml
+*.cover
+.hypothesis/
+.pytest_cache/
+
+# Environments
+env
+env3
+.env
+.venv
+env/
+venv/
+ENV/
+env.bak/
+venv.bak/
+.python-version
.DS_Store
.idea
*.iml
diff --git a/.travis.yml b/.travis.yml
index a8f8f1b..5a96fc0 100644
--- a/.travis.yml
+++ b/.travis.yml
@@ -112,6 +112,17 @@ matrix:
dist: xenial
env: NAME="Test submarine distribution" PROFILE="-Phadoop-2.9" BUILD_FLAG="clean package install -DskipTests" TEST_FLAG="test -DskipRat -am" MODULES="" TEST_MODULES="" TEST_PROJECTS=""
+ # Test submarine-sdk
+ - language: python
+ python: 3.6
+ dist: xenial
+ install:
+ - source ./submarine-sdk/pysubmarine/travis/conda.sh
+ - pip install -r ./submarine-sdk/pysubmarine/travis/lint-requirements.txt
+ script:
+ - ./submarine-sdk/pysubmarine/travis/lint.sh
+ - pytest --cov=submarine -vs
+
install:
- mvn --version
diff --git a/docs/submarine-sdk/README.md b/docs/submarine-sdk/README.md
new file mode 100644
index 0000000..e29f2b2
--- /dev/null
+++ b/docs/submarine-sdk/README.md
@@ -0,0 +1,50 @@
+<!---
+ Licensed under the Apache License, Version 2.0 (the "License");
+ you may not use this file except in compliance with the License.
+ You may obtain a copy of the License at
+
+ http://www.apache.org/licenses/LICENSE-2.0
+
+ Unless required by applicable law or agreed to in writing, software
+ distributed under the License is distributed on an "AS IS" BASIS,
+ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ See the License for the specific language governing permissions and
+ limitations under the License. See accompanying LICENSE file.
+-->
+
+# Submarine-SDK
+
+Support Python, Scala, R language for algorithm development.
+The SDK is provided to help developers use submarine's internal data caching,
+data exchange, and task tracking to more efficiently improve the development
+and execution of machine learning tasks.
+
+- Allow data scients to track distributed ML job
+- Support store ML parameters and metrics in Submarine-server
+- Support store ML job output (e.g. csv,images)
+- Support hdfs,S3 and mysql
+- (WEB) Metric tracking ui in workbench-web
+- (WEB) Metric graphical display in workbench-web
+
+### Project setup
+- Clone repo
+```bash
+git https://github.com/apache/hadoop-submarine.git
+cd hadoop-submarine/pysubmarine/submarine-sdk
+```
+
+- Install pip package
+```
+pip install .
+```
+
+- Run tests
+```
+pytest --cov=submarine -vs
+```
+
+- Run checkstyle
+```
+pylint --msg-template="{path} ({line},{column}): \
+[{msg_id} {symbol}] {msg}" --rcfile=pylintrc -- submarine tests
+```
\ No newline at end of file
diff --git a/submarine-sdk/pysubmarine/example/Tracking.py b/submarine-sdk/pysubmarine/example/Tracking.py
new file mode 100644
index 0000000..753df3f
--- /dev/null
+++ b/submarine-sdk/pysubmarine/example/Tracking.py
@@ -0,0 +1,32 @@
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements. See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License. You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+import numpy as np
+from os import environ
+from sklearn.linear_model import LogisticRegression
+import submarine
+
+if __name__ == "__main__":
+ # note: SUBMARINE_JOB_NAME should be set by submarine submitter
+ environ["SUBMARINE_JOB_NAME"] = "application_1234"
+ X = np.array([-2, -1, 0, 1, 2, 1]).reshape(-1, 1)
+ y = np.array([0, 0, 1, 1, 1, 0])
+ lr = LogisticRegression(solver='liblinear', max_iter=100)
+ submarine.log_param("max_iter", 100, "worker-1")
+ lr.fit(X, y)
+ score = lr.score(X, y)
+ print("Score: %s" % score)
+ submarine.log_metric("score", score, "worker-1")
+
diff --git a/submarine-sdk/pysubmarine/pylintrc b/submarine-sdk/pysubmarine/pylintrc
new file mode 100644
index 0000000..b657436
--- /dev/null
+++ b/submarine-sdk/pysubmarine/pylintrc
@@ -0,0 +1,600 @@
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements. See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License. You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License
+
+[MASTER]
+
+# A comma-separated list of package or module names from where C extensions may
+# be loaded. Extensions are loading into the active Python interpreter and may
+# run arbitrary code.
+extension-pkg-whitelist=
+
+# Add files or directories to the blacklist. They should be base names, not
+# paths.
+ignore=CVS
+
+# Add files or directories matching the regex patterns to the blacklist. The
+# regex matches against base names, not paths.
+ignore-patterns=
+
+# Python code to execute, usually for sys.path manipulation such as
+# pygtk.require().
+#init-hook=
+
+# Use multiple processes to speed up Pylint. Specifying 0 will auto-detect the
+# number of processors available to use.
+jobs=1
+
+# Control the amount of potential inferred values when inferring a single
+# object. This can help the performance when dealing with large functions or
+# complex, nested conditions.
+limit-inference-results=100
+
+# List of plugins (as comma separated values of python modules names) to load,
+# usually to register additional checkers.
+load-plugins=
+
+# Pickle collected data for later comparisons.
+persistent=yes
+
+# Specify a configuration file.
+#rcfile=
+
+# When enabled, pylint would attempt to guess common misconfiguration and emit
+# user-friendly hints instead of false-positive error messages.
+suggestion-mode=yes
+
+# Allow loading of arbitrary C extensions. Extensions are imported into the
+# active Python interpreter and may run arbitrary code.
+unsafe-load-any-extension=no
+
+
+[MESSAGES CONTROL]
+
+# Only show warnings with the listed confidence levels. Leave empty to show
+# all. Valid levels: HIGH, INFERENCE, INFERENCE_FAILURE, UNDEFINED.
+confidence=
+
+# Disable the message, report, category or checker with the given id(s). You
+# can either give multiple identifiers separated by comma (,) or put this
+# option multiple times (only on the command line, not in the configuration
+# file where it should appear only once). You can also use "--disable=all" to
+# disable everything first and then reenable specific checks. For example, if
+# you want to run only the similarities checker, you can use "--disable=all
+# --enable=similarities". If you want to run only the classes checker, but have
+# no Warning level messages displayed, use "--disable=all --enable=classes
+# --disable=W".
+disable=missing-docstring,
+ print-statement,
+ parameter-unpacking,
+ unpacking-in-except,
+ old-raise-syntax,
+ backtick,
+ long-suffix,
+ old-ne-operator,
+ old-octal-literal,
+ import-star-module-level,
+ non-ascii-bytes-literal,
+ raw-checker-failed,
+ bad-inline-option,
+ locally-disabled,
+ locally-enabled,
+ file-ignored,
+ suppressed-message,
+ useless-suppression,
+ deprecated-pragma,
+ use-symbolic-message-instead,
+ apply-builtin,
+ basestring-builtin,
+ buffer-builtin,
+ cmp-builtin,
+ coerce-builtin,
+ execfile-builtin,
+ file-builtin,
+ long-builtin,
+ raw_input-builtin,
+ reduce-builtin,
+ standarderror-builtin,
+ unicode-builtin,
+ xrange-builtin,
+ coerce-method,
+ delslice-method,
+ getslice-method,
+ setslice-method,
+ no-absolute-import,
+ old-division,
+ dict-iter-method,
+ dict-view-method,
+ next-method-called,
+ metaclass-assignment,
+ indexing-exception,
+ raising-string,
+ reload-builtin,
+ oct-method,
+ hex-method,
+ nonzero-method,
+ cmp-method,
+ input-builtin,
+ round-builtin,
+ intern-builtin,
+ unichr-builtin,
+ map-builtin-not-iterating,
+ zip-builtin-not-iterating,
+ range-builtin-not-iterating,
+ filter-builtin-not-iterating,
+ using-cmp-argument,
+ eq-without-hash,
+ div-method,
+ idiv-method,
+ rdiv-method,
+ exception-message-attribute,
+ invalid-str-codec,
+ sys-max-int,
+ bad-python3-import,
+ deprecated-string-function,
+ deprecated-str-translate-call,
+ deprecated-itertools-function,
+ deprecated-types-field,
+ next-method-defined,
+ dict-items-not-iterating,
+ dict-keys-not-iterating,
+ dict-values-not-iterating,
+ # For Submarine, ignore "convention" and "refactor" message types
+ # (see all message types at https://docs.pylint.org/en/1.6.0/tutorial.html#getting-started)
+ C,
+ R,
+ # Additional style-check messages disabled for Submarine
+ invalid-name,
+ no-else-return,
+ protected-access,
+ too-many-instance-attributes,
+ len-as-condition,
+ too-many-locals,
+ too-many-arguments,
+ too-many-return-statements,
+ fixme,
+ global-statement,
+ no-member, # disabled due to https://github.com/PyCQA/pylint/issues/1864
+ # redefined-outer-name disabled to avoid linting errors when using pytest fixtures
+ # (see https://stackoverflow.com/questions/46089480/pytest-fixtures-redefining-name-from-outer-scope-pylint)
+ redefined-outer-name,
+
+# Enable the message, report, category or checker with the given id(s). You can
+# either give multiple identifier separated by comma (,) or put this option
+# multiple time (only on the command line, not in the configuration file where
+# it should appear only once). See also the "--disable" option for examples.
+enable=c-extension-no-member
+
+
+[REPORTS]
+
+# Python expression which should return a note less than 10 (10 is the highest
+# note). You have access to the variables errors warning, statement which
+# respectively contain the number of errors / warnings messages and the total
+# number of statements analyzed. This is used by the global evaluation report
+# (RP0004).
+evaluation=10.0 - ((float(5 * error + warning + refactor + convention) / statement) * 10)
+
+# Template used to display messages. This is a python new-style format string
+# used to format the message information. See doc for all details.
+#msg-template=
+
+# Set the output format. Available formats are text, parseable, colorized, json
+# and msvs (visual studio). You can also give a reporter class, e.g.
+# mypackage.mymodule.MyReporterClass.
+output-format=text
+
+# Tells whether to display a full report or only the messages.
+reports=no
+
+# Activate the evaluation score.
+score=yes
+
+
+[REFACTORING]
+
+# Maximum number of nested blocks for function / method body
+max-nested-blocks=5
+
+# Complete name of functions that never returns. When checking for
+# inconsistent-return-statements if a never returning function is called then
+# it will be considered as an explicit return statement and no message will be
+# printed.
+never-returning-functions=optparse.Values,sys.exit
+
+
+[LOGGING]
+
+# Format style used to check logging format string. `old` means using %
+# formatting, while `new` is for `{}` formatting.
+logging-format-style=old
+
+# Logging modules to check that the string format arguments are in logging
+# function parameter format.
+logging-modules=logging
+
+
+[SIMILARITIES]
+
+# Ignore comments when computing similarities.
+ignore-comments=yes
+
+# Ignore docstrings when computing similarities.
+ignore-docstrings=yes
+
+# Ignore imports when computing similarities.
+ignore-imports=no
+
+# Minimum lines number of a similarity.
+min-similarity-lines=4
+
+
+[VARIABLES]
+
+# List of additional names supposed to be defined in builtins. Remember that
+# you should avoid defining new builtins when possible.
+additional-builtins=
+
+# Tells whether unused global variables should be treated as a violation.
+allow-global-unused-variables=yes
+
+# List of strings which can identify a callback function by name. A callback
+# name must start or end with one of those strings.
+callbacks=cb_,
+ _cb
+
+# A regular expression matching the name of dummy variables (i.e. expected to
+# not be used).
+dummy-variables-rgx=_+$|(_[a-zA-Z0-9_]*[a-zA-Z0-9]+?$)|dummy|^ignored_|^unused_
+
+# Argument names that match this expression will be ignored. Default to name
+# with leading underscore.
+ignored-argument-names=_.*|^ignored_|^unused_
+
+# Tells whether we should check for unused import in __init__ files.
+init-import=no
+
+# List of qualified module names which can have objects that can redefine
+# builtins.
+redefining-builtins-modules=six.moves,past.builtins,future.builtins,builtins,io
+
+
+[SPELLING]
+
+# Limits count of emitted suggestions for spelling mistakes.
+max-spelling-suggestions=4
+
+# Spelling dictionary name. Available dictionaries: none. To make it working
+# install python-enchant package..
+spelling-dict=
+
+# List of comma separated words that should not be checked.
+spelling-ignore-words=
+
+# A path to a file that contains private dictionary; one word per line.
+spelling-private-dict-file=
+
+# Tells whether to store unknown words to indicated private dictionary in
+# --spelling-private-dict-file option instead of raising a message.
+spelling-store-unknown-words=no
+
+
+[FORMAT]
+
+# Expected format of line ending, e.g. empty (any line ending), LF or CRLF.
+expected-line-ending-format=
+
+# Regexp for a line that is allowed to be longer than the limit.
+ignore-long-lines=^\s*(# )?<?https?://\S+>?$
+
+# Number of spaces of indent required inside a hanging or continued line.
+indent-after-paren=4
+
+# String used as indentation unit. This is usually " " (4 spaces) or "\t" (1
+# tab).
+indent-string=' '
+
+# Maximum number of characters on a single line.
+max-line-length=100
+
+# Maximum number of lines in a module.
+max-module-lines=1000
+
+# List of optional constructs for which whitespace checking is disabled. `dict-
+# separator` is used to allow tabulation in dicts, etc.: {1 : 1,\n222: 2}.
+# `trailing-comma` allows a space between comma and closing bracket: (a, ).
+# `empty-line` allows space-only lines.
+no-space-check=trailing-comma,
+ dict-separator
+
+# Allow the body of a class to be on the same line as the declaration if body
+# contains single statement.
+single-line-class-stmt=no
+
+# Allow the body of an if to be on the same line as the test if there is no
+# else.
+single-line-if-stmt=no
+
+
+[BASIC]
+
+# Naming style matching correct argument names.
+argument-naming-style=snake_case
+
+# Regular expression matching correct argument names. Overrides argument-
+# naming-style.
+#argument-rgx=
+
+# Naming style matching correct attribute names.
+attr-naming-style=snake_case
+
+# Regular expression matching correct attribute names. Overrides attr-naming-
+# style.
+#attr-rgx=
+
+# Bad variable names which should always be refused, separated by a comma.
+bad-names=foo,
+ bar,
+ baz,
+ toto,
+ tutu,
+ tata
+
+# Naming style matching correct class attribute names.
+class-attribute-naming-style=any
+
+# Regular expression matching correct class attribute names. Overrides class-
+# attribute-naming-style.
+#class-attribute-rgx=
+
+# Naming style matching correct class names.
+class-naming-style=PascalCase
+
+# Regular expression matching correct class names. Overrides class-naming-
+# style.
+#class-rgx=
+
+# Naming style matching correct constant names.
+const-naming-style=UPPER_CASE
+
+# Regular expression matching correct constant names. Overrides const-naming-
+# style.
+#const-rgx=
+
+# Minimum line length for functions/classes that require docstrings, shorter
+# ones are exempt.
+docstring-min-length=-1
+
+# Naming style matching correct function names.
+function-naming-style=snake_case
+
+# Regular expression matching correct function names. Overrides function-
+# naming-style.
+#function-rgx=
+
+# Good variable names which should always be accepted, separated by a comma.
+good-names=i,
+ j,
+ k,
+ ex,
+ Run,
+ _
+
+# Include a hint for the correct naming format with invalid-name.
+include-naming-hint=no
+
+# Naming style matching correct inline iteration names.
+inlinevar-naming-style=any
+
+# Regular expression matching correct inline iteration names. Overrides
+# inlinevar-naming-style.
+#inlinevar-rgx=
+
+# Naming style matching correct method names.
+method-naming-style=snake_case
+
+# Regular expression matching correct method names. Overrides method-naming-
+# style.
+#method-rgx=
+
+# Naming style matching correct module names.
+module-naming-style=snake_case
+
+# Regular expression matching correct module names. Overrides module-naming-
+# style.
+#module-rgx=
+
+# Colon-delimited sets of names that determine each other's naming style when
+# the name regexes allow several styles.
+name-group=
+
+# Regular expression which should only match function or class names that do
+# not require a docstring.
+no-docstring-rgx=^_
+
+# List of decorators that produce properties, such as abc.abstractproperty. Add
+# to this list to register other decorators that produce valid properties.
+# These decorators are taken in consideration only for invalid-name.
+property-classes=abc.abstractproperty
+
+# Naming style matching correct variable names.
+variable-naming-style=snake_case
+
+# Regular expression matching correct variable names. Overrides variable-
+# naming-style.
+#variable-rgx=
+
+
+[MISCELLANEOUS]
+
+# List of note tags to take in consideration, separated by a comma.
+notes=FIXME,
+ XXX,
+ TODO
+
+
+[TYPECHECK]
+
+# List of decorators that produce context managers, such as
+# contextlib.contextmanager. Add to this list to register other decorators that
+# produce valid context managers.
+contextmanager-decorators=contextlib.contextmanager
+
+# List of members which are set dynamically and missed by pylint inference
+# system, and so shouldn't trigger E1101 when accessed. Python regular
+# expressions are accepted.
+generated-members=
+
+# Tells whether missing members accessed in mixin class should be ignored. A
+# mixin class is detected if its name ends with "mixin" (case insensitive).
+ignore-mixin-members=yes
+
+# Tells whether to warn about missing members when the owner of the attribute
+# is inferred to be None.
+ignore-none=yes
+
+# This flag controls whether pylint should warn about no-member and similar
+# checks whenever an opaque object is returned when inferring. The inference
+# can return multiple potential results while evaluating a Python object, but
+# some branches might not be evaluated, which results in partial inference. In
+# that case, it might be useful to still emit no-member and other checks for
+# the rest of the inferred objects.
+ignore-on-opaque-inference=yes
+
+# List of class names for which member attributes should not be checked (useful
+# for classes with dynamically set attributes). This supports the use of
+# qualified names.
+ignored-classes=optparse.Values,thread._local,_thread._local
+
+# List of module names for which member attributes should not be checked
+# (useful for modules/projects where namespaces are manipulated during runtime
+# and thus existing member attributes cannot be deduced by static analysis. It
+# supports qualified module names, as well as Unix pattern matching.
+ignored-modules=distutils,tensorflow.keras
+
+# Show a hint with possible names when a member name was not found. The aspect
+# of finding the hint is based on edit distance.
+missing-member-hint=yes
+
+# The minimum edit distance a name should have in order to be considered a
+# similar match for a missing member name.
+missing-member-hint-distance=1
+
+# The total number of similar names that should be taken in consideration when
+# showing a hint for a missing member.
+missing-member-max-choices=1
+
+
+[STRING]
+
+# This flag controls whether the implicit-str-concat-in-sequence should
+# generate a warning on implicit string concatenation in sequences defined over
+# several lines.
+check-str-concat-over-line-jumps=no
+
+
+[CLASSES]
+
+# List of method names used to declare (i.e. assign) instance attributes.
+defining-attr-methods=__init__,
+ __new__,
+ setUp
+
+# List of member names, which should be excluded from the protected access
+# warning.
+exclude-protected=_asdict,
+ _fields,
+ _replace,
+ _source,
+ _make
+
+# List of valid names for the first argument in a class method.
+valid-classmethod-first-arg=cls
+
+# List of valid names for the first argument in a metaclass class method.
+valid-metaclass-classmethod-first-arg=cls
+
+
+[IMPORTS]
+
+# Allow wildcard imports from modules that define __all__.
+allow-wildcard-with-all=no
+
+# Analyse import fallback blocks. This can be used to support both Python 2 and
+# 3 compatible code, which means that the block might have code that exists
+# only in one or another interpreter, leading to false positives when analysed.
+analyse-fallback-blocks=no
+
+# Deprecated modules which should not be used, separated by a comma.
+deprecated-modules=optparse,tkinter.tix
+
+# Create a graph of external dependencies in the given file (report RP0402 must
+# not be disabled).
+ext-import-graph=
+
+# Create a graph of every (i.e. internal and external) dependencies in the
+# given file (report RP0402 must not be disabled).
+import-graph=
+
+# Create a graph of internal dependencies in the given file (report RP0402 must
+# not be disabled).
+int-import-graph=
+
+# Force import order to recognize a module as part of the standard
+# compatibility libraries.
+known-standard-library=
+
+# Force import order to recognize a module as part of a third party library.
+known-third-party=enchant
+
+
+[DESIGN]
+
+# Maximum number of arguments for function / method.
+max-args=5
+
+# Maximum number of attributes for a class (see R0902).
+max-attributes=7
+
+# Maximum number of boolean expressions in an if statement.
+max-bool-expr=5
+
+# Maximum number of branch for function / method body.
+max-branches=12
+
+# Maximum number of locals for function / method body.
+max-locals=15
+
+# Maximum number of parents for a class (see R0901).
+max-parents=7
+
+# Maximum number of public methods for a class (see R0904).
+max-public-methods=20
+
+# Maximum number of return / yield for function / method body.
+max-returns=6
+
+# Maximum number of statements in function / method body.
+max-statements=50
+
+# Minimum number of public methods for a class (see R0903).
+min-public-methods=2
+
+
+[EXCEPTIONS]
+
+# Exceptions that will emit a warning when being caught. Defaults to
+# "BaseException, Exception".
+overgeneral-exceptions=BaseException,
+ Exception
diff --git a/submarine-sdk/pysubmarine/setup.py b/submarine-sdk/pysubmarine/setup.py
new file mode 100644
index 0000000..10d71c0
--- /dev/null
+++ b/submarine-sdk/pysubmarine/setup.py
@@ -0,0 +1,38 @@
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements. See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License. You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+from setuptools import setup, find_packages
+
+
+setup(
+ name='pysubmarine',
+ version='0.3.0-SNAPSHOT',
+ description="A python SDK for submarine",
+ url="https://github.com/apache/hadoop-submarine",
+ packages=find_packages(exclude=['tests', 'tests.*']),
+ install_requires=[
+ 'six>=1.10.0',
+ 'numpy',
+ 'pandas',
+ 'sqlalchemy',
+ 'sqlparse',
+ 'pymysql',
+ ],
+ classifiers=[
+ 'Intended Audience :: Developers',
+ 'Programming Language :: Python :: 2.7',
+ 'Programming Language :: Python :: 3.6',
+ ],
+)
diff --git a/submarine-sdk/pysubmarine/submarine/__init__.py b/submarine-sdk/pysubmarine/submarine/__init__.py
new file mode 100644
index 0000000..84b8890
--- /dev/null
+++ b/submarine-sdk/pysubmarine/submarine/__init__.py
@@ -0,0 +1,24 @@
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements. See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License. You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+import submarine.tracking.fluent
+import submarine.tracking as tracking
+
+log_param = submarine.tracking.fluent.log_param
+log_metric = submarine.tracking.fluent.log_metric
+set_tracking_uri = tracking.set_tracking_uri
+get_tracking_uri = tracking.get_tracking_uri
+
+__all__ = ["log_metric", "log_param", "set_tracking_uri", "get_tracking_uri"]
diff --git a/submarine-sdk/pysubmarine/submarine/entities/Metric.py b/submarine-sdk/pysubmarine/submarine/entities/Metric.py
new file mode 100644
index 0000000..ea69954
--- /dev/null
+++ b/submarine-sdk/pysubmarine/submarine/entities/Metric.py
@@ -0,0 +1,54 @@
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements. See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License. You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+from submarine.entities._submarine_object import _SubmarineObject
+
+
+class Metric(_SubmarineObject):
+ """
+ Metric object.
+ """
+
+ def __init__(self, key, value, worker_index, timestamp, step):
+ self._key = key
+ self._value = value
+ self._worker_index = worker_index
+ self._timestamp = timestamp
+ self._step = step
+
+ @property
+ def key(self):
+ """String key corresponding to the metric name."""
+ return self._key
+
+ @property
+ def value(self):
+ """Float value of the metric."""
+ return self._value
+
+ @property
+ def worker_index(self):
+ """string value of the metric."""
+ return self._worker_index
+
+ @property
+ def timestamp(self):
+ """Metric timestamp as an integer (milliseconds since the Unix epoch)."""
+ return self._timestamp
+
+ @property
+ def step(self):
+ """Integer metric step (x-coordinate)."""
+ return self._step
diff --git a/submarine-sdk/pysubmarine/submarine/entities/Param.py b/submarine-sdk/pysubmarine/submarine/entities/Param.py
new file mode 100644
index 0000000..af09bd8
--- /dev/null
+++ b/submarine-sdk/pysubmarine/submarine/entities/Param.py
@@ -0,0 +1,42 @@
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements. See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License. You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+from submarine.entities._submarine_object import _SubmarineObject
+
+
+class Param(_SubmarineObject):
+ """
+ Parameter object.
+ """
+
+ def __init__(self, key, value, worker_index):
+ self._key = key
+ self._value = value
+ self._worker_index = worker_index
+
+ @property
+ def key(self):
+ """String key corresponding to the parameter name."""
+ return self._key
+
+ @property
+ def value(self):
+ """String value of the parameter."""
+ return self._value
+
+ @property
+ def worker_index(self):
+ """String value of the parameter."""
+ return self._worker_index
diff --git a/submarine-sdk/pysubmarine/submarine/entities/__init__.py b/submarine-sdk/pysubmarine/submarine/entities/__init__.py
new file mode 100644
index 0000000..e2d8479
--- /dev/null
+++ b/submarine-sdk/pysubmarine/submarine/entities/__init__.py
@@ -0,0 +1,22 @@
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements. See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License. You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+from submarine.entities.Metric import Metric
+from submarine.entities.Param import Param
+
+__all__ = [
+ "Metric",
+ "Param",
+]
diff --git a/submarine-sdk/pysubmarine/submarine/entities/_submarine_object.py b/submarine-sdk/pysubmarine/submarine/entities/_submarine_object.py
new file mode 100644
index 0000000..29137b5
--- /dev/null
+++ b/submarine-sdk/pysubmarine/submarine/entities/_submarine_object.py
@@ -0,0 +1,58 @@
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements. See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License. You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+import pprint
+
+
+class _SubmarineObject:
+ def __iter__(self):
+ # Iterate through list of properties and yield as key -> value
+ for prop in self._properties():
+ yield prop, self.__getattribute__(prop)
+
+ @classmethod
+ def _properties(cls):
+ return sorted([p for p in cls.__dict__ if isinstance(getattr(cls, p), property)])
+
+ @classmethod
+ def from_dictionary(cls, the_dict):
+ filtered_dict = {key: value for key, value in the_dict.items() if key in cls._properties()}
+ return cls(**filtered_dict)
+
+ def __repr__(self):
+ return to_string(self)
+
+
+def to_string(obj):
+ return _SubmarineObjectPrinter().to_string(obj)
+
+
+def get_classname(obj):
+ return type(obj).__name__
+
+
+class _SubmarineObjectPrinter:
+
+ def __init__(self):
+ super(_SubmarineObjectPrinter, self).__init__()
+ self.printer = pprint.PrettyPrinter()
+
+ def to_string(self, obj):
+ if isinstance(obj, _SubmarineObject):
+ return "<%s: %s>" % (get_classname(obj), self._entity_to_string(obj))
+ return self.printer.pformat(obj)
+
+ def _entity_to_string(self, entity):
+ return ", ".join(["%s=%s" % (key, self.to_string(value)) for key, value in entity])
diff --git a/submarine-sdk/pysubmarine/submarine/exceptions.py b/submarine-sdk/pysubmarine/submarine/exceptions.py
new file mode 100644
index 0000000..6ce1bca
--- /dev/null
+++ b/submarine-sdk/pysubmarine/submarine/exceptions.py
@@ -0,0 +1,26 @@
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements. See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License. You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+
+class SubmarineException(Exception):
+ """
+ Generic exception thrown to surface failure information about external-facing operations.
+ """
+ def __init__(self, message):
+ """
+ :param message: The message describing the error that occured.
+ """
+ self.message = message
+ super(SubmarineException, self).__init__(message)
diff --git a/submarine-sdk/pysubmarine/submarine/store/__init__.py b/submarine-sdk/pysubmarine/submarine/store/__init__.py
new file mode 100644
index 0000000..a8362d7
--- /dev/null
+++ b/submarine-sdk/pysubmarine/submarine/store/__init__.py
@@ -0,0 +1,18 @@
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements. See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License. You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+DEFAULT_SUBMARINE_JDBC_URL = "mysql+pymysql://submarine:password@localhost:3306/submarineDB"
+
+__all__ = ["DEFAULT_SUBMARINE_JDBC_URL"]
diff --git a/submarine-sdk/pysubmarine/submarine/store/abstract_store.py b/submarine-sdk/pysubmarine/submarine/store/abstract_store.py
new file mode 100644
index 0000000..7ca99fb
--- /dev/null
+++ b/submarine-sdk/pysubmarine/submarine/store/abstract_store.py
@@ -0,0 +1,48 @@
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements. See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License. You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+from abc import ABCMeta
+
+
+class AbstractStore:
+ """
+ Abstract class for Backend Storage.
+ This class defines the API interface for front ends to connect with various types of backends.
+ """
+
+ __metaclass__ = ABCMeta
+
+ def __init__(self):
+ """
+ Empty constructor for now. This is deliberately not marked as abstract, else every
+ derived class would be forced to create one.
+ """
+ pass
+
+ def log_metric(self, job_name, metric):
+ """
+ Log a metric for the specified run
+ :param job_name: String id for the run
+ :param metric: :py:class:`submarine.entities.Metric` instance to log
+ """
+ pass
+
+ def log_param(self, job_name, param):
+ """
+ Log a param for the specified run
+ :param job_name: String id for the run
+ :param param: :py:class:`submarine.entities.Param` instance to log
+ """
+ pass
diff --git a/submarine-sdk/pysubmarine/submarine/store/database/__init__.py b/submarine-sdk/pysubmarine/submarine/store/database/__init__.py
new file mode 100644
index 0000000..a6eb1b5
--- /dev/null
+++ b/submarine-sdk/pysubmarine/submarine/store/database/__init__.py
@@ -0,0 +1,14 @@
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements. See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License. You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
diff --git a/submarine-sdk/pysubmarine/submarine/store/database/db_types.py b/submarine-sdk/pysubmarine/submarine/store/database/db_types.py
new file mode 100644
index 0000000..aff53ee
--- /dev/null
+++ b/submarine-sdk/pysubmarine/submarine/store/database/db_types.py
@@ -0,0 +1,30 @@
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements. See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License. You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+"""
+Set of SQLAlchemy database schemas supported in Submarine for tracking server backends.
+"""
+
+POSTGRES = 'postgresql'
+MYSQL = 'mysql'
+SQLITE = 'sqlite'
+MSSQL = 'mssql'
+
+DATABASE_ENGINES = [
+ POSTGRES,
+ MYSQL,
+ SQLITE,
+ MSSQL
+]
diff --git a/submarine-sdk/pysubmarine/submarine/store/database/models.py b/submarine-sdk/pysubmarine/submarine/store/database/models.py
new file mode 100644
index 0000000..c63dc2e
--- /dev/null
+++ b/submarine-sdk/pysubmarine/submarine/store/database/models.py
@@ -0,0 +1,138 @@
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements. See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License. You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+import time
+import sqlalchemy as sa
+from sqlalchemy import (Column, String, BigInteger,
+ PrimaryKeyConstraint, Boolean)
+from sqlalchemy.ext.declarative import declarative_base
+from submarine.entities import (Metric, Param)
+
+
+Base = declarative_base()
+
+# +-------+----------+--------------+---------------+------+--------+------------------+
+# | key | value | worker_index | timestamp | step | is_nan | job_name |
+# +-------+----------+--------------+---------------+------+--------+------------------+
+# | score | 0.666667 | worker-1 | 1569139525097 | 0 | 0 | application_1234 |
+# | score | 0.666667 | worker-1 | 1569149139731 | 0 | 0 | application_1234 |
+# | score | 0.666667 | worker-1 | 1569169376482 | 0 | 0 | application_1234 |
+# | score | 0.666667 | worker-1 | 1569236290721 | 0 | 0 | application_1234 |
+# | score | 0.666667 | worker-1 | 1569236466722 | 0 | 0 | application_1234 |
+# +-------+----------+--------------+---------------+------+--------+------------------+
+
+
+class SqlMetric(Base):
+ __tablename__ = 'metrics'
+
+ key = Column(String(250))
+ """
+ Metric key: `String` (limit 250 characters). Part of *Primary Key* for ``metrics`` table.
+ """
+ value = Column(sa.types.Float(precision=53), nullable=False)
+ """
+ Metric value: `Float`. Defined as *Non-null* in schema.
+ """
+ worker_index = Column(String(250))
+ """
+ Metric worker_index: `String` (limit 250 characters). Part of *Primary Key* for
+ ``metrics`` table.
+ """
+ timestamp = Column(BigInteger, default=lambda: int(time.time()))
+ """
+ Timestamp recorded for this metric entry: `BigInteger`. Part of *Primary Key* for
+ ``metrics`` table.
+ """
+ step = Column(BigInteger, default=0, nullable=False)
+ """
+ Step recorded for this metric entry: `BigInteger`.
+ """
+ is_nan = Column(Boolean, nullable=False, default=False)
+ """
+ True if the value is in fact NaN.
+ """
+ job_name = Column(String(32))
+ """
+ JOB NAME to which this metric belongs to: Part of *Primary Key* for ``metrics`` table.
+ """
+
+ __table_args__ = (
+ PrimaryKeyConstraint('key', 'timestamp', 'worker_index', 'step', 'job_name',
+ 'value', "is_nan", name='metric_pk'),
+ )
+
+ def __repr__(self):
+ return '<SqlMetric({}, {}, {}, {}, {})>'.format(self.key, self.value, self.worker_index,
+ self.timestamp, self.step)
+
+ def to_submarine_entity(self):
+ """
+ Convert DB model to corresponding Submarine entity.
+ :return: :py:class:`submarine.entities.Metric`.
+ """
+ return Metric(
+ key=self.key,
+ value=self.value if not self.is_nan else float("nan"),
+ worker_index=self.worker_index,
+ timestamp=self.timestamp,
+ step=self.step)
+
+
+# +----------+-------+--------------+-----------------------+
+# | key | value | worker_index | job_name |
+# +----------+-------+--------------+-----------------------+
+# | max_iter | 100 | worker-1 | application_123651651 |
+# | n_jobs | 5 | worker-1 | application_123456898 |
+# | alpha | 20 | worker-1 | application_123456789 |
+# +----------+-------+--------------+-----------------------+
+
+
+class SqlParam(Base):
+ __tablename__ = 'params'
+
+ key = Column(String(250))
+ """
+ Param key: `String` (limit 250 characters). Part of *Primary Key* for ``params`` table.
+ """
+ value = Column(String(250), nullable=False)
+ """
+ Param value: `String` (limit 250 characters). Defined as *Non-null* in schema.
+ """
+ worker_index = Column(String(250), nullable=False)
+ """
+ Param worker_index: `String` (limit 250 characters). Part of *Primary Key* for
+ ``metrics`` table.
+ """
+ job_name = Column(String(32))
+ """
+ JOB NAME to which this parameter belongs to: Part of *Primary Key* for ``params`` table.
+ """
+
+ __table_args__ = (
+ PrimaryKeyConstraint('key', 'job_name', 'worker_index', name='param_pk'),
+ )
+
+ def __repr__(self):
+ return '<SqlParam({}, {}, {})>'.format(self.key, self.value, self.worker_index)
+
+ def to_submarine_entity(self):
+ """
+ Convert DB model to corresponding submarine entity.
+ :return: :py:class:`submarine.entities.Param`.
+ """
+ return Param(
+ key=self.key,
+ value=self.value,
+ worker_index=self.worker_index)
diff --git a/submarine-sdk/pysubmarine/submarine/store/sqlalchemy_store.py b/submarine-sdk/pysubmarine/submarine/store/sqlalchemy_store.py
new file mode 100644
index 0000000..bd29c02
--- /dev/null
+++ b/submarine-sdk/pysubmarine/submarine/store/sqlalchemy_store.py
@@ -0,0 +1,154 @@
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements. See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License. You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+import logging
+from contextlib import contextmanager
+import math
+import sqlalchemy
+
+from submarine.exceptions import SubmarineException
+from submarine.utils import extract_db_type_from_uri
+from submarine.store.database.models import Base, SqlMetric, SqlParam
+from submarine.store.abstract_store import AbstractStore
+
+
+_logger = logging.getLogger(__name__)
+
+
+class SqlAlchemyStore(AbstractStore):
+ """
+ SQLAlchemy compliant backend store for tracking meta data for Submarine entities. Submarine
+ supports the database dialects ``mysql``.
+ As specified in the
+ `SQLAlchemy docs <https://docs.sqlalchemy.org/en/latest/core/engines.html#database-urls>`_ ,
+ the database URI is expected in the format
+ ``<dialect>+<driver>://<username>:<password>@<host>:<port>/<database>``. If you do not
+ specify a driver, SQLAlchemy uses a dialect's default driver.
+ This store interacts with SQL store using SQLAlchemy abstractions defined for
+ Submarine entities.
+ :py:class:`submarine.store.database.models.SqlMetric`, and
+ :py:class:`submarine.store.database.models.SqlParam`.
+ """
+
+ def __init__(self, db_uri):
+ """
+ Create a database backed store.
+ :param db_uri: The SQLAlchemy database URI string to connect to the database. See
+ the `SQLAlchemy docs
+ <https://docs.sqlalchemy.org/en/latest/core/engines.html#database-urls>`_
+ for format specifications. Submarine supports the dialects ``mysql``.
+ """
+ super(SqlAlchemyStore, self).__init__()
+ self.db_uri = db_uri
+ self.db_type = extract_db_type_from_uri(db_uri)
+ self.engine = sqlalchemy.create_engine(db_uri, pool_pre_ping=True)
+ insp = sqlalchemy.inspect(self.engine)
+
+ expected_tables = {
+ SqlMetric.__tablename__,
+ SqlParam.__tablename__,
+ }
+ if len(expected_tables & set(insp.get_table_names())) == 0:
+ SqlAlchemyStore._initialize_tables(self.engine)
+ Base.metadata.bind = self.engine
+ SessionMaker = sqlalchemy.orm.sessionmaker(bind=self.engine)
+ self.ManagedSessionMaker = self._get_managed_session_maker(SessionMaker)
+ # Todo Need to check database's schema is not out of date
+ # SqlAlchemyStore._verify_schema(self.engine)
+
+ @staticmethod
+ def _initialize_tables(engine):
+ _logger.info("Creating initial Submarine database tables...")
+ Base.metadata.create_all(engine)
+
+ @staticmethod
+ def _get_managed_session_maker(SessionMaker):
+ """
+ Creates a factory for producing exception-safe SQLAlchemy sessions that are made available
+ using a context manager. Any session produced by this factory is automatically committed
+ if no exceptions are encountered within its associated context. If an exception is
+ encountered, the session is rolled back. Finally, any session produced by this factory is
+ automatically closed when the session's associated context is exited.
+ """
+ @contextmanager
+ def make_managed_session():
+ """Provide a transactional scope around a series of operations."""
+ session = SessionMaker()
+ try:
+ yield session
+ session.commit()
+ except SubmarineException:
+ session.rollback()
+ raise
+ except Exception as e:
+ session.rollback()
+ raise SubmarineException(message=e)
+ finally:
+ session.close()
+
+ return make_managed_session
+
+ @staticmethod
+ def _save_to_db(session, objs):
+ """
+ Store in db
+ """
+ if type(objs) is list:
+ session.add_all(objs)
+ else:
+ # single object
+ session.add(objs)
+
+ def _get_or_create(self, session, model, **kwargs):
+ instance = session.query(model).filter_by(**kwargs).first()
+ created = False
+
+ if instance:
+ return instance, created
+ else:
+ instance = model(**kwargs)
+ self._save_to_db(objs=instance, session=session)
+ created = True
+
+ return instance, created
+
+ def log_metric(self, job_name, metric):
+ is_nan = math.isnan(metric.value)
+ if is_nan:
+ value = 0
+ elif math.isinf(metric.value):
+ # NB: Sql can not represent Infs = > We replace +/- Inf with max/min 64b float value
+ value = 1.7976931348623157e308 if metric.value > 0 else -1.7976931348623157e308
+ else:
+ # some driver doesn't knows float64, so we need convert it to a regular float
+ value = float(metric.value)
+ with self.ManagedSessionMaker() as session:
+ try:
+ self._get_or_create(model=SqlMetric, job_name=job_name, key=metric.key,
+ value=value, worker_index=metric.worker_index,
+ timestamp=metric.timestamp, step=metric.step,
+ session=session, is_nan=is_nan)
+ except sqlalchemy.exc.IntegrityError:
+ session.rollback()
+
+ def log_param(self, job_name, param):
+ with self.ManagedSessionMaker() as session:
+ try:
+ self._get_or_create(model=SqlParam, job_name=job_name, session=session,
+ key=param.key, value=param.value,
+ worker_index=param.worker_index)
+ session.commit()
+ except sqlalchemy.exc.IntegrityError:
+ session.rollback()
diff --git a/submarine-sdk/pysubmarine/submarine/tracking/__init__.py b/submarine-sdk/pysubmarine/submarine/tracking/__init__.py
new file mode 100644
index 0000000..a1cd9aa
--- /dev/null
+++ b/submarine-sdk/pysubmarine/submarine/tracking/__init__.py
@@ -0,0 +1,26 @@
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements. See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License. You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+from submarine.tracking.client import SubmarineClient
+from submarine.tracking.utils import set_tracking_uri, get_tracking_uri, _TRACKING_URI_ENV_VAR, \
+ _JOB_NAME_ENV_VAR
+
+__all__ = [
+ "SubmarineClient",
+ "get_tracking_uri",
+ "set_tracking_uri",
+ "_TRACKING_URI_ENV_VAR",
+ "_JOB_NAME_ENV_VAR",
+]
diff --git a/submarine-sdk/pysubmarine/submarine/tracking/client.py b/submarine-sdk/pysubmarine/submarine/tracking/client.py
new file mode 100644
index 0000000..0f04743
--- /dev/null
+++ b/submarine-sdk/pysubmarine/submarine/tracking/client.py
@@ -0,0 +1,65 @@
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements. See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License. You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+import time
+from submarine.entities import Param, Metric
+from submarine.tracking import utils
+from submarine.utils.validation import validate_metric, validate_param
+
+
+class SubmarineClient(object):
+
+ """
+ Client of an submarine Tracking Server that creates and manages experiments and runs.
+ """
+
+ def __init__(self, tracking_uri=None):
+ """
+ :param tracking_uri: Address of local or remote tracking server. If not provided, defaults
+ to the service set by ``submarine.tracking.set_tracking_uri``. See
+ `Where Runs Get Recorded <../tracking.html#where-runs-get-recorded>`_
+ for more info.
+ """
+ self.tracking_uri = tracking_uri or utils.get_tracking_uri()
+ self.store = utils.get_sqlalchemy_store(self.tracking_uri)
+
+ def log_metric(self, job_name, key, value, worker_index, timestamp=None, step=None):
+ """
+ Log a metric against the run ID.
+ :param job_name: The job name to which the metric should be logged.
+ :param key: Metric name.
+ :param value: Metric value (float). Note that some special values such
+ as +/- Infinity may be replaced by other values depending on the store. For
+ example, the SQLAlchemy store replaces +/- Inf with max / min float values.
+ :param worker_index: Metric worker_index (string).
+ :param timestamp: Time when this metric was calculated. Defaults to the current system time.
+ :param step: Training step (iteration) at which was the metric calculated. Defaults to 0.
+ """
+ timestamp = timestamp if timestamp is not None else int(time.time())
+ step = step if step is not None else 0
+ validate_metric(key, value, timestamp, step)
+ metric = Metric(key, value, worker_index, timestamp, step)
+ self.store.log_metric(job_name, metric)
+
+ def log_param(self, job_name, key, value, worker_index):
+ """
+ Log a parameter against the job name. Value is converted to a string.
+ :param job_name: The job name to which the parameter should be logged.
+ :param key: Parameter name.
+ :param value: Parameter value (string).
+ :param worker_index: Parameter worker_index (string).
+ """
+ validate_param(key, value)
+ param = Param(key, str(value), worker_index)
+ self.store.log_param(job_name, param)
diff --git a/submarine-sdk/pysubmarine/submarine/tracking/fluent.py b/submarine-sdk/pysubmarine/submarine/tracking/fluent.py
new file mode 100644
index 0000000..f0689e5
--- /dev/null
+++ b/submarine-sdk/pysubmarine/submarine/tracking/fluent.py
@@ -0,0 +1,66 @@
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements. See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License. You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+"""
+Internal module implementing the fluent API, allowing management of an active
+Submarine run. This module is exposed to users at the top-level :py:mod:`submarine` module.
+"""
+from __future__ import print_function
+from submarine.tracking.client import SubmarineClient
+from submarine.tracking.utils import get_job_name
+
+import time
+import logging
+import random
+import string
+
+
+_RUN_ID_ENV_VAR = "SUBMARINE_RUN_ID"
+_active_run_stack = []
+
+_logger = logging.getLogger(__name__)
+
+
+# Todo need to support unique run_id or change to submarine-job name
+def random_string(stringLength=30):
+ """Generate a random string of fixed length """
+ letters = string.ascii_lowercase
+ return ''.join(random.choice(letters) for i in range(stringLength))
+
+
+def log_param(key, value, worker_index):
+ """
+ Log a parameter under the current run, creating a run if necessary.
+ :param key: Parameter name (string)
+ :param value: Parameter value (string, but will be string-ified if not)
+ :param worker_index
+ """
+ job_name = get_job_name()
+ SubmarineClient().log_param(job_name, key, value, worker_index)
+
+
+def log_metric(key, value, worker_index, step=None):
+ """
+ Log a metric under the current run, creating a run if necessary.
+ :param key: Metric name (string).
+ :param value: Metric value (float). Note that some special values such as +/- Infinity may be
+ replaced by other values depending on the store. For example, sFor example, the
+ SQLAlchemy store replaces +/- Inf with max / min float values.
+ :param worker_index: Metric worker_index (string).
+ :param step: Metric step (int). Defaults to zero if unspecified.
+ """
+ job_name = get_job_name()
+ SubmarineClient().log_metric(
+ job_name, key, value, worker_index, int(time.time() * 1000), step or 0)
diff --git a/submarine-sdk/pysubmarine/submarine/tracking/utils.py b/submarine-sdk/pysubmarine/submarine/tracking/utils.py
new file mode 100644
index 0000000..952de07
--- /dev/null
+++ b/submarine-sdk/pysubmarine/submarine/tracking/utils.py
@@ -0,0 +1,76 @@
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements. See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License. You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+from __future__ import print_function
+from submarine.store import DEFAULT_SUBMARINE_JDBC_URL
+from submarine.store.sqlalchemy_store import SqlAlchemyStore
+from submarine.utils import env
+
+_TRACKING_URI_ENV_VAR = "SUBMARINE_TRACKING_URI"
+_JOB_NAME_ENV_VAR = "SUBMARINE_JOB_NAME"
+
+# Extra environment variables which take precedence for setting the basic/bearer
+# auth on http requests.
+_TRACKING_USERNAME_ENV_VAR = "SUBMARINE_TRACKING_USERNAME"
+_TRACKING_PASSWORD_ENV_VAR = "SUBMARINE_TRACKING_PASSWORD"
+_TRACKING_TOKEN_ENV_VAR = "SUBMARINE_TRACKING_TOKEN"
+_TRACKING_INSECURE_TLS_ENV_VAR = "SUBMARINE_TRACKING_INSECURE_TLS"
+
+_tracking_uri = None
+
+
+def is_tracking_uri_set():
+ """Returns True if the tracking URI has been set, False otherwise."""
+ if _tracking_uri or env.get_env(_TRACKING_URI_ENV_VAR):
+ return True
+ return False
+
+
+def set_tracking_uri(uri):
+ """
+ Set the tracking server URI. This does not affect the
+ currently active run (if one exists), but takes effect for successive runs.
+ """
+ global _tracking_uri
+ _tracking_uri = uri
+
+
+def get_tracking_uri():
+ """
+ Get the current tracking URI. This may not correspond to the tracking URI of
+ the currently active run, since the tracking URI can be updated via ``set_tracking_uri``.
+ :return: The tracking URI.
+ """
+ # TODO get database url from submarine-site.xml
+ global _tracking_uri
+ if _tracking_uri is not None:
+ return _tracking_uri
+ elif env.get_env(_TRACKING_URI_ENV_VAR) is not None:
+ return env.get_env(_TRACKING_URI_ENV_VAR)
+ else:
+ return DEFAULT_SUBMARINE_JDBC_URL
+
+
+def get_job_name():
+ """
+ Get the current job name.
+ :return The job name:
+ """
+ if env.get_env(_JOB_NAME_ENV_VAR) is not None:
+ return env.get_env(_JOB_NAME_ENV_VAR)
+
+
+def get_sqlalchemy_store(store_uri):
+ return SqlAlchemyStore(store_uri)
diff --git a/submarine-sdk/pysubmarine/submarine/utils/__init__.py b/submarine-sdk/pysubmarine/submarine/utils/__init__.py
new file mode 100644
index 0000000..b36ce34
--- /dev/null
+++ b/submarine-sdk/pysubmarine/submarine/utils/__init__.py
@@ -0,0 +1,36 @@
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements. See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License. You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+from six.moves import urllib
+from submarine.exceptions import SubmarineException
+
+
+def extract_db_type_from_uri(db_uri):
+ """
+ Parse the specified DB URI to extract the database type. Confirm the database type is
+ supported. If a driver is specified, confirm it passes a plausible regex.
+ """
+ scheme = urllib.parse.urlparse(db_uri).scheme
+ scheme_plus_count = scheme.count('+')
+
+ if scheme_plus_count == 0:
+ db_type = scheme
+ elif scheme_plus_count == 1:
+ db_type, _ = scheme.split('+')
+ else:
+ error_msg = "Invalid database URI: '%s'. %s" % (db_uri, 'INVALID_DB_URI_MSG')
+ raise SubmarineException(error_msg)
+
+ return db_type
diff --git a/submarine-sdk/pysubmarine/submarine/utils/env.py b/submarine-sdk/pysubmarine/submarine/utils/env.py
new file mode 100644
index 0000000..9d63b0b
--- /dev/null
+++ b/submarine-sdk/pysubmarine/submarine/utils/env.py
@@ -0,0 +1,25 @@
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements. See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License. You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+import os
+
+
+def get_env(variable_name):
+ return os.environ.get(variable_name)
+
+
+def unset_variable(variable_name):
+ if variable_name in os.environ:
+ del os.environ[variable_name]
diff --git a/submarine-sdk/pysubmarine/submarine/utils/validation.py b/submarine-sdk/pysubmarine/submarine/utils/validation.py
new file mode 100644
index 0000000..25733b1
--- /dev/null
+++ b/submarine-sdk/pysubmarine/submarine/utils/validation.py
@@ -0,0 +1,115 @@
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements. See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License. You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+"""
+Utilities for validating user inputs such as metric names and parameter names.
+"""
+import numbers
+import re
+import posixpath
+
+from submarine.exceptions import SubmarineException
+from submarine.store.database.db_types import DATABASE_ENGINES
+
+_VALID_PARAM_AND_METRIC_NAMES = re.compile(r"^[/\w.\- ]*$")
+
+MAX_ENTITY_KEY_LENGTH = 250
+MAX_PARAM_VAL_LENGTH = 250
+
+
+_BAD_CHARACTERS_MESSAGE = (
+ "Names may only contain alphanumerics, underscores (_), dashes (-), periods (.),"
+ " spaces ( ), and slashes (/)."
+)
+
+_UNSUPPORTED_DB_TYPE_MSG = "Supported database engines are {%s}" % ', '.join(DATABASE_ENGINES)
+
+
+def bad_path_message(name):
+ return (
+ "Names may be treated as files in certain cases, and must not resolve to other names"
+ " when treated as such. This name would resolve to '%s'"
+ ) % posixpath.normpath(name)
+
+
+def path_not_unique(name):
+ norm = posixpath.normpath(name)
+ return norm != name or norm == '.' or norm.startswith('..') or norm.startswith('/')
+
+
+def _validate_param_name(name):
+ """Check that `name` is a valid parameter name and raise an exception if it isn't."""
+ if not _VALID_PARAM_AND_METRIC_NAMES.match(name):
+ raise SubmarineException(
+ "Invalid parameter name: '%s'. %s" % (name, _BAD_CHARACTERS_MESSAGE),)
+
+ if path_not_unique(name):
+ raise SubmarineException(
+ "Invalid parameter name: '%s'. %s" % (name, bad_path_message(name)))
+
+
+def _validate_metric_name(name):
+ """Check that `name` is a valid metric name and raise an exception if it isn't."""
+ if not _VALID_PARAM_AND_METRIC_NAMES.match(name):
+ raise SubmarineException("Invalid metric name: '%s'. %s" % (name, _BAD_CHARACTERS_MESSAGE),)
+
+ if path_not_unique(name):
+ raise SubmarineException("Invalid metric name: '%s'. %s" % (name, bad_path_message(name)))
+
+
+def _validate_length_limit(entity_name, limit, value):
+ if len(value) > limit:
+ raise SubmarineException(
+ "%s '%s' had length %s, which exceeded length limit of %s" %
+ (entity_name, value[:250], len(value), limit))
+
+
+def validate_metric(key, value, timestamp, step):
+ """
+ Check that a param with the specified key, value, timestamp is valid and raise an exception if
+ it isn't.
+ """
+ _validate_metric_name(key)
+ if not isinstance(value, numbers.Number):
+ raise SubmarineException(
+ "Got invalid value %s for metric '%s' (timestamp=%s). Please specify value as a valid "
+ "double (64-bit floating point)" % (value, key, timestamp),)
+
+ if not isinstance(timestamp, numbers.Number) or timestamp < 0:
+ raise SubmarineException(
+ "Got invalid timestamp %s for metric '%s' (value=%s). Timestamp must be a nonnegative "
+ "long (64-bit integer) " % (timestamp, key, value),)
+
+ if not isinstance(step, numbers.Number):
+ raise SubmarineException(
+ "Got invalid step %s for metric '%s' (value=%s). Step must be a valid long "
+ "(64-bit integer)." % (step, key, value),)
+
+
+def validate_param(key, value):
+ """
+ Check that a param with the specified key & value is valid and raise an exception if it
+ isn't.
+ """
+ _validate_param_name(key)
+ _validate_length_limit("Param key", MAX_ENTITY_KEY_LENGTH, key)
+ _validate_length_limit("Param value", MAX_PARAM_VAL_LENGTH, str(value))
+
+
+def _validate_db_type_string(db_type):
+ """validates db_type parsed from DB URI is supported"""
+ if db_type not in DATABASE_ENGINES:
+ error_msg = "Invalid database engine: '%s'. '%s'" % (db_type, _UNSUPPORTED_DB_TYPE_MSG)
+ raise SubmarineException(error_msg)
diff --git a/submarine-sdk/pysubmarine/tests/__init__.py b/submarine-sdk/pysubmarine/tests/__init__.py
new file mode 100644
index 0000000..ae1e83e
--- /dev/null
+++ b/submarine-sdk/pysubmarine/tests/__init__.py
@@ -0,0 +1,14 @@
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements. See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License. You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
diff --git a/submarine-sdk/pysubmarine/tests/entities/test_metrics.py b/submarine-sdk/pysubmarine/tests/entities/test_metrics.py
new file mode 100644
index 0000000..c8a62e4
--- /dev/null
+++ b/submarine-sdk/pysubmarine/tests/entities/test_metrics.py
@@ -0,0 +1,38 @@
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements. See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License. You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+import time
+
+from submarine.entities import Metric
+
+
+def _check(metric, key, value, worker_index, timestamp, step):
+ assert type(metric) == Metric
+ assert metric.key == key
+ assert metric.value == value
+ assert metric.worker_index == worker_index
+ assert metric.timestamp == timestamp
+ assert metric.step == step
+
+
+def test_creation_and_hydration():
+ key = "alpha"
+ value = 10000
+ worker_index = 1
+ ts = int(time.time())
+ step = 0
+
+ metric = Metric(key, value, worker_index, ts, step)
+ _check(metric, key, value, worker_index, ts, step)
diff --git a/submarine-sdk/pysubmarine/tests/entities/test_params.py b/submarine-sdk/pysubmarine/tests/entities/test_params.py
new file mode 100644
index 0000000..4148b00
--- /dev/null
+++ b/submarine-sdk/pysubmarine/tests/entities/test_params.py
@@ -0,0 +1,32 @@
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements. See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License. You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+from submarine.entities import Param
+
+
+def _check(param, key, value, worker_index):
+ assert type(param) == Param
+ assert param.key == key
+ assert param.value == value
+ assert param.worker_index == worker_index
+
+
+def test_creation_and_hydration():
+ key = "alpha"
+ value = 10000
+ worker_index = 1
+
+ param = Param(key, value, worker_index)
+ _check(param, key, value, worker_index)
diff --git a/submarine-sdk/pysubmarine/tests/store/test_sqlalchemy_store.py b/submarine-sdk/pysubmarine/tests/store/test_sqlalchemy_store.py
new file mode 100644
index 0000000..6798986
--- /dev/null
+++ b/submarine-sdk/pysubmarine/tests/store/test_sqlalchemy_store.py
@@ -0,0 +1,73 @@
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements. See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License. You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+import submarine
+from os import environ
+from submarine.store.database.models import SqlMetric, SqlParam
+from submarine.tracking import utils
+from submarine.store.database import models
+from submarine.entities import Metric, Param
+
+import time
+import unittest
+
+JOB_NAME = "application_123456789"
+
+
+class TestSqlAlchemyStore(unittest.TestCase):
+ def setUp(self):
+ submarine.set_tracking_uri(
+ "mysql+pymysql://submarine_test:password_test@localhost:3306/submarineDB_test")
+ self.tracking_uri = utils.get_tracking_uri()
+ self.store = utils.get_sqlalchemy_store(self.tracking_uri)
+
+ def tearDown(self):
+ submarine.set_tracking_uri(None)
+ models.Base.metadata.drop_all(self.store.engine)
+
+ def test_log_param(self):
+ param1 = Param("name_1", "a", "worker-1")
+ self.store.log_param(JOB_NAME, param1)
+
+ # Validate params
+ with self.store.ManagedSessionMaker() as session:
+ params = session \
+ .query(SqlParam) \
+ .options() \
+ .filter(SqlParam.job_name == JOB_NAME).all()
+ assert params[0].key == "name_1"
+ assert params[0].value == "a"
+ assert params[0].worker_index == "worker-1"
+ assert params[0].job_name == JOB_NAME
+
+ def test_log_metric(self):
+ metric1 = Metric("name_1", 5, "worker-1", int(time.time()), 0)
+ metric2 = Metric("name_1", 6, "worker-2", int(time.time()), 0)
+ self.store.log_metric(JOB_NAME, metric1)
+ self.store.log_metric(JOB_NAME, metric2)
+
+ # Validate params
+ with self.store.ManagedSessionMaker() as session:
+ metrics = session \
+ .query(SqlMetric) \
+ .options() \
+ .filter(SqlMetric.job_name == JOB_NAME).all()
+ assert len(metrics) == 2
+ assert metrics[0].key == "name_1"
+ assert metrics[0].value == 5
+ assert metrics[0].worker_index == "worker-1"
+ assert metrics[0].job_name == JOB_NAME
+ assert metrics[1].value == 6
+ assert metrics[1].worker_index == "worker-2"
diff --git a/submarine-sdk/pysubmarine/tests/tracking/test_tracking.py b/submarine-sdk/pysubmarine/tests/tracking/test_tracking.py
new file mode 100644
index 0000000..6695282
--- /dev/null
+++ b/submarine-sdk/pysubmarine/tests/tracking/test_tracking.py
@@ -0,0 +1,67 @@
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements. See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License. You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+import submarine
+from os import environ
+from submarine.store.database.models import SqlMetric, SqlParam
+from submarine.tracking import utils
+from submarine.store.database import models
+
+import unittest
+
+JOB_NAME = "application_123456789"
+
+
+class TestTracking(unittest.TestCase):
+ def setUp(self):
+ environ["SUBMARINE_JOB_NAME"] = JOB_NAME
+ submarine.set_tracking_uri(
+ "mysql+pymysql://submarine_test:password_test@localhost:3306/submarineDB_test")
+ self.tracking_uri = utils.get_tracking_uri()
+ self.store = utils.get_sqlalchemy_store(self.tracking_uri)
+
+ def tearDown(self):
+ submarine.set_tracking_uri(None)
+ models.Base.metadata.drop_all(self.store.engine)
+
+ def log_param(self):
+ submarine.log_param("name_1", "a", "worker-1")
+ # Validate params
+ with self.store.ManagedSessionMaker() as session:
+ params = session \
+ .query(SqlParam) \
+ .options() \
+ .filter(SqlParam.job_name == JOB_NAME).all()
+ assert params[0].key == "name_1"
+ assert params[0].value == "a"
+ assert params[0].worker_index == "worker-1"
+ assert params[0].job_name == JOB_NAME
+
+ def test_log_metric(self):
+ submarine.log_metric("name_1", 5, "worker-1")
+ submarine.log_metric("name_1", 6, "worker-2")
+ # Validate params
+ with self.store.ManagedSessionMaker() as session:
+ metrics = session \
+ .query(SqlMetric) \
+ .options() \
+ .filter(SqlMetric.job_name == JOB_NAME).all()
+ assert len(metrics) == 2
+ assert metrics[0].key == "name_1"
+ assert metrics[0].value == 5
+ assert metrics[0].worker_index == "worker-1"
+ assert metrics[0].job_name == JOB_NAME
+ assert metrics[1].value == 6
+ assert metrics[1].worker_index == "worker-2"
diff --git a/submarine-sdk/pysubmarine/tests/tracking/test_utils.py b/submarine-sdk/pysubmarine/tests/tracking/test_utils.py
new file mode 100644
index 0000000..be057bf
--- /dev/null
+++ b/submarine-sdk/pysubmarine/tests/tracking/test_utils.py
@@ -0,0 +1,60 @@
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements. See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License. You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+import mock
+import os
+from submarine.store import DEFAULT_SUBMARINE_JDBC_URL
+from submarine.store.sqlalchemy_store import SqlAlchemyStore
+from submarine.tracking.utils import is_tracking_uri_set, _TRACKING_URI_ENV_VAR, \
+ get_tracking_uri, _JOB_NAME_ENV_VAR, get_job_name, get_sqlalchemy_store
+
+
+def test_is_tracking_uri_set():
+ env = {
+ _TRACKING_URI_ENV_VAR: DEFAULT_SUBMARINE_JDBC_URL,
+ }
+ with mock.patch.dict(os.environ, env):
+ assert is_tracking_uri_set() is True
+
+
+def test_get_tracking_uri():
+ env = {
+ _TRACKING_URI_ENV_VAR: DEFAULT_SUBMARINE_JDBC_URL,
+ }
+ with mock.patch.dict(os.environ, env):
+ assert get_tracking_uri() == DEFAULT_SUBMARINE_JDBC_URL
+
+
+def test_get_job_name():
+ env = {
+ _JOB_NAME_ENV_VAR:
+ "application_12346789",
+ }
+ with mock.patch.dict(os.environ, env):
+ assert get_job_name() == "application_12346789"
+
+
+def test_get_sqlalchemy_store():
+ patch_create_engine = mock.patch("sqlalchemy.create_engine")
+ uri = DEFAULT_SUBMARINE_JDBC_URL
+ env = {
+ _TRACKING_URI_ENV_VAR: uri
+ }
+ with mock.patch.dict(os.environ, env), patch_create_engine as mock_create_engine, \
+ mock.patch("submarine.store.sqlalchemy_store.SqlAlchemyStore._initialize_tables"):
+ store = get_sqlalchemy_store(uri)
+ assert isinstance(store, SqlAlchemyStore)
+ assert store.db_uri == uri
+ mock_create_engine.assert_called_once_with(uri, pool_pre_ping=True)
diff --git a/submarine-sdk/pysubmarine/tests/utils/test_env.py b/submarine-sdk/pysubmarine/tests/utils/test_env.py
new file mode 100644
index 0000000..4fc26d3
--- /dev/null
+++ b/submarine-sdk/pysubmarine/tests/utils/test_env.py
@@ -0,0 +1,28 @@
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements. See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License. You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+from submarine.utils.env import get_env, unset_variable
+from os import environ
+
+
+class TestEnv:
+ def test_get_env(self):
+ environ["test"] = "hello"
+ assert get_env("test") == "hello"
+
+ def test_unset_variable(self):
+ environ["test"] = "hello"
+ unset_variable("test")
+ assert "test" not in environ
diff --git a/submarine-sdk/pysubmarine/tests/utils/test_validation.py b/submarine-sdk/pysubmarine/tests/utils/test_validation.py
new file mode 100644
index 0000000..1382084
--- /dev/null
+++ b/submarine-sdk/pysubmarine/tests/utils/test_validation.py
@@ -0,0 +1,65 @@
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements. See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License. You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+import pytest
+
+from submarine.exceptions import SubmarineException
+from submarine.utils.validation import _validate_metric_name, _validate_param_name,\
+ _validate_length_limit, _validate_db_type_string
+
+GOOD_METRIC_OR_PARAM_NAMES = [
+ "a", "Ab-5_", "a/b/c", "a.b.c", ".a", "b.", "a..a/._./o_O/.e.", "a b/c d",
+]
+BAD_METRIC_OR_PARAM_NAMES = [
+ "", ".", "/", "..", "//", "a//b", "a/./b", "/a", "a/", ":", "\\", "./", "/./",
+]
+
+
+def test_validate_metric_name():
+ for good_name in GOOD_METRIC_OR_PARAM_NAMES:
+ _validate_metric_name(good_name)
+ for bad_name in BAD_METRIC_OR_PARAM_NAMES:
+ with pytest.raises(SubmarineException, match="Invalid metric name"):
+ _validate_metric_name(bad_name)
+
+
+def test_validate_param_name():
+ for good_name in GOOD_METRIC_OR_PARAM_NAMES:
+ _validate_param_name(good_name)
+ for bad_name in BAD_METRIC_OR_PARAM_NAMES:
+ with pytest.raises(SubmarineException, match="Invalid parameter name"):
+ _validate_param_name(bad_name)
+
+
+def test__validate_length_limit():
+ limit = 10
+ key = "key"
+ good_value = "test-12345"
+ bad_value = "test-123456789"
+ _validate_length_limit(key, limit, good_value)
+ with pytest.raises(SubmarineException, match="which exceeded length limit"):
+ _validate_length_limit(key, limit, bad_value)
+
+
+def test_db_type():
+ for db_type in ["mysql", "mssql", "postgresql", "sqlite"]:
+ # should not raise an exception
+ _validate_db_type_string(db_type)
+
+ # error cases
+ for db_type in ["MySQL", "mongo", "cassandra", "sql", ""]:
+ with pytest.raises(SubmarineException) as e:
+ _validate_db_type_string(db_type)
+ assert "Invalid database engine" in e.value.message
diff --git a/submarine-sdk/pysubmarine/travis/conda.sh b/submarine-sdk/pysubmarine/travis/conda.sh
new file mode 100644
index 0000000..1f8fc36
--- /dev/null
+++ b/submarine-sdk/pysubmarine/travis/conda.sh
@@ -0,0 +1,54 @@
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements. See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License. You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+#!/usr/bin/env bash
+
+set -ex
+sudo mkdir -p /travis-install
+sudo chown travis /travis-install
+
+# We do this conditionally because it saves us some downloading if the
+# version is the same.
+if [[ "$TRAVIS_PYTHON_VERSION" == "2.7" ]]; then
+ wget https://repo.continuum.io/miniconda/Miniconda2-latest-Linux-x86_64.sh -O /travis-install/miniconda.sh;
+else
+ wget https://repo.continuum.io/miniconda/Miniconda3-latest-Linux-x86_64.sh -O /travis-install/miniconda.sh;
+fi
+
+bash /travis-install/miniconda.sh -b -p $HOME/miniconda
+export PATH="$HOME/miniconda/bin:$PATH"
+hash -r
+conda config --set always_yes yes --set changeps1 no
+# Useful for debugging any issues with conda
+conda info -a
+if [[ -n "$TRAVIS_PYTHON_VERSION" ]]; then
+ conda create -q -n test-environment python=$TRAVIS_PYTHON_VERSION
+else
+ conda create -q -n test-environment python=3.6
+fi
+source activate test-environment
+python --version
+pip install --upgrade pip
+pip install -r ./submarine-sdk/pysubmarine/travis/test-requirements.txt
+
+pip install ./submarine-sdk/pysubmarine/.
+export SUBMARINE_HOME=$(pwd)
+
+# Print current environment info
+pip list
+echo $SUBMARINE_HOME
+
+# Turn off trace output & exit-on-errors
+set +ex
\ No newline at end of file
diff --git a/submarine-sdk/pysubmarine/travis/lint-requirements.txt b/submarine-sdk/pysubmarine/travis/lint-requirements.txt
new file mode 100644
index 0000000..e2f58bf
--- /dev/null
+++ b/submarine-sdk/pysubmarine/travis/lint-requirements.txt
@@ -0,0 +1,16 @@
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License. You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+prospector[with_pyroma]==0.12.7
+pep8==1.7.1
+pylint==1.8.2
\ No newline at end of file
diff --git a/submarine-sdk/pysubmarine/travis/lint.sh b/submarine-sdk/pysubmarine/travis/lint.sh
new file mode 100755
index 0000000..220bfa8
--- /dev/null
+++ b/submarine-sdk/pysubmarine/travis/lint.sh
@@ -0,0 +1,24 @@
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements. See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License. You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+#!/usr/bin/env bash
+set -e
+
+FWDIR="$(cd "`dirname $0`"; pwd)"
+cd "$FWDIR"
+cd ..
+
+pycodestyle --max-line-length=100 -- submarine tests
+pylint --msg-template="{path} ({line},{column}): [{msg_id} {symbol}] {msg}" --rcfile=pylintrc -- submarine tests
diff --git a/submarine-sdk/pysubmarine/travis/test-requirements.txt b/submarine-sdk/pysubmarine/travis/test-requirements.txt
new file mode 100644
index 0000000..3116e4a
--- /dev/null
+++ b/submarine-sdk/pysubmarine/travis/test-requirements.txt
@@ -0,0 +1,27 @@
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements. See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License. You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+mock==2.0.0
+moto==1.3.7
+pandas<=0.23.4
+scikit-learn==0.20.2
+scipy==1.2.1
+pyarrow==0.12.1
+attrdict==2.0.0
+pytest==3.2.1
+pytest-cov==2.6.0
+pytest-localserver==0.5.0
+sqlalchemy==1.3.0
+PyMySQL==0.9.3
\ No newline at end of file
diff --git a/submodules/tony b/submodules/tony
index b0e20b9..a07f0fb 160000
--- a/submodules/tony
+++ b/submodules/tony
@@ -1 +1 @@
-Subproject commit b0e20b9c2fcf111679150a829ac43a1edcbba3e2
+Subproject commit a07f0fb68112014d4420ef6d9817eeebddd19ad1