You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@systemml.apache.org by GitBox <gi...@apache.org> on 2020/04/30 14:40:58 UTC

[GitHub] [systemml] lukas-jkl opened a new pull request #904: onnx-systemds implementation

lukas-jkl opened a new pull request #904:
URL: https://github.com/apache/systemml/pull/904


   This PR implements a first poc-implementation for an ONNX importer.
   It adds support for the following operators:
   * Add
   * Sub
   * MatMul
   * Neg
   * Xor
   * Or
   * And
   * Relu
   * Tanh
   * Sigmoid
   * Softmax
   * Dropout
   * MaxPool
   * Conv
   * If
   
   As well as the logic for nested sub-graphs. 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [systemml] Baunsgaard commented on pull request #904: onnx-systemds implementation

Posted by GitBox <gi...@apache.org>.
Baunsgaard commented on pull request #904:
URL: https://github.com/apache/systemml/pull/904#issuecomment-628795163


   Cool many thanks, LGTM. :1st_place_medal: 
   (if i had the power i would merge it)
   
   Up for discussion in the future is still https://github.com/apache/systemml/pull/904#issuecomment-627933153
   
   Only real thing left to refine in my opinion is the Output files for validation, that i think is to much, because the tests should not reflect a specific __result__, but a certain __behavior__. But I also acknowledge this is a highly personal thing, and that some of the python tests already in does not follow this principle.
   
   @mboehm7 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [systemml] asfgit closed pull request #904: onnx-systemds implementation

Posted by GitBox <gi...@apache.org>.
asfgit closed pull request #904:
URL: https://github.com/apache/systemml/pull/904


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [systemml] j143 removed a comment on pull request #904: onnx-systemds implementation

Posted by GitBox <gi...@apache.org>.
j143 removed a comment on pull request #904:
URL: https://github.com/apache/systemml/pull/904#issuecomment-622235625


   I am reviewing this. Should be back in a day. thank you.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [systemml] lukas-jkl commented on pull request #904: onnx-systemds implementation

Posted by GitBox <gi...@apache.org>.
lukas-jkl commented on pull request #904:
URL: https://github.com/apache/systemml/pull/904#issuecomment-628782797


   > > Thank you for the quick review @Baunsgaard ! I've implemented your suggested changes and have rewritten the documentation in rst files. I am however still a bit unsure about the yml files. I guess it would not be sufficient to simply add onnx and Jinja to the pip dependencies as protobuf would also need to be installed?
   > 
   > Since our python tests are executed on Ubuntu-latest it should **hopefully** be as simple as adding a step,
   > 
   > ```yaml
   > - name: install protobuf
   >    run: {INSERT INSTALL COMMAND HERE }
   > ```
   
   Seems to work now!


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [systemml] mboehm7 commented on pull request #904: onnx-systemds implementation

Posted by GitBox <gi...@apache.org>.
mboehm7 commented on pull request #904:
URL: https://github.com/apache/systemml/pull/904#issuecomment-628910995


   LGTM - thanks @lukas-jkl for this awesome patch, and @Baunsgaard for the detailed discussions. Overall, this is a great initial importer and completes the AMLS project. During the merge, I only added `**/*.out` to the exclude list of the rat check (for proper licenses).
   
   Down the the road, I would also prefer to remove these expected output files and rather generate them dynamically (e.g., via pytorch). Similarly, we should clean up the license headers to use a common text layout (beginning/end lines).   
   
   Btw, there are additional rat problems in the data slicing implementation (which I fix when merging Svetlana's PR) and the `JolEstimate*` tests (@Baunsgaard you might want to fix that).


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [systemml] lukas-jkl commented on pull request #904: onnx-systemds implementation

Posted by GitBox <gi...@apache.org>.
lukas-jkl commented on pull request #904:
URL: https://github.com/apache/systemml/pull/904#issuecomment-628586817


   Thank you for the quick review @Baunsgaard ! I've implemented your suggested changes and have rewritten the documentation in rst files. I am however still a bit unsure about the yml files. I guess it would not be sufficient to simply add onnx and Jinja to the pip dependencies as protobuf would also need to be installed?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [systemml] j143 commented on pull request #904: onnx-systemds implementation

Posted by GitBox <gi...@apache.org>.
j143 commented on pull request #904:
URL: https://github.com/apache/systemml/pull/904#issuecomment-623142003


   Thank you for the design documentation, @lukas-jkl . I would make a deep pass shortly.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [systemml] lukas-jkl commented on a change in pull request #904: onnx-systemds implementation

Posted by GitBox <gi...@apache.org>.
lukas-jkl commented on a change in pull request #904:
URL: https://github.com/apache/systemml/pull/904#discussion_r425085177



##########
File path: src/main/python/onnx_systemds/onnx_helper.py
##########
@@ -0,0 +1,216 @@
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements.  See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to you under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License.  You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+from functools import reduce
+
+import onnx
+import onnx.version_converter
+
+
+class TreeNode:
+    def __init__(self, node: onnx.NodeProto):
+        self.node = node
+        self.parent_nodes = list()
+        self.child_nodes = list()
+
+
+class NodeTree:
+    """ A simple class for representing a tree structure of nodes """
+
+    def __init__(self, nodes: [onnx.NodeProto]):
+        self.nodes = [TreeNode(node) for node in nodes]
+        self.root_nodes = []  # nodes that have no parents
+        self.end_nodes = []  # nodes that have no children
+
+        # find parents and children for each node
+        for tree_node in self.nodes:
+            for compare_tree_node in self.nodes:
+                if tree_node != compare_tree_node:
+                    for node_output in tree_node.node.output:
+                        if node_output in compare_tree_node.node.input:
+                            tree_node.child_nodes.append(compare_tree_node)
+                            compare_tree_node.parent_nodes.append(tree_node)
+
+        for node in self.nodes:
+            if len(node.child_nodes) == 0:
+                self.end_nodes.append(node)
+            if len(node.parent_nodes) == 0:
+                self.root_nodes.append(node)
+
+    def remove_end_node(self, node: TreeNode):
+        """
+        Removes the given end-node from the tree.
+        Removing a non-existing or non end-node raises an exception.
+        :param node: The node that shall be removed
+        """
+        if node not in self.end_nodes:
+            raise Exception("Can only remove end nodes")
+        self.end_nodes.remove(node)
+        self.nodes.remove(node)
+
+        for parent_node in node.parent_nodes:
+            parent_node.child_nodes.remove(node)
+        self.end_nodes += node.parent_nodes
+        node.parent_nodes = []
+
+
+def load_model(onnx_file: str) -> onnx.ModelProto:
+    """
+    Loads the onnx file, checks the model and converts it to a common version if necessary.
+
+    :param onnx_file:
+    :return: the loaded onnx-model
+    """
+    TARGET_VERSION = 11
+    model = onnx.load(onnx_file)
+    onnx.checker.check_model(model)
+    if len(list(model.opset_import)) == 1 and list(model.opset_import)[0].version == TARGET_VERSION:
+        return model
+    else:
+        return onnx.version_converter.convert_version(model, TARGET_VERSION)
+
+
+def get_value_info(graph: onnx.GraphProto, name: str) -> onnx.ValueInfoProto:
+    """
+    Searches the `graph` for the given `name` and returns the associated ValueInfo,
+    if the name is not found None is returned.
+
+    :param graph: the onnx-graph that shall be searched
+    :param name: the name of the value
+    :return: the value-info or None if it is not found
+    """
+    for info in graph.input:
+        if info.name == name:
+            return info
+
+    for info in graph.value_info:
+        if info.name == name:
+            return info
+
+    for info in graph.output:
+        if info.name == name:
+            return info
+
+    return None
+
+
+def get_graph_inputs_without_initializers(graph: onnx.GraphProto) -> [onnx.ValueInfoProto]:
+    """
+    Returns all inputs of the `graph` that have no associated initializer values.
+
+    :param graph: the onnx-graph
+    :return: list of uninitialized inputs
+    """
+    inputs_without_initializers = []
+    for input in graph.input:
+        has_initializer = False
+        for initializer in graph.initializer:
+            if initializer.name == input.name:
+                has_initializer = True
+                break
+
+        if not has_initializer:
+            inputs_without_initializers.append(input)
+
+    return inputs_without_initializers
+
+
+def get_graph_inputs_with_initializers(graph: onnx.GraphProto) -> [(onnx.ValueInfoProto, onnx.TensorProto)]:
+    """
+    Returns all initialized inputs of the `graph` with their corresponding initializer.
+
+    :param graph: the onnx-graph
+    :return: list of tuples of (input, initializer)
+    """
+    inputs_with_initializers = []
+
+    for input in graph.input:
+        for initializer in graph.initializer:
+            if initializer.name == input.name:
+                inputs_with_initializers.append((input, initializer))
+
+    return inputs_with_initializers
+
+
+class PreparedValue:
+    """ Class for preparing onnx value structures for writing them to the dml script """
+    def __init__(self, value_info: onnx.ValueInfoProto, initializer: onnx.TensorProto = None):
+
+        supported_types = ["int", "boolean", "double"]
+
+        # TODO: these type translations are not correct double -> float
+        type_translation = {
+            1: "double",  # float
+            2: "int",  # uint8_t
+            3: "int",  # int8_t
+            4: "int",  # uint16_t
+            5: "int",  # int16_t
+            6: "int",  # int32_t
+            7: "int",  # int64_t
+            8: "string",
+            9: "boolean",  # bool
+
+            10: "double",  # float16,
+            11: "double",
+            12: "int",  # uint32
+            13: "int",  # uint64

Review comment:
       Yes, you are correct, I am however unsure how big the "Integer" type of systemds is? I guess it should be int32. I've now changed those translations and an exception is thrown if we do not support the type. I've also added an additional check to enforce that a matrix must only contain double values, as stated in the dml-language-reference. 




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [systemml] j143 commented on pull request #904: onnx-systemds implementation

Posted by GitBox <gi...@apache.org>.
j143 commented on pull request #904:
URL: https://github.com/apache/systemml/pull/904#issuecomment-622235625


   I am reviewing this. Should be back in a day. thank you.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [systemml] lukas-jkl commented on a change in pull request #904: onnx-systemds implementation

Posted by GitBox <gi...@apache.org>.
lukas-jkl commented on a change in pull request #904:
URL: https://github.com/apache/systemml/pull/904#discussion_r425085759



##########
File path: src/main/python/onnx_systemds/convert.py
##########
@@ -0,0 +1,54 @@
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements.  See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to you under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License.  You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+import argparse
+import os.path
+import ntpath
+import onnx_systemds.onnx_helper as onnx_helper
+import onnx_systemds.render as render
+
+
+def init_argparse() -> argparse.ArgumentParser:
+    arg_parser = argparse.ArgumentParser(description="Convert onnx models into dml scripts")
+    arg_parser.add_argument("input", type=str)
+    arg_parser.add_argument("-o", "--output", type=str,
+                            help="output file", required=False)
+    return arg_parser
+
+
+def onnx2systemds(input_onnx_file: str, output_dml_file: str = None) -> None:
+    """
+    Loads the model from the input file and generates a dml file.
+
+    :param input_onnx_file: the onnx input file
+    :param output_dml_file: (optional) the dml output file,
+        if this parameter is not given the output file will have the same name as the input file
+    """
+    if not os.path.isfile(input_onnx_file):
+        raise Exception("Invalid input-file: " + str(input_onnx_file))
+
+    if not output_dml_file:
+        output_dml_file = os.path.splitext(ntpath.basename(input_onnx_file))[0] + ".dml"

Review comment:
       I guess os.path can also be used. I've now removed ntpath.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [systemml] lukas-jkl commented on pull request #904: onnx-systemds implementation

Posted by GitBox <gi...@apache.org>.
lukas-jkl commented on pull request #904:
URL: https://github.com/apache/systemml/pull/904#issuecomment-623101143


   > On a cursory check, the changeset looks good. Can the design be documented thoroughly in a `.md` file.
   > Feel free to include as much information in the doc as possible with the uses of tables, graphs etc.
   > 
   > I am facing problems with building `onnx` for now.
   
   Thank you for the quick review, I've added a design document which describes the current implementation in detail. 
   I think the conda install of onnx works best, I also had problems installing with other methods. 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [systemml] lukas-jkl commented on a change in pull request #904: onnx-systemds implementation

Posted by GitBox <gi...@apache.org>.
lukas-jkl commented on a change in pull request #904:
URL: https://github.com/apache/systemml/pull/904#discussion_r425082563



##########
File path: src/main/python/onnx_systemds/onnx_helper.py
##########
@@ -0,0 +1,216 @@
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements.  See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to you under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License.  You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+from functools import reduce
+
+import onnx
+import onnx.version_converter
+
+
+class TreeNode:
+    def __init__(self, node: onnx.NodeProto):
+        self.node = node
+        self.parent_nodes = list()
+        self.child_nodes = list()
+
+
+class NodeTree:
+    """ A simple class for representing a tree structure of nodes """
+
+    def __init__(self, nodes: [onnx.NodeProto]):
+        self.nodes = [TreeNode(node) for node in nodes]
+        self.root_nodes = []  # nodes that have no parents
+        self.end_nodes = []  # nodes that have no children
+
+        # find parents and children for each node
+        for tree_node in self.nodes:
+            for compare_tree_node in self.nodes:
+                if tree_node != compare_tree_node:
+                    for node_output in tree_node.node.output:
+                        if node_output in compare_tree_node.node.input:
+                            tree_node.child_nodes.append(compare_tree_node)
+                            compare_tree_node.parent_nodes.append(tree_node)
+
+        for node in self.nodes:
+            if len(node.child_nodes) == 0:
+                self.end_nodes.append(node)
+            if len(node.parent_nodes) == 0:
+                self.root_nodes.append(node)
+
+    def remove_end_node(self, node: TreeNode):
+        """
+        Removes the given end-node from the tree.
+        Removing a non-existing or non end-node raises an exception.
+        :param node: The node that shall be removed
+        """
+        if node not in self.end_nodes:
+            raise Exception("Can only remove end nodes")
+        self.end_nodes.remove(node)
+        self.nodes.remove(node)
+
+        for parent_node in node.parent_nodes:
+            parent_node.child_nodes.remove(node)
+        self.end_nodes += node.parent_nodes
+        node.parent_nodes = []
+
+
+def load_model(onnx_file: str) -> onnx.ModelProto:
+    """
+    Loads the onnx file, checks the model and converts it to a common version if necessary.
+
+    :param onnx_file:
+    :return: the loaded onnx-model
+    """
+    TARGET_VERSION = 11
+    model = onnx.load(onnx_file)
+    onnx.checker.check_model(model)
+    if len(list(model.opset_import)) == 1 and list(model.opset_import)[0].version == TARGET_VERSION:
+        return model
+    else:
+        return onnx.version_converter.convert_version(model, TARGET_VERSION)
+
+
+def get_value_info(graph: onnx.GraphProto, name: str) -> onnx.ValueInfoProto:
+    """
+    Searches the `graph` for the given `name` and returns the associated ValueInfo,
+    if the name is not found None is returned.
+
+    :param graph: the onnx-graph that shall be searched
+    :param name: the name of the value
+    :return: the value-info or None if it is not found
+    """
+    for info in graph.input:
+        if info.name == name:
+            return info
+
+    for info in graph.value_info:
+        if info.name == name:
+            return info
+
+    for info in graph.output:
+        if info.name == name:
+            return info
+
+    return None
+
+
+def get_graph_inputs_without_initializers(graph: onnx.GraphProto) -> [onnx.ValueInfoProto]:
+    """
+    Returns all inputs of the `graph` that have no associated initializer values.
+
+    :param graph: the onnx-graph
+    :return: list of uninitialized inputs
+    """
+    inputs_without_initializers = []
+    for input in graph.input:
+        has_initializer = False
+        for initializer in graph.initializer:
+            if initializer.name == input.name:
+                has_initializer = True
+                break
+
+        if not has_initializer:
+            inputs_without_initializers.append(input)
+
+    return inputs_without_initializers
+
+
+def get_graph_inputs_with_initializers(graph: onnx.GraphProto) -> [(onnx.ValueInfoProto, onnx.TensorProto)]:
+    """
+    Returns all initialized inputs of the `graph` with their corresponding initializer.
+
+    :param graph: the onnx-graph
+    :return: list of tuples of (input, initializer)
+    """
+    inputs_with_initializers = []
+
+    for input in graph.input:
+        for initializer in graph.initializer:
+            if initializer.name == input.name:
+                inputs_with_initializers.append((input, initializer))
+
+    return inputs_with_initializers
+
+
+class PreparedValue:
+    """ Class for preparing onnx value structures for writing them to the dml script """
+    def __init__(self, value_info: onnx.ValueInfoProto, initializer: onnx.TensorProto = None):
+
+        supported_types = ["int", "boolean", "double"]

Review comment:
       Those should be the types supported by systems (I've now renamed the variable to make this more clear). So I guess it should be "integer"




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org