You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@gearpump.apache.org by huafengw <gi...@git.apache.org> on 2017/09/05 09:00:39 UTC

[GitHub] incubator-gearpump pull request #223: [GEARPUMP-349] Optimize Graph topologi...

GitHub user huafengw opened a pull request:

    https://github.com/apache/incubator-gearpump/pull/223

    [GEARPUMP-349] Optimize Graph topologicalOrderIterator performance

    Be sure to do all of the following to help us incorporate your contribution
    quickly and easily:
    
     - [ ] Make sure the commit message is formatted like:
       `[GEARPUMP-<Jira issue #>] Meaningful description of pull request` 
     - [ ] Make sure tests pass via `sbt clean test`.
     - [ ] Make sure old documentation affected by the pull request has been updated and new documentation added for new functionality. 
    


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/huafengw/incubator-gearpump graph

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/incubator-gearpump/pull/223.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #223
    
----
commit 2673c99cb26a8a0733af772f01748f818ea08857
Author: huafengw <fv...@gmail.com>
Date:   2017-09-05T08:59:49Z

    [GEARPUMP-349] Optimize Graph topologicalOrderIterator performance

----


---

[GitHub] incubator-gearpump issue #223: [GEARPUMP-349] Optimize Graph topologicalOrde...

Posted by huafengw <gi...@git.apache.org>.
Github user huafengw commented on the issue:

    https://github.com/apache/incubator-gearpump/pull/223
  
    @manuzhang 


---

[GitHub] incubator-gearpump pull request #223: [GEARPUMP-349] Optimize Graph topologi...

Posted by manuzhang <gi...@git.apache.org>.
Github user manuzhang commented on a diff in the pull request:

    https://github.com/apache/incubator-gearpump/pull/223#discussion_r139280373
  
    --- Diff: core/src/main/scala/org/apache/gearpump/util/Graph.scala ---
    @@ -318,12 +360,8 @@ class Graph[N, E](vertexList: List[N], edgeList: List[(N, E, N)]) extends Serial
        * http://www.drdobbs.com/database/topological-sorting/184410262
        */
       def topologicalOrderWithCirclesIterator: Iterator[N] = {
    -    if (hasCycle()) {
    -      val topo = getAcyclicCopy().topologicalOrderIterator
    -      topo.flatMap(_.sortBy(_indexs(_)).iterator)
    -    } else {
    -      topologicalOrderIterator
    -    }
    +    val topo = getAcyclicCopy().topologicalOrderIterator
    +    topo.flatMap(_.sortBy(indexs(_)).iterator)
    --- End diff --
    
    Is `topologicalOrderIterator` already sorted ? Why do we sort again ? 


---

[GitHub] incubator-gearpump pull request #223: [GEARPUMP-349] Optimize Graph topologi...

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/incubator-gearpump/pull/223


---

[GitHub] incubator-gearpump pull request #223: [GEARPUMP-349] Optimize Graph topologi...

Posted by huafengw <gi...@git.apache.org>.
Github user huafengw commented on a diff in the pull request:

    https://github.com/apache/incubator-gearpump/pull/223#discussion_r139345072
  
    --- Diff: core/src/main/scala/org/apache/gearpump/util/Graph.scala ---
    @@ -318,12 +360,8 @@ class Graph[N, E](vertexList: List[N], edgeList: List[(N, E, N)]) extends Serial
        * http://www.drdobbs.com/database/topological-sorting/184410262
        */
       def topologicalOrderWithCirclesIterator: Iterator[N] = {
    -    if (hasCycle()) {
    -      val topo = getAcyclicCopy().topologicalOrderIterator
    -      topo.flatMap(_.sortBy(_indexs(_)).iterator)
    -    } else {
    -      topologicalOrderIterator
    -    }
    +    val topo = getAcyclicCopy().topologicalOrderIterator
    +    topo.flatMap(_.sortBy(indexs(_)).iterator)
    --- End diff --
    
    It's not. `topo` is an `Iterator[List[N]]` so the items of topo is sorted but the item it self is a list and it's not sorted.


---

[GitHub] incubator-gearpump pull request #223: [GEARPUMP-349] Optimize Graph topologi...

Posted by manuzhang <gi...@git.apache.org>.
Github user manuzhang commented on a diff in the pull request:

    https://github.com/apache/incubator-gearpump/pull/223#discussion_r139061240
  
    --- Diff: core/src/main/scala/org/apache/gearpump/util/Graph.scala ---
    @@ -140,64 +155,64 @@ class Graph[N, E](vertexList: List[N], edgeList: List[(N, E, N)]) extends Serial
        * Current Graph is not changed.
        */
       def mapVertex[NewNode](fun: N => NewNode): Graph[NewNode, E] = {
    -    val vertexes = vertices.map(node => (node, fun(node)))
    +    val newVertices = getVertices.map(node => (node, fun(node)))
     
    -    val vertexMap: Map[N, NewNode] = vertexes.toMap
    +    val vertexMap: Map[N, NewNode] = newVertices.toMap
     
    -    val newEdges = edges.map { edge =>
    +    val newEdges = getEdges.map { edge =>
           (vertexMap(edge._1), edge._2, vertexMap(edge._3))
         }
    -    new Graph(vertexes.map(_._2), newEdges)
    +    new Graph(newVertices.map(_._2), newEdges)
       }
     
       /**
        * Map a graph to a new graph, with edge converted to new type
        * Current graph is not changed.
        */
       def mapEdge[NewEdge](fun: (N, E, N) => NewEdge): Graph[N, NewEdge] = {
    -    val newEdges = edges.map { edge =>
    +    val newEdges = getEdges.map { edge =>
           (edge._1, fun(edge._1, edge._2, edge._3), edge._3)
         }
    -    new Graph(vertices, newEdges)
    +    new Graph(getVertices, newEdges)
       }
     
       /**
        * edges connected to node
        */
       def edgesOf(node: N): List[(N, E, N)] = {
    -    (incomingEdgesOf(node) ++ outgoingEdgesOf(node)).toSet[(N, E, N)].toList.sortBy(_indexs(_))
    +    (incomingEdgesOf(node) ++ outgoingEdgesOf(node)).toList
    --- End diff --
    
    no need to sort now ?


---

[GitHub] incubator-gearpump issue #223: [GEARPUMP-349] Optimize Graph topologicalOrde...

Posted by manuzhang <gi...@git.apache.org>.
Github user manuzhang commented on the issue:

    https://github.com/apache/incubator-gearpump/pull/223
  
    We also do an extra check for cycle here. Can it be removed ?
    
    ```scala
        if (dag.hasCycle()) {
          LOG.warn(s"Detected cycles in DAG of application $name!")
        }
    
        val indices = dag.topologicalOrderWithCirclesIterator.toList.zipWithIndex.toMap
    ```


---

[GitHub] incubator-gearpump pull request #223: [GEARPUMP-349] Optimize Graph topologi...

Posted by manuzhang <gi...@git.apache.org>.
Github user manuzhang commented on a diff in the pull request:

    https://github.com/apache/incubator-gearpump/pull/223#discussion_r138875971
  
    --- Diff: core/src/main/scala/org/apache/gearpump/util/Graph.scala ---
    @@ -243,13 +259,34 @@ class Graph[N, E](vertexList: List[N], edgeList: List[(N, E, N)]) extends Serial
        * The node returned by Iterator is stable sorted.
        */
       def topologicalOrderIterator: Iterator[N] = {
    -    val newGraph = copy
    -    var output = List.empty[N]
    -
    -    while (!newGraph.isEmpty) {
    -      output ++= newGraph.removeZeroInDegree
    +    tryTopologicalOrderIterator.get
    --- End diff --
    
    This will throw exception on failure ? The callers don't know.


---

[GitHub] incubator-gearpump pull request #223: [GEARPUMP-349] Optimize Graph topologi...

Posted by manuzhang <gi...@git.apache.org>.
Github user manuzhang commented on a diff in the pull request:

    https://github.com/apache/incubator-gearpump/pull/223#discussion_r138876219
  
    --- Diff: core/src/main/scala/org/apache/gearpump/util/Graph.scala ---
    @@ -355,19 +393,7 @@ class Graph[N, E](vertexList: List[N], edgeList: List[(N, E, N)]) extends Serial
        * check whether there is a loop
        */
       def hasCycle(): Boolean = {
    -    @tailrec
    -    def detectCycle(graph: Graph[N, E]): Boolean = {
    -      if (graph.edges.isEmpty) {
    -        false
    -      } else if (graph.vertices.nonEmpty && !graph.vertices.exists(graph.inDegreeOf(_) == 0)) {
    -        true
    -      } else {
    -        graph.removeZeroInDegree
    -        detectCycle(graph)
    -      }
    -    }
    -
    -    detectCycle(copy)
    +    tryTopologicalOrderIterator.isFailure
    --- End diff --
    
    can we cache result here ?


---

[GitHub] incubator-gearpump pull request #223: [GEARPUMP-349] Optimize Graph topologi...

Posted by manuzhang <gi...@git.apache.org>.
Github user manuzhang commented on a diff in the pull request:

    https://github.com/apache/incubator-gearpump/pull/223#discussion_r138873545
  
    --- Diff: core/src/main/scala/org/apache/gearpump/util/Graph.scala ---
    @@ -81,28 +86,35 @@ class Graph[N, E](vertexList: List[N], edgeList: List[(N, E, N)]) extends Serial
        * out degree
        */
       def outDegreeOf(node: N): Int = {
    -    edges.count(_._1 == node)
    +    outgoingEdgesOf(node).size
       }
     
       /**
        * in degree
        */
       def inDegreeOf(node: N): Int = {
    -    edges.count(_._3 == node)
    +    incomingEdgesOf(node).size
       }
     
       /**
        * out going edges.
        */
       def outgoingEdgesOf(node: N): List[(N, E, N)] = {
    -    edges.filter(_._1 == node)
    +    _outEdges.getOrElse(node, List.empty)
       }
     
       /**
        * incoming edges.
        */
       def incomingEdgesOf(node: N): List[(N, E, N)] = {
    -    edges.filter(_._3 == node)
    +    _inEdges.getOrElse(node, List.empty)
    +  }
    +
    +  /**
    +   * adjacent vertices.
    +   */
    +  def adjacentVertices(node: N): List[N] = {
    --- End diff --
    
    how about vertices of incoming edges ?


---

[GitHub] incubator-gearpump pull request #223: [GEARPUMP-349] Optimize Graph topologi...

Posted by huafengw <gi...@git.apache.org>.
Github user huafengw commented on a diff in the pull request:

    https://github.com/apache/incubator-gearpump/pull/223#discussion_r138880775
  
    --- Diff: core/src/main/scala/org/apache/gearpump/util/Graph.scala ---
    @@ -81,28 +86,35 @@ class Graph[N, E](vertexList: List[N], edgeList: List[(N, E, N)]) extends Serial
        * out degree
        */
       def outDegreeOf(node: N): Int = {
    -    edges.count(_._1 == node)
    +    outgoingEdgesOf(node).size
       }
     
       /**
        * in degree
        */
       def inDegreeOf(node: N): Int = {
    -    edges.count(_._3 == node)
    +    incomingEdgesOf(node).size
       }
     
       /**
        * out going edges.
        */
       def outgoingEdgesOf(node: N): List[(N, E, N)] = {
    -    edges.filter(_._1 == node)
    +    _outEdges.getOrElse(node, List.empty)
       }
     
       /**
        * incoming edges.
        */
       def incomingEdgesOf(node: N): List[(N, E, N)] = {
    -    edges.filter(_._3 == node)
    +    _inEdges.getOrElse(node, List.empty)
    +  }
    +
    +  /**
    +   * adjacent vertices.
    +   */
    +  def adjacentVertices(node: N): List[N] = {
    --- End diff --
    
    It's a directed graph


---

[GitHub] incubator-gearpump pull request #223: [GEARPUMP-349] Optimize Graph topologi...

Posted by manuzhang <gi...@git.apache.org>.
Github user manuzhang commented on a diff in the pull request:

    https://github.com/apache/incubator-gearpump/pull/223#discussion_r139061587
  
    --- Diff: core/src/main/scala/org/apache/gearpump/util/Graph.scala ---
    @@ -243,13 +258,36 @@ class Graph[N, E](vertexList: List[N], edgeList: List[(N, E, N)]) extends Serial
        * The node returned by Iterator is stable sorted.
        */
       def topologicalOrderIterator: Iterator[N] = {
    -    val newGraph = copy
    -    var output = List.empty[N]
    +    tryTopologicalOrderIterator match {
    +      case Success(intertor) => intertor
    --- End diff --
    
    intertor ?


---

[GitHub] incubator-gearpump pull request #223: [GEARPUMP-349] Optimize Graph topologi...

Posted by huafengw <gi...@git.apache.org>.
Github user huafengw commented on a diff in the pull request:

    https://github.com/apache/incubator-gearpump/pull/223#discussion_r139049494
  
    --- Diff: core/src/main/scala/org/apache/gearpump/util/Graph.scala ---
    @@ -81,28 +86,35 @@ class Graph[N, E](vertexList: List[N], edgeList: List[(N, E, N)]) extends Serial
        * out degree
        */
       def outDegreeOf(node: N): Int = {
    -    edges.count(_._1 == node)
    +    outgoingEdgesOf(node).size
       }
     
       /**
        * in degree
        */
       def inDegreeOf(node: N): Int = {
    -    edges.count(_._3 == node)
    +    incomingEdgesOf(node).size
       }
     
       /**
        * out going edges.
        */
       def outgoingEdgesOf(node: N): List[(N, E, N)] = {
    -    edges.filter(_._1 == node)
    +    _outEdges.getOrElse(node, List.empty)
       }
     
       /**
        * incoming edges.
        */
       def incomingEdgesOf(node: N): List[(N, E, N)] = {
    -    edges.filter(_._3 == node)
    +    _inEdges.getOrElse(node, List.empty)
    +  }
    +
    +  /**
    +   * adjacent vertices.
    +   */
    +  def adjacentVertices(node: N): List[N] = {
    --- End diff --
    
    sure


---

[GitHub] incubator-gearpump pull request #223: [GEARPUMP-349] Optimize Graph topologi...

Posted by manuzhang <gi...@git.apache.org>.
Github user manuzhang commented on a diff in the pull request:

    https://github.com/apache/incubator-gearpump/pull/223#discussion_r139049164
  
    --- Diff: core/src/main/scala/org/apache/gearpump/util/Graph.scala ---
    @@ -81,28 +86,35 @@ class Graph[N, E](vertexList: List[N], edgeList: List[(N, E, N)]) extends Serial
        * out degree
        */
       def outDegreeOf(node: N): Int = {
    -    edges.count(_._1 == node)
    +    outgoingEdgesOf(node).size
       }
     
       /**
        * in degree
        */
       def inDegreeOf(node: N): Int = {
    -    edges.count(_._3 == node)
    +    incomingEdgesOf(node).size
       }
     
       /**
        * out going edges.
        */
       def outgoingEdgesOf(node: N): List[(N, E, N)] = {
    -    edges.filter(_._1 == node)
    +    _outEdges.getOrElse(node, List.empty)
       }
     
       /**
        * incoming edges.
        */
       def incomingEdgesOf(node: N): List[(N, E, N)] = {
    -    edges.filter(_._3 == node)
    +    _inEdges.getOrElse(node, List.empty)
    +  }
    +
    +  /**
    +   * adjacent vertices.
    +   */
    +  def adjacentVertices(node: N): List[N] = {
    --- End diff --
    
    can this be private ? We don't know this is a DAG from outside.


---

[GitHub] incubator-gearpump pull request #223: [GEARPUMP-349] Optimize Graph topologi...

Posted by manuzhang <gi...@git.apache.org>.
Github user manuzhang commented on a diff in the pull request:

    https://github.com/apache/incubator-gearpump/pull/223#discussion_r138876553
  
    --- Diff: core/src/main/scala/org/apache/gearpump/util/Graph.scala ---
    @@ -318,11 +355,12 @@ class Graph[N, E](vertexList: List[N], edgeList: List[(N, E, N)]) extends Serial
        * http://www.drdobbs.com/database/topological-sorting/184410262
        */
       def topologicalOrderWithCirclesIterator: Iterator[N] = {
    -    if (hasCycle()) {
    +    val result = tryTopologicalOrderIterator
    --- End diff --
    
    `match case` looks more clear


---

[GitHub] incubator-gearpump issue #223: [GEARPUMP-349] Optimize Graph topologicalOrde...

Posted by manuzhang <gi...@git.apache.org>.
Github user manuzhang commented on the issue:

    https://github.com/apache/incubator-gearpump/pull/223
  
    LGTM


---

[GitHub] incubator-gearpump pull request #223: [GEARPUMP-349] Optimize Graph topologi...

Posted by manuzhang <gi...@git.apache.org>.
Github user manuzhang commented on a diff in the pull request:

    https://github.com/apache/incubator-gearpump/pull/223#discussion_r139061668
  
    --- Diff: core/src/main/scala/org/apache/gearpump/util/Graph.scala ---
    @@ -243,13 +258,36 @@ class Graph[N, E](vertexList: List[N], edgeList: List[(N, E, N)]) extends Serial
        * The node returned by Iterator is stable sorted.
        */
       def topologicalOrderIterator: Iterator[N] = {
    -    val newGraph = copy
    -    var output = List.empty[N]
    +    tryTopologicalOrderIterator match {
    +      case Success(intertor) => intertor
    +      case Failure(_) => topologicalOrderWithCirclesIterator
    --- End diff --
    
    should we add a warning here ?


---

[GitHub] incubator-gearpump pull request #223: [GEARPUMP-349] Optimize Graph topologi...

Posted by huafengw <gi...@git.apache.org>.
Github user huafengw commented on a diff in the pull request:

    https://github.com/apache/incubator-gearpump/pull/223#discussion_r138881132
  
    --- Diff: core/src/main/scala/org/apache/gearpump/util/Graph.scala ---
    @@ -243,13 +259,34 @@ class Graph[N, E](vertexList: List[N], edgeList: List[(N, E, N)]) extends Serial
        * The node returned by Iterator is stable sorted.
        */
       def topologicalOrderIterator: Iterator[N] = {
    -    val newGraph = copy
    -    var output = List.empty[N]
    -
    -    while (!newGraph.isEmpty) {
    -      output ++= newGraph.removeZeroInDegree
    +    tryTopologicalOrderIterator.get
    --- End diff --
    
    I think exception is better than dead loop


---

[GitHub] incubator-gearpump pull request #223: [GEARPUMP-349] Optimize Graph topologi...

Posted by manuzhang <gi...@git.apache.org>.
Github user manuzhang commented on a diff in the pull request:

    https://github.com/apache/incubator-gearpump/pull/223#discussion_r138875439
  
    --- Diff: core/src/main/scala/org/apache/gearpump/util/Graph.scala ---
    @@ -28,6 +29,8 @@ class Graph[N, E](vertexList: List[N], edgeList: List[(N, E, N)]) extends Serial
     
       private val _vertices = mutable.Set.empty[N]
       private val _edges = mutable.Set.empty[(N, E, N)]
    +  private val _outEdges = mutable.Map.empty[N, List[(N, E, N)]]
    --- End diff --
    
    Will `Set` perform better than `List` here ?


---

[GitHub] incubator-gearpump pull request #223: [GEARPUMP-349] Optimize Graph topologi...

Posted by manuzhang <gi...@git.apache.org>.
Github user manuzhang commented on a diff in the pull request:

    https://github.com/apache/incubator-gearpump/pull/223#discussion_r138876832
  
    --- Diff: core/src/main/scala/org/apache/gearpump/util/Graph.scala ---
    @@ -243,13 +259,34 @@ class Graph[N, E](vertexList: List[N], edgeList: List[(N, E, N)]) extends Serial
        * The node returned by Iterator is stable sorted.
        */
       def topologicalOrderIterator: Iterator[N] = {
    -    val newGraph = copy
    -    var output = List.empty[N]
    -
    -    while (!newGraph.isEmpty) {
    -      output ++= newGraph.removeZeroInDegree
    +    tryTopologicalOrderIterator.get
    +  }
    +
    +  private def tryTopologicalOrderIterator: Try[Iterator[N]] = {
    +    Try {
    +      val indegreeMap = mutable.Map.empty[N, Int]
    --- End diff --
    
    why not generate `indegreeMap` with `vertices.map` ?


---

[GitHub] incubator-gearpump pull request #223: [GEARPUMP-349] Optimize Graph topologi...

Posted by huafengw <gi...@git.apache.org>.
Github user huafengw commented on a diff in the pull request:

    https://github.com/apache/incubator-gearpump/pull/223#discussion_r138880939
  
    --- Diff: core/src/main/scala/org/apache/gearpump/util/Graph.scala ---
    @@ -165,7 +181,7 @@ class Graph[N, E](vertexList: List[N], edgeList: List[(N, E, N)]) extends Serial
        * edges connected to node
        */
       def edgesOf(node: N): List[(N, E, N)] = {
    -    (incomingEdgesOf(node) ++ outgoingEdgesOf(node)).toSet[(N, E, N)].toList.sortBy(_indexs(_))
    +    (incomingEdgesOf(node) ++ outgoingEdgesOf(node)).distinct.sortBy(_indexs(_))
    --- End diff --
    
    ```
     // This is used to ensure the output of this Graph is always stable
     // Like method vertices(), or edges()
     private var _indexs = Map.empty[Any, Int]
    ````


---

[GitHub] incubator-gearpump pull request #223: [GEARPUMP-349] Optimize Graph topologi...

Posted by huafengw <gi...@git.apache.org>.
Github user huafengw commented on a diff in the pull request:

    https://github.com/apache/incubator-gearpump/pull/223#discussion_r138882103
  
    --- Diff: core/src/main/scala/org/apache/gearpump/util/Graph.scala ---
    @@ -355,19 +393,7 @@ class Graph[N, E](vertexList: List[N], edgeList: List[(N, E, N)]) extends Serial
        * check whether there is a loop
        */
       def hasCycle(): Boolean = {
    -    @tailrec
    -    def detectCycle(graph: Graph[N, E]): Boolean = {
    -      if (graph.edges.isEmpty) {
    -        false
    -      } else if (graph.vertices.nonEmpty && !graph.vertices.exists(graph.inDegreeOf(_) == 0)) {
    -        true
    -      } else {
    -        graph.removeZeroInDegree
    -        detectCycle(graph)
    -      }
    -    }
    -
    -    detectCycle(copy)
    +    tryTopologicalOrderIterator.isFailure
    --- End diff --
    
    The graph is mutable so we have to check whether the Graph is changed if caching the result.


---

[GitHub] incubator-gearpump issue #223: [GEARPUMP-349] Optimize Graph topologicalOrde...

Posted by manuzhang <gi...@git.apache.org>.
Github user manuzhang commented on the issue:

    https://github.com/apache/incubator-gearpump/pull/223
  
    Verified Beam ValidatesRunnerTests can be run much faster now.


---

[GitHub] incubator-gearpump pull request #223: [GEARPUMP-349] Optimize Graph topologi...

Posted by manuzhang <gi...@git.apache.org>.
Github user manuzhang commented on a diff in the pull request:

    https://github.com/apache/incubator-gearpump/pull/223#discussion_r139049560
  
    --- Diff: core/src/main/scala/org/apache/gearpump/util/Graph.scala ---
    @@ -156,64 +155,64 @@ class Graph[N, E](vertexList: List[N], edgeList: List[(N, E, N)]) extends Serial
        * Current Graph is not changed.
        */
       def mapVertex[NewNode](fun: N => NewNode): Graph[NewNode, E] = {
    -    val vertexes = vertices.map(node => (node, fun(node)))
    +    val newVertexes = getVertices.map(node => (node, fun(node)))
     
    -    val vertexMap: Map[N, NewNode] = vertexes.toMap
    +    val vertexMap: Map[N, NewNode] = newVertexes.toMap
     
    -    val newEdges = edges.map { edge =>
    +    val newEdges = getEdges.map { edge =>
           (vertexMap(edge._1), edge._2, vertexMap(edge._3))
         }
    -    new Graph(vertexes.map(_._2), newEdges)
    +    new Graph(newVertexes.map(_._2), newEdges)
    --- End diff --
    
    I believe the plural of Vertex is Vertices


---

[GitHub] incubator-gearpump issue #223: [GEARPUMP-349] Optimize Graph topologicalOrde...

Posted by manuzhang <gi...@git.apache.org>.
Github user manuzhang commented on the issue:

    https://github.com/apache/incubator-gearpump/pull/223
  
    @huafengw could you explain a bit why it's faster now.


---

[GitHub] incubator-gearpump pull request #223: [GEARPUMP-349] Optimize Graph topologi...

Posted by manuzhang <gi...@git.apache.org>.
Github user manuzhang commented on a diff in the pull request:

    https://github.com/apache/incubator-gearpump/pull/223#discussion_r138883853
  
    --- Diff: core/src/main/scala/org/apache/gearpump/util/Graph.scala ---
    @@ -243,13 +259,34 @@ class Graph[N, E](vertexList: List[N], edgeList: List[(N, E, N)]) extends Serial
        * The node returned by Iterator is stable sorted.
        */
       def topologicalOrderIterator: Iterator[N] = {
    -    val newGraph = copy
    -    var output = List.empty[N]
    -
    -    while (!newGraph.isEmpty) {
    -      output ++= newGraph.removeZeroInDegree
    +    tryTopologicalOrderIterator.get
    --- End diff --
    
    Then return a `Try` is better since the caller will be aware this method may fail then.


---

[GitHub] incubator-gearpump issue #223: [GEARPUMP-349] Optimize Graph topologicalOrde...

Posted by codecov-io <gi...@git.apache.org>.
Github user codecov-io commented on the issue:

    https://github.com/apache/incubator-gearpump/pull/223
  
    # [Codecov](https://codecov.io/gh/apache/incubator-gearpump/pull/223?src=pr&el=h1) Report
    > Merging [#223](https://codecov.io/gh/apache/incubator-gearpump/pull/223?src=pr&el=desc) into [master](https://codecov.io/gh/apache/incubator-gearpump/commit/b79fb2445b88f0e7aab71e72404ca5a449fc97f1?src=pr&el=desc) will **decrease** coverage by `0.02%`.
    > The diff coverage is `96%`.
    
    
    
    ```diff
    @@            Coverage Diff             @@
    ##           master     #223      +/-   ##
    ==========================================
    - Coverage   69.05%   69.03%   -0.03%     
    ==========================================
      Files         191      191              
      Lines        6104     6116      +12     
      Branches      350      352       +2     
    ==========================================
    + Hits         4215     4222       +7     
    - Misses       1889     1894       +5
    ```
    



---

[GitHub] incubator-gearpump pull request #223: [GEARPUMP-349] Optimize Graph topologi...

Posted by manuzhang <gi...@git.apache.org>.
Github user manuzhang commented on a diff in the pull request:

    https://github.com/apache/incubator-gearpump/pull/223#discussion_r138879455
  
    --- Diff: core/src/main/scala/org/apache/gearpump/util/Graph.scala ---
    @@ -28,6 +29,8 @@ class Graph[N, E](vertexList: List[N], edgeList: List[(N, E, N)]) extends Serial
     
       private val _vertices = mutable.Set.empty[N]
       private val _edges = mutable.Set.empty[(N, E, N)]
    +  private val _outEdges = mutable.Map.empty[N, List[(N, E, N)]]
    +  private val _inEdges = mutable.Map.empty[N, List[(N, E, N)]]
    --- End diff --
    
    Why do we have "_abc" style variable names here ?


---

[GitHub] incubator-gearpump pull request #223: [GEARPUMP-349] Optimize Graph topologi...

Posted by manuzhang <gi...@git.apache.org>.
Github user manuzhang commented on a diff in the pull request:

    https://github.com/apache/incubator-gearpump/pull/223#discussion_r138884320
  
    --- Diff: core/src/main/scala/org/apache/gearpump/util/Graph.scala ---
    @@ -355,19 +393,7 @@ class Graph[N, E](vertexList: List[N], edgeList: List[(N, E, N)]) extends Serial
        * check whether there is a loop
        */
       def hasCycle(): Boolean = {
    -    @tailrec
    -    def detectCycle(graph: Graph[N, E]): Boolean = {
    -      if (graph.edges.isEmpty) {
    -        false
    -      } else if (graph.vertices.nonEmpty && !graph.vertices.exists(graph.inDegreeOf(_) == 0)) {
    -        true
    -      } else {
    -        graph.removeZeroInDegree
    -        detectCycle(graph)
    -      }
    -    }
    -
    -    detectCycle(copy)
    +    tryTopologicalOrderIterator.isFailure
    --- End diff --
    
    sure, this can be left as a future optimization and I'm not a big fan of "mutable"


---

[GitHub] incubator-gearpump pull request #223: [GEARPUMP-349] Optimize Graph topologi...

Posted by manuzhang <gi...@git.apache.org>.
Github user manuzhang commented on a diff in the pull request:

    https://github.com/apache/incubator-gearpump/pull/223#discussion_r138875669
  
    --- Diff: core/src/main/scala/org/apache/gearpump/util/Graph.scala ---
    @@ -165,7 +181,7 @@ class Graph[N, E](vertexList: List[N], edgeList: List[(N, E, N)]) extends Serial
        * edges connected to node
        */
       def edgesOf(node: N): List[(N, E, N)] = {
    -    (incomingEdgesOf(node) ++ outgoingEdgesOf(node)).toSet[(N, E, N)].toList.sortBy(_indexs(_))
    +    (incomingEdgesOf(node) ++ outgoingEdgesOf(node)).distinct.sortBy(_indexs(_))
    --- End diff --
    
    why do we need to `distinct` and `sort` here ?


---