You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@tinkerpop.apache.org by spmallette <gi...@git.apache.org> on 2017/03/28 16:31:51 UTC

[GitHub] tinkerpop pull request #587: TINKERPOP-1642 Performance Enhancement on Mutat...

GitHub user spmallette opened a pull request:

    https://github.com/apache/tinkerpop/pull/587

    TINKERPOP-1642 Performance Enhancement on Mutating Traversal Execution

    https://issues.apache.org/jira/browse/TINKERPOP-1642
    
    This is a focused effort to improve the performance of traversals that mutate the graph.  All improvements were made on gremlin-core and should therefore be helpful to all `Graph` implementations. Here's the benchmarks at 3.2.4:
    
    ```text
    Benchmark                                           Mode  Cnt        Score         Error  Units
    GraphMutateBenchmark.testAddE                      thrpt   20    20010.155 �     621.783  ops/s
    GraphMutateBenchmark.testAddEdge                   thrpt   20  3609684.997 �  195545.641  ops/s
    GraphMutateBenchmark.testAddV                      thrpt   20   288621.370 �   12576.655  ops/s
    GraphMutateBenchmark.testAddVAddEWithPropsChained  thrpt   20        5.402 �       0.084  ops/s
    GraphMutateBenchmark.testAddVWithProps             thrpt   20    21924.945 �     230.668  ops/s
    GraphMutateBenchmark.testAddVWithPropsChained      thrpt   20      184.896 �       4.875  ops/s
    GraphMutateBenchmark.testAddVertex                 thrpt   20  5830447.734 �  131727.984  ops/s
    GraphMutateBenchmark.testAddVertexWithProps        thrpt   20   163131.738 �    2647.117  ops/s
    GraphMutateBenchmark.testEdgeProperty              thrpt   20  8971499.256 � 1630382.353  ops/s
    GraphMutateBenchmark.testEdgePropertyStep          thrpt   20   125053.249 �    4770.478  ops/s
    GraphMutateBenchmark.testVertexProperty            thrpt   20  3741129.376 �  286173.817  ops/s
    GraphMutateBenchmark.testVertexPropertyStep        thrpt   20   121742.106 �    4339.508  ops/s
    ```
    
    and here's the benchmarks now in this PR which will apply to 3.2.5 and 3.3.0:
    
    ```text
    Benchmark                                           Mode  Cnt        Score        Error  Units
    GraphMutateBenchmark.testAddE                      thrpt   20    26921.570 �    241.374  ops/s
    GraphMutateBenchmark.testAddEdge                   thrpt   20  3567550.566 � 185748.638  ops/s
    GraphMutateBenchmark.testAddV                      thrpt   20   341523.214 �  10678.074  ops/s
    GraphMutateBenchmark.testAddVAddEWithPropsChained  thrpt   20      462.173 �     46.842  ops/s
    GraphMutateBenchmark.testAddVWithProps             thrpt   20    46946.475 �    508.174  ops/s
    GraphMutateBenchmark.testAddVWithPropsChained      thrpt   20      430.650 �      5.941  ops/s
    GraphMutateBenchmark.testAddVertex                 thrpt   20  6362284.245 � 294203.426  ops/s
    GraphMutateBenchmark.testAddVertexWithProps        thrpt   20   165223.769 �   4788.481  ops/s
    GraphMutateBenchmark.testEdgeProperty              thrpt   20  8490043.427 � 575562.362  ops/s
    GraphMutateBenchmark.testEdgePropertyStep          thrpt   20   142430.605 �   3452.101  ops/s
    GraphMutateBenchmark.testVertexProperty            thrpt   20  4016719.775 � 587856.207  ops/s
    GraphMutateBenchmark.testVertexPropertyStep        thrpt   20   131740.946 �   1525.710  ops/s
    ```
    
    The key things to note:
    
    * A simple `g.addV()` got about 20% faster 
    * A simple `g.addE()` got about 34% faster
    * A long chained set of `addV()` where multiple vertices were added in a single traversal became a bit more than twice as fast.
    * A long chained set of `addV()` and `addE()` where multiple vertices/edges were added in a single traversal became 92 times faster
    * These changes did not hurt performance of the "non-mutating" traversals - those benchmarks remained about the same.
    
    Obviously, these numbers still lag where we want to be, but the changes in place represent a good body of work at this point and have opened up other ideas for improvements which can be dealt with as other specific issues/pull requests.
    
    VOTE +1

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/apache/tinkerpop TINKERPOP-1642

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/tinkerpop/pull/587.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #587
    
----
commit 4ffd19e0bba9408aa648521b4b94d1997055a796
Author: Stephen Mallette <sp...@genoprime.com>
Date:   2017-03-02T15:44:58Z

    TINKERPOP-1642 Improved performance of addV() and chained mutating statements
    
    Added more tests for Parameters. Performance improved by about 2.5x given the benchmarks. Also long chained mutating traversals are now inserting roughly on par with single insert traversals (prior to this change they were about 25% slower).

commit c8cc470e38b6f2ef3f87fa6cfebb59a92e6301d2
Author: Stephen Mallette <sp...@genoprime.com>
Date:   2017-03-03T11:47:56Z

    TINKERPOP-1642 Cached the Traversal list in Parameters
    
    This leads to a pretty good boost in performance from the prior commit especially for long chained mutating traversals that add vertices and edges as there is a 3X improvement in those cases.

commit 632fd219cd4452486dd45fb3a748f897fd751904
Author: Stephen Mallette <sp...@genoprime.com>
Date:   2017-03-03T13:45:24Z

    TINKERPOP-1642 Minor fixes

commit 0406a92d2530b2486bc81ca6d6ab4eacd0094a9a
Author: Marko A. Rodriguez <ok...@gmail.com>
Date:   2017-03-03T16:27:58Z

    Added ScopingStrategy which does a single TraversalHelper.getLabels() call and recurssively tells where() and select() steps the path labels accessed by the Traversal.

commit 9f7b7591a3ea0782b9389061d74db2586ff90430
Author: Marko A. Rodriguez <ok...@gmail.com>
Date:   2017-03-06T15:07:39Z

    Add a slight optimization to ScopingStrategy that will not compute step labels if no Scoping steps are defined. Also added JavaDoc and comments accordingly.

commit 2549650f62148c21f1217af629b1ef04fdd9bbe7
Author: Marko A. Rodriguez <ok...@gmail.com>
Date:   2017-03-08T14:37:07Z

    Wow. Can't believe we didn't do this from the start. LABELED_PATH is set if a Step is labeled. Dar. So much simpler than all the recurssion in TraversalHelper.getLabels() and having a ScopingStrategy.... @spmallette -- can you verify that your Mutating traversal profiling is faster now.

commit 9185f45273cefca3a0b622e87e1eab825dea0ffe
Author: Stephen Mallette <sp...@genoprime.com>
Date:   2017-03-10T17:31:38Z

    TINKERPOP-1642 Removed some extra iteration in Parameters.getTraversals()

commit 4838f7b0eb0f2f97b3dcad18a72295dc655dd463
Author: Stephen Mallette <sp...@genoprime.com>
Date:   2017-03-10T18:27:01Z

    TINKERPOP-1642 Minor refactoring to get rid of duplicate code

commit 9e4942cf83e5add39838605641112438b093dbd7
Author: Stephen Mallette <sp...@genoprime.com>
Date:   2017-03-10T20:11:52Z

    TINKERPOP-1642 Made Mutating steps implement Scoping.
    
    This simplified PathUtil.getReferencedLabels() and reduced iterations over Parameter traversals.

commit d4467512345b39dd0a8f89309d26039b0d5e96f8
Author: Daniel Kuppitz <da...@hotmail.com>
Date:   2017-03-13T15:01:12Z

    Fixed VertexProgram test that verifies proper TraversalRequirements.

commit 9893e49ae03cc1a474a6b8085d0a4e3d5eff3b36
Author: Stephen Mallette <sp...@genoprime.com>
Date:   2017-03-13T18:50:09Z

    TINKERPOP-1642 Pushed integrateChild() operations into Parameters.set()
    
    There was no need to continually iterate the traversal list and doing integrateChild() over and over again for each newly added traversal. Removing those wasteful cycles and only doing that once, when the Traversal was added helped improve performance.

commit d39ef5caf0e1fc51767202b9bf28a12117a51a40
Author: Marko A. Rodriguez <ok...@gmail.com>
Date:   2017-03-17T17:57:52Z

    I made it so ProfileStrategy uses the MARKER model developed by @dkuppitz and myself to reduce recurssive lookups in strategies. Moving forward, we really need to move to a Traversal.metadata model as using labeled steps is a bit hackish.

commit 3929a6748e3b54692c0d103319965dd0d47ba3e1
Author: Marko A. Rodriguez <ok...@gmail.com>
Date:   2017-03-17T19:04:58Z

    added MARKER model to PathRetractionStrategy to reduce recurssive lookups of invalidating steps. Again, we really need Traversal.metdata to make this more 'pure'.

commit 0824625f4db8e0fdc483183faf8d505f3300b6b4
Author: Stephen Mallette <sp...@genoprime.com>
Date:   2017-03-27T18:03:26Z

    TINKERPOP-1642 Fixed up some execution problems after rebase
    
    The logic still doesn't seem quite right but at least the tests pass and a complete build is achieved. The issue is related to new stuff added in PathRetractionStrategy on TINKERPOP-1652.

commit fc3fcb38255aa49bd4a311172d0d8b84e31dfdb9
Author: Marko A. Rodriguez <ok...@gmail.com>
Date:   2017-03-28T13:46:31Z

    removed a repeated recurssion in PathRetractionStrategy using the MARKER model of strategies.

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] tinkerpop pull request #587: TINKERPOP-1642 Performance Enhancement on Mutat...

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/tinkerpop/pull/587


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] tinkerpop issue #587: TINKERPOP-1642 Performance Enhancement on Mutating Tra...

Posted by spmallette <gi...@git.apache.org>.
Github user spmallette commented on the issue:

    https://github.com/apache/tinkerpop/pull/587
  
    weird @twilmes - i wonder what that traversal is doing that none of our tests are doing.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] tinkerpop issue #587: TINKERPOP-1642 Performance Enhancement on Mutating Tra...

Posted by spmallette <gi...@git.apache.org>.
Github user spmallette commented on the issue:

    https://github.com/apache/tinkerpop/pull/587
  
    Yeah - @okram mentioned the same thing. I thought it had passed for me earlier. I had a number of builds running on different systems. I think I attributed that failure to a different branch by mistake. Anyway - now that we've confirmed that there is a build issue, I wonder what the problem is. I don't recall doing anything with `sack()` in this change. Judging from the failing line, it looks like `sideEffects` field in `B_LP_O_S_SE_SL_Traverser` is null. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] tinkerpop issue #587: TINKERPOP-1642 Performance Enhancement on Mutating Tra...

Posted by okram <gi...@git.apache.org>.
Github user okram commented on the issue:

    https://github.com/apache/tinkerpop/pull/587
  
    Fixed the bugs that both @dkuppitz and @twilmes discovered.
    
    VOTE +1.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] tinkerpop issue #587: TINKERPOP-1642 Performance Enhancement on Mutating Tra...

Posted by okram <gi...@git.apache.org>.
Github user okram commented on the issue:

    https://github.com/apache/tinkerpop/pull/587
  
    Note that `tp32/` doesn't have the issue. Thus, the bug is confined to this branch (luckily).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] tinkerpop issue #587: TINKERPOP-1642 Performance Enhancement on Mutating Tra...

Posted by robertdale <gi...@git.apache.org>.
Github user robertdale commented on the issue:

    https://github.com/apache/tinkerpop/pull/587
  
    My app tests pass. Also noticed a slight reduction in test time. :grinning:
    
    VOTE +1


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] tinkerpop issue #587: TINKERPOP-1642 Performance Enhancement on Mutating Tra...

Posted by twilmes <gi...@git.apache.org>.
Github user twilmes commented on the issue:

    https://github.com/apache/tinkerpop/pull/587
  
    This is an impressive set of performance improvements. I was testing and ran into an issue with this following query.  It ran successfully against `tp32`.
    
    ```
    graph = TinkerGraph.open()
    g = graph.traversal()
    g.addV().as("first").repeat(addE("next").to(addV()).inV()).times(5).addE("next").to(select("first"))
    
    java.lang.IllegalStateException: The traversal strategies are complete and the traversal can no longer be modulated
    	at org.apache.tinkerpop.gremlin.process.traversal.Traversal$Exceptions.traversalIsLocked(Traversal.java:274)
    	at org.apache.tinkerpop.gremlin.process.traversal.util.DefaultTraversal.applyStrategies(DefaultTraversal.java:112)
    	at org.apache.tinkerpop.gremlin.process.traversal.util.DefaultTraversal.applyStrategies(DefaultTraversal.java:129)
    	at org.apache.tinkerpop.gremlin.process.traversal.util.DefaultTraversal.hasNext(DefaultTraversal.java:186)
    	at org.codehaus.groovy.vmplugin.v7.IndyInterface.selectMethod(IndyInterface.java:232)
    	at org.apache.tinkerpop.gremlin.console.Console$_closure3.doCall(Console.groovy:234)
    	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    	at java.lang.reflect.Method.invoke(Method.java:497)
    	at org.codehaus.groovy.reflection.CachedMethod.invoke(CachedMethod.java:93)
    	at groovy.lang.MetaMethod.doMethodInvoke(MetaMethod.java:325)
    	at org.codehaus.groovy.runtime.metaclass.ClosureMetaClass.invokeMethod(ClosureMetaClass.java:294) ...
    ```


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] tinkerpop issue #587: TINKERPOP-1642 Performance Enhancement on Mutating Tra...

Posted by dkuppitz <gi...@git.apache.org>.
Github user dkuppitz commented on the issue:

    https://github.com/apache/tinkerpop/pull/587
  
    `docker/build.sh -t -n -i` failed.
    
    Got an NPE in `SparkGraphComputerGroovyProcessIntegrateTest`.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] tinkerpop issue #587: TINKERPOP-1642 Performance Enhancement on Mutating Tra...

Posted by dkuppitz <gi...@git.apache.org>.
Github user dkuppitz commented on the issue:

    https://github.com/apache/tinkerpop/pull/587
  
    `docker/build.sh -t -n -i` succeeded.
    
    VOTE: +1


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---