You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@tinkerpop.apache.org by spmallette <gi...@git.apache.org> on 2017/03/28 16:31:51 UTC
[GitHub] tinkerpop pull request #587: TINKERPOP-1642 Performance Enhancement on Mutat...
GitHub user spmallette opened a pull request:
https://github.com/apache/tinkerpop/pull/587
TINKERPOP-1642 Performance Enhancement on Mutating Traversal Execution
https://issues.apache.org/jira/browse/TINKERPOP-1642
This is a focused effort to improve the performance of traversals that mutate the graph. All improvements were made on gremlin-core and should therefore be helpful to all `Graph` implementations. Here's the benchmarks at 3.2.4:
```text
Benchmark Mode Cnt Score Error Units
GraphMutateBenchmark.testAddE thrpt 20 20010.155 � 621.783 ops/s
GraphMutateBenchmark.testAddEdge thrpt 20 3609684.997 � 195545.641 ops/s
GraphMutateBenchmark.testAddV thrpt 20 288621.370 � 12576.655 ops/s
GraphMutateBenchmark.testAddVAddEWithPropsChained thrpt 20 5.402 � 0.084 ops/s
GraphMutateBenchmark.testAddVWithProps thrpt 20 21924.945 � 230.668 ops/s
GraphMutateBenchmark.testAddVWithPropsChained thrpt 20 184.896 � 4.875 ops/s
GraphMutateBenchmark.testAddVertex thrpt 20 5830447.734 � 131727.984 ops/s
GraphMutateBenchmark.testAddVertexWithProps thrpt 20 163131.738 � 2647.117 ops/s
GraphMutateBenchmark.testEdgeProperty thrpt 20 8971499.256 � 1630382.353 ops/s
GraphMutateBenchmark.testEdgePropertyStep thrpt 20 125053.249 � 4770.478 ops/s
GraphMutateBenchmark.testVertexProperty thrpt 20 3741129.376 � 286173.817 ops/s
GraphMutateBenchmark.testVertexPropertyStep thrpt 20 121742.106 � 4339.508 ops/s
```
and here's the benchmarks now in this PR which will apply to 3.2.5 and 3.3.0:
```text
Benchmark Mode Cnt Score Error Units
GraphMutateBenchmark.testAddE thrpt 20 26921.570 � 241.374 ops/s
GraphMutateBenchmark.testAddEdge thrpt 20 3567550.566 � 185748.638 ops/s
GraphMutateBenchmark.testAddV thrpt 20 341523.214 � 10678.074 ops/s
GraphMutateBenchmark.testAddVAddEWithPropsChained thrpt 20 462.173 � 46.842 ops/s
GraphMutateBenchmark.testAddVWithProps thrpt 20 46946.475 � 508.174 ops/s
GraphMutateBenchmark.testAddVWithPropsChained thrpt 20 430.650 � 5.941 ops/s
GraphMutateBenchmark.testAddVertex thrpt 20 6362284.245 � 294203.426 ops/s
GraphMutateBenchmark.testAddVertexWithProps thrpt 20 165223.769 � 4788.481 ops/s
GraphMutateBenchmark.testEdgeProperty thrpt 20 8490043.427 � 575562.362 ops/s
GraphMutateBenchmark.testEdgePropertyStep thrpt 20 142430.605 � 3452.101 ops/s
GraphMutateBenchmark.testVertexProperty thrpt 20 4016719.775 � 587856.207 ops/s
GraphMutateBenchmark.testVertexPropertyStep thrpt 20 131740.946 � 1525.710 ops/s
```
The key things to note:
* A simple `g.addV()` got about 20% faster
* A simple `g.addE()` got about 34% faster
* A long chained set of `addV()` where multiple vertices were added in a single traversal became a bit more than twice as fast.
* A long chained set of `addV()` and `addE()` where multiple vertices/edges were added in a single traversal became 92 times faster
* These changes did not hurt performance of the "non-mutating" traversals - those benchmarks remained about the same.
Obviously, these numbers still lag where we want to be, but the changes in place represent a good body of work at this point and have opened up other ideas for improvements which can be dealt with as other specific issues/pull requests.
VOTE +1
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/apache/tinkerpop TINKERPOP-1642
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/tinkerpop/pull/587.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #587
----
commit 4ffd19e0bba9408aa648521b4b94d1997055a796
Author: Stephen Mallette <sp...@genoprime.com>
Date: 2017-03-02T15:44:58Z
TINKERPOP-1642 Improved performance of addV() and chained mutating statements
Added more tests for Parameters. Performance improved by about 2.5x given the benchmarks. Also long chained mutating traversals are now inserting roughly on par with single insert traversals (prior to this change they were about 25% slower).
commit c8cc470e38b6f2ef3f87fa6cfebb59a92e6301d2
Author: Stephen Mallette <sp...@genoprime.com>
Date: 2017-03-03T11:47:56Z
TINKERPOP-1642 Cached the Traversal list in Parameters
This leads to a pretty good boost in performance from the prior commit especially for long chained mutating traversals that add vertices and edges as there is a 3X improvement in those cases.
commit 632fd219cd4452486dd45fb3a748f897fd751904
Author: Stephen Mallette <sp...@genoprime.com>
Date: 2017-03-03T13:45:24Z
TINKERPOP-1642 Minor fixes
commit 0406a92d2530b2486bc81ca6d6ab4eacd0094a9a
Author: Marko A. Rodriguez <ok...@gmail.com>
Date: 2017-03-03T16:27:58Z
Added ScopingStrategy which does a single TraversalHelper.getLabels() call and recurssively tells where() and select() steps the path labels accessed by the Traversal.
commit 9f7b7591a3ea0782b9389061d74db2586ff90430
Author: Marko A. Rodriguez <ok...@gmail.com>
Date: 2017-03-06T15:07:39Z
Add a slight optimization to ScopingStrategy that will not compute step labels if no Scoping steps are defined. Also added JavaDoc and comments accordingly.
commit 2549650f62148c21f1217af629b1ef04fdd9bbe7
Author: Marko A. Rodriguez <ok...@gmail.com>
Date: 2017-03-08T14:37:07Z
Wow. Can't believe we didn't do this from the start. LABELED_PATH is set if a Step is labeled. Dar. So much simpler than all the recurssion in TraversalHelper.getLabels() and having a ScopingStrategy.... @spmallette -- can you verify that your Mutating traversal profiling is faster now.
commit 9185f45273cefca3a0b622e87e1eab825dea0ffe
Author: Stephen Mallette <sp...@genoprime.com>
Date: 2017-03-10T17:31:38Z
TINKERPOP-1642 Removed some extra iteration in Parameters.getTraversals()
commit 4838f7b0eb0f2f97b3dcad18a72295dc655dd463
Author: Stephen Mallette <sp...@genoprime.com>
Date: 2017-03-10T18:27:01Z
TINKERPOP-1642 Minor refactoring to get rid of duplicate code
commit 9e4942cf83e5add39838605641112438b093dbd7
Author: Stephen Mallette <sp...@genoprime.com>
Date: 2017-03-10T20:11:52Z
TINKERPOP-1642 Made Mutating steps implement Scoping.
This simplified PathUtil.getReferencedLabels() and reduced iterations over Parameter traversals.
commit d4467512345b39dd0a8f89309d26039b0d5e96f8
Author: Daniel Kuppitz <da...@hotmail.com>
Date: 2017-03-13T15:01:12Z
Fixed VertexProgram test that verifies proper TraversalRequirements.
commit 9893e49ae03cc1a474a6b8085d0a4e3d5eff3b36
Author: Stephen Mallette <sp...@genoprime.com>
Date: 2017-03-13T18:50:09Z
TINKERPOP-1642 Pushed integrateChild() operations into Parameters.set()
There was no need to continually iterate the traversal list and doing integrateChild() over and over again for each newly added traversal. Removing those wasteful cycles and only doing that once, when the Traversal was added helped improve performance.
commit d39ef5caf0e1fc51767202b9bf28a12117a51a40
Author: Marko A. Rodriguez <ok...@gmail.com>
Date: 2017-03-17T17:57:52Z
I made it so ProfileStrategy uses the MARKER model developed by @dkuppitz and myself to reduce recurssive lookups in strategies. Moving forward, we really need to move to a Traversal.metadata model as using labeled steps is a bit hackish.
commit 3929a6748e3b54692c0d103319965dd0d47ba3e1
Author: Marko A. Rodriguez <ok...@gmail.com>
Date: 2017-03-17T19:04:58Z
added MARKER model to PathRetractionStrategy to reduce recurssive lookups of invalidating steps. Again, we really need Traversal.metdata to make this more 'pure'.
commit 0824625f4db8e0fdc483183faf8d505f3300b6b4
Author: Stephen Mallette <sp...@genoprime.com>
Date: 2017-03-27T18:03:26Z
TINKERPOP-1642 Fixed up some execution problems after rebase
The logic still doesn't seem quite right but at least the tests pass and a complete build is achieved. The issue is related to new stuff added in PathRetractionStrategy on TINKERPOP-1652.
commit fc3fcb38255aa49bd4a311172d0d8b84e31dfdb9
Author: Marko A. Rodriguez <ok...@gmail.com>
Date: 2017-03-28T13:46:31Z
removed a repeated recurssion in PathRetractionStrategy using the MARKER model of strategies.
----
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
[GitHub] tinkerpop pull request #587: TINKERPOP-1642 Performance Enhancement on Mutat...
Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:
https://github.com/apache/tinkerpop/pull/587
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
[GitHub] tinkerpop issue #587: TINKERPOP-1642 Performance Enhancement on Mutating Tra...
Posted by spmallette <gi...@git.apache.org>.
Github user spmallette commented on the issue:
https://github.com/apache/tinkerpop/pull/587
weird @twilmes - i wonder what that traversal is doing that none of our tests are doing.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
[GitHub] tinkerpop issue #587: TINKERPOP-1642 Performance Enhancement on Mutating Tra...
Posted by spmallette <gi...@git.apache.org>.
Github user spmallette commented on the issue:
https://github.com/apache/tinkerpop/pull/587
Yeah - @okram mentioned the same thing. I thought it had passed for me earlier. I had a number of builds running on different systems. I think I attributed that failure to a different branch by mistake. Anyway - now that we've confirmed that there is a build issue, I wonder what the problem is. I don't recall doing anything with `sack()` in this change. Judging from the failing line, it looks like `sideEffects` field in `B_LP_O_S_SE_SL_Traverser` is null.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
[GitHub] tinkerpop issue #587: TINKERPOP-1642 Performance Enhancement on Mutating Tra...
Posted by okram <gi...@git.apache.org>.
Github user okram commented on the issue:
https://github.com/apache/tinkerpop/pull/587
Fixed the bugs that both @dkuppitz and @twilmes discovered.
VOTE +1.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
[GitHub] tinkerpop issue #587: TINKERPOP-1642 Performance Enhancement on Mutating Tra...
Posted by okram <gi...@git.apache.org>.
Github user okram commented on the issue:
https://github.com/apache/tinkerpop/pull/587
Note that `tp32/` doesn't have the issue. Thus, the bug is confined to this branch (luckily).
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
[GitHub] tinkerpop issue #587: TINKERPOP-1642 Performance Enhancement on Mutating Tra...
Posted by robertdale <gi...@git.apache.org>.
Github user robertdale commented on the issue:
https://github.com/apache/tinkerpop/pull/587
My app tests pass. Also noticed a slight reduction in test time. :grinning:
VOTE +1
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
[GitHub] tinkerpop issue #587: TINKERPOP-1642 Performance Enhancement on Mutating Tra...
Posted by twilmes <gi...@git.apache.org>.
Github user twilmes commented on the issue:
https://github.com/apache/tinkerpop/pull/587
This is an impressive set of performance improvements. I was testing and ran into an issue with this following query. It ran successfully against `tp32`.
```
graph = TinkerGraph.open()
g = graph.traversal()
g.addV().as("first").repeat(addE("next").to(addV()).inV()).times(5).addE("next").to(select("first"))
java.lang.IllegalStateException: The traversal strategies are complete and the traversal can no longer be modulated
at org.apache.tinkerpop.gremlin.process.traversal.Traversal$Exceptions.traversalIsLocked(Traversal.java:274)
at org.apache.tinkerpop.gremlin.process.traversal.util.DefaultTraversal.applyStrategies(DefaultTraversal.java:112)
at org.apache.tinkerpop.gremlin.process.traversal.util.DefaultTraversal.applyStrategies(DefaultTraversal.java:129)
at org.apache.tinkerpop.gremlin.process.traversal.util.DefaultTraversal.hasNext(DefaultTraversal.java:186)
at org.codehaus.groovy.vmplugin.v7.IndyInterface.selectMethod(IndyInterface.java:232)
at org.apache.tinkerpop.gremlin.console.Console$_closure3.doCall(Console.groovy:234)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:497)
at org.codehaus.groovy.reflection.CachedMethod.invoke(CachedMethod.java:93)
at groovy.lang.MetaMethod.doMethodInvoke(MetaMethod.java:325)
at org.codehaus.groovy.runtime.metaclass.ClosureMetaClass.invokeMethod(ClosureMetaClass.java:294) ...
```
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
[GitHub] tinkerpop issue #587: TINKERPOP-1642 Performance Enhancement on Mutating Tra...
Posted by dkuppitz <gi...@git.apache.org>.
Github user dkuppitz commented on the issue:
https://github.com/apache/tinkerpop/pull/587
`docker/build.sh -t -n -i` failed.
Got an NPE in `SparkGraphComputerGroovyProcessIntegrateTest`.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
[GitHub] tinkerpop issue #587: TINKERPOP-1642 Performance Enhancement on Mutating Tra...
Posted by dkuppitz <gi...@git.apache.org>.
Github user dkuppitz commented on the issue:
https://github.com/apache/tinkerpop/pull/587
`docker/build.sh -t -n -i` succeeded.
VOTE: +1
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---