You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@tinkerpop.apache.org by okram <gi...@git.apache.org> on 2016/02/09 15:47:34 UTC

[GitHub] incubator-tinkerpop pull request: TINKERPOP-971: TraversalSource s...

GitHub user okram opened a pull request:

    https://github.com/apache/incubator-tinkerpop/pull/215

    TINKERPOP-971: TraversalSource should be fluent like GraphComputer

    https://issues.apache.org/jira/browse/TINKERPOP-971
    
    The concept of a `TraversalSource` is now very explicit and used extensively. Any traversal is composed of both a `TraversalSource` and a `Traversal`. Prior to this PR, `TraversalSources` were manipulated via a very awkward/error prone "builder model." Moreover, there were objects that didn't need to exist -- e.g. `TraversalEngine`. With this PR, the API now looks like this:
    
    ```
    g = graph.traversal().withCompute(SparkGraphComputer.class).withStrategies(ReadOnlyStrategy.instance())
    g.V().count()
    ```
    
    Note that `withComputer()` has lots of useful overloads.
    
    ```
    g = graph.traversal().withComputer(h -> h.compute(SparkGraphComputer.class).workers(10))
    ```
    
    This change is "lightly breaking." For users, nothing changes. The previous `Builder` model still exists in a `Deprecated` state. All graph system providers have to do is:
    
    * Implement `XXXGraphProvider.getGraphComputer()` (if and only if they support `GraphComputer`).
    * Change `traverser.getEngine().isGraphComputer()` to `traversal.getStrategies().onGraphComputer()` if they have custom `TraversalStrategy` implementations.
    * Change `implements EngineDependent` to `implements GraphComputing` if they have custom steps that implemented `EngineDependent`.
    
    CHANGELOG:
    
    ```
    * Refactored `TraversalSource` model to allow fluent-method construction of `TraversalSources`.
    * Deprecated the concept of a `TraversalSource.Builder`.
    * Removed the concept of a `TraversalEngine`. All `Traversal` modulations now mediated by `TraversalStrategies`. (*breaking*)
    * Added `SideEffectStrategy` for registering sideEffects in a spawned `Traversal`.
    * Added `SackStrategy` for registering a sack for a spawned `Traversal`.
    * Added `RequirementsStrategy` and `RequirementsStep` for adding dynamic `TraverserRequirements` to a `Traversal`.
    * Removed `EngineDependentStrategy`.
    * Renamed step interface `EngineDependent` to `GraphComputing` with method `onGraphComputer()`. (*breaking*)
    * Cleaned up various `TraversalStrategy` tests now that `TraversalEngine` no longer exists.
    ```
    
    
    


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/apache/incubator-tinkerpop TINKERPOP-971

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/incubator-tinkerpop/pull/215.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #215
    
----
commit 5f2cd6770d56b69c3b0eebf07152315237837f12
Author: Marko A. Rodriguez <ok...@gmail.com>
Date:   2016-02-09T00:35:46Z

    The most brutal refactor. There is no such thing as a TraversalSource.Builder anymore. All there are are TraversalSources. These are simply a wrapper of a TraversalStrategy and a Graph. There is no such thing as a TraversalEngine anymore. What was ComputerTraversalEngine is now simply a strategy of TraversalVertexProgramStrategy. There is no such thing as EngineDependentStrategy -- this is simply TraversalVertexProgramStrategy. In short, all there are are Graphs and TraversalStrategies. So so so so so much cleaner. This WILL NOT be backwards compatible for graph system providers. They will need to update their GraphProvider implementations -- two methods changed. Moreover, if they have an XXXStrategy that calls traversal.getEngine().isComputer(), they will need to change that to traversal.getStrategies().onGraphComputer(). Really simple to for them to fix. It took me less a minute to fix up Spark, Giraph, and Neo4j. These changes WILL be backwards compatible for users at the leve
 l of graph.traversal(computer(SparkGraphComputer)) still working (though Depcrecation introduced). The GraphTraversal.Builder will simply turn into a bunch of TravesalSource calls. It is not used outside of this context.  This commit is NOT complete and will be completed on the next sprint. Moreover, there are lots of cleanups and JavaDocing that I need to do -- I just don't want to lose this work so pushing.

commit 908433faf631f6d585dd2ff662c430da030f2857
Author: Marko A. Rodriguez <ok...@gmail.com>
Date:   2016-02-09T12:36:13Z

    added back in ComputerTraversalEngine, StandardTraversalEngine, and TraversalSource.Builder. All are deprecated but are functional with respects to Graph.traversal(TravesalSource.Builder). Both TinkerGraphProvider and SparkGraphProvider have a 50-percent chance of either using the new traversal source model or the old deprecated builder model. This way, we are certain that, for users, this push is backwards compatible. Note that for graph system providers, there is only one new method they need to implement in their XXXGraphProvider test code -- getGraphComputer(). Thus, this ticket is extremely gentle on both users and providers even though it is a massive refactor.

commit 229474ab17a20de9be93325f8d170d915f60414a
Author: Marko A. Rodriguez <ok...@gmail.com>
Date:   2016-02-09T14:29:08Z

    Added lots of JavaDoc. Lots of organization and clean up. Ready for PR.

commit 2db6faacff947a3ee456585714101d6ce0399f35
Author: Marko A. Rodriguez <ok...@gmail.com>
Date:   2016-02-09T14:42:14Z

    renamed EngineDependent to GraphComputing as there is no longer the concept of a TraversalEngine.

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-tinkerpop pull request: TINKERPOP-971: TraversalSource s...

Posted by dkuppitz <gi...@git.apache.org>.
Github user dkuppitz commented on the pull request:

    https://github.com/apache/incubator-tinkerpop/pull/215#issuecomment-182849111
  
    New code:
    
    * https://github.com/apache/incubator-tinkerpop/pull/215/files#diff-61ffd0f149055394746c1d94ccab52e3R65
    * https://github.com/apache/incubator-tinkerpop/pull/215/files#diff-bd7750c1c674aeb94e058006af320abfR50
    * https://github.com/apache/incubator-tinkerpop/pull/215/files#diff-61f8ab6e66876a272a1fa1f3556b3012R51
    * https://github.com/apache/incubator-tinkerpop/pull/215/files#diff-3f79b4cdbc46d5f73ad1864f7d4402a5R80
    * https://github.com/apache/incubator-tinkerpop/pull/215/files#diff-34c81b4d45231b961aa8f748f3f4a339R72
    * https://github.com/apache/incubator-tinkerpop/pull/215/files#diff-94d7c8c44b3975d2312b04c7f54e500cR47
    * https://github.com/apache/incubator-tinkerpop/pull/215/files#diff-b6bca124a8cddd1cd9a8c0b004cd1634R53
    
    Old code, but touched by the commit:
    
    *  https://github.com/apache/incubator-tinkerpop/pull/215/files#diff-0f4959bc7926d8e5fbd6395e96f19502R76
    
    At least some of these code lines were probably copied from somewhere else as part of the refactoring process. However, as said before, it would probably be better to get rid of `stream()` once we stumble across it in the code we're working on.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-tinkerpop pull request: TINKERPOP-971: TraversalSource s...

Posted by dkuppitz <gi...@git.apache.org>.
Github user dkuppitz commented on the pull request:

    https://github.com/apache/incubator-tinkerpop/pull/215#issuecomment-182685148
  
    * `mvn clean install`: passed
    * integration tests (incl. Neo4j): passed
    
    VOTE: +1
    
    One negative note though: Strategies are mainly respsonsible for a significant performance gap between graph methods and their respective traversal methods. As we've seen in the past, `.stream()...` performs terrible compared to an old school `for` loop, hence I don't think we should use `.stream()` so frequently in new code. Better we get rid of `.stream()` altogether.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-tinkerpop pull request: TINKERPOP-971: TraversalSource s...

Posted by spmallette <gi...@git.apache.org>.
Github user spmallette commented on the pull request:

    https://github.com/apache/incubator-tinkerpop/pull/215#issuecomment-182948628
  
    Did a basic build with `mvn clean install` and reviewed docs - looks good:
    
    VOTE +1 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-tinkerpop pull request: TINKERPOP-971: TraversalSource s...

Posted by okram <gi...@git.apache.org>.
Github user okram commented on the pull request:

    https://github.com/apache/incubator-tinkerpop/pull/215#issuecomment-182636140
  
    Note that when TINKERPOP-1140 is completed, `traversal.getStrategies().onGraphComputer()` will be replaced by `TraversalHelper.onGraphComputer(traversal)`. Minor note as this is one of the breaking changes for providers.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-tinkerpop pull request: TINKERPOP-971: TraversalSource s...

Posted by okram <gi...@git.apache.org>.
Github user okram commented on the pull request:

    https://github.com/apache/incubator-tinkerpop/pull/215#issuecomment-182853040
  
    A response to each item in your list.
    
    * This will go away when TINKERPOP-1140 is complete. This is a "staging method" for now.
    * This is not a critical performance line of code. This is during `TraversalSource` construction which is like saying "a database connection creation method uses stream."
    * Same as previous.
    * Same as previous -- submitting a `GraphComputer` job is crazy costly relative to this simple `stream()`.
    * This will go away when TINKERPOP-1140 is complete. This is a "staging method" for now.
    * This is test suite setup code. Performance doesn't matter.
    * Same as previous.
    
    To your "old code" reference:
    
    * Yes, we should not use `stream()` here. This class needs to be blazing fast. I believe you and I are the author of this class and we should fix it. That is for another ticket.
    
    NOTE: We should also remove all the `stream()` work in `StarGraph` (there is a butt load).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-tinkerpop pull request: TINKERPOP-971: TraversalSource s...

Posted by okram <gi...@git.apache.org>.
Github user okram commented on the pull request:

    https://github.com/apache/incubator-tinkerpop/pull/215#issuecomment-182069459
  
    ```
    [INFO] ------------------------------------------------------------------------
    [INFO] BUILD SUCCESS
    [INFO] ------------------------------------------------------------------------
    [INFO] Total time: 3:16:47.194s
    [INFO] Finished at: Tue Feb 09 14:06:43 MST 2016
    [INFO] Final Memory: 96M/731M
    [INFO] ------------------------------------------------------------------------
    ```


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-tinkerpop pull request: TINKERPOP-971: TraversalSource s...

Posted by okram <gi...@git.apache.org>.
Github user okram commented on the pull request:

    https://github.com/apache/incubator-tinkerpop/pull/215#issuecomment-182811160
  
    @dkuppitz -- where in particular are you seeing the use of `stream()`?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-tinkerpop pull request: TINKERPOP-971: TraversalSource s...

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/incubator-tinkerpop/pull/215


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-tinkerpop pull request: TINKERPOP-971: TraversalSource s...

Posted by okram <gi...@git.apache.org>.
Github user okram commented on the pull request:

    https://github.com/apache/incubator-tinkerpop/pull/215#issuecomment-181957720
  
    VOTE +1. `mvn clean install`, integration tested, and built/published docs.
    
    You can review the docs here: http://tinkerpop.apache.org/docs/3.2.0-SNAPSHOT/reference/


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---