You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@tinkerpop.apache.org by spmallette <gi...@git.apache.org> on 2018/07/27 20:32:21 UTC

[GitHub] tinkerpop pull request #897: TINKERPOP-1967 connectedComponent()

GitHub user spmallette opened a pull request:

    https://github.com/apache/tinkerpop/pull/897

    TINKERPOP-1967 connectedComponent()

    https://issues.apache.org/jira/browse/TINKERPOP-1967
    
    Adds a `connectedComponent()` step. Was previously planning to hold this and #882 until i'd explored TINKERPOP-1991 in more detail, but I don't think adding that for 3.4.0 is reasonable. I think we cover a lot of the common problems/questions that we deal with day-to-day with these two algorithms so while we sorta extend the number of steps in the "core" space, Given that a separate "algorithm" space was determined after both of these steps were done, I think we can just include these for now and then work toward the "algorithm" space later. In the mean time we can look to avoid adding new "algorithm" steps to "core".
    
    All tests pass with `docker/build.sh -t -n -i`
    
    VOTE +1

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/apache/tinkerpop TINKERPOP-1967

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/tinkerpop/pull/897.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #897
    
----

----


---

[GitHub] tinkerpop pull request #897: TINKERPOP-1967 connectedComponent()

Posted by dkuppitz <gi...@git.apache.org>.
Github user dkuppitz commented on a diff in the pull request:

    https://github.com/apache/tinkerpop/pull/897#discussion_r205981525
  
    --- Diff: docs/src/recipes/connected-components.asciidoc ---
    @@ -35,46 +54,92 @@ g.addV().property(id, "A").as("a").
       addE("link").from("d").to("e").iterate()
     ----
     
    -One way to detect the various subgraphs would be to do something like this:
    +==== Small graph traversals
     
    +Connected components in a small graph can be determined with either an OLTP traversal or the OLAP
    +`connectedComponent()`-step. The `connectedComponent()`-step is available as of TinkerPop 3.4.0 and is
    +described in more detail in the
    +link:http://tinkerpop.apache.org/docs/x.y.z/reference/#connectedcomponent-step[Reference Documentation].
    +The traversal looks like:
     [gremlin-groovy,existing]
     ----
    -g.V().emit(cyclicPath().or().not(both())).repeat(both()).until(cyclicPath()).  <1>
    -  path().aggregate("p").                                                       <2>
    -  unfold().dedup().                                                            <3>
    -  map(__.as("v").select("p").unfold().                                         <4>
    -         filter(unfold().where(eq("v"))).
    -         unfold().dedup().order().by(id).fold()).
    -  dedup()                                                                      <5>
    +g.withComputer().V().connectedComponent().
    +    group().by('gremlin.connectedComponentVertexProgram.component').
    --- End diff --
    
    I think you should use the constant `ConnectedComponentVertexProgram.COMPONENT`.


---

[GitHub] tinkerpop pull request #897: TINKERPOP-1967 connectedComponent()

Posted by spmallette <gi...@git.apache.org>.
Github user spmallette commented on a diff in the pull request:

    https://github.com/apache/tinkerpop/pull/897#discussion_r206112146
  
    --- Diff: docs/src/recipes/connected-components.asciidoc ---
    @@ -35,46 +54,92 @@ g.addV().property(id, "A").as("a").
       addE("link").from("d").to("e").iterate()
     ----
     
    -One way to detect the various subgraphs would be to do something like this:
    +==== Small graph traversals
     
    +Connected components in a small graph can be determined with either an OLTP traversal or the OLAP
    +`connectedComponent()`-step. The `connectedComponent()`-step is available as of TinkerPop 3.4.0 and is
    +described in more detail in the
    +link:http://tinkerpop.apache.org/docs/x.y.z/reference/#connectedcomponent-step[Reference Documentation].
    +The traversal looks like:
     [gremlin-groovy,existing]
     ----
    -g.V().emit(cyclicPath().or().not(both())).repeat(both()).until(cyclicPath()).  <1>
    -  path().aggregate("p").                                                       <2>
    -  unfold().dedup().                                                            <3>
    -  map(__.as("v").select("p").unfold().                                         <4>
    -         filter(unfold().where(eq("v"))).
    -         unfold().dedup().order().by(id).fold()).
    -  dedup()                                                                      <5>
    +g.withComputer().V().connectedComponent().
    +    group().by('gremlin.connectedComponentVertexProgram.component').
    --- End diff --
    
    This comment combined with the one from @FlorianHockmann is troublesome because GLVs don't have `ConnectedComponentVertexProgram` so the Gremlin won't work.  I think that's why I wrote the docs this way. Maybe I need to move `COMPONENT` to `ConnectedComponent` and then allow GLVs to have access. I'll do it that way. 
    
    @dkuppitz do you have a similar issue with this?
    
    https://github.com/apache/tinkerpop/pull/882/files#diff-36e52ac0c49a08a7f3e6ee54b60a3745R73


---

[GitHub] tinkerpop issue #897: TINKERPOP-1967 connectedComponent()

Posted by spmallette <gi...@git.apache.org>.
Github user spmallette commented on the issue:

    https://github.com/apache/tinkerpop/pull/897
  
    cool - pushed a test


---

[GitHub] tinkerpop issue #897: TINKERPOP-1967 connectedComponent()

Posted by dkuppitz <gi...@git.apache.org>.
Github user dkuppitz commented on the issue:

    https://github.com/apache/tinkerpop/pull/897
  
    Nice.
    
    VOTE: +1


---

[GitHub] tinkerpop issue #897: TINKERPOP-1967 connectedComponent()

Posted by spmallette <gi...@git.apache.org>.
Github user spmallette commented on the issue:

    https://github.com/apache/tinkerpop/pull/897
  
    @dkuppitz i just pushed a fix for halted traverser stuff - does that look right to you now? or am i still missing something?


---

[GitHub] tinkerpop issue #897: TINKERPOP-1967 connectedComponent()

Posted by dkuppitz <gi...@git.apache.org>.
Github user dkuppitz commented on the issue:

    https://github.com/apache/tinkerpop/pull/897
  
    Mucho mejor.
    
    ```
    gremlin> g.withComputer().V().dedup().connectedComponent().valueMap()
    ==>[gremlin.connectedComponentVertexProgram.component:[1],name:[ripple],lang:[java]]
    ==>[gremlin.connectedComponentVertexProgram.component:[1],name:[marko],age:[29]]
    ==>[gremlin.connectedComponentVertexProgram.component:[1],name:[lop],lang:[java]]
    ==>[gremlin.connectedComponentVertexProgram.component:[1],name:[vadas],age:[27]]
    ==>[gremlin.connectedComponentVertexProgram.component:[1],name:[josh],age:[32]]
    ==>[gremlin.connectedComponentVertexProgram.component:[1],name:[peter],age:[35]]
    ```
    
    Perhaps there should be a test case for this query, just in case somebody's gonna try to optimize the extra halted traverser code away.


---

[GitHub] tinkerpop pull request #897: TINKERPOP-1967 connectedComponent()

Posted by spmallette <gi...@git.apache.org>.
Github user spmallette commented on a diff in the pull request:

    https://github.com/apache/tinkerpop/pull/897#discussion_r206114783
  
    --- Diff: docs/src/recipes/connected-components.asciidoc ---
    @@ -35,46 +54,92 @@ g.addV().property(id, "A").as("a").
       addE("link").from("d").to("e").iterate()
     ----
     
    -One way to detect the various subgraphs would be to do something like this:
    +==== Small graph traversals
     
    +Connected components in a small graph can be determined with either an OLTP traversal or the OLAP
    +`connectedComponent()`-step. The `connectedComponent()`-step is available as of TinkerPop 3.4.0 and is
    +described in more detail in the
    +link:http://tinkerpop.apache.org/docs/x.y.z/reference/#connectedcomponent-step[Reference Documentation].
    +The traversal looks like:
     [gremlin-groovy,existing]
     ----
    -g.V().emit(cyclicPath().or().not(both())).repeat(both()).until(cyclicPath()).  <1>
    -  path().aggregate("p").                                                       <2>
    -  unfold().dedup().                                                            <3>
    -  map(__.as("v").select("p").unfold().                                         <4>
    -         filter(unfold().where(eq("v"))).
    -         unfold().dedup().order().by(id).fold()).
    -  dedup()                                                                      <5>
    +g.withComputer().V().connectedComponent().
    +    group().by('gremlin.connectedComponentVertexProgram.component').
    --- End diff --
    
    Ended up dong this: https://github.com/apache/tinkerpop/commit/627b6e95b24130dea5582903add52c06b7b64a41


---

[GitHub] tinkerpop issue #897: TINKERPOP-1967 connectedComponent()

Posted by FlorianHockmann <gi...@git.apache.org>.
Github user FlorianHockmann commented on the issue:

    https://github.com/apache/tinkerpop/pull/897
  
    Shouldn't we also add the `ConnectedComponent` class to the GLVs so users can easily do something like this in other languages:
    ```java
    g.V().hasLabel('person').
      connectedComponent().
        with(ConnectedComponent.propertyName, 'component').
        with(ConnectedComponent.edges, outE('knows'))
    ```
    ?


---

[GitHub] tinkerpop issue #897: TINKERPOP-1967 connectedComponent()

Posted by spmallette <gi...@git.apache.org>.
Github user spmallette commented on the issue:

    https://github.com/apache/tinkerpop/pull/897
  
    @FlorianHockmann regarding your point - yes, those tokens should be added to GLVs. I think we should wait for #893 to merge because it started the pattern for adding such tokens. I think that approach can be generalized better and then we can implement it for this PR and for the @dkuppitz at #882. It may even be best to simply merge all three PRs and then implement that improvement once they are all in a single branch (i'd be in favor of that approach).


---

[GitHub] tinkerpop pull request #897: TINKERPOP-1967 connectedComponent()

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/tinkerpop/pull/897


---