You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@tinkerpop.apache.org by okram <gi...@git.apache.org> on 2015/12/07 19:32:21 UTC

[GitHub] incubator-tinkerpop pull request: TINKERPOP-1027: Merge view prior...

GitHub user okram opened a pull request:

    https://github.com/apache/incubator-tinkerpop/pull/172

    TINKERPOP-1027: Merge view prior to writing graphRDD to output format/rdd

    https://issues.apache.org/jira/browse/TINKERPOP-1027
    
    We had a bug in Spark `graphRDD` writing that showed itself on for particular providers. @dalaro provided realized the problem and provided a solution. This PR implements @dalaro's recommended fix. This fix also removes the need for `reduceByKey()` (though backwards compatible if you do still have it) and allowed us to always use `GryoSerialization` with Spark. This is rad. I added a few more required serialization registers to `GryoSerialization` and all the test cases pass. I also added some more test cases to ensure proper functioning.
    
    * Spark integration tests passed.
    * `mvn clean install` passed.
    
    VOTE +1.


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/apache/incubator-tinkerpop TINKERPOP-1027

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/incubator-tinkerpop/pull/172.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #172
    
----
commit 5c7bc38bdb42ae50243f58a22fc74bc094be6333
Author: Marko A. Rodriguez <ok...@gmail.com>
Date:   2015-12-04T15:36:08Z

    mapReduceRDD makes use of a post view merge. @dalaro realized this was important prior to graph writing. Thus, moved the view merge to pre-mapreduce and pre-graph output. Added more rigorous property checking to PageRankVertexProgramTest. InputFormatRDD and ToyGraphInputRDD no longer require reduceByKey() initiation because of merged veiws.

commit e45c293425ed4d9c317b5efbb3a81a9874f7e0e6
Author: Marko A. Rodriguez <ok...@gmail.com>
Date:   2015-12-04T18:14:50Z

    numerous tweaks trying to get things clean and clear. Added more tests to PersistedInputOutputRDDTest that show good long chain vertex programs with various degrees of Persist and Hadoop OLTP access, etc. Looking good. Still BulkLoaderVertexProgram problem with InputRDD... don't know what the problem is still (unfortunately).

commit 42bcd89d7cd3d297d958ad22919377e94a149b0e
Author: Marko A. Rodriguez <ok...@gmail.com>
Date:   2015-12-07T18:14:29Z

    Merge branch 'TINKERPOP-1025' into TINKERPOP-1027

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-tinkerpop pull request: TINKERPOP-1027: Merge view prior...

Posted by dalaro <gi...@git.apache.org>.
Github user dalaro commented on the pull request:

    https://github.com/apache/incubator-tinkerpop/pull/172#issuecomment-162618167
  
    [My comment on TINKERPOP-1025](https://issues.apache.org/jira/browse/TINKERPOP-1025?focusedCommentId=15045307&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15045307) now applies to this PR.   The change involving {{prepareFinalGraphRDD}} is what matters to me, and when I tested a TINKERPOP-1025 HEAD that includes both of these commits, it solved my problem.  So, +1 (non-voting) from me.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-tinkerpop pull request: TINKERPOP-1027: Merge view prior...

Posted by twilmes <gi...@git.apache.org>.
Github user twilmes commented on the pull request:

    https://github.com/apache/incubator-tinkerpop/pull/172#issuecomment-162737584
  
    Awesome, looks good and tests pass for me.
    
    VOTE +1


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-tinkerpop pull request: TINKERPOP-1027: Merge view prior...

Posted by okram <gi...@git.apache.org>.
Github user okram commented on the pull request:

    https://github.com/apache/incubator-tinkerpop/pull/172#issuecomment-162709345
  
    I made it so that SparkGremlin works like Spark JobServer (https://github.com/spark-jobserver/spark-jobserver/). It ensures that RDDs are not garbage collected by maintaining a static `Spark` class that holds a `ConcurrentHashMap` of RDDs. Thus, Spark is like a "file system" in that RDDs can be `ls()`, `rm()`, etc. This was necessary to get "slow" `mvn clean install` to build correctly where RDDs are NOT GC'd by Spark context cleaner.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-tinkerpop pull request: TINKERPOP-1027: Merge view prior...

Posted by spmallette <gi...@git.apache.org>.
Github user spmallette commented on the pull request:

    https://github.com/apache/incubator-tinkerpop/pull/172#issuecomment-162867159
  
    tests run locally for me:
    
    VOTE +1
    
    btw, your intellij settings might need to get fixed up - they are doing wildcards for static imports:
    
    https://github.com/apache/incubator-tinkerpop/pull/172/files#diff-889af44a6f21b3700e537cc41765435aR40


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-tinkerpop pull request: TINKERPOP-1027: Merge view prior...

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/incubator-tinkerpop/pull/172


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-tinkerpop pull request: TINKERPOP-1027: Merge view prior...

Posted by okram <gi...@git.apache.org>.
Github user okram commented on the pull request:

    https://github.com/apache/incubator-tinkerpop/pull/172#issuecomment-162616761
  
    NOTE: I originally had this work in TINKERPOP-1025, but that ticket is a completely different beast so I merged the work into a new ticket. TINKERPOP-1025 is still being dealt with.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-tinkerpop pull request: TINKERPOP-1027: Merge view prior...

Posted by ds-jenkins-builds <gi...@git.apache.org>.
Github user ds-jenkins-builds commented on the pull request:

    https://github.com/apache/incubator-tinkerpop/pull/172#issuecomment-162858822
  
    Build finished. No test results found.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---