You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@tinkerpop.apache.org by graben1437 <gi...@git.apache.org> on 2015/06/09 20:44:05 UTC

[GitHub] incubator-tinkerpop pull request: TINKERPOP-714 - remove jsr305 fr...

GitHub user graben1437 opened a pull request:

    https://github.com/apache/incubator-tinkerpop/pull/74

    TINKERPOP-714 - remove jsr305 from standalone and job jar

    excluding jsr305 from packaging and standalone because it has a creative commons license in it.
    Plus, as a static analysis jar, this should not be needed for run times.
    
    Here are the results from the mvn clean install (with test cases):
    [INFO] ------------------------------------------------------------------------
    [INFO] Reactor Summary:
    [INFO] 
    [INFO] Apache TinkerPop ................................... SUCCESS [  2.920 s]
    [INFO] Apache TinkerPop :: Gremlin Shaded ................. SUCCESS [  1.032 s]
    [INFO] Apache TinkerPop :: Gremlin Core ................... SUCCESS [ 25.428 s]
    [INFO] Apache TinkerPop :: Gremlin Test ................... SUCCESS [ 10.694 s]
    [INFO] Apache TinkerPop :: Gremlin Groovy ................. SUCCESS [ 23.034 s]
    [INFO] Apache TinkerPop :: Gremlin Groovy Test ............ SUCCESS [  4.931 s]
    [INFO] Apache TinkerPop :: TinkerGraph Gremlin ............ SUCCESS [ 41.566 s]
    [INFO] Apache TinkerPop :: Hadoop Gremlin ................. SUCCESS [02:23 min]
    [INFO] Apache TinkerPop :: Neo4j Gremlin .................. SUCCESS [  2.809 s]
    [INFO] Apache TinkerPop :: Gremlin Driver ................. SUCCESS [  4.425 s]
    [INFO] Apache TinkerPop :: Gremlin Server ................. SUCCESS [  4.417 s]
    [INFO] Apache TinkerPop :: Gremlin Console ................ SUCCESS [ 15.072 s]
    [INFO] ------------------------------------------------------------------------
    [INFO] BUILD SUCCESS


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/graben1437/incubator-tinkerpop tinkerpop-714

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/incubator-tinkerpop/pull/74.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #74
    
----
commit 79d3166db111c1048f2beb77dfde74a9007d825a
Author: David Robinson <dr...@gmail.com>
Date:   2015-06-09T18:35:26Z

    TINKERPOP-714 - remove jsr305 from standalone and job jar

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-tinkerpop pull request: TINKERPOP-714 - remove jsr305 fr...

Posted by okram <gi...@git.apache.org>.
Github user okram commented on the pull request:

    https://github.com/apache/incubator-tinkerpop/pull/74#issuecomment-113270752
  
    Hi -- I just do single machine and typically use the grateful-dead dataset. However, for Hadoop and Spark, make sure you have multiple workers so you test worker communication as that is where the jar issues usually make themselves apparent --- netty, jetty, javax-servelt... those are usually the culprits.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-tinkerpop pull request: TINKERPOP-714 - remove jsr305 fr...

Posted by graben1437 <gi...@git.apache.org>.
Github user graben1437 commented on the pull request:

    https://github.com/apache/incubator-tinkerpop/pull/74#issuecomment-120118202
  
    Cancelling and will submit a pull request that moves logic into the pom as request.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-tinkerpop pull request: TINKERPOP-714 - remove jsr305 fr...

Posted by spmallette <gi...@git.apache.org>.
Github user spmallette commented on the pull request:

    https://github.com/apache/incubator-tinkerpop/pull/74#issuecomment-110464218
  
    Can you please pass along the results of the integration tests?  The unit tests don't tell us much about the integrity of the build.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-tinkerpop pull request: TINKERPOP-714 - remove jsr305 fr...

Posted by spmallette <gi...@git.apache.org>.
Github user spmallette commented on the pull request:

    https://github.com/apache/incubator-tinkerpop/pull/74#issuecomment-113230660
  
    it hasn't been merged yet actually - i just closed #72 which was replaced by this one.  i assume we can accept this since the integration tests pass.  
    
    @okram you know the `gremlin-hadoop` better than I - you ok with my merging this?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-tinkerpop pull request: TINKERPOP-714 - remove jsr305 fr...

Posted by graben1437 <gi...@git.apache.org>.
Github user graben1437 closed the pull request at:

    https://github.com/apache/incubator-tinkerpop/pull/74


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-tinkerpop pull request: TINKERPOP-714 - remove jsr305 fr...

Posted by graben1437 <gi...@git.apache.org>.
Github user graben1437 commented on the pull request:

    https://github.com/apache/incubator-tinkerpop/pull/74#issuecomment-113267484
  
    Thanks for the guidance.  I will get started.  Question on the data set and test cases for the stand alone servers.  Can I re-use integration test data sets and queries for the stand alone environment tests or should I use something different ?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-tinkerpop pull request: TINKERPOP-714 - remove jsr305 fr...

Posted by okram <gi...@git.apache.org>.
Github user okram commented on the pull request:

    https://github.com/apache/incubator-tinkerpop/pull/74#issuecomment-113236075
  
    If you are messing with Gremlin-Hadoop dependencies then I would not only verify against the integration tests, but also against standalone servers of both Hadoop and SparkServer as that is where the `<dependency>`-exclusion balancing act makes itself apparent. If everything still works (no weird ClassNotFound, MethodNotFound, etc. exceptions) on standard Gremlin OLAP queries, then it should be good to merge.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-tinkerpop pull request: TINKERPOP-714 - remove jsr305 fr...

Posted by okram <gi...@git.apache.org>.
Github user okram commented on the pull request:

    https://github.com/apache/incubator-tinkerpop/pull/74#issuecomment-117803926
  
    Looks good. If `GiraphGraphComputer` works, then we can merge. Thanks.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-tinkerpop pull request: TINKERPOP-714 - remove jsr305 fr...

Posted by graben1437 <gi...@git.apache.org>.
Github user graben1437 commented on the pull request:

    https://github.com/apache/incubator-tinkerpop/pull/74#issuecomment-117801830
  
    For the SparkGraphComputer testing:
    
    Set up Spark 1.2.1 (prebuild) in standalone cluster mode with 1 master and 2 workers.  
    Started the master and the workers and verified they were running via command line and Spark UI.
    
    On the same node, I built the latest (as of 7/1) TinkerPop3 with the JSR patch in this pull request in place.  Verified that the jsr305 jar was not found under the distribution:
    /home/..../tinkerpop3/incubator-tinkerpop/hadoop-gremlin
    find . -name jsr305*
    <<nothing returned>>
    Without the fix in place the following are the results of the find command:
    ./hadoop-gremlin/target/hadoop-gremlin-3.0.0-SNAPSHOT-standalone/lib/jsr305-1.3.9.jar
    
    Also grepped the jar files to verify that jsr305 is not packaged:
    grep jsr305 *.jar
    << no output>>
    
    The following is the output when the jsr305 is present:
    ./incubator-tinkerpop/hadoop-gremlin/target
    grep jsr305 *.jar
    Binary file hadoop-gremlin-3.0.0-SNAPSHOT-job.jar matches
    
    At the very end, after testing, I went to the spark-1.2.1/work directory
    and ran the following command to verify that jsr305 was not in the "jar loads"
    being sent to Spark:
    find . -name *.jar -exec grep -H jsr305 {} \;
    << returned nothing>>
    
    Next:
    Under gremlin-console/target I unzipped apache-gremlin-console-3.0.0-SNAPSHOT-distribution.zip
    cd apache-gremlin-console-3.0.0-SNAPSHOT
    vi conf/hadoop-gryo.properties 
    In that file change:
    #spark.master=local[4]
    spark.master=spark://machine1.xx.xxx.xxx.com:7077 
    which is the Spark master indicated by the  1.2.1master started above.
    
    I also copied ./ext/hadoop-gremlin/lib jar files over the ./lib files to eliminate Spark errors about class serialization.
    
    The following queries were performed to validate the output was correct as well as checking the Spark master and worker logs to make sure no exceptions were thrown:
    
    bin/gremlin.sh
    
             \,,,/
             (o o)
    -----oOOo-(3)-oOOo-----
    plugin activated: tinkerpop.server
    plugin activated: tinkerpop.utilities
    INFO  org.apache.tinkerpop.gremlin.hadoop.structure.HadoopGraph  - HADOOP_GREMLIN_LIBS is set to: /home/..../tinkerpop3/incubator-tinkerpop/gremlin-console/target/apache-gremlin-console-3.0.0-SNAPSHOT/ext/hadoop-gremlin/lib
    plugin activated: tinkerpop.hadoop
    plugin activated: tinkerpop.tinkergraph
    graph = GraphFactory.open('/home/..../tinkerpop3/incubator-tinkerpop/gremlin-console/target/apache-gremlin-console-3.0.0-SNAPSHOT/conf/hadoop/hadoop-gryo.properties')
    ==>hadoopgraph[gryoinputformat->gryooutputformat]
    gremlin> g=graph.traversal(computer(SparkGraphComputer))
    ==>graphtraversalsource[hadoopgraph[gryoinputformat->gryooutputformat], sparkgraphcomputer]
    gremlin> g.V().count()
    ==>6
    gremlin> g.V().group().by(bothE().count()) 
    ==>[1:[v[6], v[5], v[2]], 3:[v[4], v[1], v[3]]]
    gremlin> g.V().groupCount('a').by(label).cap('a')
    ==>[software:2, person:4]
    gremlin> g.V().range(0,3)
    WARN  org.apache.hadoop.util.NativeCodeLoader  - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
    WARN  org.apache.hadoop.io.compress.snappy.LoadSnappy  - Snappy native library not loaded
    ==>v[4]
    ==>v[1]
    ==>v[6]
    
    Based on this small sample of queries running against a stand alone Spark, it appears that removing the jsr305.jar from the standalone and/or distribution jar does not adversely impact use of the SparkGraphComputer functionality.
    
    I will test the GiraphGraphComputer next, assuming this all looks correct here.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-tinkerpop pull request: TINKERPOP-714 - remove jsr305 fr...

Posted by graben1437 <gi...@git.apache.org>.
Github user graben1437 commented on the pull request:

    https://github.com/apache/incubator-tinkerpop/pull/74#issuecomment-113268772
  
    Will a single machine standalone environment work for Spark/Hadoop in this case or do you suggest a small cluster for these tests ?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-tinkerpop pull request: TINKERPOP-714 - remove jsr305 fr...

Posted by graben1437 <gi...@git.apache.org>.
Github user graben1437 commented on the pull request:

    https://github.com/apache/incubator-tinkerpop/pull/74#issuecomment-113187671
  
    Hey,  
    Looks like this pull request was accepted.  Thanks !  Is this Jira ready to be closed - or what is the process at this point ?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---