You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@kudu.apache.org by "Dan Burkert (Code Review)" <ge...@cloudera.org> on 2017/03/18 00:57:41 UTC

[kudu-CR] Spark ITBLL

Hello Jean-Daniel Cryans,

I'd like you to do a code review.  Please visit

    http://gerrit.cloudera.org:8080/6419

to review the following change.

Change subject: Spark ITBLL
......................................................................

Spark ITBLL

This adds a new end-to-end test for Spark based on the existing
MapReduce ITBLL test. Common parts of the two have been abstracted into
a utility class.

The implementation is only compatible with Spark2/Scala2.11, so the new
submodule is deactivated when compiling with the Spark1/Scala2.10
profile.

The new job's command line arguments are different (and hopefully
simpler) than the MR ITBLL, but the generated tables are designed to be
interoperable, meaning the MR verify job should work on tables generated
with the Spark generator, and vice versa.

Change-Id: I968fb236f1e93e548db9fd79443912c664e06a1f
---
M build-support/jenkins/build-and-test.sh
A java/kudu-client-tools/src/main/java/org/apache/kudu/mapreduce/tools/BigLinkedListCommon.java
M java/kudu-client-tools/src/main/java/org/apache/kudu/mapreduce/tools/IntegrationTestBigLinkedList.java
A java/kudu-spark-tools/pom.xml
A java/kudu-spark-tools/src/main/scala/org/apache/kudu/spark/tools/IntegrationTestBigLinkedList.scala
A java/kudu-spark-tools/src/test/resources/log4j.properties
A java/kudu-spark-tools/src/test/scala/org/apache/kudu/spark/tools/IntegrationTestBigLinkedListTest.scala
M java/kudu-spark/pom.xml
M java/pom.xml
9 files changed, 1,023 insertions(+), 231 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/19/6419/1
-- 
To view, visit http://gerrit.cloudera.org:8080/6419
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: I968fb236f1e93e548db9fd79443912c664e06a1f
Gerrit-PatchSet: 1
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Dan Burkert <da...@apache.org>
Gerrit-Reviewer: Jean-Daniel Cryans <jd...@apache.org>

[kudu-CR] Spark ITBLL

Posted by "Andrew Wong (Code Review)" <ge...@cloudera.org>.
Andrew Wong has posted comments on this change.

Change subject: Spark ITBLL
......................................................................


Patch Set 2:

(12 comments)

http://gerrit.cloudera.org:8080/#/c/6419/2/build-support/jenkins/build-and-test.sh
File build-support/jenkins/build-and-test.sh:

Line 368:     #   - org.apache.kudu.spark.kudu.* - kudu-spark tests
It's unclear to me what these dashes indicate. Can you elaborate?


http://gerrit.cloudera.org:8080/#/c/6419/2/java/kudu-client-tools/src/main/java/org/apache/kudu/mapreduce/tools/BigLinkedListCommon.java
File java/kudu-client-tools/src/main/java/org/apache/kudu/mapreduce/tools/BigLinkedListCommon.java:

Line 37:  * should be kept here.
If the goal is to keep the bare minimum that needs to be in-sync for ITBLL up-to-date, I think it'd be a good idea to only keep things that are related to creating/referencing the lists via kudu (e.g. column names, schema builders, etc.) in this file and leave the more implementation-specific things out of it.

I see the potential value in keeping everything here, but there are a few declarations here that don't see the light of day in the scala version.


PS2, Line 85: public static final int WIDTH_DEFAULT = 1000000;
            :   public static final int WRAP_DEFAULT = 25;
Documentation for these would be nice.


Line 101:     EXTRAREFERENCES,
nit: the scala version uses OVERREFERENCED instead. Maybe update the java for consistency. It's fairly clear either way, but it's a little weird that the scala version has access to this.


Line 156:   public static <T> void circularLeftShift(T[] first) {
This doesn't seem to be explicitly related to ITBLL, not sure if it belongs here. Not used in scala ITBLL either.


http://gerrit.cloudera.org:8080/#/c/6419/2/java/kudu-spark-tools/src/main/scala/org/apache/kudu/spark/tools/IntegrationTestBigLinkedList.scala
File java/kudu-spark-tools/src/main/scala/org/apache/kudu/spark/tools/IntegrationTestBigLinkedList.scala:

Line 35:   *   * Currently, only the generator and verifier jobs are implemented.
Also the looper. What's missing?


PS2, Line 48: generate
Is this "generate" needed?


PS2, Line 109: s, if they don't exist
nit: should be singular


PS2, Line 182:  def createBatch(batchSize: Int, rand: Xoroshiro128PlusRandom): Array[(Long, Long)] = {
             :     (0 until batchSize).map(_ => rand.nextLong() -> rand.nextLong()).toArray
             :   }
Deprecated?


PS2, Line 395: ef testMain(arguments: Array[String], sc: SparkContext): Counts = {
             :     run(Args.parse(arguments), sc)
             :   }
I think there should be a note somewhere in either testMain or run noting that these don't "verify" so much as they do count references.


PS2, Line 407: if (args.nodes.exists(_ != counts.referenced)) {
             :       System.err.println(s"Found ${counts.referenced} referenced nodes, " +
             :                          s"which does not match the expected count of ${args.nodes.get} nodes")
             :       success = false
             :     }
             : 
             :     if (counts.unreferenced > 0) {
             :       System.err.println(s"Found ${counts.unreferenced} unreferenced nodes")
             :       success = false
             :     }
             : 
             :     if (counts.undefined > 0) {
             :       System.err.println(s"Found ${counts.undefined} undefined nodes")
             :       success = false
             :     }
             : 
             :     if (counts.overreferenced > 0) {
             :       System.err.println(s"Found ${counts.overreferenced} over-referenced nodes")
             :       success = false
             :     }
             : 
             :     if (!success) {
             :       System.exit(1)
             :     }
Maybe put this in a configurable @VisibleForTesting validate() function for use here and in ITBLLTest, or have run actually validate with some expected counts as input args.


Line 445:       Verifier.run(verifyArgs, sc)
If you do end up adding a validate(), here would be a good place to use it as well.


-- 
To view, visit http://gerrit.cloudera.org:8080/6419
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I968fb236f1e93e548db9fd79443912c664e06a1f
Gerrit-PatchSet: 2
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Dan Burkert <da...@apache.org>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Dan Burkert <da...@apache.org>
Gerrit-Reviewer: Jean-Daniel Cryans <jd...@apache.org>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-HasComments: Yes

[kudu-CR] Spark ITBLL

Posted by "Dan Burkert (Code Review)" <ge...@cloudera.org>.
Dan Burkert has posted comments on this change.

Change subject: Spark ITBLL
......................................................................


Patch Set 2:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/6419/1/java/kudu-client-tools/src/main/java/org/apache/kudu/mapreduce/tools/BigLinkedListCommon.java
File java/kudu-client-tools/src/main/java/org/apache/kudu/mapreduce/tools/BigLinkedListCommon.java:

Line 37:  * should be kept here.
> Add InterfaceAudience
Done


http://gerrit.cloudera.org:8080/#/c/6419/1/java/kudu-spark-tools/src/main/scala/org/apache/kudu/spark/tools/IntegrationTestBigLinkedList.scala
File java/kudu-spark-tools/src/main/scala/org/apache/kudu/spark/tools/IntegrationTestBigLinkedList.scala:

Line 42:   */
> Also missing something to make it Loop. It's handy and usually that's how p
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/6419
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I968fb236f1e93e548db9fd79443912c664e06a1f
Gerrit-PatchSet: 2
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Dan Burkert <da...@apache.org>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Dan Burkert <da...@apache.org>
Gerrit-Reviewer: Jean-Daniel Cryans <jd...@apache.org>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-HasComments: Yes

[kudu-CR] Spark ITBLL

Posted by "Todd Lipcon (Code Review)" <ge...@cloudera.org>.
Todd Lipcon has posted comments on this change.

Change subject: Spark ITBLL
......................................................................


Patch Set 2:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/6419/2/java/kudu-spark-tools/src/main/scala/org/apache/kudu/spark/tools/IntegrationTestBigLinkedList.scala
File java/kudu-spark-tools/src/main/scala/org/apache/kudu/spark/tools/IntegrationTestBigLinkedList.scala:

Line 275:        | Usage: verify [--nodes] [--master-addrs] [--table-name]
hrm, I think this would read better as "--nodes=<nodes> --master-addrs=<addrs>" etc. As is they don't look like flags with values


Line 439:     val genArgs = Generator.Args.parse(args)
Maybe I'm doing something wrong, but when I run:

spark2-submit  --class org.apache.kudu.spark.tools.IntegrationTestBigLinkedList ./kudu-spark-tools-1.4.0-SNAPSHOT.jar loop --help

I don't get any help message. It seems to just ignore the --help and go about its way with default configuration.


Line 444:       Generator.run(genArgs, sc)
it would be nice if there were some visible progress reports along the way (eg log messages with some extra newlines) to see progress. As is, I just see lots of spark logs but no real way to see how far it has gone, etc.


-- 
To view, visit http://gerrit.cloudera.org:8080/6419
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I968fb236f1e93e548db9fd79443912c664e06a1f
Gerrit-PatchSet: 2
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Dan Burkert <da...@apache.org>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Dan Burkert <da...@apache.org>
Gerrit-Reviewer: Jean-Daniel Cryans <jd...@apache.org>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-HasComments: Yes

[kudu-CR] Spark ITBLL

Posted by "Jean-Daniel Cryans (Code Review)" <ge...@cloudera.org>.
Jean-Daniel Cryans has posted comments on this change.

Change subject: Spark ITBLL
......................................................................


Patch Set 3: Code-Review+2

-- 
To view, visit http://gerrit.cloudera.org:8080/6419
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I968fb236f1e93e548db9fd79443912c664e06a1f
Gerrit-PatchSet: 3
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Dan Burkert <da...@apache.org>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Dan Burkert <da...@apache.org>
Gerrit-Reviewer: Jean-Daniel Cryans <jd...@apache.org>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-HasComments: No

[kudu-CR] Spark ITBLL

Posted by "Andrew Wong (Code Review)" <ge...@cloudera.org>.
Andrew Wong has posted comments on this change.

Change subject: Spark ITBLL
......................................................................


Patch Set 2:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/6419/2/java/kudu-spark-tools/src/main/scala/org/apache/kudu/spark/tools/IntegrationTestBigLinkedList.scala
File java/kudu-spark-tools/src/main/scala/org/apache/kudu/spark/tools/IntegrationTestBigLinkedList.scala:

PS2, Line 54: e counts after running REFERENCED and
            :        |                    UNREFERENCED are ok any UNDEFINED
nit: add commas


-- 
To view, visit http://gerrit.cloudera.org:8080/6419
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I968fb236f1e93e548db9fd79443912c664e06a1f
Gerrit-PatchSet: 2
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Dan Burkert <da...@apache.org>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Dan Burkert <da...@apache.org>
Gerrit-Reviewer: Jean-Daniel Cryans <jd...@apache.org>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-HasComments: Yes

[kudu-CR] Spark ITBLL

Posted by "Dan Burkert (Code Review)" <ge...@cloudera.org>.
Dan Burkert has posted comments on this change.

Change subject: Spark ITBLL
......................................................................


Patch Set 3:

(14 comments)

http://gerrit.cloudera.org:8080/#/c/6419/2/build-support/jenkins/build-and-test.sh
File build-support/jenkins/build-and-test.sh:

Line 368:     FAILURES="$FAILURES"$'spark2 build/test failed\n'
> It's unclear to me what these dashes indicate. Can you elaborate?
ah, this was a poor attempt at markdown-like formatting.  I think just adding kudu-spark-tools to the above comment is probably better.


http://gerrit.cloudera.org:8080/#/c/6419/2/java/kudu-client-tools/src/main/java/org/apache/kudu/mapreduce/tools/BigLinkedListCommon.java
File java/kudu-client-tools/src/main/java/org/apache/kudu/mapreduce/tools/BigLinkedListCommon.java:

Line 37:  * should be kept here.
> If the goal is to keep the bare minimum that needs to be in-sync for ITBLL 
Done


PS2, Line 85: }
            : 
> Documentation for these would be nice.
this has been moved back to previous location


Line 101:         new ColumnSchema.ColumnSchemaBuilder(COLUMN_KEY_ONE, Type.INT64).key(true).build(),
> nit: the scala version uses OVERREFERENCED instead. Maybe update the java f
Done


Line 156:       state0 = z ^ (z >>> 31);
> This doesn't seem to be explicitly related to ITBLL, not sure if it belongs
Done


http://gerrit.cloudera.org:8080/#/c/6419/2/java/kudu-spark-tools/src/main/scala/org/apache/kudu/spark/tools/IntegrationTestBigLinkedList.scala
File java/kudu-spark-tools/src/main/scala/org/apache/kudu/spark/tools/IntegrationTestBigLinkedList.scala:

Line 35:   *   * Currently, only the generator and verifier jobs are implemented.
> Also the looper. What's missing?
Print, Updater, and Walker.


PS2, Line 48: COMMAND 
> Is this "generate" needed?
Done


PS2, Line 54: job if any UNDEFINED, UNREFERENCED, or
            :        |                    EXTRAREFERENCES nodes are found. 
> nit: add commas
Done


PS2, Line 109: e for the new linked l
> nit: should be singular
Done


PS2, Line 182:  def testMain(args: Array[String], sc: SparkContext): Unit = {
             :     run(Args.parse(args), sc)
             :   }
> Deprecated?
Done


Line 275:        | Usage: verify --nodes=<nodes> --master-addrs=<master-addrs> --table-name=<table-name>
> hrm, I think this would read better as "--nodes=<nodes> --master-addrs=<add
Done


Line 439:                                    tableName = genArgs.tableName)
> Maybe I'm doing something wrong, but when I run:
Done


Line 444:       val count = Verifier.run(verifyArgs, sc)
> it would be nice if there were some visible progress reports along the way 
Done


Line 445:       val expected = verifyArgs.nodes.map(_ + nodesPerLoop)
> If you do end up adding a validate(), here would be a good place to use it 
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/6419
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I968fb236f1e93e548db9fd79443912c664e06a1f
Gerrit-PatchSet: 3
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Dan Burkert <da...@apache.org>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Dan Burkert <da...@apache.org>
Gerrit-Reviewer: Jean-Daniel Cryans <jd...@apache.org>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-HasComments: Yes

[kudu-CR] Spark ITBLL

Posted by "Dan Burkert (Code Review)" <ge...@cloudera.org>.
Dan Burkert has submitted this change and it was merged.

Change subject: Spark ITBLL
......................................................................


Spark ITBLL

This adds a new end-to-end test for Spark based on the existing
MapReduce ITBLL test. Common parts of the two have been abstracted into
a utility class.

The implementation is only compatible with Spark2/Scala2.11, so the new
submodule is deactivated when compiling with the Spark1/Scala2.10
profile.

The new job's command line arguments are different (and hopefully
simpler) than the MR ITBLL, but the generated tables are designed to be
interoperable, meaning the MR verify job should work on tables generated
with the Spark generator, and vice versa.

Change-Id: I968fb236f1e93e548db9fd79443912c664e06a1f
Reviewed-on: http://gerrit.cloudera.org:8080/6419
Tested-by: Kudu Jenkins
Reviewed-by: Jean-Daniel Cryans <jd...@apache.org>
---
M build-support/jenkins/build-and-test.sh
A java/kudu-client-tools/src/main/java/org/apache/kudu/mapreduce/tools/BigLinkedListCommon.java
M java/kudu-client-tools/src/main/java/org/apache/kudu/mapreduce/tools/IntegrationTestBigLinkedList.java
A java/kudu-spark-tools/pom.xml
A java/kudu-spark-tools/src/main/scala/org/apache/kudu/spark/tools/IntegrationTestBigLinkedList.scala
A java/kudu-spark-tools/src/test/resources/log4j.properties
A java/kudu-spark-tools/src/test/scala/org/apache/kudu/spark/tools/IntegrationTestBigLinkedListTest.scala
M java/kudu-spark/pom.xml
M java/kudu-spark/src/main/scala/org/apache/kudu/spark/kudu/DefaultSource.scala
M java/pom.xml
10 files changed, 1,025 insertions(+), 210 deletions(-)

Approvals:
  Jean-Daniel Cryans: Looks good to me, approved
  Kudu Jenkins: Verified



-- 
To view, visit http://gerrit.cloudera.org:8080/6419
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: I968fb236f1e93e548db9fd79443912c664e06a1f
Gerrit-PatchSet: 4
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Dan Burkert <da...@apache.org>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Dan Burkert <da...@apache.org>
Gerrit-Reviewer: Jean-Daniel Cryans <jd...@apache.org>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>

[kudu-CR] Spark ITBLL

Posted by "Jean-Daniel Cryans (Code Review)" <ge...@cloudera.org>.
Jean-Daniel Cryans has posted comments on this change.

Change subject: Spark ITBLL
......................................................................


Patch Set 1:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/6419/1/java/kudu-client-tools/src/main/java/org/apache/kudu/mapreduce/tools/BigLinkedListCommon.java
File java/kudu-client-tools/src/main/java/org/apache/kudu/mapreduce/tools/BigLinkedListCommon.java:

Line 37: public class BigLinkedListCommon {
Add InterfaceAudience


http://gerrit.cloudera.org:8080/#/c/6419/1/java/kudu-spark-tools/src/main/scala/org/apache/kudu/spark/tools/IntegrationTestBigLinkedList.scala
File java/kudu-spark-tools/src/main/scala/org/apache/kudu/spark/tools/IntegrationTestBigLinkedList.scala:

Line 42:   */
Also missing something to make it Loop. It's handy and usually that's how people run ITBLL.


-- 
To view, visit http://gerrit.cloudera.org:8080/6419
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I968fb236f1e93e548db9fd79443912c664e06a1f
Gerrit-PatchSet: 1
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Dan Burkert <da...@apache.org>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Jean-Daniel Cryans <jd...@apache.org>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-HasComments: Yes

[kudu-CR] Spark ITBLL

Posted by "Dan Burkert (Code Review)" <ge...@cloudera.org>.
Hello Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

    http://gerrit.cloudera.org:8080/6419

to look at the new patch set (#3).

Change subject: Spark ITBLL
......................................................................

Spark ITBLL

This adds a new end-to-end test for Spark based on the existing
MapReduce ITBLL test. Common parts of the two have been abstracted into
a utility class.

The implementation is only compatible with Spark2/Scala2.11, so the new
submodule is deactivated when compiling with the Spark1/Scala2.10
profile.

The new job's command line arguments are different (and hopefully
simpler) than the MR ITBLL, but the generated tables are designed to be
interoperable, meaning the MR verify job should work on tables generated
with the Spark generator, and vice versa.

Change-Id: I968fb236f1e93e548db9fd79443912c664e06a1f
---
M build-support/jenkins/build-and-test.sh
A java/kudu-client-tools/src/main/java/org/apache/kudu/mapreduce/tools/BigLinkedListCommon.java
M java/kudu-client-tools/src/main/java/org/apache/kudu/mapreduce/tools/IntegrationTestBigLinkedList.java
A java/kudu-spark-tools/pom.xml
A java/kudu-spark-tools/src/main/scala/org/apache/kudu/spark/tools/IntegrationTestBigLinkedList.scala
A java/kudu-spark-tools/src/test/resources/log4j.properties
A java/kudu-spark-tools/src/test/scala/org/apache/kudu/spark/tools/IntegrationTestBigLinkedListTest.scala
M java/kudu-spark/pom.xml
M java/kudu-spark/src/main/scala/org/apache/kudu/spark/kudu/DefaultSource.scala
M java/pom.xml
10 files changed, 1,025 insertions(+), 210 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/19/6419/3
-- 
To view, visit http://gerrit.cloudera.org:8080/6419
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I968fb236f1e93e548db9fd79443912c664e06a1f
Gerrit-PatchSet: 3
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Dan Burkert <da...@apache.org>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Dan Burkert <da...@apache.org>
Gerrit-Reviewer: Jean-Daniel Cryans <jd...@apache.org>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>

[kudu-CR] Spark ITBLL

Posted by "Dan Burkert (Code Review)" <ge...@cloudera.org>.
Hello Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

    http://gerrit.cloudera.org:8080/6419

to look at the new patch set (#2).

Change subject: Spark ITBLL
......................................................................

Spark ITBLL

This adds a new end-to-end test for Spark based on the existing
MapReduce ITBLL test. Common parts of the two have been abstracted into
a utility class.

The implementation is only compatible with Spark2/Scala2.11, so the new
submodule is deactivated when compiling with the Spark1/Scala2.10
profile.

The new job's command line arguments are different (and hopefully
simpler) than the MR ITBLL, but the generated tables are designed to be
interoperable, meaning the MR verify job should work on tables generated
with the Spark generator, and vice versa.

Change-Id: I968fb236f1e93e548db9fd79443912c664e06a1f
---
M build-support/jenkins/build-and-test.sh
A java/kudu-client-tools/src/main/java/org/apache/kudu/mapreduce/tools/BigLinkedListCommon.java
M java/kudu-client-tools/src/main/java/org/apache/kudu/mapreduce/tools/IntegrationTestBigLinkedList.java
A java/kudu-spark-tools/pom.xml
A java/kudu-spark-tools/src/main/scala/org/apache/kudu/spark/tools/IntegrationTestBigLinkedList.scala
A java/kudu-spark-tools/src/test/resources/log4j.properties
A java/kudu-spark-tools/src/test/scala/org/apache/kudu/spark/tools/IntegrationTestBigLinkedListTest.scala
M java/kudu-spark/pom.xml
M java/pom.xml
9 files changed, 1,043 insertions(+), 229 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/19/6419/2
-- 
To view, visit http://gerrit.cloudera.org:8080/6419
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I968fb236f1e93e548db9fd79443912c664e06a1f
Gerrit-PatchSet: 2
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Dan Burkert <da...@apache.org>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Jean-Daniel Cryans <jd...@apache.org>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>