You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@kudu.apache.org by "Grant Henke (Code Review)" <ge...@cloudera.org> on 2018/12/14 15:59:27 UTC

[kudu-CR] [spark] Add a format alias of “kudu”

Grant Henke has uploaded this change for review. ( http://gerrit.cloudera.org:8080/12083


Change subject: [spark] Add a format alias of “kudu”
......................................................................

[spark] Add a format alias of “kudu”

Implements the spark DataSourceRegister on the
kudu-spark DefaultSource to allow the use of
the alias “kudu” when using the Kudu format.

This patch also updates the documentation examples
to use the new aliased syntax given that is the Spark
defacto approach.

Additionally the old package object that was
effectively an alias for the longer format syntax
is marked as deprecated. This syntax doesn’t match
other spark examples and sources and could be
confusing.

Change-Id: I6bb2c7314652456b2f0fa3c4a110ef156503ef92
---
M docs/developing.adoc
M java/kudu-spark-tools/src/main/scala/org/apache/kudu/spark/tools/ImportExportFiles.scala
M java/kudu-spark-tools/src/main/scala/org/apache/kudu/spark/tools/IntegrationTestBigLinkedList.scala
A java/kudu-spark/src/main/resources/META-INF/services/org.apache.spark.sql.sources.DataSourceRegister
M java/kudu-spark/src/main/scala/org/apache/kudu/spark/kudu/DefaultSource.scala
M java/kudu-spark/src/main/scala/org/apache/kudu/spark/kudu/package.scala
M java/kudu-spark/src/test/scala/org/apache/kudu/spark/kudu/DefaultSourceTest.scala
M java/kudu-spark/src/test/scala/org/apache/kudu/spark/kudu/KuduContextTest.scala
M java/kudu-spark/src/test/scala/org/apache/kudu/spark/kudu/StreamingTest.scala
9 files changed, 101 insertions(+), 59 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/83/12083/1
-- 
To view, visit http://gerrit.cloudera.org:8080/12083
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I6bb2c7314652456b2f0fa3c4a110ef156503ef92
Gerrit-Change-Number: 12083
Gerrit-PatchSet: 1
Gerrit-Owner: Grant Henke <gr...@apache.org>

[kudu-CR] [spark] Add a format alias of “kudu”

Posted by "Grant Henke (Code Review)" <ge...@cloudera.org>.
Grant Henke has posted comments on this change. ( http://gerrit.cloudera.org:8080/12083 )

Change subject: [spark] Add a format alias of “kudu”
......................................................................


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/12083/1/docs/developing.adoc
File docs/developing.adoc:

http://gerrit.cloudera.org:8080/#/c/12083/1/docs/developing.adoc@a111
PS1, Line 111: 
> So under the hood, is Spark reading any and all META-INF files that are imp
This was only needed for the implicit`.kudu` functions to be available to the user. 

We version the docs with each release, so I think it makes sense do drop the old style. I could bump the versions though.



-- 
To view, visit http://gerrit.cloudera.org:8080/12083
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6bb2c7314652456b2f0fa3c4a110ef156503ef92
Gerrit-Change-Number: 12083
Gerrit-PatchSet: 1
Gerrit-Owner: Grant Henke <gr...@apache.org>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Grant Henke <gr...@apache.org>
Gerrit-Reviewer: Hao Hao <ha...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Comment-Date: Mon, 17 Dec 2018 19:52:49 +0000
Gerrit-HasComments: Yes

[kudu-CR] [spark] Add a format alias of “kudu”

Posted by "Grant Henke (Code Review)" <ge...@cloudera.org>.
Grant Henke has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/12083 )

Change subject: [spark] Add a format alias of “kudu”
......................................................................

[spark] Add a format alias of “kudu”

Implements the spark DataSourceRegister on the
kudu-spark DefaultSource to allow the use of
the alias “kudu” when using the Kudu format.

This patch also updates the documentation examples
to use the new aliased syntax given that is the Spark
defacto approach.

Additionally the old package object that was
effectively an alias for the longer format syntax
is marked as deprecated. This syntax doesn’t match
other spark examples and sources and could be
confusing.

Change-Id: I6bb2c7314652456b2f0fa3c4a110ef156503ef92
Reviewed-on: http://gerrit.cloudera.org:8080/12083
Reviewed-by: Mike Percy <mp...@apache.org>
Tested-by: Kudu Jenkins
---
M docs/developing.adoc
M java/kudu-spark-tools/src/main/scala/org/apache/kudu/spark/tools/ImportExportFiles.scala
M java/kudu-spark-tools/src/main/scala/org/apache/kudu/spark/tools/IntegrationTestBigLinkedList.scala
A java/kudu-spark/src/main/resources/META-INF/services/org.apache.spark.sql.sources.DataSourceRegister
M java/kudu-spark/src/main/scala/org/apache/kudu/spark/kudu/DefaultSource.scala
M java/kudu-spark/src/main/scala/org/apache/kudu/spark/kudu/package.scala
M java/kudu-spark/src/test/scala/org/apache/kudu/spark/kudu/DefaultSourceTest.scala
M java/kudu-spark/src/test/scala/org/apache/kudu/spark/kudu/KuduContextTest.scala
M java/kudu-spark/src/test/scala/org/apache/kudu/spark/kudu/StreamingTest.scala
9 files changed, 109 insertions(+), 63 deletions(-)

Approvals:
  Mike Percy: Looks good to me, approved
  Kudu Jenkins: Verified

-- 
To view, visit http://gerrit.cloudera.org:8080/12083
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: I6bb2c7314652456b2f0fa3c4a110ef156503ef92
Gerrit-Change-Number: 12083
Gerrit-PatchSet: 4
Gerrit-Owner: Grant Henke <gr...@apache.org>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Grant Henke <gr...@apache.org>
Gerrit-Reviewer: Hao Hao <ha...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Mike Percy <mp...@apache.org>

[kudu-CR] [spark] Add a format alias of “kudu”

Posted by "Grant Henke (Code Review)" <ge...@cloudera.org>.
Hello Mike Percy, Kudu Jenkins, Andrew Wong, Hao Hao, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/12083

to look at the new patch set (#3).

Change subject: [spark] Add a format alias of “kudu”
......................................................................

[spark] Add a format alias of “kudu”

Implements the spark DataSourceRegister on the
kudu-spark DefaultSource to allow the use of
the alias “kudu” when using the Kudu format.

This patch also updates the documentation examples
to use the new aliased syntax given that is the Spark
defacto approach.

Additionally the old package object that was
effectively an alias for the longer format syntax
is marked as deprecated. This syntax doesn’t match
other spark examples and sources and could be
confusing.

Change-Id: I6bb2c7314652456b2f0fa3c4a110ef156503ef92
---
M docs/developing.adoc
M java/kudu-spark-tools/src/main/scala/org/apache/kudu/spark/tools/ImportExportFiles.scala
M java/kudu-spark-tools/src/main/scala/org/apache/kudu/spark/tools/IntegrationTestBigLinkedList.scala
A java/kudu-spark/src/main/resources/META-INF/services/org.apache.spark.sql.sources.DataSourceRegister
M java/kudu-spark/src/main/scala/org/apache/kudu/spark/kudu/DefaultSource.scala
M java/kudu-spark/src/main/scala/org/apache/kudu/spark/kudu/package.scala
M java/kudu-spark/src/test/scala/org/apache/kudu/spark/kudu/DefaultSourceTest.scala
M java/kudu-spark/src/test/scala/org/apache/kudu/spark/kudu/KuduContextTest.scala
M java/kudu-spark/src/test/scala/org/apache/kudu/spark/kudu/StreamingTest.scala
9 files changed, 109 insertions(+), 63 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/83/12083/3
-- 
To view, visit http://gerrit.cloudera.org:8080/12083
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I6bb2c7314652456b2f0fa3c4a110ef156503ef92
Gerrit-Change-Number: 12083
Gerrit-PatchSet: 3
Gerrit-Owner: Grant Henke <gr...@apache.org>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Grant Henke <gr...@apache.org>
Gerrit-Reviewer: Hao Hao <ha...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Mike Percy <mp...@apache.org>

[kudu-CR] [spark] Add a format alias of “kudu”

Posted by "Grant Henke (Code Review)" <ge...@cloudera.org>.
Grant Henke has posted comments on this change. ( http://gerrit.cloudera.org:8080/12083 )

Change subject: [spark] Add a format alias of “kudu”
......................................................................


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/12083/1/docs/developing.adoc
File docs/developing.adoc:

http://gerrit.cloudera.org:8080/#/c/12083/1/docs/developing.adoc@a111
PS1, Line 111: 
> Ah fair. Then perhaps just include a note that versions 1.8 and below have 
Good idea. I will add a note about earlier versions.



-- 
To view, visit http://gerrit.cloudera.org:8080/12083
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6bb2c7314652456b2f0fa3c4a110ef156503ef92
Gerrit-Change-Number: 12083
Gerrit-PatchSet: 1
Gerrit-Owner: Grant Henke <gr...@apache.org>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Grant Henke <gr...@apache.org>
Gerrit-Reviewer: Hao Hao <ha...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Comment-Date: Wed, 19 Dec 2018 19:07:46 +0000
Gerrit-HasComments: Yes

[kudu-CR] [spark] Add a format alias of “kudu”

Posted by "Grant Henke (Code Review)" <ge...@cloudera.org>.
Grant Henke has posted comments on this change. ( http://gerrit.cloudera.org:8080/12083 )

Change subject: [spark] Add a format alias of “kudu”
......................................................................


Patch Set 2:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/12083/2/docs/developing.adoc
File docs/developing.adoc:

http://gerrit.cloudera.org:8080/#/c/12083/2/docs/developing.adoc@104
PS2, Line 104: documenation
> nit: documentation
Done



-- 
To view, visit http://gerrit.cloudera.org:8080/12083
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6bb2c7314652456b2f0fa3c4a110ef156503ef92
Gerrit-Change-Number: 12083
Gerrit-PatchSet: 2
Gerrit-Owner: Grant Henke <gr...@apache.org>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Grant Henke <gr...@apache.org>
Gerrit-Reviewer: Hao Hao <ha...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Comment-Date: Wed, 19 Dec 2018 20:32:42 +0000
Gerrit-HasComments: Yes

[kudu-CR] [spark] Add a format alias of “kudu”

Posted by "Mike Percy (Code Review)" <ge...@cloudera.org>.
Mike Percy has posted comments on this change. ( http://gerrit.cloudera.org:8080/12083 )

Change subject: [spark] Add a format alias of “kudu”
......................................................................


Patch Set 3: Code-Review+2

Nice. LGTM


-- 
To view, visit http://gerrit.cloudera.org:8080/12083
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6bb2c7314652456b2f0fa3c4a110ef156503ef92
Gerrit-Change-Number: 12083
Gerrit-PatchSet: 3
Gerrit-Owner: Grant Henke <gr...@apache.org>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Grant Henke <gr...@apache.org>
Gerrit-Reviewer: Hao Hao <ha...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Comment-Date: Wed, 19 Dec 2018 20:33:24 +0000
Gerrit-HasComments: No

[kudu-CR] [spark] Add a format alias of “kudu”

Posted by "Andrew Wong (Code Review)" <ge...@cloudera.org>.
Andrew Wong has posted comments on this change. ( http://gerrit.cloudera.org:8080/12083 )

Change subject: [spark] Add a format alias of “kudu”
......................................................................


Patch Set 2: Code-Review+1

(1 comment)

http://gerrit.cloudera.org:8080/#/c/12083/2/docs/developing.adoc
File docs/developing.adoc:

http://gerrit.cloudera.org:8080/#/c/12083/2/docs/developing.adoc@104
PS2, Line 104: documenation
nit: documentation



-- 
To view, visit http://gerrit.cloudera.org:8080/12083
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6bb2c7314652456b2f0fa3c4a110ef156503ef92
Gerrit-Change-Number: 12083
Gerrit-PatchSet: 2
Gerrit-Owner: Grant Henke <gr...@apache.org>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Grant Henke <gr...@apache.org>
Gerrit-Reviewer: Hao Hao <ha...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Comment-Date: Wed, 19 Dec 2018 19:52:54 +0000
Gerrit-HasComments: Yes

[kudu-CR] [spark] Add a format alias of “kudu”

Posted by "Grant Henke (Code Review)" <ge...@cloudera.org>.
Hello Mike Percy, Kudu Jenkins, Andrew Wong, Hao Hao, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/12083

to look at the new patch set (#2).

Change subject: [spark] Add a format alias of “kudu”
......................................................................

[spark] Add a format alias of “kudu”

Implements the spark DataSourceRegister on the
kudu-spark DefaultSource to allow the use of
the alias “kudu” when using the Kudu format.

This patch also updates the documentation examples
to use the new aliased syntax given that is the Spark
defacto approach.

Additionally the old package object that was
effectively an alias for the longer format syntax
is marked as deprecated. This syntax doesn’t match
other spark examples and sources and could be
confusing.

Change-Id: I6bb2c7314652456b2f0fa3c4a110ef156503ef92
---
M docs/developing.adoc
M java/kudu-spark-tools/src/main/scala/org/apache/kudu/spark/tools/ImportExportFiles.scala
M java/kudu-spark-tools/src/main/scala/org/apache/kudu/spark/tools/IntegrationTestBigLinkedList.scala
A java/kudu-spark/src/main/resources/META-INF/services/org.apache.spark.sql.sources.DataSourceRegister
M java/kudu-spark/src/main/scala/org/apache/kudu/spark/kudu/DefaultSource.scala
M java/kudu-spark/src/main/scala/org/apache/kudu/spark/kudu/package.scala
M java/kudu-spark/src/test/scala/org/apache/kudu/spark/kudu/DefaultSourceTest.scala
M java/kudu-spark/src/test/scala/org/apache/kudu/spark/kudu/KuduContextTest.scala
M java/kudu-spark/src/test/scala/org/apache/kudu/spark/kudu/StreamingTest.scala
9 files changed, 109 insertions(+), 63 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/83/12083/2
-- 
To view, visit http://gerrit.cloudera.org:8080/12083
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I6bb2c7314652456b2f0fa3c4a110ef156503ef92
Gerrit-Change-Number: 12083
Gerrit-PatchSet: 2
Gerrit-Owner: Grant Henke <gr...@apache.org>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Grant Henke <gr...@apache.org>
Gerrit-Reviewer: Hao Hao <ha...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Mike Percy <mp...@apache.org>

[kudu-CR] [spark] Add a format alias of “kudu”

Posted by "Andrew Wong (Code Review)" <ge...@cloudera.org>.
Andrew Wong has posted comments on this change. ( http://gerrit.cloudera.org:8080/12083 )

Change subject: [spark] Add a format alias of “kudu”
......................................................................


Patch Set 1: Code-Review+1

(1 comment)

http://gerrit.cloudera.org:8080/#/c/12083/1/docs/developing.adoc
File docs/developing.adoc:

http://gerrit.cloudera.org:8080/#/c/12083/1/docs/developing.adoc@a111
PS1, Line 111: 
So under the hood, is Spark reading any and all META-INF files that are imported in `--packages`? And that's why this isn't necessary anymore?

This doc references older versions of the Spark bindings; I wonder if it's worth keeping an example of the legacy incantation around, even if we are deprecating it. WDYT?



-- 
To view, visit http://gerrit.cloudera.org:8080/12083
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6bb2c7314652456b2f0fa3c4a110ef156503ef92
Gerrit-Change-Number: 12083
Gerrit-PatchSet: 1
Gerrit-Owner: Grant Henke <gr...@apache.org>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Hao Hao <ha...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Comment-Date: Mon, 17 Dec 2018 18:30:02 +0000
Gerrit-HasComments: Yes

[kudu-CR] [spark] Add a format alias of “kudu”

Posted by "Andrew Wong (Code Review)" <ge...@cloudera.org>.
Andrew Wong has posted comments on this change. ( http://gerrit.cloudera.org:8080/12083 )

Change subject: [spark] Add a format alias of “kudu”
......................................................................


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/12083/1/docs/developing.adoc
File docs/developing.adoc:

http://gerrit.cloudera.org:8080/#/c/12083/1/docs/developing.adoc@a111
PS1, Line 111: 
> This was only needed for the implicit`.kudu` functions to be available to t
Ah fair. Then perhaps just include a note that versions 1.8 and below have slightly different semantics? That way someone using an older version reading this expecting to have an example will know to look elsewhere.



-- 
To view, visit http://gerrit.cloudera.org:8080/12083
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6bb2c7314652456b2f0fa3c4a110ef156503ef92
Gerrit-Change-Number: 12083
Gerrit-PatchSet: 1
Gerrit-Owner: Grant Henke <gr...@apache.org>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Grant Henke <gr...@apache.org>
Gerrit-Reviewer: Hao Hao <ha...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Comment-Date: Tue, 18 Dec 2018 17:06:00 +0000
Gerrit-HasComments: Yes