You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@kudu.apache.org by "Andrew Manlove (Code Review)" <ge...@cloudera.org> on 2016/11/21 21:15:22 UTC

[kudu-CR] Implement fix for KUDU-1493

Andrew Manlove has uploaded a new change for review.

  http://gerrit.cloudera.org:8080/5167

Change subject: Implement fix for KUDU-1493
......................................................................

Implement fix for KUDU-1493

Change-Id: I8b6073256b61a174f898be222058277be976273c
---
M java/kudu-spark/src/main/scala/org/apache/kudu/spark/kudu/DefaultSource.scala
M java/kudu-spark/src/test/scala/org/apache/kudu/spark/kudu/DefaultSourceTest.scala
2 files changed, 78 insertions(+), 21 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/67/5167/1
-- 
To view, visit http://gerrit.cloudera.org:8080/5167
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: I8b6073256b61a174f898be222058277be976273c
Gerrit-PatchSet: 1
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Andrew Manlove <aj...@gmail.com>

[kudu-CR] KUDU-1493: Implement SchemaRelationProvider

Posted by "Todd Lipcon (Code Review)" <ge...@cloudera.org>.
Todd Lipcon has posted comments on this change.

Change subject: KUDU-1493: Implement SchemaRelationProvider
......................................................................


Patch Set 3:

Dan/Chris, can you guys take a look at this when you get a chance? I don't feel qualified to review the Spark stuff in good detail.

-- 
To view, visit http://gerrit.cloudera.org:8080/5167
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I8b6073256b61a174f898be222058277be976273c
Gerrit-PatchSet: 3
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Andrew Manlove <aj...@gmail.com>
Gerrit-Reviewer: Chris George <ch...@rms.com>
Gerrit-Reviewer: Dan Burkert <da...@apache.org>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-HasComments: No

[kudu-CR] KUDU-1493: Implement SchemaRelationProvider

Posted by "Chris George (Code Review)" <ge...@cloudera.org>.
Chris George has posted comments on this change.

Change subject: KUDU-1493: Implement SchemaRelationProvider
......................................................................


Patch Set 3:

(1 comment)

overall I think it looks good... we should update the @param on KuduRelation
Would also be good to update the examples in the docs

http://gerrit.cloudera.org:8080/#/c/5167/3/java/kudu-spark/src/main/scala/org/apache/kudu/spark/kudu/DefaultSource.scala
File java/kudu-spark/src/main/scala/org/apache/kudu/spark/kudu/DefaultSource.scala:

Line 124:   * @param sqlContext SparkSQL context
update comments


-- 
To view, visit http://gerrit.cloudera.org:8080/5167
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I8b6073256b61a174f898be222058277be976273c
Gerrit-PatchSet: 3
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Andrew Manlove <aj...@gmail.com>
Gerrit-Reviewer: Chris George <ch...@rms.com>
Gerrit-Reviewer: Dan Burkert <da...@apache.org>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-HasComments: Yes

[kudu-CR] KUDU-1493: Implement SchemaRelationProvider

Posted by "Dan Burkert (Code Review)" <ge...@cloudera.org>.
Dan Burkert has submitted this change and it was merged.

Change subject: KUDU-1493: Implement SchemaRelationProvider
......................................................................


KUDU-1493: Implement SchemaRelationProvider

Implement SchemaRelationProvider org.apache.kudu.spark.kudu.DefaultSource to allow
for specifying a user schema on read and thus allow for DataFrames of different
column orderings

Change-Id: I8b6073256b61a174f898be222058277be976273c
Reviewed-on: http://gerrit.cloudera.org:8080/5167
Tested-by: Kudu Jenkins
Reviewed-by: Dan Burkert <da...@apache.org>
---
M java/kudu-spark/src/main/scala/org/apache/kudu/spark/kudu/DefaultSource.scala
M java/kudu-spark/src/test/scala/org/apache/kudu/spark/kudu/DefaultSourceTest.scala
2 files changed, 85 insertions(+), 28 deletions(-)

Approvals:
  Dan Burkert: Looks good to me, approved
  Kudu Jenkins: Verified



-- 
To view, visit http://gerrit.cloudera.org:8080/5167
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: I8b6073256b61a174f898be222058277be976273c
Gerrit-PatchSet: 6
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Andrew Manlove <aj...@gmail.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Chris George <ch...@rms.com>
Gerrit-Reviewer: Dan Burkert <da...@apache.org>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Will Berkeley <wd...@gmail.com>

[kudu-CR] KUDU-1493: Implement SchemaRelationProvider

Posted by "Dan Burkert (Code Review)" <ge...@cloudera.org>.
Dan Burkert has posted comments on this change.

Change subject: KUDU-1493: Implement SchemaRelationProvider
......................................................................


Patch Set 5: Code-Review+2

-- 
To view, visit http://gerrit.cloudera.org:8080/5167
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I8b6073256b61a174f898be222058277be976273c
Gerrit-PatchSet: 5
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Andrew Manlove <aj...@gmail.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Chris George <ch...@rms.com>
Gerrit-Reviewer: Dan Burkert <da...@apache.org>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Will Berkeley <wd...@gmail.com>
Gerrit-HasComments: No

[kudu-CR] Implement fix for KUDU-1493

Posted by "Todd Lipcon (Code Review)" <ge...@cloudera.org>.
Todd Lipcon has posted comments on this change.

Change subject: Implement fix for KUDU-1493
......................................................................


Patch Set 1:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/5167/1//COMMIT_MSG
Commit Message:

Line 7: Implement fix for KUDU-1493
nit: please use the format:

---
KUDU-1493: <short summary>

<long summary of problem and approach taken in the patch>
---

so that when we look at a git log it's easier to understand what the commit did without having to go read the JIRA.


http://gerrit.cloudera.org:8080/#/c/5167/1/java/kudu-spark/src/main/scala/org/apache/kudu/spark/kudu/DefaultSource.scala
File java/kudu-spark/src/main/scala/org/apache/kudu/spark/kudu/DefaultSource.scala:

Line 102:   override def createRelation(sqlContext: SQLContext, parameters: Map[String, String], schema: StructType): BaseRelation = {
nit: do you mind wrapping these lines to 100 columns max?


PS1, Line 102:  override def createRelation(sqlContext: SQLContext, parameters: Map[String, String], schema: StructType): BaseRelation = {
(a few other places too)


-- 
To view, visit http://gerrit.cloudera.org:8080/5167
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I8b6073256b61a174f898be222058277be976273c
Gerrit-PatchSet: 1
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Andrew Manlove <aj...@gmail.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-HasComments: Yes

[kudu-CR] KUDU-1493: Implement SchemaRelationProvider

Posted by "Andrew Wong (Code Review)" <ge...@cloudera.org>.
Andrew Wong has posted comments on this change.

Change subject: KUDU-1493: Implement SchemaRelationProvider
......................................................................


Patch Set 5:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/5167/5/java/kudu-spark/src/main/scala/org/apache/kudu/spark/kudu/DefaultSource.scala
File java/kudu-spark/src/main/scala/org/apache/kudu/spark/kudu/DefaultSource.scala:

PS5, Line 98:     new KuduRelation(tableName, kuduMaster, operationType, Some(schema))(sqlContext)
Should we add a check to see whether the schema is valid (e.g. non-unique columns that do exist in the original table)?


-- 
To view, visit http://gerrit.cloudera.org:8080/5167
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I8b6073256b61a174f898be222058277be976273c
Gerrit-PatchSet: 5
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Andrew Manlove <aj...@gmail.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Chris George <ch...@rms.com>
Gerrit-Reviewer: Dan Burkert <da...@apache.org>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Will Berkeley <wd...@gmail.com>
Gerrit-HasComments: Yes

[kudu-CR] KUDU-1493: Implement SchemaRelationProvider

Posted by "Will Berkeley (Code Review)" <ge...@cloudera.org>.
Will Berkeley has posted comments on this change.

Change subject: KUDU-1493: Implement SchemaRelationProvider
......................................................................


Patch Set 4:

(7 comments)

http://gerrit.cloudera.org:8080/#/c/5167/3/java/kudu-spark/src/main/scala/org/apache/kudu/spark/kudu/DefaultSource.scala
File java/kudu-spark/src/main/scala/org/apache/kudu/spark/kudu/DefaultSource.scala:

PS3, Line 70:       case "update" 
> Could you move this into OperationType.scala and make it package-private (l
Made it private. It makes sense to leave it here.


Line 124: class KuduRelation(private val tableName: String,
> update comments
Done


PS3, Line 151: userSchema match {
             :       case Some(x) =>
> This is dead code now
Done


http://gerrit.cloudera.org:8080/#/c/5167/3/java/kudu-spark/src/test/scala/org/apache/kudu/spark/kudu/DefaultSourceTest.scala
File java/kudu-spark/src/test/scala/org/apache/kudu/spark/kudu/DefaultSourceTest.scala:

PS3, Line 25: 
> Please preserve this empty line.
Done


PS3, Line 31: import org.apache.spark.sql.types.{DataTypes, StructField
> Also the empty line here. We order the imports as java -> scala -> other im
Done


PS3, Line 33: import org.junit.runner.RunWith
> This belongs in the block of other imports, sorted by import name.
Done


PS3, Line 377: 
> Are you expecting the columns to be re-ordered in the rows? If so, could yo
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/5167
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I8b6073256b61a174f898be222058277be976273c
Gerrit-PatchSet: 4
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Andrew Manlove <aj...@gmail.com>
Gerrit-Reviewer: Chris George <ch...@rms.com>
Gerrit-Reviewer: Dan Burkert <da...@apache.org>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Will Berkeley <wd...@gmail.com>
Gerrit-HasComments: Yes

[kudu-CR] KUDU-1493: Implement SchemaRelationProvider

Posted by "Andrew Manlove (Code Review)" <ge...@cloudera.org>.
Hello Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

    http://gerrit.cloudera.org:8080/5167

to look at the new patch set (#3).

Change subject: KUDU-1493: Implement SchemaRelationProvider
......................................................................

KUDU-1493: Implement SchemaRelationProvider

Implement SchemaRelationProvider org.apache.kudu.spark.kudu.DefaultSource to allow
for specifying a user schema on read and thus allow for DataFrames of different
column orderings

Change-Id: I8b6073256b61a174f898be222058277be976273c
---
M java/kudu-spark/src/main/scala/org/apache/kudu/spark/kudu/DefaultSource.scala
M java/kudu-spark/src/test/scala/org/apache/kudu/spark/kudu/DefaultSourceTest.scala
2 files changed, 79 insertions(+), 21 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/67/5167/3
-- 
To view, visit http://gerrit.cloudera.org:8080/5167
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I8b6073256b61a174f898be222058277be976273c
Gerrit-PatchSet: 3
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Andrew Manlove <aj...@gmail.com>
Gerrit-Reviewer: Chris George <ch...@rms.com>
Gerrit-Reviewer: Dan Burkert <da...@apache.org>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>

[kudu-CR] KUDU-1493: Implement SchemaRelationProvider

Posted by "Will Berkeley (Code Review)" <ge...@cloudera.org>.
Hello Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

    http://gerrit.cloudera.org:8080/5167

to look at the new patch set (#5).

Change subject: KUDU-1493: Implement SchemaRelationProvider
......................................................................

KUDU-1493: Implement SchemaRelationProvider

Implement SchemaRelationProvider org.apache.kudu.spark.kudu.DefaultSource to allow
for specifying a user schema on read and thus allow for DataFrames of different
column orderings

Change-Id: I8b6073256b61a174f898be222058277be976273c
---
M java/kudu-spark/src/main/scala/org/apache/kudu/spark/kudu/DefaultSource.scala
M java/kudu-spark/src/test/scala/org/apache/kudu/spark/kudu/DefaultSourceTest.scala
2 files changed, 85 insertions(+), 28 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/67/5167/5
-- 
To view, visit http://gerrit.cloudera.org:8080/5167
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I8b6073256b61a174f898be222058277be976273c
Gerrit-PatchSet: 5
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Andrew Manlove <aj...@gmail.com>
Gerrit-Reviewer: Chris George <ch...@rms.com>
Gerrit-Reviewer: Dan Burkert <da...@apache.org>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Will Berkeley <wd...@gmail.com>

[kudu-CR] KUDU-1493: Implement SchemaRelationProvider

Posted by "Will Berkeley (Code Review)" <ge...@cloudera.org>.
Will Berkeley has posted comments on this change.

Change subject: KUDU-1493: Implement SchemaRelationProvider
......................................................................


Patch Set 3:

(6 comments)

http://gerrit.cloudera.org:8080/#/c/5167/3/java/kudu-spark/src/main/scala/org/apache/kudu/spark/kudu/DefaultSource.scala
File java/kudu-spark/src/main/scala/org/apache/kudu/spark/kudu/DefaultSource.scala:

PS3, Line 70: def getOperationType
Could you move this into OperationType.scala and make it package-private (like the rest of OperationType)?


PS3, Line 151: val fields: Array[StructField] =
             :       table.getSchema.getColumns.asScala.map { kuduColumnToSparkField }.toArray
This is dead code now


http://gerrit.cloudera.org:8080/#/c/5167/3/java/kudu-spark/src/test/scala/org/apache/kudu/spark/kudu/DefaultSourceTest.scala
File java/kudu-spark/src/test/scala/org/apache/kudu/spark/kudu/DefaultSourceTest.scala:

PS3, Line 25: 
Please preserve this empty line.


PS3, Line 31: import org.scalatest.{BeforeAndAfter, FunSuite, Matchers}
Also the empty line here. We order the imports as java -> scala -> other imports -> kudu, with an empty line in between


PS3, Line 33: import org.apache.spark.sql.types.{DataTypes, StructField, StructType};
This belongs in the block of other imports, sorted by import name.


PS3, Line 377: dfWithUserSchema.limit(10).collect()
Are you expecting the columns to be re-ordered in the rows? If so, could you verify that here in the test.


-- 
To view, visit http://gerrit.cloudera.org:8080/5167
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I8b6073256b61a174f898be222058277be976273c
Gerrit-PatchSet: 3
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Andrew Manlove <aj...@gmail.com>
Gerrit-Reviewer: Chris George <ch...@rms.com>
Gerrit-Reviewer: Dan Burkert <da...@apache.org>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Will Berkeley <wd...@gmail.com>
Gerrit-HasComments: Yes

[kudu-CR] KUDU-1493: Implement SchemaRelationProvider

Posted by "Will Berkeley (Code Review)" <ge...@cloudera.org>.
Hello Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

    http://gerrit.cloudera.org:8080/5167

to look at the new patch set (#4).

Change subject: KUDU-1493: Implement SchemaRelationProvider
......................................................................

KUDU-1493: Implement SchemaRelationProvider

Implement SchemaRelationProvider org.apache.kudu.spark.kudu.DefaultSource to allow
for specifying a user schema on read and thus allow for DataFrames of different
column orderings

Change-Id: I8b6073256b61a174f898be222058277be976273c
---
M java/kudu-spark/src/main/scala/org/apache/kudu/spark/kudu/DefaultSource.scala
M java/kudu-spark/src/test/scala/org/apache/kudu/spark/kudu/DefaultSourceTest.scala
2 files changed, 86 insertions(+), 26 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/67/5167/4
-- 
To view, visit http://gerrit.cloudera.org:8080/5167
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I8b6073256b61a174f898be222058277be976273c
Gerrit-PatchSet: 4
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Andrew Manlove <aj...@gmail.com>
Gerrit-Reviewer: Chris George <ch...@rms.com>
Gerrit-Reviewer: Dan Burkert <da...@apache.org>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Will Berkeley <wd...@gmail.com>

[kudu-CR] KUDU-1493: Implement SchemaRelationProvider

Posted by "Chris George (Code Review)" <ge...@cloudera.org>.
Chris George has posted comments on this change.

Change subject: KUDU-1493: Implement SchemaRelationProvider
......................................................................


Patch Set 5:

would be good to get some examples in documentation about using this

-- 
To view, visit http://gerrit.cloudera.org:8080/5167
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I8b6073256b61a174f898be222058277be976273c
Gerrit-PatchSet: 5
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Andrew Manlove <aj...@gmail.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Chris George <ch...@rms.com>
Gerrit-Reviewer: Dan Burkert <da...@apache.org>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Will Berkeley <wd...@gmail.com>
Gerrit-HasComments: No

[kudu-CR] KUDU-1493: Implement SchemaRelationProvider for spark kudu DefaultSource

Posted by "Andrew Manlove (Code Review)" <ge...@cloudera.org>.
Hello Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

    http://gerrit.cloudera.org:8080/5167

to look at the new patch set (#2).

Change subject: KUDU-1493: Implement SchemaRelationProvider for spark kudu DefaultSource
......................................................................

KUDU-1493: Implement SchemaRelationProvider for spark kudu DefaultSource

Implement SchemaRelationProvider to allow for specifying a user schema on read and thus allow for DataFrames of different column orderings

Change-Id: I8b6073256b61a174f898be222058277be976273c
---
M java/kudu-spark/src/main/scala/org/apache/kudu/spark/kudu/DefaultSource.scala
M java/kudu-spark/src/test/scala/org/apache/kudu/spark/kudu/DefaultSourceTest.scala
2 files changed, 78 insertions(+), 21 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/67/5167/2
-- 
To view, visit http://gerrit.cloudera.org:8080/5167
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I8b6073256b61a174f898be222058277be976273c
Gerrit-PatchSet: 2
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Andrew Manlove <aj...@gmail.com>
Gerrit-Reviewer: Chris George <ch...@rms.com>
Gerrit-Reviewer: Dan Burkert <da...@apache.org>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>