You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@kudu.apache.org by "Andrew Manlove (Code Review)" <ge...@cloudera.org> on 2016/11/21 21:15:22 UTC
[kudu-CR] Implement fix for KUDU-1493
Andrew Manlove has uploaded a new change for review.
http://gerrit.cloudera.org:8080/5167
Change subject: Implement fix for KUDU-1493
......................................................................
Implement fix for KUDU-1493
Change-Id: I8b6073256b61a174f898be222058277be976273c
---
M java/kudu-spark/src/main/scala/org/apache/kudu/spark/kudu/DefaultSource.scala
M java/kudu-spark/src/test/scala/org/apache/kudu/spark/kudu/DefaultSourceTest.scala
2 files changed, 78 insertions(+), 21 deletions(-)
git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/67/5167/1
--
To view, visit http://gerrit.cloudera.org:8080/5167
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings
Gerrit-MessageType: newchange
Gerrit-Change-Id: I8b6073256b61a174f898be222058277be976273c
Gerrit-PatchSet: 1
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Andrew Manlove <aj...@gmail.com>
[kudu-CR] KUDU-1493: Implement SchemaRelationProvider
Posted by "Todd Lipcon (Code Review)" <ge...@cloudera.org>.
Todd Lipcon has posted comments on this change.
Change subject: KUDU-1493: Implement SchemaRelationProvider
......................................................................
Patch Set 3:
Dan/Chris, can you guys take a look at this when you get a chance? I don't feel qualified to review the Spark stuff in good detail.
--
To view, visit http://gerrit.cloudera.org:8080/5167
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings
Gerrit-MessageType: comment
Gerrit-Change-Id: I8b6073256b61a174f898be222058277be976273c
Gerrit-PatchSet: 3
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Andrew Manlove <aj...@gmail.com>
Gerrit-Reviewer: Chris George <ch...@rms.com>
Gerrit-Reviewer: Dan Burkert <da...@apache.org>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-HasComments: No
[kudu-CR] KUDU-1493: Implement SchemaRelationProvider
Posted by "Chris George (Code Review)" <ge...@cloudera.org>.
Chris George has posted comments on this change.
Change subject: KUDU-1493: Implement SchemaRelationProvider
......................................................................
Patch Set 3:
(1 comment)
overall I think it looks good... we should update the @param on KuduRelation
Would also be good to update the examples in the docs
http://gerrit.cloudera.org:8080/#/c/5167/3/java/kudu-spark/src/main/scala/org/apache/kudu/spark/kudu/DefaultSource.scala
File java/kudu-spark/src/main/scala/org/apache/kudu/spark/kudu/DefaultSource.scala:
Line 124: * @param sqlContext SparkSQL context
update comments
--
To view, visit http://gerrit.cloudera.org:8080/5167
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings
Gerrit-MessageType: comment
Gerrit-Change-Id: I8b6073256b61a174f898be222058277be976273c
Gerrit-PatchSet: 3
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Andrew Manlove <aj...@gmail.com>
Gerrit-Reviewer: Chris George <ch...@rms.com>
Gerrit-Reviewer: Dan Burkert <da...@apache.org>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-HasComments: Yes
[kudu-CR] KUDU-1493: Implement SchemaRelationProvider
Posted by "Dan Burkert (Code Review)" <ge...@cloudera.org>.
Dan Burkert has submitted this change and it was merged.
Change subject: KUDU-1493: Implement SchemaRelationProvider
......................................................................
KUDU-1493: Implement SchemaRelationProvider
Implement SchemaRelationProvider org.apache.kudu.spark.kudu.DefaultSource to allow
for specifying a user schema on read and thus allow for DataFrames of different
column orderings
Change-Id: I8b6073256b61a174f898be222058277be976273c
Reviewed-on: http://gerrit.cloudera.org:8080/5167
Tested-by: Kudu Jenkins
Reviewed-by: Dan Burkert <da...@apache.org>
---
M java/kudu-spark/src/main/scala/org/apache/kudu/spark/kudu/DefaultSource.scala
M java/kudu-spark/src/test/scala/org/apache/kudu/spark/kudu/DefaultSourceTest.scala
2 files changed, 85 insertions(+), 28 deletions(-)
Approvals:
Dan Burkert: Looks good to me, approved
Kudu Jenkins: Verified
--
To view, visit http://gerrit.cloudera.org:8080/5167
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings
Gerrit-MessageType: merged
Gerrit-Change-Id: I8b6073256b61a174f898be222058277be976273c
Gerrit-PatchSet: 6
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Andrew Manlove <aj...@gmail.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Chris George <ch...@rms.com>
Gerrit-Reviewer: Dan Burkert <da...@apache.org>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Will Berkeley <wd...@gmail.com>
[kudu-CR] KUDU-1493: Implement SchemaRelationProvider
Posted by "Dan Burkert (Code Review)" <ge...@cloudera.org>.
Dan Burkert has posted comments on this change.
Change subject: KUDU-1493: Implement SchemaRelationProvider
......................................................................
Patch Set 5: Code-Review+2
--
To view, visit http://gerrit.cloudera.org:8080/5167
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings
Gerrit-MessageType: comment
Gerrit-Change-Id: I8b6073256b61a174f898be222058277be976273c
Gerrit-PatchSet: 5
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Andrew Manlove <aj...@gmail.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Chris George <ch...@rms.com>
Gerrit-Reviewer: Dan Burkert <da...@apache.org>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Will Berkeley <wd...@gmail.com>
Gerrit-HasComments: No
[kudu-CR] Implement fix for KUDU-1493
Posted by "Todd Lipcon (Code Review)" <ge...@cloudera.org>.
Todd Lipcon has posted comments on this change.
Change subject: Implement fix for KUDU-1493
......................................................................
Patch Set 1:
(3 comments)
http://gerrit.cloudera.org:8080/#/c/5167/1//COMMIT_MSG
Commit Message:
Line 7: Implement fix for KUDU-1493
nit: please use the format:
---
KUDU-1493: <short summary>
<long summary of problem and approach taken in the patch>
---
so that when we look at a git log it's easier to understand what the commit did without having to go read the JIRA.
http://gerrit.cloudera.org:8080/#/c/5167/1/java/kudu-spark/src/main/scala/org/apache/kudu/spark/kudu/DefaultSource.scala
File java/kudu-spark/src/main/scala/org/apache/kudu/spark/kudu/DefaultSource.scala:
Line 102: override def createRelation(sqlContext: SQLContext, parameters: Map[String, String], schema: StructType): BaseRelation = {
nit: do you mind wrapping these lines to 100 columns max?
PS1, Line 102: override def createRelation(sqlContext: SQLContext, parameters: Map[String, String], schema: StructType): BaseRelation = {
(a few other places too)
--
To view, visit http://gerrit.cloudera.org:8080/5167
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings
Gerrit-MessageType: comment
Gerrit-Change-Id: I8b6073256b61a174f898be222058277be976273c
Gerrit-PatchSet: 1
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Andrew Manlove <aj...@gmail.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-HasComments: Yes
[kudu-CR] KUDU-1493: Implement SchemaRelationProvider
Posted by "Andrew Wong (Code Review)" <ge...@cloudera.org>.
Andrew Wong has posted comments on this change.
Change subject: KUDU-1493: Implement SchemaRelationProvider
......................................................................
Patch Set 5:
(1 comment)
http://gerrit.cloudera.org:8080/#/c/5167/5/java/kudu-spark/src/main/scala/org/apache/kudu/spark/kudu/DefaultSource.scala
File java/kudu-spark/src/main/scala/org/apache/kudu/spark/kudu/DefaultSource.scala:
PS5, Line 98: new KuduRelation(tableName, kuduMaster, operationType, Some(schema))(sqlContext)
Should we add a check to see whether the schema is valid (e.g. non-unique columns that do exist in the original table)?
--
To view, visit http://gerrit.cloudera.org:8080/5167
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings
Gerrit-MessageType: comment
Gerrit-Change-Id: I8b6073256b61a174f898be222058277be976273c
Gerrit-PatchSet: 5
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Andrew Manlove <aj...@gmail.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Chris George <ch...@rms.com>
Gerrit-Reviewer: Dan Burkert <da...@apache.org>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Will Berkeley <wd...@gmail.com>
Gerrit-HasComments: Yes
[kudu-CR] KUDU-1493: Implement SchemaRelationProvider
Posted by "Will Berkeley (Code Review)" <ge...@cloudera.org>.
Will Berkeley has posted comments on this change.
Change subject: KUDU-1493: Implement SchemaRelationProvider
......................................................................
Patch Set 4:
(7 comments)
http://gerrit.cloudera.org:8080/#/c/5167/3/java/kudu-spark/src/main/scala/org/apache/kudu/spark/kudu/DefaultSource.scala
File java/kudu-spark/src/main/scala/org/apache/kudu/spark/kudu/DefaultSource.scala:
PS3, Line 70: case "update"
> Could you move this into OperationType.scala and make it package-private (l
Made it private. It makes sense to leave it here.
Line 124: class KuduRelation(private val tableName: String,
> update comments
Done
PS3, Line 151: userSchema match {
: case Some(x) =>
> This is dead code now
Done
http://gerrit.cloudera.org:8080/#/c/5167/3/java/kudu-spark/src/test/scala/org/apache/kudu/spark/kudu/DefaultSourceTest.scala
File java/kudu-spark/src/test/scala/org/apache/kudu/spark/kudu/DefaultSourceTest.scala:
PS3, Line 25:
> Please preserve this empty line.
Done
PS3, Line 31: import org.apache.spark.sql.types.{DataTypes, StructField
> Also the empty line here. We order the imports as java -> scala -> other im
Done
PS3, Line 33: import org.junit.runner.RunWith
> This belongs in the block of other imports, sorted by import name.
Done
PS3, Line 377:
> Are you expecting the columns to be re-ordered in the rows? If so, could yo
Done
--
To view, visit http://gerrit.cloudera.org:8080/5167
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings
Gerrit-MessageType: comment
Gerrit-Change-Id: I8b6073256b61a174f898be222058277be976273c
Gerrit-PatchSet: 4
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Andrew Manlove <aj...@gmail.com>
Gerrit-Reviewer: Chris George <ch...@rms.com>
Gerrit-Reviewer: Dan Burkert <da...@apache.org>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Will Berkeley <wd...@gmail.com>
Gerrit-HasComments: Yes
[kudu-CR] KUDU-1493: Implement SchemaRelationProvider
Posted by "Andrew Manlove (Code Review)" <ge...@cloudera.org>.
Hello Kudu Jenkins,
I'd like you to reexamine a change. Please visit
http://gerrit.cloudera.org:8080/5167
to look at the new patch set (#3).
Change subject: KUDU-1493: Implement SchemaRelationProvider
......................................................................
KUDU-1493: Implement SchemaRelationProvider
Implement SchemaRelationProvider org.apache.kudu.spark.kudu.DefaultSource to allow
for specifying a user schema on read and thus allow for DataFrames of different
column orderings
Change-Id: I8b6073256b61a174f898be222058277be976273c
---
M java/kudu-spark/src/main/scala/org/apache/kudu/spark/kudu/DefaultSource.scala
M java/kudu-spark/src/test/scala/org/apache/kudu/spark/kudu/DefaultSourceTest.scala
2 files changed, 79 insertions(+), 21 deletions(-)
git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/67/5167/3
--
To view, visit http://gerrit.cloudera.org:8080/5167
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I8b6073256b61a174f898be222058277be976273c
Gerrit-PatchSet: 3
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Andrew Manlove <aj...@gmail.com>
Gerrit-Reviewer: Chris George <ch...@rms.com>
Gerrit-Reviewer: Dan Burkert <da...@apache.org>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
[kudu-CR] KUDU-1493: Implement SchemaRelationProvider
Posted by "Will Berkeley (Code Review)" <ge...@cloudera.org>.
Hello Kudu Jenkins,
I'd like you to reexamine a change. Please visit
http://gerrit.cloudera.org:8080/5167
to look at the new patch set (#5).
Change subject: KUDU-1493: Implement SchemaRelationProvider
......................................................................
KUDU-1493: Implement SchemaRelationProvider
Implement SchemaRelationProvider org.apache.kudu.spark.kudu.DefaultSource to allow
for specifying a user schema on read and thus allow for DataFrames of different
column orderings
Change-Id: I8b6073256b61a174f898be222058277be976273c
---
M java/kudu-spark/src/main/scala/org/apache/kudu/spark/kudu/DefaultSource.scala
M java/kudu-spark/src/test/scala/org/apache/kudu/spark/kudu/DefaultSourceTest.scala
2 files changed, 85 insertions(+), 28 deletions(-)
git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/67/5167/5
--
To view, visit http://gerrit.cloudera.org:8080/5167
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I8b6073256b61a174f898be222058277be976273c
Gerrit-PatchSet: 5
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Andrew Manlove <aj...@gmail.com>
Gerrit-Reviewer: Chris George <ch...@rms.com>
Gerrit-Reviewer: Dan Burkert <da...@apache.org>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Will Berkeley <wd...@gmail.com>
[kudu-CR] KUDU-1493: Implement SchemaRelationProvider
Posted by "Will Berkeley (Code Review)" <ge...@cloudera.org>.
Will Berkeley has posted comments on this change.
Change subject: KUDU-1493: Implement SchemaRelationProvider
......................................................................
Patch Set 3:
(6 comments)
http://gerrit.cloudera.org:8080/#/c/5167/3/java/kudu-spark/src/main/scala/org/apache/kudu/spark/kudu/DefaultSource.scala
File java/kudu-spark/src/main/scala/org/apache/kudu/spark/kudu/DefaultSource.scala:
PS3, Line 70: def getOperationType
Could you move this into OperationType.scala and make it package-private (like the rest of OperationType)?
PS3, Line 151: val fields: Array[StructField] =
: table.getSchema.getColumns.asScala.map { kuduColumnToSparkField }.toArray
This is dead code now
http://gerrit.cloudera.org:8080/#/c/5167/3/java/kudu-spark/src/test/scala/org/apache/kudu/spark/kudu/DefaultSourceTest.scala
File java/kudu-spark/src/test/scala/org/apache/kudu/spark/kudu/DefaultSourceTest.scala:
PS3, Line 25:
Please preserve this empty line.
PS3, Line 31: import org.scalatest.{BeforeAndAfter, FunSuite, Matchers}
Also the empty line here. We order the imports as java -> scala -> other imports -> kudu, with an empty line in between
PS3, Line 33: import org.apache.spark.sql.types.{DataTypes, StructField, StructType};
This belongs in the block of other imports, sorted by import name.
PS3, Line 377: dfWithUserSchema.limit(10).collect()
Are you expecting the columns to be re-ordered in the rows? If so, could you verify that here in the test.
--
To view, visit http://gerrit.cloudera.org:8080/5167
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings
Gerrit-MessageType: comment
Gerrit-Change-Id: I8b6073256b61a174f898be222058277be976273c
Gerrit-PatchSet: 3
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Andrew Manlove <aj...@gmail.com>
Gerrit-Reviewer: Chris George <ch...@rms.com>
Gerrit-Reviewer: Dan Burkert <da...@apache.org>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Will Berkeley <wd...@gmail.com>
Gerrit-HasComments: Yes
[kudu-CR] KUDU-1493: Implement SchemaRelationProvider
Posted by "Will Berkeley (Code Review)" <ge...@cloudera.org>.
Hello Kudu Jenkins,
I'd like you to reexamine a change. Please visit
http://gerrit.cloudera.org:8080/5167
to look at the new patch set (#4).
Change subject: KUDU-1493: Implement SchemaRelationProvider
......................................................................
KUDU-1493: Implement SchemaRelationProvider
Implement SchemaRelationProvider org.apache.kudu.spark.kudu.DefaultSource to allow
for specifying a user schema on read and thus allow for DataFrames of different
column orderings
Change-Id: I8b6073256b61a174f898be222058277be976273c
---
M java/kudu-spark/src/main/scala/org/apache/kudu/spark/kudu/DefaultSource.scala
M java/kudu-spark/src/test/scala/org/apache/kudu/spark/kudu/DefaultSourceTest.scala
2 files changed, 86 insertions(+), 26 deletions(-)
git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/67/5167/4
--
To view, visit http://gerrit.cloudera.org:8080/5167
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I8b6073256b61a174f898be222058277be976273c
Gerrit-PatchSet: 4
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Andrew Manlove <aj...@gmail.com>
Gerrit-Reviewer: Chris George <ch...@rms.com>
Gerrit-Reviewer: Dan Burkert <da...@apache.org>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Will Berkeley <wd...@gmail.com>
[kudu-CR] KUDU-1493: Implement SchemaRelationProvider
Posted by "Chris George (Code Review)" <ge...@cloudera.org>.
Chris George has posted comments on this change.
Change subject: KUDU-1493: Implement SchemaRelationProvider
......................................................................
Patch Set 5:
would be good to get some examples in documentation about using this
--
To view, visit http://gerrit.cloudera.org:8080/5167
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings
Gerrit-MessageType: comment
Gerrit-Change-Id: I8b6073256b61a174f898be222058277be976273c
Gerrit-PatchSet: 5
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Andrew Manlove <aj...@gmail.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Chris George <ch...@rms.com>
Gerrit-Reviewer: Dan Burkert <da...@apache.org>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Will Berkeley <wd...@gmail.com>
Gerrit-HasComments: No
[kudu-CR] KUDU-1493: Implement SchemaRelationProvider for spark kudu DefaultSource
Posted by "Andrew Manlove (Code Review)" <ge...@cloudera.org>.
Hello Kudu Jenkins,
I'd like you to reexamine a change. Please visit
http://gerrit.cloudera.org:8080/5167
to look at the new patch set (#2).
Change subject: KUDU-1493: Implement SchemaRelationProvider for spark kudu DefaultSource
......................................................................
KUDU-1493: Implement SchemaRelationProvider for spark kudu DefaultSource
Implement SchemaRelationProvider to allow for specifying a user schema on read and thus allow for DataFrames of different column orderings
Change-Id: I8b6073256b61a174f898be222058277be976273c
---
M java/kudu-spark/src/main/scala/org/apache/kudu/spark/kudu/DefaultSource.scala
M java/kudu-spark/src/test/scala/org/apache/kudu/spark/kudu/DefaultSourceTest.scala
2 files changed, 78 insertions(+), 21 deletions(-)
git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/67/5167/2
--
To view, visit http://gerrit.cloudera.org:8080/5167
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I8b6073256b61a174f898be222058277be976273c
Gerrit-PatchSet: 2
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Andrew Manlove <aj...@gmail.com>
Gerrit-Reviewer: Chris George <ch...@rms.com>
Gerrit-Reviewer: Dan Burkert <da...@apache.org>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>