You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@kudu.apache.org by "Grant Henke (Code Review)" <ge...@cloudera.org> on 2020/10/30 15:49:21 UTC

[kudu-CR] WIP: KUDU-1563 Support ignore operations in kudu-spark

Grant Henke has uploaded this change for review. ( http://gerrit.cloudera.org:8080/16681


Change subject: WIP: KUDU-1563 Support ignore operations in kudu-spark
......................................................................

WIP: KUDU-1563 Support ignore operations in kudu-spark

This patch adds support for the INSERT_IGNORE, UPDATE_IGNORE,
and DELETE_IGNORE operations into the Kudu Spark integration.

TODO: After KUDU-3211 automatically detect if INSERT_IGNORE is
supported and fall back to INSERT with
`kudu.ignoreDuplicateRowErrors = true`.

Change-Id: If4b4dc0ec996a88afead0f9da0024457e568b0f4
---
M java/kudu-spark/src/main/scala/org/apache/kudu/spark/kudu/DefaultSource.scala
M java/kudu-spark/src/main/scala/org/apache/kudu/spark/kudu/KuduContext.scala
M java/kudu-spark/src/main/scala/org/apache/kudu/spark/kudu/OperationType.scala
M java/kudu-spark/src/test/scala/org/apache/kudu/spark/kudu/DefaultSourceTest.scala
4 files changed, 67 insertions(+), 4 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/81/16681/1
-- 
To view, visit http://gerrit.cloudera.org:8080/16681
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: If4b4dc0ec996a88afead0f9da0024457e568b0f4
Gerrit-Change-Number: 16681
Gerrit-PatchSet: 1
Gerrit-Owner: Grant Henke <gr...@apache.org>

[kudu-CR] KUDU-1563 Support ignore operations in kudu-spark

Posted by "Alexey Serbin (Code Review)" <ge...@cloudera.org>.
Alexey Serbin has posted comments on this change. ( http://gerrit.cloudera.org:8080/16681 )

Change subject: KUDU-1563 Support ignore operations in kudu-spark
......................................................................


Patch Set 6: Code-Review+2


-- 
To view, visit http://gerrit.cloudera.org:8080/16681
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If4b4dc0ec996a88afead0f9da0024457e568b0f4
Gerrit-Change-Number: 16681
Gerrit-PatchSet: 6
Gerrit-Owner: Grant Henke <gr...@apache.org>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Attila Bukor <ab...@apache.org>
Gerrit-Reviewer: Grant Henke <gr...@apache.org>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Comment-Date: Mon, 09 Nov 2020 15:03:44 +0000
Gerrit-HasComments: No

[kudu-CR] KUDU-1563 Support ignore operations in kudu-spark

Posted by "Grant Henke (Code Review)" <ge...@cloudera.org>.
Grant Henke has posted comments on this change. ( http://gerrit.cloudera.org:8080/16681 )

Change subject: KUDU-1563 Support ignore operations in kudu-spark
......................................................................


Patch Set 3:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/16681/3/java/kudu-spark/src/test/scala/org/apache/kudu/spark/kudu/DefaultSourceTest.scala
File java/kudu-spark/src/test/scala/org/apache/kudu/spark/kudu/DefaultSourceTest.scala:

http://gerrit.cloudera.org:8080/#/c/16681/3/java/kudu-spark/src/test/scala/org/apache/kudu/spark/kudu/DefaultSourceTest.scala@289
PS3, Line 289:   def testInsertIgnoreRowsWriteOption() {
> nit: maybe add a hint to future readers like, "Identical to the above test,
I did this but swapped it so that the legacy test is second.



-- 
To view, visit http://gerrit.cloudera.org:8080/16681
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If4b4dc0ec996a88afead0f9da0024457e568b0f4
Gerrit-Change-Number: 16681
Gerrit-PatchSet: 3
Gerrit-Owner: Grant Henke <gr...@apache.org>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Attila Bukor <ab...@apache.org>
Gerrit-Reviewer: Grant Henke <gr...@apache.org>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Comment-Date: Fri, 06 Nov 2020 13:57:43 +0000
Gerrit-HasComments: Yes

[kudu-CR] KUDU-1563 Support ignore operations in kudu-spark

Posted by "Attila Bukor (Code Review)" <ge...@cloudera.org>.
Attila Bukor has posted comments on this change. ( http://gerrit.cloudera.org:8080/16681 )

Change subject: KUDU-1563 Support ignore operations in kudu-spark
......................................................................


Patch Set 5:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/16681/5/java/kudu-spark/src/main/scala/org/apache/kudu/spark/kudu/DefaultSource.scala
File java/kudu-spark/src/main/scala/org/apache/kudu/spark/kudu/DefaultSource.scala:

http://gerrit.cloudera.org:8080/#/c/16681/5/java/kudu-spark/src/main/scala/org/apache/kudu/spark/kudu/DefaultSource.scala@247
PS5, Line 247:       case "insert_ignore" => InsertIgnore
nit: can you make the order of underscore and hyphen versions consistent?


http://gerrit.cloudera.org:8080/#/c/16681/5/java/kudu-spark/src/main/scala/org/apache/kudu/spark/kudu/KuduContext.scala
File java/kudu-spark/src/main/scala/org/apache/kudu/spark/kudu/KuduContext.scala:

http://gerrit.cloudera.org:8080/#/c/16681/5/java/kudu-spark/src/main/scala/org/apache/kudu/spark/kudu/KuduContext.scala@486
PS5, Line 486:         "kudu.ignoreDuplicateRowErrors is deprecated and slow. Use the insert_ignore operation instead.")
should we emit this warning even if it is not supported in the cluster?



-- 
To view, visit http://gerrit.cloudera.org:8080/16681
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If4b4dc0ec996a88afead0f9da0024457e568b0f4
Gerrit-Change-Number: 16681
Gerrit-PatchSet: 5
Gerrit-Owner: Grant Henke <gr...@apache.org>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Attila Bukor <ab...@apache.org>
Gerrit-Reviewer: Grant Henke <gr...@apache.org>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Comment-Date: Mon, 09 Nov 2020 11:39:05 +0000
Gerrit-HasComments: Yes

[kudu-CR] KUDU-1563 Support ignore operations in kudu-spark

Posted by "Alexey Serbin (Code Review)" <ge...@cloudera.org>.
Alexey Serbin has posted comments on this change. ( http://gerrit.cloudera.org:8080/16681 )

Change subject: KUDU-1563 Support ignore operations in kudu-spark
......................................................................


Patch Set 4:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/16681/4/java/kudu-spark/src/main/scala/org/apache/kudu/spark/kudu/KuduContext.scala
File java/kudu-spark/src/main/scala/org/apache/kudu/spark/kudu/KuduContext.scala:

http://gerrit.cloudera.org:8080/#/c/16681/4/java/kudu-spark/src/main/scala/org/apache/kudu/spark/kudu/KuduContext.scala@159
PS4, Line 159: transient lazy val supportsIgnoreOperations
Does the '@transient' attribute mean that every worker calls this since this piece of information isn't serialized?  I'm concerned that Kudu masters might be bombarded with those Ping() requests even when it would be enough to check for the presence of the IGNORE operations once.

Maybe, it's worth making the supportsIgnoreOperations field serializable to spare Kudu masters of unneeded load?



-- 
To view, visit http://gerrit.cloudera.org:8080/16681
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If4b4dc0ec996a88afead0f9da0024457e568b0f4
Gerrit-Change-Number: 16681
Gerrit-PatchSet: 4
Gerrit-Owner: Grant Henke <gr...@apache.org>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Attila Bukor <ab...@apache.org>
Gerrit-Reviewer: Grant Henke <gr...@apache.org>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Comment-Date: Sat, 07 Nov 2020 18:03:15 +0000
Gerrit-HasComments: Yes

[kudu-CR] KUDU-1563 Support ignore operations in kudu-spark

Posted by "Andrew Wong (Code Review)" <ge...@cloudera.org>.
Andrew Wong has posted comments on this change. ( http://gerrit.cloudera.org:8080/16681 )

Change subject: KUDU-1563 Support ignore operations in kudu-spark
......................................................................


Patch Set 3: Code-Review+1

(1 comment)

http://gerrit.cloudera.org:8080/#/c/16681/3/java/kudu-spark/src/test/scala/org/apache/kudu/spark/kudu/DefaultSourceTest.scala
File java/kudu-spark/src/test/scala/org/apache/kudu/spark/kudu/DefaultSourceTest.scala:

http://gerrit.cloudera.org:8080/#/c/16681/3/java/kudu-spark/src/test/scala/org/apache/kudu/spark/kudu/DefaultSourceTest.scala@289
PS3, Line 289:   def testInsertIgnoreRowsWriteOption() {
nit: maybe add a hint to future readers like, "Identical to the above test, but supporting ignore operations, ensuring we functionally support the same semantics. Also uses "insert_ignore" instead of "insert-ignore"" or something? Readers (myself included) may otherwise squint for a bit deciphering whether there are other meaningful differences with the above test.



-- 
To view, visit http://gerrit.cloudera.org:8080/16681
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If4b4dc0ec996a88afead0f9da0024457e568b0f4
Gerrit-Change-Number: 16681
Gerrit-PatchSet: 3
Gerrit-Owner: Grant Henke <gr...@apache.org>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Attila Bukor <ab...@apache.org>
Gerrit-Reviewer: Grant Henke <gr...@apache.org>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Comment-Date: Fri, 06 Nov 2020 07:54:27 +0000
Gerrit-HasComments: Yes

[kudu-CR] KUDU-1563 Support ignore operations in kudu-spark

Posted by "Grant Henke (Code Review)" <ge...@cloudera.org>.
Hello Alexey Serbin, Attila Bukor, Kudu Jenkins, Andrew Wong, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/16681

to look at the new patch set (#6).

Change subject: KUDU-1563 Support ignore operations in kudu-spark
......................................................................

KUDU-1563 Support ignore operations in kudu-spark

This patch adds support for the INSERT_IGNORE, UPDATE_IGNORE,
and DELETE_IGNORE operations into the Kudu Spark integration.

It leverages `AsyncKuduClient.supportsIgnoreOperations()` to
handle INSERT_IGNORE operations in a compatible way.

Change-Id: If4b4dc0ec996a88afead0f9da0024457e568b0f4
---
M java/kudu-spark/src/main/scala/org/apache/kudu/spark/kudu/DefaultSource.scala
M java/kudu-spark/src/main/scala/org/apache/kudu/spark/kudu/KuduContext.scala
M java/kudu-spark/src/main/scala/org/apache/kudu/spark/kudu/OperationType.scala
M java/kudu-spark/src/test/scala/org/apache/kudu/spark/kudu/DefaultSourceTest.scala
4 files changed, 83 insertions(+), 5 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/81/16681/6
-- 
To view, visit http://gerrit.cloudera.org:8080/16681
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: If4b4dc0ec996a88afead0f9da0024457e568b0f4
Gerrit-Change-Number: 16681
Gerrit-PatchSet: 6
Gerrit-Owner: Grant Henke <gr...@apache.org>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Attila Bukor <ab...@apache.org>
Gerrit-Reviewer: Grant Henke <gr...@apache.org>
Gerrit-Reviewer: Kudu Jenkins (120)

[kudu-CR] KUDU-1563 Support ignore operations in kudu-spark

Posted by "Grant Henke (Code Review)" <ge...@cloudera.org>.
Grant Henke has posted comments on this change. ( http://gerrit.cloudera.org:8080/16681 )

Change subject: KUDU-1563 Support ignore operations in kudu-spark
......................................................................


Patch Set 5:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/16681/5/java/kudu-spark/src/main/scala/org/apache/kudu/spark/kudu/DefaultSource.scala
File java/kudu-spark/src/main/scala/org/apache/kudu/spark/kudu/DefaultSource.scala:

http://gerrit.cloudera.org:8080/#/c/16681/5/java/kudu-spark/src/main/scala/org/apache/kudu/spark/kudu/DefaultSource.scala@247
PS5, Line 247:       case "insert_ignore" => InsertIgnore
> nit: can you make the order of underscore and hyphen versions consistent?
Done


http://gerrit.cloudera.org:8080/#/c/16681/5/java/kudu-spark/src/main/scala/org/apache/kudu/spark/kudu/KuduContext.scala
File java/kudu-spark/src/main/scala/org/apache/kudu/spark/kudu/KuduContext.scala:

http://gerrit.cloudera.org:8080/#/c/16681/5/java/kudu-spark/src/main/scala/org/apache/kudu/spark/kudu/KuduContext.scala@486
PS5, Line 486:         "kudu.ignoreDuplicateRowErrors is deprecated and slow. Use the insert_ignore operation instead.")
> should we emit this warning even if it is not supported in the cluster?
"ignoreDuplicateRowErrors" is the old style and is always "supported".

If you mean should we emit this when ignore ops aren't supported, I think we should since it indicates to the user that the old style is being used.



-- 
To view, visit http://gerrit.cloudera.org:8080/16681
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If4b4dc0ec996a88afead0f9da0024457e568b0f4
Gerrit-Change-Number: 16681
Gerrit-PatchSet: 5
Gerrit-Owner: Grant Henke <gr...@apache.org>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Attila Bukor <ab...@apache.org>
Gerrit-Reviewer: Grant Henke <gr...@apache.org>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Comment-Date: Mon, 09 Nov 2020 14:06:51 +0000
Gerrit-HasComments: Yes

[kudu-CR] WIP: KUDU-1563 Support ignore operations in kudu-spark

Posted by "Alexey Serbin (Code Review)" <ge...@cloudera.org>.
Alexey Serbin has posted comments on this change. ( http://gerrit.cloudera.org:8080/16681 )

Change subject: WIP: KUDU-1563 Support ignore operations in kudu-spark
......................................................................


Patch Set 2:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/16681/2/java/kudu-spark/src/test/scala/org/apache/kudu/spark/kudu/DefaultSourceTest.scala
File java/kudu-spark/src/test/scala/org/apache/kudu/spark/kudu/DefaultSourceTest.scala:

http://gerrit.cloudera.org:8080/#/c/16681/2/java/kudu-spark/src/test/scala/org/apache/kudu/spark/kudu/DefaultSourceTest.scala@266
PS2, Line 266: insert-ignore
There is also `insert_ignore` at line 297.  What's the difference?

Also, is it possible to introduce some sort of constant for the operation types and use those in such cases?



-- 
To view, visit http://gerrit.cloudera.org:8080/16681
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If4b4dc0ec996a88afead0f9da0024457e568b0f4
Gerrit-Change-Number: 16681
Gerrit-PatchSet: 2
Gerrit-Owner: Grant Henke <gr...@apache.org>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Comment-Date: Sat, 31 Oct 2020 00:55:48 +0000
Gerrit-HasComments: Yes

[kudu-CR] KUDU-1563 Support ignore operations in kudu-spark

Posted by "Grant Henke (Code Review)" <ge...@cloudera.org>.
Grant Henke has posted comments on this change. ( http://gerrit.cloudera.org:8080/16681 )

Change subject: KUDU-1563 Support ignore operations in kudu-spark
......................................................................


Patch Set 3:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/16681/2/java/kudu-spark/src/test/scala/org/apache/kudu/spark/kudu/DefaultSourceTest.scala
File java/kudu-spark/src/test/scala/org/apache/kudu/spark/kudu/DefaultSourceTest.scala:

http://gerrit.cloudera.org:8080/#/c/16681/2/java/kudu-spark/src/test/scala/org/apache/kudu/spark/kudu/DefaultSourceTest.scala@266
PS2, Line 266: ess.getMaster
> There is also `insert_ignore` at line 297.  What's the difference?
There is no difference. I made the hyphen and underscore interchangeable in the Spark properties like Kudu's other flags/configs.

Testing using the string literal ensures we don't change a constant and break existing applications.



-- 
To view, visit http://gerrit.cloudera.org:8080/16681
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If4b4dc0ec996a88afead0f9da0024457e568b0f4
Gerrit-Change-Number: 16681
Gerrit-PatchSet: 3
Gerrit-Owner: Grant Henke <gr...@apache.org>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Grant Henke <gr...@apache.org>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Comment-Date: Fri, 06 Nov 2020 05:15:20 +0000
Gerrit-HasComments: Yes

[kudu-CR] KUDU-1563 Support ignore operations in kudu-spark

Posted by "Alexey Serbin (Code Review)" <ge...@cloudera.org>.
Alexey Serbin has posted comments on this change. ( http://gerrit.cloudera.org:8080/16681 )

Change subject: KUDU-1563 Support ignore operations in kudu-spark
......................................................................


Patch Set 5: Code-Review+2

(1 comment)

http://gerrit.cloudera.org:8080/#/c/16681/4/java/kudu-spark/src/main/scala/org/apache/kudu/spark/kudu/KuduContext.scala
File java/kudu-spark/src/main/scala/org/apache/kudu/spark/kudu/KuduContext.scala:

http://gerrit.cloudera.org:8080/#/c/16681/4/java/kudu-spark/src/main/scala/org/apache/kudu/spark/kudu/KuduContext.scala@159
PS4, Line 159: transient lazy val supportsIgnoreOperations
> This is only called on the driver node before the tasks are created. If it 
Ah, I see.  Thank you for the clarification.



-- 
To view, visit http://gerrit.cloudera.org:8080/16681
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If4b4dc0ec996a88afead0f9da0024457e568b0f4
Gerrit-Change-Number: 16681
Gerrit-PatchSet: 5
Gerrit-Owner: Grant Henke <gr...@apache.org>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Attila Bukor <ab...@apache.org>
Gerrit-Reviewer: Grant Henke <gr...@apache.org>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Comment-Date: Mon, 09 Nov 2020 06:57:11 +0000
Gerrit-HasComments: Yes

[kudu-CR] KUDU-1563 Support ignore operations in kudu-spark

Posted by "Grant Henke (Code Review)" <ge...@cloudera.org>.
Hello Alexey Serbin, Attila Bukor, Kudu Jenkins, Andrew Wong, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/16681

to look at the new patch set (#4).

Change subject: KUDU-1563 Support ignore operations in kudu-spark
......................................................................

KUDU-1563 Support ignore operations in kudu-spark

This patch adds support for the INSERT_IGNORE, UPDATE_IGNORE,
and DELETE_IGNORE operations into the Kudu Spark integration.

It leverages `AsyncKuduClient.supportsIgnoreOperations()` to
handle INSERT_IGNORE operations in a compatible way.

Change-Id: If4b4dc0ec996a88afead0f9da0024457e568b0f4
---
M java/kudu-spark/src/main/scala/org/apache/kudu/spark/kudu/DefaultSource.scala
M java/kudu-spark/src/main/scala/org/apache/kudu/spark/kudu/KuduContext.scala
M java/kudu-spark/src/main/scala/org/apache/kudu/spark/kudu/OperationType.scala
M java/kudu-spark/src/test/scala/org/apache/kudu/spark/kudu/DefaultSourceTest.scala
4 files changed, 83 insertions(+), 5 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/81/16681/4
-- 
To view, visit http://gerrit.cloudera.org:8080/16681
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: If4b4dc0ec996a88afead0f9da0024457e568b0f4
Gerrit-Change-Number: 16681
Gerrit-PatchSet: 4
Gerrit-Owner: Grant Henke <gr...@apache.org>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Attila Bukor <ab...@apache.org>
Gerrit-Reviewer: Grant Henke <gr...@apache.org>
Gerrit-Reviewer: Kudu Jenkins (120)

[kudu-CR] KUDU-1563 Support ignore operations in kudu-spark

Posted by "Grant Henke (Code Review)" <ge...@cloudera.org>.
Grant Henke has posted comments on this change. ( http://gerrit.cloudera.org:8080/16681 )

Change subject: KUDU-1563 Support ignore operations in kudu-spark
......................................................................


Patch Set 4:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/16681/4/java/kudu-spark/src/main/scala/org/apache/kudu/spark/kudu/KuduContext.scala
File java/kudu-spark/src/main/scala/org/apache/kudu/spark/kudu/KuduContext.scala:

http://gerrit.cloudera.org:8080/#/c/16681/4/java/kudu-spark/src/main/scala/org/apache/kudu/spark/kudu/KuduContext.scala@159
PS4, Line 159: transient lazy val supportsIgnoreOperations
> Does the '@transient' attribute mean that every worker calls this since thi
This is only called on the driver node before the tasks are created. If it were called inside `rdd.foreachPartition` in the writeRows method, then it would be on each worker.



-- 
To view, visit http://gerrit.cloudera.org:8080/16681
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If4b4dc0ec996a88afead0f9da0024457e568b0f4
Gerrit-Change-Number: 16681
Gerrit-PatchSet: 4
Gerrit-Owner: Grant Henke <gr...@apache.org>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Attila Bukor <ab...@apache.org>
Gerrit-Reviewer: Grant Henke <gr...@apache.org>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Comment-Date: Sat, 07 Nov 2020 20:26:46 +0000
Gerrit-HasComments: Yes

[kudu-CR] KUDU-1563 Support ignore operations in kudu-spark

Posted by "Grant Henke (Code Review)" <ge...@cloudera.org>.
Hello Alexey Serbin, Kudu Jenkins, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/16681

to look at the new patch set (#3).

Change subject: KUDU-1563 Support ignore operations in kudu-spark
......................................................................

KUDU-1563 Support ignore operations in kudu-spark

This patch adds support for the INSERT_IGNORE, UPDATE_IGNORE,
and DELETE_IGNORE operations into the Kudu Spark integration.

It leverages `AsyncKuduClient.supportsIgnoreOperations()` to
handle INSERT_IGNORE operations in a compatible way.

Change-Id: If4b4dc0ec996a88afead0f9da0024457e568b0f4
---
M java/kudu-spark/src/main/scala/org/apache/kudu/spark/kudu/DefaultSource.scala
M java/kudu-spark/src/main/scala/org/apache/kudu/spark/kudu/KuduContext.scala
M java/kudu-spark/src/main/scala/org/apache/kudu/spark/kudu/OperationType.scala
M java/kudu-spark/src/test/scala/org/apache/kudu/spark/kudu/DefaultSourceTest.scala
4 files changed, 79 insertions(+), 6 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/81/16681/3
-- 
To view, visit http://gerrit.cloudera.org:8080/16681
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: If4b4dc0ec996a88afead0f9da0024457e568b0f4
Gerrit-Change-Number: 16681
Gerrit-PatchSet: 3
Gerrit-Owner: Grant Henke <gr...@apache.org>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)

[kudu-CR] KUDU-1563 Support ignore operations in kudu-spark

Posted by "Attila Bukor (Code Review)" <ge...@cloudera.org>.
Attila Bukor has posted comments on this change. ( http://gerrit.cloudera.org:8080/16681 )

Change subject: KUDU-1563 Support ignore operations in kudu-spark
......................................................................


Patch Set 6: Code-Review+2

(1 comment)

http://gerrit.cloudera.org:8080/#/c/16681/5/java/kudu-spark/src/main/scala/org/apache/kudu/spark/kudu/KuduContext.scala
File java/kudu-spark/src/main/scala/org/apache/kudu/spark/kudu/KuduContext.scala:

http://gerrit.cloudera.org:8080/#/c/16681/5/java/kudu-spark/src/main/scala/org/apache/kudu/spark/kudu/KuduContext.scala@486
PS5, Line 486:         "kudu.ignoreDuplicateRowErrors is deprecated and slow. Use the insert_ignore operation instead.")
> "ignoreDuplicateRowErrors" is the old style and is always "supported".
Yea, meant the latter. Thanks for the clarification, it makes sense.



-- 
To view, visit http://gerrit.cloudera.org:8080/16681
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If4b4dc0ec996a88afead0f9da0024457e568b0f4
Gerrit-Change-Number: 16681
Gerrit-PatchSet: 6
Gerrit-Owner: Grant Henke <gr...@apache.org>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Attila Bukor <ab...@apache.org>
Gerrit-Reviewer: Grant Henke <gr...@apache.org>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Comment-Date: Mon, 09 Nov 2020 14:41:56 +0000
Gerrit-HasComments: Yes

[kudu-CR] KUDU-1563 Support ignore operations in kudu-spark

Posted by "Grant Henke (Code Review)" <ge...@cloudera.org>.
Grant Henke has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/16681 )

Change subject: KUDU-1563 Support ignore operations in kudu-spark
......................................................................

KUDU-1563 Support ignore operations in kudu-spark

This patch adds support for the INSERT_IGNORE, UPDATE_IGNORE,
and DELETE_IGNORE operations into the Kudu Spark integration.

It leverages `AsyncKuduClient.supportsIgnoreOperations()` to
handle INSERT_IGNORE operations in a compatible way.

Change-Id: If4b4dc0ec996a88afead0f9da0024457e568b0f4
Reviewed-on: http://gerrit.cloudera.org:8080/16681
Reviewed-by: Attila Bukor <ab...@apache.org>
Tested-by: Kudu Jenkins
Reviewed-by: Alexey Serbin <as...@cloudera.com>
---
M java/kudu-spark/src/main/scala/org/apache/kudu/spark/kudu/DefaultSource.scala
M java/kudu-spark/src/main/scala/org/apache/kudu/spark/kudu/KuduContext.scala
M java/kudu-spark/src/main/scala/org/apache/kudu/spark/kudu/OperationType.scala
M java/kudu-spark/src/test/scala/org/apache/kudu/spark/kudu/DefaultSourceTest.scala
4 files changed, 83 insertions(+), 5 deletions(-)

Approvals:
  Attila Bukor: Looks good to me, approved
  Kudu Jenkins: Verified
  Alexey Serbin: Looks good to me, approved

-- 
To view, visit http://gerrit.cloudera.org:8080/16681
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: If4b4dc0ec996a88afead0f9da0024457e568b0f4
Gerrit-Change-Number: 16681
Gerrit-PatchSet: 7
Gerrit-Owner: Grant Henke <gr...@apache.org>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Attila Bukor <ab...@apache.org>
Gerrit-Reviewer: Grant Henke <gr...@apache.org>
Gerrit-Reviewer: Kudu Jenkins (120)