You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@kudu.apache.org by "Hongjiang Zhang (Code Review)" <ge...@cloudera.org> on 2021/09/06 03:26:32 UTC

[kudu-CR] KUDU-1260: Fix prefetching bug on Java scanner

Hello Alexey Serbin, Kudu Jenkins, Grant Henke, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/17773

to look at the new patch set (#8).

Change subject: KUDU-1260: Fix prefetching bug on Java scanner
......................................................................

KUDU-1260: Fix prefetching bug on Java scanner

Add a UT to test prefetching. The UT has two concurrent threads: writing
thread and scanner thread. The writing thread records the timestamp of
its write, and the scanner thread creates two scanners (w and w/o prefetching),
by comparing the scan result of the two scanners, we can verify the
prefetching result.

When prefetching is enabled, there is a RowResultIterator prefetched and
it will override the one which has not yet been consumed in current
implementation, as a result, some data will loss. The fix is simple:
just use a small queue to cache the prefetching result.

Furthermore, there are at most two consecutive ScanRequests
sent to the tserver. But if the scan data reached the end, only one hasMore=false is returned.
As a result, one of the ScanRequests got "scanner not found (it may have expired)" exception.
The same issue occurs for KeepAliveRequest.
This error may cause the spark task to fail sometimes.

Change-Id: I853a041d86c75ec196d7d4ff45af4673c5c5f5cd
---
M java/kudu-client/src/main/java/org/apache/kudu/client/AsyncKuduScanner.java
A java/kudu-client/src/test/java/org/apache/kudu/client/TestKuduScannerPrefetching.java
M java/kudu-spark/src/main/scala/org/apache/kudu/spark/kudu/KuduRDD.scala
M java/kudu-spark/src/main/scala/org/apache/kudu/spark/kudu/KuduReadOptions.scala
M java/kudu-spark/src/test/scala/org/apache/kudu/spark/kudu/KuduRDDTest.scala
5 files changed, 441 insertions(+), 13 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/73/17773/8
-- 
To view, visit http://gerrit.cloudera.org:8080/17773
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I853a041d86c75ec196d7d4ff45af4673c5c5f5cd
Gerrit-Change-Number: 17773
Gerrit-PatchSet: 8
Gerrit-Owner: Hongjiang Zhang <ho...@ebay.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Grant Henke <gr...@apache.org>
Gerrit-Reviewer: Hongjiang Zhang <ho...@ebay.com>
Gerrit-Reviewer: Kudu Jenkins (120)