You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@arrow.apache.org by "ASF GitHub Bot (Jira)" <ji...@apache.org> on 2021/11/10 18:17:00 UTC
[jira] [Updated] (ARROW-14665) [Java] JdbcToArrowUtils ResultSet
iteration bug
[ https://issues.apache.org/jira/browse/ARROW-14665?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
ASF GitHub Bot updated ARROW-14665:
-----------------------------------
Labels: pull-request-available (was: )
> [Java] JdbcToArrowUtils ResultSet iteration bug
> -----------------------------------------------
>
> Key: ARROW-14665
> URL: https://issues.apache.org/jira/browse/ARROW-14665
> Project: Apache Arrow
> Issue Type: Bug
> Components: Java
> Affects Versions: 6.0.0
> Reporter: Zac
> Priority: Major
> Labels: pull-request-available
> Time Spent: 10m
> Remaining Estimate: 0h
>
> When specifying a target batch size, the [iteration logic|https://github.com/apache/arrow/blob/ea42b9e0aa000238fff22fd48f06f3aa516b9f3f/java/adapter/jdbc/src/main/java/org/apache/arrow/adapter/jdbc/JdbcToArrowUtils.java#L266] is currently broken:
> {code:java}
> while (rs.next() && readRowCount < config.getTargetBatchSize()) {
> compositeConsumer.consume(rs);
> readRowCount++;
> }
> {code}
> calling next() on the result set will move the cursor forward to the next row, even when we've reached the target batch size.
> For example, consider setting target batch size to 1, and query a table that has three rows.
> On the first iteration, we'll successfully consume the first row. On the next iteration, we'll move the cursor to row 2, but detect the read row count is no longer < target batch size and return.
> Upon calling into the method again with the same result set, rs.next will be called again which will result in successfully consuming row 3.
> *Problem:* row 2 is skipped!
--
This message was sent by Atlassian Jira
(v8.20.1#820001)