You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@arrow.apache.org by "ASF GitHub Bot (Jira)" <ji...@apache.org> on 2021/11/10 18:17:00 UTC

[jira] [Updated] (ARROW-14665) [Java] JdbcToArrowUtils ResultSet iteration bug

     [ https://issues.apache.org/jira/browse/ARROW-14665?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

ASF GitHub Bot updated ARROW-14665:
-----------------------------------
    Labels: pull-request-available  (was: )

> [Java] JdbcToArrowUtils ResultSet iteration bug
> -----------------------------------------------
>
>                 Key: ARROW-14665
>                 URL: https://issues.apache.org/jira/browse/ARROW-14665
>             Project: Apache Arrow
>          Issue Type: Bug
>          Components: Java
>    Affects Versions: 6.0.0
>            Reporter: Zac
>            Priority: Major
>              Labels: pull-request-available
>          Time Spent: 10m
>  Remaining Estimate: 0h
>
> When specifying a target batch size, the [iteration logic|https://github.com/apache/arrow/blob/ea42b9e0aa000238fff22fd48f06f3aa516b9f3f/java/adapter/jdbc/src/main/java/org/apache/arrow/adapter/jdbc/JdbcToArrowUtils.java#L266] is currently broken:
> {code:java}
>         while (rs.next() && readRowCount < config.getTargetBatchSize()) {
>           compositeConsumer.consume(rs);
>           readRowCount++;
>         }
> {code}
> calling next() on the result set will move the cursor forward to the next row, even when we've reached the target batch size.
> For example, consider setting target batch size to 1, and query a table that has three rows.
> On the first iteration, we'll successfully consume the first row. On the next iteration, we'll move the cursor to row 2, but detect the read row count is no longer < target batch size and return.
> Upon calling into the method again with the same result set, rs.next will be called again which will result in successfully consuming row 3.
> *Problem:* row 2 is skipped! 



--
This message was sent by Atlassian Jira
(v8.20.1#820001)