You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@arrow.apache.org by "Zac (Jira)" <ji...@apache.org> on 2021/11/10 17:56:00 UTC

[jira] [Created] (ARROW-14665) [Java] JdbcToArrowUtils ResultSet iteration bug

Zac created ARROW-14665:
---------------------------

             Summary: [Java] JdbcToArrowUtils ResultSet iteration bug
                 Key: ARROW-14665
                 URL: https://issues.apache.org/jira/browse/ARROW-14665
             Project: Apache Arrow
          Issue Type: Bug
          Components: Java
    Affects Versions: 6.0.0
            Reporter: Zac


When specifying a target batch size, the [iteration logic|https://github.com/apache/arrow/blob/ea42b9e0aa000238fff22fd48f06f3aa516b9f3f/java/adapter/jdbc/src/main/java/org/apache/arrow/adapter/jdbc/JdbcToArrowUtils.java#L266] is currently broken:


{code:java}
        while (rs.next() && readRowCount < config.getTargetBatchSize()) {
          compositeConsumer.consume(rs);
          readRowCount++;
        }

{code}

calling next() on the result set will move the cursor forward to the next row, even when we've reached the target batch size.

For example, consider setting target batch size to 1, and query a table that has three rows.

On the first iteration, we'll successfully consume the first row. On the next iteration, we'll move the cursor to row 2, but detect the read row count is no longer < target batch size and return.

Upon calling into the method again with the same result set, rs.next will be called again which will result in successfully consuming row 3.

*Problem:* row 2 is skipped! 



--
This message was sent by Atlassian Jira
(v8.20.1#820001)