You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@arrow.apache.org by "Zac (Jira)" <ji...@apache.org> on 2021/11/10 17:56:00 UTC
[jira] [Created] (ARROW-14665) [Java] JdbcToArrowUtils ResultSet
iteration bug
Zac created ARROW-14665:
---------------------------
Summary: [Java] JdbcToArrowUtils ResultSet iteration bug
Key: ARROW-14665
URL: https://issues.apache.org/jira/browse/ARROW-14665
Project: Apache Arrow
Issue Type: Bug
Components: Java
Affects Versions: 6.0.0
Reporter: Zac
When specifying a target batch size, the [iteration logic|https://github.com/apache/arrow/blob/ea42b9e0aa000238fff22fd48f06f3aa516b9f3f/java/adapter/jdbc/src/main/java/org/apache/arrow/adapter/jdbc/JdbcToArrowUtils.java#L266] is currently broken:
{code:java}
while (rs.next() && readRowCount < config.getTargetBatchSize()) {
compositeConsumer.consume(rs);
readRowCount++;
}
{code}
calling next() on the result set will move the cursor forward to the next row, even when we've reached the target batch size.
For example, consider setting target batch size to 1, and query a table that has three rows.
On the first iteration, we'll successfully consume the first row. On the next iteration, we'll move the cursor to row 2, but detect the read row count is no longer < target batch size and return.
Upon calling into the method again with the same result set, rs.next will be called again which will result in successfully consuming row 3.
*Problem:* row 2 is skipped!
--
This message was sent by Atlassian Jira
(v8.20.1#820001)