You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@drill.apache.org by "ASF GitHub Bot (Jira)" <ji...@apache.org> on 2021/12/08 21:12:00 UTC

[jira] [Commented] (DRILL-8070) format-excel assumes that rowIterator returns every row

    [ https://issues.apache.org/jira/browse/DRILL-8070?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17455995#comment-17455995 ] 

ASF GitHub Bot commented on DRILL-8070:
---------------------------------------

pjfanning opened a new pull request #2399:
URL: https://github.com/apache/drill/pull/2399


   # [DRILL-8070](https://issues.apache.org/jira/browse/DRILL-8070): format-excel assumes that rowIterator returns every row
   
   ## Description
   
   Replaces the code for skipping rows with a version that checks the rownum as it iterates
   
   ## Documentation
   No changes needed
   
   ## Testing
   Unit tests. Advice about how to add a test that changes the ExcelBatchReader.ExcelReaderConfig would be appreciated. There do not appear to be any existing tests that set the ExcelReaderConfig.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@drill.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


> format-excel assumes that rowIterator returns every row
> -------------------------------------------------------
>
>                 Key: DRILL-8070
>                 URL: https://issues.apache.org/jira/browse/DRILL-8070
>             Project: Apache Drill
>          Issue Type: Bug
>          Components: Execution - Data Types
>            Reporter: PJ Fanning
>            Priority: Major
>
> In ExcelBatchReader, this code makes the wrong assumption:
> {code:java}
>     for (int i = 1; i < rowNumber; i++) {
>          currentRow = rowIterator.next();
>     } {code}
>  
> There are 2 for loops like this.
> Empty Rows will not necessarily be returned by the iterator. Basically, rows without populated cells could easily be skipped. Think of the Sheet as being represented as a sparse matrix - because it is stored like this.
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)