You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@drill.apache.org by "ASF GitHub Bot (Jira)" <ji...@apache.org> on 2020/02/20 16:47:00 UTC

[jira] [Commented] (DRILL-7514) Update Apache POI to Latest Version

    [ https://issues.apache.org/jira/browse/DRILL-7514?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17041130#comment-17041130 ] 

ASF GitHub Bot commented on DRILL-7514:
---------------------------------------

cgivre commented on pull request #1991: DRILL-7514: Update Apache POI to Latest Version
URL: https://github.com/apache/drill/pull/1991
 
 
   # [DRILL-7514](https://issues.apache.org/jira/browse/DRILL-7514): Update Apache POI to Latest Version
   
   ## Description
   
   Drill's Excel Format Plugin uses Apache POI to parse Excel files. While this reader is effective in that it parses formulae and data types, it uses memory inefficiently and will struggle to read very large Excel files.  
   The latest version of POI addresses some of the memory issues and hopefully Drill will be able to query larger Excel files without running out of memory.
   
   This PR updates Drill to use the latest version of Apache POI and also updates the User Agent Parser to a more recent version. 
   
   There was a minor change to the POI's behavior with respect to empty sheets, so I had to enclose a line in a `try/catch` block.  
   
   ## Documentation
   No user visible changes.
   
   ## Testing
   All relevant unit tests run and passed.
   
 
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


> Update Apache POI to Latest Version
> -----------------------------------
>
>                 Key: DRILL-7514
>                 URL: https://issues.apache.org/jira/browse/DRILL-7514
>             Project: Apache Drill
>          Issue Type: Improvement
>    Affects Versions: 1.17.0
>            Reporter: Charles Givre
>            Assignee: Charles Givre
>            Priority: Minor
>             Fix For: 1.18.0
>
>
> Drill's Excel Format Plugin uses Apache POI to parse Excel files. While this reader is effective in that it parses formulae and data types, it uses memory inefficiently and will struggle to read very large Excel files.  
> The latest version of POI addresses some of the memory issues and hopefully Drill will be able to query larger Excel files without running out of memory.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)