You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@metamodel.apache.org by "ASF GitHub Bot (JIRA)" <ji...@apache.org> on 2015/10/05 00:04:26 UTC

[jira] [Commented] (METAMODEL-187) ExcelDataContext uses more memory than it needs to for File-based resources.

    [ https://issues.apache.org/jira/browse/METAMODEL-187?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14942826#comment-14942826 ] 

ASF GitHub Bot commented on METAMODEL-187:
------------------------------------------

GitHub user kaspersorensen opened a pull request:

    https://github.com/apache/metamodel/pull/56

    Metamodel 187/excel file another attempt

    Here's my attempt at solving METAMODEL-187. Heavily inspired by #49 but to get my head around it I had to start over and track resource usage/closing very closely.
    
    I finally found the culprit (maybe one of more) and it was to close the workbook in ExcelUtils.writeWorkbook(...) before flushing the new contents to the resource.
    
    I also updated POI since the errors given in the later version was giving a tiny bit more sense.
    
    I also used an InMemoryResource instead of a temp file resource. I am working on the assumption that this can fit into memory easily in it's persisted form, but that the memory improvement we need to do is only on the unpacked form.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/kaspersorensen/metamodel METAMODEL-187/excel-file-another-attempt

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/metamodel/pull/56.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #56
    
----
commit e6dca2049e9eb37a52b47f6e1cd257427e7dac1a
Author: Kasper Sørensen <i....@gmail.com>
Date:   2015-10-04T21:43:46Z

    METAMODEL-187: First attempt at solving this issue. Now I get a more
    regular JUnit error regarding a MALFORMED file instead of a JVM crash!

commit cd41e173ad5f8861050862e07e69d10bac9be3d6
Author: Kasper Sørensen <i....@gmail.com>
Date:   2015-10-04T21:59:47Z

    METAMODEL-187: Fixed it! :-D

----


> ExcelDataContext uses more memory than it needs to for File-based resources.
> ----------------------------------------------------------------------------
>
>                 Key: METAMODEL-187
>                 URL: https://issues.apache.org/jira/browse/METAMODEL-187
>             Project: Apache MetaModel
>          Issue Type: Bug
>            Reporter: Dennis Du Krøger
>            Priority: Minor
>         Attachments: Memory use File-based.png, Memory use InputStream-based.png
>
>
> ExcelDataContext uses the input stream from resources, even if it is a FileResource. This is pretty wasteful memory-wise, both according to http://poi.apache.org/spreadsheet/quick-guide.html#FileInputStream and to own tests; I made a naïve change that uses the internal File of FileResources on a huge file. With InputStream, getting the defaultSchema used around 950 MB in average, while it used around 650 in average with the File based version (nothing scientific, just eyeballed in JVisualVM).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)