You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@poi.apache.org by Jerry Williamson <je...@e2open.com> on 2020/01/16 15:57:51 UTC

Creating an SXSSFWorkbook for an existing workbook

Hi,

I have a system where POI was used to generated an XLSX file with approximately 256 columns x 35,000 rows of data.
We used SXSSFWorkbook to create it in order to maintain a reasonably small memory footprint.
The resulting workbook is 24-28MB depending on whether shared strings are used.

Sometime later, we want to add an additional sheet to the workbook and we, again, want to create an SXSSFWorkbook to do the work.

Unfortunately, the only mechanism available seems to be to create a an XSSFWorkbook first and then pass that into the constructor of the SXSSFWorkbook. The problem is that instantiating the XSSFWorkbook is exhausting a 4GB heap if I bump the heap up to 8GB (in this case), I can get it seems somewhat antithetical to the streaming approach in the first place.

Is there any way to create an SXSSFWorkbook for an existing file without consuming all of that memory?

Thanks


Re: Creating an SXSSFWorkbook for an existing workbook

Posted by Dominik Stadler <do...@gmx.at>.
Hi,

The XSSFWorkbook parameter to SXSSFWorkbook is intended as"template" with
later writing large amounts of data in streaming mode.

If you need to read a large file and produce another large file, you will
need to take a look at some of the event/streaming-based input
functionality in Apache POI, see e.g.
https://github.com/dearshor/poi-examples/blob/master/src/main/java/poi/xssf/eventusermodel/XLSX2CSV.java
for an example of how to parse a sheet from an XLSX-file in a streaming
mode.

The classes OPCPackage and XSSFReader are used to extract the information
from the file, then by implementing the SheetContentsHandler interface you
have callbacks which allow you to process the data and populate a new
SXSSFWorkbook with the values.

Dominik.


On Thu, Jan 16, 2020 at 4:57 PM Jerry Williamson <
jerry.williamson@e2open.com> wrote:

> Hi,
>
> I have a system where POI was used to generated an XLSX file with
> approximately 256 columns x 35,000 rows of data.
> We used SXSSFWorkbook to create it in order to maintain a reasonably small
> memory footprint.
> The resulting workbook is 24-28MB depending on whether shared strings are
> used.
>
> Sometime later, we want to add an additional sheet to the workbook and we,
> again, want to create an SXSSFWorkbook to do the work.
>
> Unfortunately, the only mechanism available seems to be to create a an
> XSSFWorkbook first and then pass that into the constructor of the
> SXSSFWorkbook. The problem is that instantiating the XSSFWorkbook is
> exhausting a 4GB heap if I bump the heap up to 8GB (in this case), I can
> get it seems somewhat antithetical to the streaming approach in the first
> place.
>
> Is there any way to create an SXSSFWorkbook for an existing file without
> consuming all of that memory?
>
> Thanks
>
>