You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@poi.apache.org by Oliver Kohll - Mailing Lists <ol...@gtwm.co.uk> on 2012/08/10 11:13:55 UTC

Memory consumption

Hi,

I have a servlet which generates Excel files using POI and serves them as downloads. I've hit a heap space issue generating large(ish) files.

A file that ends up as 20MB, 300,000 rows causes a memory spike of about 1.3GB. A smaller 1MB file with a few thousand rows causes a 30 or 40 MB jump.

I've done everything I can think of so far:

I use the streaming excel generation class SXSSFWorkbook which I understand is low memory, buffering to disk.

There are two methods, one to generate the Excel and one to serve it. I save a temporary file to disk rather than transfer data directly in memory between the methods

The second method to serve the content uses buffered input and output streams.

Anything else you can think of? The code is here:

https://github.com/okohll/agileBase/blob/master/gtpb_server/src/com/gtwm/pb/servlets/ReportDownloader.java

getSessionReportAsExcel generates the spreadsheet and

serveSpreadsheet serves it.


Oliver Kohll
www.agilebase.co.uk


---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
For additional commands, e-mail: user-help@poi.apache.org


Re: Memory consumption

Posted by Oliver Kohll - Mailing Lists <ol...@gtwm.co.uk>.
I think you're probably right, it does load the whole dataset into memory from the database in one go. There's no quick answer to that, it'll take a bit of redesign, I've experimented with a couple of ideas but none have borne much fruit so far.

However, I have knocked 0.3GB off using statement.setFetchSize(100); I didn't realise you could stream data like that. Thanks.

Oliver

On 10 Aug 2012, at 13:36, Yegor Kozlov <ye...@dinom.ru> wrote:

> You are reading data from a database, can it be the bottleneck?
> SXSSF is a low-memory footprint API and I doubt it cause spikes up to 1.3GB.
> 
> Can your application cache objects ? Also, check settings of the JDBC
> driver and make sure it streams data and does not accumulate it in
> memory.
> 
> Yegor
> 
> On Fri, Aug 10, 2012 at 1:13 PM, Oliver Kohll - Mailing Lists
> <ol...@gtwm.co.uk> wrote:
>> Hi,
>> 
>> I have a servlet which generates Excel files using POI and serves them as downloads. I've hit a heap space issue generating large(ish) files.
>> 
>> A file that ends up as 20MB, 300,000 rows causes a memory spike of about 1.3GB. A smaller 1MB file with a few thousand rows causes a 30 or 40 MB jump.
>> 
>> I've done everything I can think of so far:
>> 
>> I use the streaming excel generation class SXSSFWorkbook which I understand is low memory, buffering to disk.
>> 
>> There are two methods, one to generate the Excel and one to serve it. I save a temporary file to disk rather than transfer data directly in memory between the methods
>> 
>> The second method to serve the content uses buffered input and output streams.
>> 
>> Anything else you can think of? The code is here:
>> 
>> https://github.com/okohll/agileBase/blob/master/gtpb_server/src/com/gtwm/pb/servlets/ReportDownloader.java
>> 
>> getSessionReportAsExcel generates the spreadsheet and
>> 
>> serveSpreadsheet serves it.
>> 
>> 
>> Oliver Kohll
>> www.agilebase.co.uk
>> 
>> 
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
>> For additional commands, e-mail: user-help@poi.apache.org
>> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
> For additional commands, e-mail: user-help@poi.apache.org
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
For additional commands, e-mail: user-help@poi.apache.org


Re: Memory consumption

Posted by Yegor Kozlov <ye...@dinom.ru>.
You are reading data from a database, can it be the bottleneck?
SXSSF is a low-memory footprint API and I doubt it cause spikes up to 1.3GB.

Can your application cache objects ? Also, check settings of the JDBC
driver and make sure it streams data and does not accumulate it in
memory.

Yegor

On Fri, Aug 10, 2012 at 1:13 PM, Oliver Kohll - Mailing Lists
<ol...@gtwm.co.uk> wrote:
> Hi,
>
> I have a servlet which generates Excel files using POI and serves them as downloads. I've hit a heap space issue generating large(ish) files.
>
> A file that ends up as 20MB, 300,000 rows causes a memory spike of about 1.3GB. A smaller 1MB file with a few thousand rows causes a 30 or 40 MB jump.
>
> I've done everything I can think of so far:
>
> I use the streaming excel generation class SXSSFWorkbook which I understand is low memory, buffering to disk.
>
> There are two methods, one to generate the Excel and one to serve it. I save a temporary file to disk rather than transfer data directly in memory between the methods
>
> The second method to serve the content uses buffered input and output streams.
>
> Anything else you can think of? The code is here:
>
> https://github.com/okohll/agileBase/blob/master/gtpb_server/src/com/gtwm/pb/servlets/ReportDownloader.java
>
> getSessionReportAsExcel generates the spreadsheet and
>
> serveSpreadsheet serves it.
>
>
> Oliver Kohll
> www.agilebase.co.uk
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
> For additional commands, e-mail: user-help@poi.apache.org
>

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
For additional commands, e-mail: user-help@poi.apache.org