You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@poi.apache.org by Yegor Kozlov <ye...@dinom.ru> on 2011/06/09 13:54:16 UTC

SXSSF API

Hi All,

Questions about producing large .xlsx spreadsheets and the BigGridDemo
example are frequent on the POI mailing lists.
While BigGridDemo is good for demonstration purposes, I was never
comfortable to recommend it to end users because it requires some
knowledge of SpreadsheetML, it is easy to generate malformed XML, etc.

Alex Geller in https://issues.apache.org/bugzilla/show_bug.cgi?id=51160
provided an excellent contribution of the SXSSF API built on top of
XSSF.

SXSSF (package: org.apache.poi.xssf.streaming)  is an API-compatible
streaming extension of XSSF to be used when very large spreadsheets
have to be produced, and heap space is limited. SXSSF achieves its low
memory footprint by limiting access to the rows that  are within a
sliding window, while XSSF gives access to all rows in the  document.
Older rows that are no longer in the window become inaccessible, as
they are written to the disk.

API-compatible here means that you can use most of features from the
Spreadsheet Usermodel API without any knowledge of SpreadsheetML! All
you need to do is to construct a SXSSFWorkbook instead of
XSSFWorkbook. The rest is hidden in the implementation.


Finally, this code is documented and users are advised to use SXSSF
instead of BigGridDemo.

Please re-read the updated documentation:

http://poi.apache.org/spreadsheet/index.html
http://poi.apache.org/spreadsheet/how-to.html#sxssf

I also added SXSSF to the list of components in Bugzilla.

Yegor

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
For additional commands, e-mail: user-help@poi.apache.org


Re: SXSSF API

Posted by Alex Geller <ag...@4js.com>.
This was fixed in revision "1148056, Mon Jul 18 21:23:32 2011 UTC".
Regards,
Alex

--
View this message in context: http://apache-poi.1045710.n5.nabble.com/SXSSF-API-tp4472443p4771113.html
Sent from the POI - User mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
For additional commands, e-mail: user-help@poi.apache.org


Re: SXSSF API

Posted by Nick Burch <ni...@alfresco.com>.
On Sun, 4 Sep 2011, sahil.satpute wrote:
> The SXSSFWorkbook implementation (3.8-beta3) is not supporting JDK 1.5.

I'd suggest you try with 3.8 beta 4, I'm pretty sure we fixed this one 
soon after beta 3 came out

Nick

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
For additional commands, e-mail: user-help@poi.apache.org


Re: SXSSF API

Posted by "sahil.satpute" <sa...@gmail.com>.
The SXSSFWorkbook implementation (3.8-beta3) is not supporting JDK 1.5. Get
the following error:

[trace]
java.lang.NoSuchMethodError:
java.util.TreeMap.firstEntry()Ljava/util/Map$Entry;
	at
org.apache.poi.xssf.streaming.SXSSFSheet.flushOneRow(SXSSFSheet.java:1203)
	at org.apache.poi.xssf.streaming.SXSSFSheet.flushRows(SXSSFSheet.java:1199)
	at
org.apache.poi.xssf.streaming.SXSSFSheet.getWorksheetXMLInputStream(SXSSFSheet.java:61)
	at
org.apache.poi.xssf.streaming.SXSSFWorkbook.injectData(SXSSFWorkbook.java:108)
	at
org.apache.poi.xssf.streaming.SXSSFWorkbook.write(SXSSFWorkbook.java:496)
[/trace]

Are there any plans for backward compatibility for JDK 1.5 in the final
release?

--
View this message in context: http://apache-poi.1045710.n5.nabble.com/SXSSF-API-tp4472443p4769377.html
Sent from the POI - User mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
For additional commands, e-mail: user-help@poi.apache.org


Re: SXSSF API

Posted by rwrozelle <rw...@yahoo.com>.
It would be nice if I could pass in the temp file directory location rather
than being forced to use the system default.

Thanks,
Bob

--
View this message in context: http://apache-poi.1045710.n5.nabble.com/SXSSF-API-tp4472443p4670140.html
Sent from the POI - User mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
For additional commands, e-mail: user-help@poi.apache.org


Re: SXSSF API

Posted by Yegor Kozlov <ye...@dinom.ru>.
Performance of SXSSF should be on par with XSSF. Please try with the
latest build from trunk - there have been improvements since
3.8-beta3. Daily builds can be downloaded from here:
http://encore.torchbox.com/poi-cvs-build/

If you still think that SXSSF is slow, please post sample code that we
can use to detect the bottleneck,

Yegor

On Mon, Jul 18, 2011 at 5:13 PM, marcesher <ma...@gmail.com> wrote:
> I gave the latest nightlies a shot, and performance has indeed much improved!
>
> Using BigGridDemo, I can generate a 300k / 125-col sheet in between 2 and
> 2.5 minutes.
>
> Using the AutoFlush example, I can generate the same dimension file in about
> 4.5 minutes.
>
> Using the second example, but flushing all rows (flush()), I can generate
> the same dimension file in about 3.5 minutes.
>
> So, not quite as fast as the BigGridDemo example, but a heckuva lot easier
> to read.
>
> Thanks for all the work, folks!
>
> --
> View this message in context: http://apache-poi.1045710.n5.nabble.com/SXSSF-API-tp4472443p4599448.html
> Sent from the POI - User mailing list archive at Nabble.com.
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
> For additional commands, e-mail: user-help@poi.apache.org
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
For additional commands, e-mail: user-help@poi.apache.org


Re: SXSSF API

Posted by Yegor Kozlov <ye...@dinom.ru>.
Play with the window size passed to the constructor of SXSSFWorkbook.

new SXSSFWorkbook(1);     // in theory it should be the fastest, each
row is immediately flushed.
new SXSSFWorkbook(100);  // default size of the row buffer
new SXSSFWorkbook(1000);

Yegor

On Mon, Jul 18, 2011 at 5:13 PM, marcesher <ma...@gmail.com> wrote:
> I gave the latest nightlies a shot, and performance has indeed much improved!
>
> Using BigGridDemo, I can generate a 300k / 125-col sheet in between 2 and
> 2.5 minutes.
>
> Using the AutoFlush example, I can generate the same dimension file in about
> 4.5 minutes.
>
> Using the second example, but flushing all rows (flush()), I can generate
> the same dimension file in about 3.5 minutes.
>
> So, not quite as fast as the BigGridDemo example, but a heckuva lot easier
> to read.
>
> Thanks for all the work, folks!
>
> --
> View this message in context: http://apache-poi.1045710.n5.nabble.com/SXSSF-API-tp4472443p4599448.html
> Sent from the POI - User mailing list archive at Nabble.com.
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
> For additional commands, e-mail: user-help@poi.apache.org
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
For additional commands, e-mail: user-help@poi.apache.org


Re: SXSSF API

Posted by marcesher <ma...@gmail.com>.
I gave the latest nightlies a shot, and performance has indeed much improved! 

Using BigGridDemo, I can generate a 300k / 125-col sheet in between 2 and
2.5 minutes.

Using the AutoFlush example, I can generate the same dimension file in about
4.5 minutes.

Using the second example, but flushing all rows (flush()), I can generate
the same dimension file in about 3.5 minutes. 

So, not quite as fast as the BigGridDemo example, but a heckuva lot easier
to read.

Thanks for all the work, folks!

--
View this message in context: http://apache-poi.1045710.n5.nabble.com/SXSSF-API-tp4472443p4599448.html
Sent from the POI - User mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
For additional commands, e-mail: user-help@poi.apache.org


Re: SXSSF API

Posted by Nick Burch <ni...@alfresco.com>.
On Mon, 18 Jul 2011, marcesher wrote:
> FWIW, this is using 3.8 beta3, not the latest nightly.
>
> Is it expected that the streaming API will be this much slower than the
> BigGridDemo approach?

The streaming approach will likely be a little bit slower. Instead of just 
writing the values out as they go, it needs to track them for a while, and 
flush them out when the window passes beyond them. It shouldn't be that 
much of an overhead though

Any chance you could re-try with a recent nightly build / a build from a 
svn checkout of trunk, and see if that's any better? And if not, could you 
run a profiler and see where the time's going?

Thanks
Nick

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
For additional commands, e-mail: user-help@poi.apache.org


Re: SXSSF API

Posted by marcesher <ma...@gmail.com>.
The new API is certainly much nicer to work with. However, in my tests, it's
also orders of magnitude slower.  Using the BigGridDemo approach, I can
generate 300K rows, with 125 columns, in about 2 minutes.

With the streaming API, attempting just 10K rows with 100 columns takes
longer than the 300k / 125 col BigGridDemo code

FWIW, this is using 3.8 beta3, not the latest nightly.

Is it expected that the streaming API will be this much slower than the
BigGridDemo approach?

Thanks!





--
View this message in context: http://apache-poi.1045710.n5.nabble.com/SXSSF-API-tp4472443p4599160.html
Sent from the POI - User mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
For additional commands, e-mail: user-help@poi.apache.org


Re: SXSSF API

Posted by Todd Feinstein <tf...@gmail.com>.
Excellent!  I'll get it now.  Thanks for the quick response.

Todd

--
View this message in context: http://apache-poi.1045710.n5.nabble.com/SXSSF-API-tp4472443p4493137.html
Sent from the POI - User mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
For additional commands, e-mail: user-help@poi.apache.org


Re: SXSSF API

Posted by Dave Fisher <da...@comcast.net>.
The documentation on the website always reflects trunk. I know this causes some confusion. The distributions include built documentation for that release.

This feature is brand new contribution. Please try with a nightly build - http://encore.torchbox.com/poi-cvs-build/ or the svn trunk - http://poi.apache.org/subversion.html

Regards,
Dave

On Jun 15, 2011, at 5:15 PM, Todd Feinstein wrote:

> I have just downloaded the latest 3.8 files.  Unfortunately the constructor
> referred to in the documentation doesn't exist yet.  I'm hoping a new
> release will contain the functionality you are talking about.  It will be
> extremely useful.
> 
> Cheers,
> Todd
> 
> --
> View this message in context: http://apache-poi.1045710.n5.nabble.com/SXSSF-API-tp4472443p4493097.html
> Sent from the POI - User mailing list archive at Nabble.com.


---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
For additional commands, e-mail: user-help@poi.apache.org


Re: SXSSF API

Posted by Todd Feinstein <tf...@gmail.com>.
I have just downloaded the latest 3.8 files.  Unfortunately the constructor
referred to in the documentation doesn't exist yet.  I'm hoping a new
release will contain the functionality you are talking about.  It will be
extremely useful.

Cheers,
Todd

--
View this message in context: http://apache-poi.1045710.n5.nabble.com/SXSSF-API-tp4472443p4493097.html
Sent from the POI - User mailing list archive at Nabble.com.