You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@poi.apache.org by marcesher <ma...@gmail.com> on 2011/07/18 13:52:10 UTC

Re: SXSSF API

The new API is certainly much nicer to work with. However, in my tests, it's
also orders of magnitude slower.  Using the BigGridDemo approach, I can
generate 300K rows, with 125 columns, in about 2 minutes.

With the streaming API, attempting just 10K rows with 100 columns takes
longer than the 300k / 125 col BigGridDemo code

FWIW, this is using 3.8 beta3, not the latest nightly.

Is it expected that the streaming API will be this much slower than the
BigGridDemo approach?

Thanks!





--
View this message in context: http://apache-poi.1045710.n5.nabble.com/SXSSF-API-tp4472443p4599160.html
Sent from the POI - User mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
For additional commands, e-mail: user-help@poi.apache.org


Re: SXSSF API

Posted by Yegor Kozlov <ye...@dinom.ru>.
Performance of SXSSF should be on par with XSSF. Please try with the
latest build from trunk - there have been improvements since
3.8-beta3. Daily builds can be downloaded from here:
http://encore.torchbox.com/poi-cvs-build/

If you still think that SXSSF is slow, please post sample code that we
can use to detect the bottleneck,

Yegor

On Mon, Jul 18, 2011 at 5:13 PM, marcesher <ma...@gmail.com> wrote:
> I gave the latest nightlies a shot, and performance has indeed much improved!
>
> Using BigGridDemo, I can generate a 300k / 125-col sheet in between 2 and
> 2.5 minutes.
>
> Using the AutoFlush example, I can generate the same dimension file in about
> 4.5 minutes.
>
> Using the second example, but flushing all rows (flush()), I can generate
> the same dimension file in about 3.5 minutes.
>
> So, not quite as fast as the BigGridDemo example, but a heckuva lot easier
> to read.
>
> Thanks for all the work, folks!
>
> --
> View this message in context: http://apache-poi.1045710.n5.nabble.com/SXSSF-API-tp4472443p4599448.html
> Sent from the POI - User mailing list archive at Nabble.com.
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
> For additional commands, e-mail: user-help@poi.apache.org
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
For additional commands, e-mail: user-help@poi.apache.org


Re: SXSSF API

Posted by Yegor Kozlov <ye...@dinom.ru>.
Play with the window size passed to the constructor of SXSSFWorkbook.

new SXSSFWorkbook(1);     // in theory it should be the fastest, each
row is immediately flushed.
new SXSSFWorkbook(100);  // default size of the row buffer
new SXSSFWorkbook(1000);

Yegor

On Mon, Jul 18, 2011 at 5:13 PM, marcesher <ma...@gmail.com> wrote:
> I gave the latest nightlies a shot, and performance has indeed much improved!
>
> Using BigGridDemo, I can generate a 300k / 125-col sheet in between 2 and
> 2.5 minutes.
>
> Using the AutoFlush example, I can generate the same dimension file in about
> 4.5 minutes.
>
> Using the second example, but flushing all rows (flush()), I can generate
> the same dimension file in about 3.5 minutes.
>
> So, not quite as fast as the BigGridDemo example, but a heckuva lot easier
> to read.
>
> Thanks for all the work, folks!
>
> --
> View this message in context: http://apache-poi.1045710.n5.nabble.com/SXSSF-API-tp4472443p4599448.html
> Sent from the POI - User mailing list archive at Nabble.com.
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
> For additional commands, e-mail: user-help@poi.apache.org
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
For additional commands, e-mail: user-help@poi.apache.org


Re: SXSSF API

Posted by marcesher <ma...@gmail.com>.
I gave the latest nightlies a shot, and performance has indeed much improved! 

Using BigGridDemo, I can generate a 300k / 125-col sheet in between 2 and
2.5 minutes.

Using the AutoFlush example, I can generate the same dimension file in about
4.5 minutes.

Using the second example, but flushing all rows (flush()), I can generate
the same dimension file in about 3.5 minutes. 

So, not quite as fast as the BigGridDemo example, but a heckuva lot easier
to read.

Thanks for all the work, folks!

--
View this message in context: http://apache-poi.1045710.n5.nabble.com/SXSSF-API-tp4472443p4599448.html
Sent from the POI - User mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
For additional commands, e-mail: user-help@poi.apache.org


Re: SXSSF API

Posted by Nick Burch <ni...@alfresco.com>.
On Mon, 18 Jul 2011, marcesher wrote:
> FWIW, this is using 3.8 beta3, not the latest nightly.
>
> Is it expected that the streaming API will be this much slower than the
> BigGridDemo approach?

The streaming approach will likely be a little bit slower. Instead of just 
writing the values out as they go, it needs to track them for a while, and 
flush them out when the window passes beyond them. It shouldn't be that 
much of an overhead though

Any chance you could re-try with a recent nightly build / a build from a 
svn checkout of trunk, and see if that's any better? And if not, could you 
run a profiler and see where the time's going?

Thanks
Nick

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
For additional commands, e-mail: user-help@poi.apache.org