You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@openoffice.apache.org by Mariusz `shd` Gliwiński <al...@gmail.com> on 2013/07/15 11:54:47 UTC

[dev] Performance of exporting through open office

Hello,
i'm looking for options of exporting fairly large XLS,XLSX documents 
through openoffice.
The basic requirement i currently have is to export 300_000*30 cell 
documents.
We need colouring and formulas for documents too (at least smaller ones).

We've been using external libraries for it, but because of performance 
and feature-set i've been thinking about going 'native' with OO.
I'm completely green in OO project, however i tried example java project 
with UNO which was horribly slow.
My guess is - most of performance and memory is used for rendering and 
other ui-things.

Finally, my question to you is:
* what API should i look on, to format my sheets through openoffice 
libraries (XLS,XLSX,CSV) - ideally i'd like to access document model 
instead of statelessly format file (if it's realistic to don't eat-up 
more than 5Gig memory)
* do you think, that using OO components in my situation is a right 
thing to do? or maybe should i look for external libraries? Or maybe 
native / good implentation wont help too much, and i should use simple 
implementation for big documents vs. feature-rich implementation for 
small documents?

Thanks in advance,
Mariusz Gliwiński

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@openoffice.apache.org
For additional commands, e-mail: users-help@openoffice.apache.org


Re: [dev] Performance of exporting through open office

Posted by shd <al...@gmail.com>.
Apache POI is exactly what i needed. Thanks for your advice.


2013/7/15 Mariusz `shd` Gliwiński <al...@gmail.com>

> 15.07.2013 12:28, Dennis E. Hamilton said:
>
>> If what you want to *produce* is xls or xlsx, you might want to look into
>> the Apache POI project.  That project provides Java libraries for
>> manipulating and producing such documents.
>> <http://poi.apache.org/>
>>
> Thank you for your answer. We need to both - read and write, most of data
> can be processed row-by-row though. Looks like i initially expected too
> much convenience. I'll probably read more about stream-based java libraries
> (originally our software is php). I don't know how realistic is to have
> formulas and colors *stream-based*, however if someone knows about it - i'd
> be glad to hear about solutions like that.
>
>  Otherwise, I think the best option is the OpenOffice SDK.
>>
> I tried to fill cells manually and import from database (both using UNO).
> Unfortunately both ways are much slower than what we've had before. If
> there are any other methods to fill the sheets and export, i'd love to try
> them out.
>
>  There is, of course, document model and library access available within
>> Microsoft Office and Visual Studio extensions for Microsoft Office.  I
>> assume that is not an option for you, although I would think the
>> performance of consuming the documents you are producing might be an issue
>> to address as well, and that might be a better place to address it.
>>
> I've been thinking about using C# on MS Windows + Office, but is it worth
> to setup and maintain the whole new server for it? Is the performance
> difference that big?
>
>  PS: CSV is always an option.  It doesn't work for formulas and cell
>> coloring, etc., though.  There is a form of single-file Excel spreadsheet
>> in XML that might be suitable.  There are also HTML formats for XML
>> spreadsheets that might be easier to emit.  Your document might be too
>> complicated or too large for these cases.  One way to find out is to make a
>> small version having the essential features you need, in Excel, and then
>> see whether and how well those are produced in flat XML and HTML.  Then you
>> can examine them for the structure that you need to replicate with
>> greater-sized documents.  Scaling experiments can be conducted to determine
>> if these are feasible to produce at your size and also to consume by the
>> intended recipients.
>>
>
> We already have separate code-path for CSV exports, however the client
> insists on XLS. Of course we can make one export for small files (with
> colors and formulas), and another one for simple but large exports. I'll
> still look a bit more for unified solution though.
>
> Thanks,
> Mariusz Gliwiński
>

Re: [dev] Performance of exporting through open office

Posted by Mariusz `shd` Gliwiński <al...@gmail.com>.
15.07.2013 12:28, Dennis E. Hamilton said:
> If what you want to *produce* is xls or xlsx, you might want to look into the Apache POI project.  That project provides Java libraries for manipulating and producing such documents.
> <http://poi.apache.org/>
Thank you for your answer. We need to both - read and write, most of 
data can be processed row-by-row though. Looks like i initially expected 
too much convenience. I'll probably read more about stream-based java 
libraries (originally our software is php). I don't know how realistic 
is to have formulas and colors *stream-based*, however if someone knows 
about it - i'd be glad to hear about solutions like that.
> Otherwise, I think the best option is the OpenOffice SDK.
I tried to fill cells manually and import from database (both using 
UNO). Unfortunately both ways are much slower than what we've had 
before. If there are any other methods to fill the sheets and export, 
i'd love to try them out.
> There is, of course, document model and library access available within Microsoft Office and Visual Studio extensions for Microsoft Office.  I assume that is not an option for you, although I would think the performance of consuming the documents you are producing might be an issue to address as well, and that might be a better place to address it.
I've been thinking about using C# on MS Windows + Office, but is it 
worth to setup and maintain the whole new server for it? Is the 
performance difference that big?
> PS: CSV is always an option.  It doesn't work for formulas and cell coloring, etc., though.  There is a form of single-file Excel spreadsheet in XML that might be suitable.  There are also HTML formats for XML spreadsheets that might be easier to emit.  Your document might be too complicated or too large for these cases.  One way to find out is to make a small version having the essential features you need, in Excel, and then see whether and how well those are produced in flat XML and HTML.  Then you can examine them for the structure that you need to replicate with greater-sized documents.  Scaling experiments can be conducted to determine if these are feasible to produce at your size and also to consume by the intended recipients.

We already have separate code-path for CSV exports, however the client 
insists on XLS. Of course we can make one export for small files (with 
colors and formulas), and another one for simple but large exports. I'll 
still look a bit more for unified solution though.

Thanks,
Mariusz Gliwiński

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@openoffice.apache.org
For additional commands, e-mail: users-help@openoffice.apache.org


RE: [dev] Performance of exporting through open office

Posted by "Dennis E. Hamilton" <de...@acm.org>.
If what you want to *produce* is xls or xlsx, you might want to look into the Apache POI project.  That project provides Java libraries for manipulating and producing such documents.
<http://poi.apache.org/>


If what you want to *produce* is ods, then the Apache ODF Toolkit might suite you.  It is also Java based.  
<http://incubator.apache.org/odftoolkit/>

Or some combination of the two.  

Otherwise, I think the best option is the OpenOffice SDK.


There is, of course, document model and library access available within Microsoft Office and Visual Studio extensions for Microsoft Office.  I assume that is not an option for you, although I would think the performance of consuming the documents you are producing might be an issue to address as well, and that might be a better place to address it.

 - Dennis

PS: CSV is always an option.  It doesn't work for formulas and cell coloring, etc., though.  There is a form of single-file Excel spreadsheet in XML that might be suitable.  There are also HTML formats for XML spreadsheets that might be easier to emit.  Your document might be too complicated or too large for these cases.  One way to find out is to make a small version having the essential features you need, in Excel, and then see whether and how well those are produced in flat XML and HTML.  Then you can examine them for the structure that you need to replicate with greater-sized documents.  Scaling experiments can be conducted to determine if these are feasible to produce at your size and also to consume by the intended recipients.

-----Original Message-----
From: Mariusz `shd` Gliwiński [mailto:alienballance@gmail.com] 
Sent: Monday, July 15, 2013 02:55 AM
To: users@openoffice.apache.org
Subject: [dev] Performance of exporting through open office

Hello,
i'm looking for options of exporting fairly large XLS,XLSX documents 
through openoffice.
The basic requirement i currently have is to export 300_000*30 cell 
documents.
We need colouring and formulas for documents too (at least smaller ones).

We've been using external libraries for it, but because of performance 
and feature-set i've been thinking about going 'native' with OO.
I'm completely green in OO project, however i tried example java project 
with UNO which was horribly slow.
My guess is - most of performance and memory is used for rendering and 
other ui-things.

Finally, my question to you is:
* what API should i look on, to format my sheets through openoffice 
libraries (XLS,XLSX,CSV) - ideally i'd like to access document model 
instead of statelessly format file (if it's realistic to don't eat-up 
more than 5Gig memory)
* do you think, that using OO components in my situation is a right 
thing to do? or maybe should i look for external libraries? Or maybe 
native / good implentation wont help too much, and i should use simple 
implementation for big documents vs. feature-rich implementation for 
small documents?

Thanks in advance,
Mariusz Gliwiński

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@openoffice.apache.org
For additional commands, e-mail: users-help@openoffice.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@openoffice.apache.org
For additional commands, e-mail: users-help@openoffice.apache.org