You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@poi.apache.org by Martin Asenov <mA...@velti.com> on 2010/04/23 15:51:27 UTC

flush workbook from time to time

Hello, everyone!

I've got some DB which contains some contacts. I'm using the Apache POI in order to export that DB to an Excel file. Anyway, let's presume the contacts are 2000. When I start creating the workbook, the contacts are inserted pretty quick, but when the workbook begins to get bigger and bigger, things are terribly slowed down.

Is there a way to flush the workbook from time to time? I didn't see such opportunity in the API.

Best,
Martin

RE: flush workbook from time to time

Posted by MSB <ma...@tiscali.co.uk>.
Hello Martin,

212 KB is quite large, certainly larger than most anything I have played
with and I do not know how it compares to the sorts of file sizes others
have created. Like you, I am a bit surprised that the time taken for
inserting record increases and think it might be worthwhile profiling the
code. At least that way you can be fairly certain where the problem lies and
attack it from there.

All the best.

Yours

Mark B


Martin Asenov wrote:
> 
> No, I've been using old good system.out.println(); statements - and
> noticed that the gap between different insertions into the workbook
> slighly increases on every insertion. Actually, the output file was 212 kb
> large - it is surprising for me that file of such size could be generated
> in so large amount of time. Something must be wrong...
> 
> Best,
> Martin
> 
> -----Original Message-----
> From: MSB [mailto:markbrdsly@tiscali.co.uk] 
> Sent: Friday, April 23, 2010 6:24 PM
> To: user@poi.apache.org
> Subject: RE: flush workbook from time to time
> 
> 
> Oh, well in that case the BigGridDemo would not work for you as it was
> only
> applicable to the OOXML based file format. If you are using the older
> binary
> format (.xls) then the answer is a certain no to streaming the file in
> chunks I am sorry to say. The reason lies in the structure of the Excel
> file
> itself, I am not completely certain of the details but understand that it
> consists of a series of streams and is block structured - that is why the
> files sizes are always multiples of 512 or 4096 I believe. As a result, a
> model of the file must be assembled in memory firstly, formatted and then
> written out.
> 
> Must admit that I am surprised you are experiencing problems with
> performance if you are using the HSSF stream as that is a more common
> problem with the OOXML files. Have you tried running a profiler on the
> code
> to see where the bottlenecks are?
> 
> Yours
> 
> Mark B
> 
> 
> Martin Asenov wrote:
>> 
>> Hi, Mark, thanks for the quick reply.
>> 
>> I don't know exactly what format we're exporting into, but I think it's
>> the .xls one. In code we're using HSSFWorkbook class. Anyway, the whole
>> thing is not written by me, so I'm unaware of more details about it.
>> 
>> Every single contact expands from 2 to up to 20 horizontal cells, but
>> commonly in 4-5 horizontal cells.
>> 
>> The point is that in the UI when the user clicks on 'export' we're
>> supposed to supply him with the exported file. Delaying is not a problem,
>> unless it's 7-8 mins for a workbook of 2000 contacts :-) I hope there's a
>> way to work around this...
>> 
>> Best,
>> Martin
>> 
>> 
>> -----Original Message-----
>> From: MSB [mailto:markbrdsly@tiscali.co.uk] 
>> Sent: Friday, April 23, 2010 5:17 PM
>> To: user@poi.apache.org
>> Subject: Re: flush workbook from time to time
>> 
>> 
>> Can I ask which version of the file format you are targetting please, the
>> binary (.xls) or OOXML (.xlsx) one? If it is the latter, then it may be
>> worthwhile looking at Yegor's BigGridDemo program - sorry but I do not
>> have
>> the link at hand currently - where he streams the contents of a file into
>> a
>> template to create large worksheets without encountering out of memory
>> exceptions. It is not a fully fledged utility but rather a proof that you
>> may - and that is may - be able to adapt to handle your current
>> requirement.
>> I do not know if you would be able to modify the program to, for example,
>> fill rows 0 to 2000 with data and then fill rows 2001 to 4000 with data
>> at
>> a
>> later date but I feel it could be worth looking into.
>> 
>> Aside from that, there is no way to currently stream the document out bit
>> by
>> bit I am sorry to say. It should be possible and has, I believe, been
>> discussed by the developers but would require some programming effort and
>> no
>> one has stepped up to sponsor the work.
>> 
>> Yours
>> 
>> Mark B
>> 
>> 
>> Martin Asenov wrote:
>>> 
>>> Hello, everyone!
>>> 
>>> I've got some DB which contains some contacts. I'm using the Apache POI
>>> in
>>> order to export that DB to an Excel file. Anyway, let's presume the
>>> contacts are 2000. When I start creating the workbook, the contacts are
>>> inserted pretty quick, but when the workbook begins to get bigger and
>>> bigger, things are terribly slowed down.
>>> 
>>> Is there a way to flush the workbook from time to time? I didn't see
>>> such
>>> opportunity in the API.
>>> 
>>> Best,
>>> Martin
>>> 
>>> 
>> 
>> -- 
>> View this message in context:
>> http://old.nabble.com/flush-workbook-from-time-to-time-tp28341613p28341925.html
>> Sent from the POI - User mailing list archive at Nabble.com.
>> 
>> 
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
>> For additional commands, e-mail: user-help@poi.apache.org
>> 
>> 
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
>> For additional commands, e-mail: user-help@poi.apache.org
>> 
>> 
>> 
> 
> -- 
> View this message in context:
> http://old.nabble.com/flush-workbook-from-time-to-time-tp28341613p28342812.html
> Sent from the POI - User mailing list archive at Nabble.com.
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
> For additional commands, e-mail: user-help@poi.apache.org
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
> For additional commands, e-mail: user-help@poi.apache.org
> 
> 
> 

-- 
View this message in context: http://old.nabble.com/flush-workbook-from-time-to-time-tp28341613p28343097.html
Sent from the POI - User mailing list archive at Nabble.com.


---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
For additional commands, e-mail: user-help@poi.apache.org


Re: flush workbook from time to time

Posted by Richard Holmes <ri...@shedconsulting.co.uk>.
Hi martin

Have you looked at what the garbage collector is doing.  I have seen this type of behaviour in applications and it is nearly always  due to memory not being released causing the gc to take more and more time from the app.
 
Rich
-----Original Message-----
From: Martin Asenov <mA...@velti.com>
Date: Fri, 23 Apr 2010 19:07:47 
To: POI Users List<us...@poi.apache.org>
Subject: RE: flush workbook from time to time

Hi Mark,

Well I've got both CSV and XLS export - you know about line separators and all this stuff that is system dependent, so we provide the both variants. 

Let's discuss it in Monday... Have a nice weekend

Thank you,
Martin

-----Original Message-----
From: MSB [mailto:markbrdsly@tiscali.co.uk] 
Sent: Friday, April 23, 2010 6:59 PM
To: user@poi.apache.org
Subject: RE: flush workbook from time to time


Old age is really slowing me down these days - and I can blame working for
the last few days on a 1 in 3 slope removing Hawthorn, still digging the
thorns out of my fingers now! - but have you looked into generating a CSV
file and then 'opening' this with Excel?

I have never tried it myself but I think it should be possible to create a
template using Excel - set all of the cell formats correctly and so on. Then
you could simply open a CSV file and populate the respective cells with data
and I am guessing that the formatting information will be preserved; your
application would then only have to create/edit the CSV file.

It should be easy enough to mock up a simple test to see if it would work
and although it would add some complexity to the task - opening the Excel
template and then importing the data - that might be worthwhile. A macro
could even be used to automate the process - something that runs when the
workbook is opened to import the data from the CSV file maybe.

Yours

Mark B


Martin Asenov wrote:
> 
> No, I've been using old good system.out.println(); statements - and
> noticed that the gap between different insertions into the workbook
> slighly increases on every insertion. Actually, the output file was 212 kb
> large - it is surprising for me that file of such size could be generated
> in so large amount of time. Something must be wrong...
> 
> Best,
> Martin
> 
> -----Original Message-----
> From: MSB [mailto:markbrdsly@tiscali.co.uk] 
> Sent: Friday, April 23, 2010 6:24 PM
> To: user@poi.apache.org
> Subject: RE: flush workbook from time to time
> 
> 
> Oh, well in that case the BigGridDemo would not work for you as it was
> only
> applicable to the OOXML based file format. If you are using the older
> binary
> format (.xls) then the answer is a certain no to streaming the file in
> chunks I am sorry to say. The reason lies in the structure of the Excel
> file
> itself, I am not completely certain of the details but understand that it
> consists of a series of streams and is block structured - that is why the
> files sizes are always multiples of 512 or 4096 I believe. As a result, a
> model of the file must be assembled in memory firstly, formatted and then
> written out.
> 
> Must admit that I am surprised you are experiencing problems with
> performance if you are using the HSSF stream as that is a more common
> problem with the OOXML files. Have you tried running a profiler on the
> code
> to see where the bottlenecks are?
> 
> Yours
> 
> Mark B
> 
> 
> Martin Asenov wrote:
>> 
>> Hi, Mark, thanks for the quick reply.
>> 
>> I don't know exactly what format we're exporting into, but I think it's
>> the .xls one. In code we're using HSSFWorkbook class. Anyway, the whole
>> thing is not written by me, so I'm unaware of more details about it.
>> 
>> Every single contact expands from 2 to up to 20 horizontal cells, but
>> commonly in 4-5 horizontal cells.
>> 
>> The point is that in the UI when the user clicks on 'export' we're
>> supposed to supply him with the exported file. Delaying is not a problem,
>> unless it's 7-8 mins for a workbook of 2000 contacts :-) I hope there's a
>> way to work around this...
>> 
>> Best,
>> Martin
>> 
>> 
>> -----Original Message-----
>> From: MSB [mailto:markbrdsly@tiscali.co.uk] 
>> Sent: Friday, April 23, 2010 5:17 PM
>> To: user@poi.apache.org
>> Subject: Re: flush workbook from time to time
>> 
>> 
>> Can I ask which version of the file format you are targetting please, the
>> binary (.xls) or OOXML (.xlsx) one? If it is the latter, then it may be
>> worthwhile looking at Yegor's BigGridDemo program - sorry but I do not
>> have
>> the link at hand currently - where he streams the contents of a file into
>> a
>> template to create large worksheets without encountering out of memory
>> exceptions. It is not a fully fledged utility but rather a proof that you
>> may - and that is may - be able to adapt to handle your current
>> requirement.
>> I do not know if you would be able to modify the program to, for example,
>> fill rows 0 to 2000 with data and then fill rows 2001 to 4000 with data
>> at
>> a
>> later date but I feel it could be worth looking into.
>> 
>> Aside from that, there is no way to currently stream the document out bit
>> by
>> bit I am sorry to say. It should be possible and has, I believe, been
>> discussed by the developers but would require some programming effort and
>> no
>> one has stepped up to sponsor the work.
>> 
>> Yours
>> 
>> Mark B
>> 
>> 
>> Martin Asenov wrote:
>>> 
>>> Hello, everyone!
>>> 
>>> I've got some DB which contains some contacts. I'm using the Apache POI
>>> in
>>> order to export that DB to an Excel file. Anyway, let's presume the
>>> contacts are 2000. When I start creating the workbook, the contacts are
>>> inserted pretty quick, but when the workbook begins to get bigger and
>>> bigger, things are terribly slowed down.
>>> 
>>> Is there a way to flush the workbook from time to time? I didn't see
>>> such
>>> opportunity in the API.
>>> 
>>> Best,
>>> Martin
>>> 
>>> 
>> 
>> -- 
>> View this message in context:
>> http://old.nabble.com/flush-workbook-from-time-to-time-tp28341613p28341925.html
>> Sent from the POI - User mailing list archive at Nabble.com.
>> 
>> 
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
>> For additional commands, e-mail: user-help@poi.apache.org
>> 
>> 
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
>> For additional commands, e-mail: user-help@poi.apache.org
>> 
>> 
>> 
> 
> -- 
> View this message in context:
> http://old.nabble.com/flush-workbook-from-time-to-time-tp28341613p28342812.html
> Sent from the POI - User mailing list archive at Nabble.com.
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
> For additional commands, e-mail: user-help@poi.apache.org
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
> For additional commands, e-mail: user-help@poi.apache.org
> 
> 
> 

-- 
View this message in context: http://old.nabble.com/flush-workbook-from-time-to-time-tp28341613p28343277.html
Sent from the POI - User mailing list archive at Nabble.com.


---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
For additional commands, e-mail: user-help@poi.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
For additional commands, e-mail: user-help@poi.apache.org


RE: flush workbook from time to time

Posted by MSB <ma...@tiscali.co.uk>.
Hello Martin,

Glad you found the problem, I have just got in from work, was just sitting
down with a cup of tea to look through the code when I read your message(s).

Yours

Mark B


Martin Asenov wrote:
> 
> 
> Actually I removed all unnecessary references, and do autoSizeColumn after
> the loop finishes.
> 
> It's bad debugging someone else's code :))))
> 
> Thank you all for the efforts!
> 
> Best,
> Martin
> 
> 
> -----Original Message-----
> From: Richard Holmes [mailto:richard@shedconsulting.co.uk]
> Sent: Monday, April 26, 2010 3:55 PM
> To: POI Users List
> Subject: Re: flush workbook from time to time
> 
> Sorry just saw that too,
> 
> On other thing that I noticed, I declared
> 
>             HSSFRow currentContactRow;
>             Cell cellNumber;
>             Cell cellFirstName;
>             Cell cellLastName;
>             Cell cellCurrentGroup;
> Outside of the loop, this had a marked impact too although the
> autoResizeColumn was worse, it definatly overloaded the memory.
> 
> On 26 April 2010 13:51, Martin Asenov <mA...@velti.com> wrote:
> 
>> Hello, everyone!
>>
>> The problem was that sheet.autoResizeColumn() was called on every single
>> cell creation. Fortunately, I saw where the delay was coming from!
>>
>> Thank you all for the help!
>>
>> Best,
>> Martin
>>
>> -----Original Message-----
>> From: Richard Holmes [mailto:richard@shedconsulting.co.uk]
>>  Sent: Monday, April 26, 2010 10:23 AM
>> To: Martin Asenov
>> Subject: Re: flush workbook from time to time
>>
>> Hi martin
>>
>> Ok was just wondering how the iterators worked specifically the inner
>> iteration does it use queries against the database and are you explicitly
>> closing result sets
>>
>> Richard
>> -----Original Message-----
>> From: Martin Asenov <mA...@velti.com>
>> Date: Mon, 26 Apr 2010 10:16:09
>> To: POI Users List<us...@poi.apache.org>; richard@shedconsulting.co.uk<
>> richard@shedconsulting.co.uk>
>> Subject: RE: flush workbook from time to time
>>
>> Hi, Richard!
>>
>> The two classes are too big and there is nothing special about them - the
>> manager only uses underlying Hibernate layer to do queries, and the
>> Contact
>> class is just a bean.
>>
>> The problem here is as the workbook grows up, things are terribly slowed
>> down.
>>
>> Best,
>> Martin
>>
>> -----Original Message-----
>> From: Richard Holmes [mailto:richard@shedconsulting.co.uk]
>> Sent: Monday, April 26, 2010 10:12 AM
>> To: POI Users List
>> Subject: Re: flush workbook from time to time
>>
>> Hi martin
>>
>> Can you send the manager and contact class too
>>
>> I will look later when I am at my computer
>> Thanks
>> Richard
>> -----Original Message-----
>> From: Martin Asenov <mA...@velti.com>
>> Date: Mon, 26 Apr 2010 09:46:35
>> To: POI Users List<us...@poi.apache.org>
>> Subject: RE: flush workbook from time to time
>>
>>
>> Hello, everyone!
>>
>> I provide the snippet of code I'm using:
>>
>> @Component
>> public class PhonebookExporterXLS {
>>        public static final String GROUP_STR = "*";
>>        public static final String OUTPUT_FILE_NAME =
>> "contactsExportToXLS";
>>
>>        private ManagerBean manager;
>>        private SMSOfficeProperties props;
>>
>>        public void setProps(SMSOfficeProperties props) {
>>                this.props = props;
>>        }
>>
>>        private Domain domain;
>>        private User actor;
>>
>>        FileOutputStream fos;
>>
>>        public void setManager(ManagerBean manager) {
>>                this.manager = manager;
>>        }
>>
>>        public PhonebookExporterXLS() {
>>        }
>>
>>        public PhonebookExporterXLS(User actor, Domain domain) {
>>                this.domain = domain;
>>                this.actor = actor;
>>        }
>>
>>        // Excel tools
>>        HSSFWorkbook wb;
>>        HSSFSheet sheet;
>>
>>        int rowCounterContacts = 0;
>>
>>        public File export(String path) throws SMSOfficeException {
>>
>>                File exportedFile = new File(path +
>> String.format(props.get(OUTPUT_FILE_NAME),
>> StatisticsExporter.fileDateFormat.format(new Date())));
>>                exportedFile.getParentFile().mkdirs();
>>
>>                int currentRowCellsCount;
>>
>>                try {
>>                        fos = new FileOutputStream(exportedFile);
>>                        wb = new HSSFWorkbook();
>>                        sheet = wb.createSheet("Contacts");
>>
>>                        for (Contact currentContact :
>> manager.getContacts(domain)) {
>>                                HSSFRow currentContactRow =
>> sheet.createRow(rowCounterContacts++);
>>
>>                                currentRowCellsCount = 0;
>>
>>                                // Number cell(s) - (String type)
>>                                for (PhoneNumber currentNumber :
>> currentContact.getNumbers()) {
>>                                        Cell cellNumber =
>> currentContactRow.createCell(currentRowCellsCount++);
>>
>>  cellNumber.setCellValue(currentNumber.getFullNumber());
>>
>>  sheet.autoSizeColumn(currentRowCellsCount - 1);
>>                                }
>>
>>                                // First name cell - (String type)
>>                                Cell cellFirstName =
>> currentContactRow.createCell(currentRowCellsCount++);
>>
>>  cellFirstName.setCellValue(currentContact.getFirstName());
>>                                sheet.autoSizeColumn(currentRowCellsCount
>> -
>> 1);
>>
>>                                // Last name cell - (String type)
>>                                Cell cellLastName =
>> currentContactRow.createCell(currentRowCellsCount++);
>>
>>  cellLastName.setCellValue(currentContact.getLastName());
>>                                sheet.autoSizeColumn(currentRowCellsCount
>> -
>> 1);
>>
>>                                for (Group currentGroup :
>> currentContact.getGroups()) {
>>                                        Cell cellCurrentGroup =
>> currentContactRow.createCell(currentRowCellsCount++);
>>
>>  cellCurrentGroup.setCellValue(GROUP_STR + currentGroup.getName());
>>
>>  sheet.autoSizeColumn(currentRowCellsCount - 1);
>>                                }
>>                        }
>>
>>                        // Writing the workbook to FileOutputStream...
>>                        wb.write(fos);
>>                        fos.flush();
>>
>>                } catch (IOException ioe) {
>>                        ioe.printStackTrace();
>>                } finally {
>>                        rowCounterContacts = 0;
>>
>>                        try {
>>                                fos.close();
>>                        } catch (IOException e) {
>>                                e.printStackTrace();
>>                        }
>>                }
>>
>>                return exportedFile;
>>
>>        }
>>
>>        public void setUser(User user) {
>>                actor = user;
>>        }
>>
>>        public void setDomain(Domain domain) {
>>                this.domain = domain;
>>        }
>> }
>>
>> Best,
>> Martin
>>
>> -----Original Message-----
>> From: MSB [mailto:markbrdsly@tiscali.co.uk]
>> Sent: Sunday, April 25, 2010 10:45 AM
>> To: user@poi.apache.org
>> Subject: RE: flush workbook from time to time
>>
>>
>> I should have expected you to look into CSV files, sorry about that. Must
>> admit, I also like Richard's suggestion; it could all be down to the JVM
>> not
>> reclaiming space.
>>
>> Until Monday.
>>
>> Yours
>>
>> Mark B
>>
>>
>> Martin Asenov wrote:
>> >
>> > Hi Mark,
>> >
>> > Well I've got both CSV and XLS export - you know about line separators
>> and
>> > all this stuff that is system dependent, so we provide the both
>> variants.
>> >
>> > Let's discuss it in Monday... Have a nice weekend
>> >
>> > Thank you,
>> > Martin
>> >
>> > -----Original Message-----
>> > From: MSB [mailto:markbrdsly@tiscali.co.uk]
>> > Sent: Friday, April 23, 2010 6:59 PM
>> > To: user@poi.apache.org
>> > Subject: RE: flush workbook from time to time
>> >
>> >
>> > Old age is really slowing me down these days - and I can blame working
>> for
>> > the last few days on a 1 in 3 slope removing Hawthorn, still digging
>> the
>> > thorns out of my fingers now! - but have you looked into generating a
>> CSV
>> > file and then 'opening' this with Excel?
>> >
>> > I have never tried it myself but I think it should be possible to
>> create
>> a
>> > template using Excel - set all of the cell formats correctly and so on.
>> > Then
>> > you could simply open a CSV file and populate the respective cells with
>> > data
>> > and I am guessing that the formatting information will be preserved;
>> your
>> > application would then only have to create/edit the CSV file.
>> >
>> > It should be easy enough to mock up a simple test to see if it would
>> work
>> > and although it would add some complexity to the task - opening the
>> Excel
>> > template and then importing the data - that might be worthwhile. A
>> macro
>> > could even be used to automate the process - something that runs when
>> the
>> > workbook is opened to import the data from the CSV file maybe.
>> >
>> > Yours
>> >
>> > Mark B
>> >
>> >
>> > Martin Asenov wrote:
>> >>
>> >> No, I've been using old good system.out.println(); statements - and
>> >> noticed that the gap between different insertions into the workbook
>> >> slighly increases on every insertion. Actually, the output file was
>> 212
>> >> kb
>> >> large - it is surprising for me that file of such size could be
>> generated
>> >> in so large amount of time. Something must be wrong...
>> >>
>> >> Best,
>> >> Martin
>> >>
>> >> -----Original Message-----
>> >> From: MSB [mailto:markbrdsly@tiscali.co.uk]
>> >> Sent: Friday, April 23, 2010 6:24 PM
>> >> To: user@poi.apache.org
>> >> Subject: RE: flush workbook from time to time
>> >>
>> >>
>> >> Oh, well in that case the BigGridDemo would not work for you as it was
>> >> only
>> >> applicable to the OOXML based file format. If you are using the older
>> >> binary
>> >> format (.xls) then the answer is a certain no to streaming the file in
>> >> chunks I am sorry to say. The reason lies in the structure of the
>> Excel
>> >> file
>> >> itself, I am not completely certain of the details but understand that
>> it
>> >> consists of a series of streams and is block structured - that is why
>> the
>> >> files sizes are always multiples of 512 or 4096 I believe. As a
>> result,
>> a
>> >> model of the file must be assembled in memory firstly, formatted and
>> then
>> >> written out.
>> >>
>> >> Must admit that I am surprised you are experiencing problems with
>> >> performance if you are using the HSSF stream as that is a more common
>> >> problem with the OOXML files. Have you tried running a profiler on the
>> >> code
>> >> to see where the bottlenecks are?
>> >>
>> >> Yours
>> >>
>> >> Mark B
>> >>
>> >>
>> >> Martin Asenov wrote:
>> >>>
>> >>> Hi, Mark, thanks for the quick reply.
>> >>>
>> >>> I don't know exactly what format we're exporting into, but I think
>> it's
>> >>> the .xls one. In code we're using HSSFWorkbook class. Anyway, the
>> whole
>> >>> thing is not written by me, so I'm unaware of more details about it.
>> >>>
>> >>> Every single contact expands from 2 to up to 20 horizontal cells, but
>> >>> commonly in 4-5 horizontal cells.
>> >>>
>> >>> The point is that in the UI when the user clicks on 'export' we're
>> >>> supposed to supply him with the exported file. Delaying is not a
>> >>> problem,
>> >>> unless it's 7-8 mins for a workbook of 2000 contacts :-) I hope
>> there's
>> >>> a
>> >>> way to work around this...
>> >>>
>> >>> Best,
>> >>> Martin
>> >>>
>> >>>
>> >>> -----Original Message-----
>> >>> From: MSB [mailto:markbrdsly@tiscali.co.uk]
>> >>> Sent: Friday, April 23, 2010 5:17 PM
>> >>> To: user@poi.apache.org
>> >>> Subject: Re: flush workbook from time to time
>> >>>
>> >>>
>> >>> Can I ask which version of the file format you are targetting please,
>> >>> the
>> >>> binary (.xls) or OOXML (.xlsx) one? If it is the latter, then it may
>> be
>> >>> worthwhile looking at Yegor's BigGridDemo program - sorry but I do
>> not
>> >>> have
>> >>> the link at hand currently - where he streams the contents of a file
>> >>> into
>> >>> a
>> >>> template to create large worksheets without encountering out of
>> memory
>> >>> exceptions. It is not a fully fledged utility but rather a proof that
>> >>> you
>> >>> may - and that is may - be able to adapt to handle your current
>> >>> requirement.
>> >>> I do not know if you would be able to modify the program to, for
>> >>> example,
>> >>> fill rows 0 to 2000 with data and then fill rows 2001 to 4000 with
>> data
>> >>> at
>> >>> a
>> >>> later date but I feel it could be worth looking into.
>> >>>
>> >>> Aside from that, there is no way to currently stream the document out
>> >>> bit
>> >>> by
>> >>> bit I am sorry to say. It should be possible and has, I believe, been
>> >>> discussed by the developers but would require some programming effort
>> >>> and
>> >>> no
>> >>> one has stepped up to sponsor the work.
>> >>>
>> >>> Yours
>> >>>
>> >>> Mark B
>> >>>
>> >>>
>> >>> Martin Asenov wrote:
>> >>>>
>> >>>> Hello, everyone!
>> >>>>
>> >>>> I've got some DB which contains some contacts. I'm using the Apache
>> POI
>> >>>> in
>> >>>> order to export that DB to an Excel file. Anyway, let's presume the
>> >>>> contacts are 2000. When I start creating the workbook, the contacts
>> are
>> >>>> inserted pretty quick, but when the workbook begins to get bigger
>> and
>> >>>> bigger, things are terribly slowed down.
>> >>>>
>> >>>> Is there a way to flush the workbook from time to time? I didn't see
>> >>>> such
>> >>>> opportunity in the API.
>> >>>>
>> >>>> Best,
>> >>>> Martin
>> >>>>
>> >>>>
>> >>>
>> >>> --
>> >>> View this message in context:
>> >>>
>> http://old.nabble.com/flush-workbook-from-time-to-time-tp28341613p28341925.html
>> >>> Sent from the POI - User mailing list archive at Nabble.com.
>> >>>
>> >>>
>> >>> ---------------------------------------------------------------------
>> >>> To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
>> >>> For additional commands, e-mail: user-help@poi.apache.org
>> >>>
>> >>>
>> >>> ---------------------------------------------------------------------
>> >>> To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
>> >>> For additional commands, e-mail: user-help@poi.apache.org
>> >>>
>> >>>
>> >>>
>> >>
>> >> --
>> >> View this message in context:
>> >>
>> http://old.nabble.com/flush-workbook-from-time-to-time-tp28341613p28342812.html
>> >> Sent from the POI - User mailing list archive at Nabble.com.
>> >>
>> >>
>> >> ---------------------------------------------------------------------
>> >> To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
>> >> For additional commands, e-mail: user-help@poi.apache.org
>> >>
>> >>
>> >> ---------------------------------------------------------------------
>> >> To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
>> >> For additional commands, e-mail: user-help@poi.apache.org
>> >>
>> >>
>> >>
>> >
>> > --
>> > View this message in context:
>> >
>> http://old.nabble.com/flush-workbook-from-time-to-time-tp28341613p28343277.html
>> > Sent from the POI - User mailing list archive at Nabble.com.
>> >
>> >
>> > ---------------------------------------------------------------------
>> > To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
>> > For additional commands, e-mail: user-help@poi.apache.org
>> >
>> >
>> > ---------------------------------------------------------------------
>> > To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
>> > For additional commands, e-mail: user-help@poi.apache.org
>> >
>> >
>> >
>>
>> --
>> View this message in context:
>> http://old.nabble.com/flush-workbook-from-time-to-time-tp28341613p28354716.html
>> Sent from the POI - User mailing list archive at Nabble.com.
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
>> For additional commands, e-mail: user-help@poi.apache.org
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
>> For additional commands, e-mail: user-help@poi.apache.org
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
>> For additional commands, e-mail: user-help@poi.apache.org
>>
>>
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
> For additional commands, e-mail: user-help@poi.apache.org
> 
> 
> 

-- 
View this message in context: http://old.nabble.com/flush-workbook-from-time-to-time-tp28341613p28364974.html
Sent from the POI - User mailing list archive at Nabble.com.


---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
For additional commands, e-mail: user-help@poi.apache.org


RE: flush workbook from time to time

Posted by Martin Asenov <mA...@velti.com>.
Actually I removed all unnecessary references, and do autoSizeColumn after the loop finishes.

It's bad debugging someone else's code :))))

Thank you all for the efforts!

Best,
Martin


-----Original Message-----
From: Richard Holmes [mailto:richard@shedconsulting.co.uk]
Sent: Monday, April 26, 2010 3:55 PM
To: POI Users List
Subject: Re: flush workbook from time to time

Sorry just saw that too,

On other thing that I noticed, I declared

            HSSFRow currentContactRow;
            Cell cellNumber;
            Cell cellFirstName;
            Cell cellLastName;
            Cell cellCurrentGroup;
Outside of the loop, this had a marked impact too although the
autoResizeColumn was worse, it definatly overloaded the memory.

On 26 April 2010 13:51, Martin Asenov <mA...@velti.com> wrote:

> Hello, everyone!
>
> The problem was that sheet.autoResizeColumn() was called on every single
> cell creation. Fortunately, I saw where the delay was coming from!
>
> Thank you all for the help!
>
> Best,
> Martin
>
> -----Original Message-----
> From: Richard Holmes [mailto:richard@shedconsulting.co.uk]
>  Sent: Monday, April 26, 2010 10:23 AM
> To: Martin Asenov
> Subject: Re: flush workbook from time to time
>
> Hi martin
>
> Ok was just wondering how the iterators worked specifically the inner
> iteration does it use queries against the database and are you explicitly
> closing result sets
>
> Richard
> -----Original Message-----
> From: Martin Asenov <mA...@velti.com>
> Date: Mon, 26 Apr 2010 10:16:09
> To: POI Users List<us...@poi.apache.org>; richard@shedconsulting.co.uk<
> richard@shedconsulting.co.uk>
> Subject: RE: flush workbook from time to time
>
> Hi, Richard!
>
> The two classes are too big and there is nothing special about them - the
> manager only uses underlying Hibernate layer to do queries, and the Contact
> class is just a bean.
>
> The problem here is as the workbook grows up, things are terribly slowed
> down.
>
> Best,
> Martin
>
> -----Original Message-----
> From: Richard Holmes [mailto:richard@shedconsulting.co.uk]
> Sent: Monday, April 26, 2010 10:12 AM
> To: POI Users List
> Subject: Re: flush workbook from time to time
>
> Hi martin
>
> Can you send the manager and contact class too
>
> I will look later when I am at my computer
> Thanks
> Richard
> -----Original Message-----
> From: Martin Asenov <mA...@velti.com>
> Date: Mon, 26 Apr 2010 09:46:35
> To: POI Users List<us...@poi.apache.org>
> Subject: RE: flush workbook from time to time
>
>
> Hello, everyone!
>
> I provide the snippet of code I'm using:
>
> @Component
> public class PhonebookExporterXLS {
>        public static final String GROUP_STR = "*";
>        public static final String OUTPUT_FILE_NAME = "contactsExportToXLS";
>
>        private ManagerBean manager;
>        private SMSOfficeProperties props;
>
>        public void setProps(SMSOfficeProperties props) {
>                this.props = props;
>        }
>
>        private Domain domain;
>        private User actor;
>
>        FileOutputStream fos;
>
>        public void setManager(ManagerBean manager) {
>                this.manager = manager;
>        }
>
>        public PhonebookExporterXLS() {
>        }
>
>        public PhonebookExporterXLS(User actor, Domain domain) {
>                this.domain = domain;
>                this.actor = actor;
>        }
>
>        // Excel tools
>        HSSFWorkbook wb;
>        HSSFSheet sheet;
>
>        int rowCounterContacts = 0;
>
>        public File export(String path) throws SMSOfficeException {
>
>                File exportedFile = new File(path +
> String.format(props.get(OUTPUT_FILE_NAME),
> StatisticsExporter.fileDateFormat.format(new Date())));
>                exportedFile.getParentFile().mkdirs();
>
>                int currentRowCellsCount;
>
>                try {
>                        fos = new FileOutputStream(exportedFile);
>                        wb = new HSSFWorkbook();
>                        sheet = wb.createSheet("Contacts");
>
>                        for (Contact currentContact :
> manager.getContacts(domain)) {
>                                HSSFRow currentContactRow =
> sheet.createRow(rowCounterContacts++);
>
>                                currentRowCellsCount = 0;
>
>                                // Number cell(s) - (String type)
>                                for (PhoneNumber currentNumber :
> currentContact.getNumbers()) {
>                                        Cell cellNumber =
> currentContactRow.createCell(currentRowCellsCount++);
>
>  cellNumber.setCellValue(currentNumber.getFullNumber());
>
>  sheet.autoSizeColumn(currentRowCellsCount - 1);
>                                }
>
>                                // First name cell - (String type)
>                                Cell cellFirstName =
> currentContactRow.createCell(currentRowCellsCount++);
>
>  cellFirstName.setCellValue(currentContact.getFirstName());
>                                sheet.autoSizeColumn(currentRowCellsCount -
> 1);
>
>                                // Last name cell - (String type)
>                                Cell cellLastName =
> currentContactRow.createCell(currentRowCellsCount++);
>
>  cellLastName.setCellValue(currentContact.getLastName());
>                                sheet.autoSizeColumn(currentRowCellsCount -
> 1);
>
>                                for (Group currentGroup :
> currentContact.getGroups()) {
>                                        Cell cellCurrentGroup =
> currentContactRow.createCell(currentRowCellsCount++);
>
>  cellCurrentGroup.setCellValue(GROUP_STR + currentGroup.getName());
>
>  sheet.autoSizeColumn(currentRowCellsCount - 1);
>                                }
>                        }
>
>                        // Writing the workbook to FileOutputStream...
>                        wb.write(fos);
>                        fos.flush();
>
>                } catch (IOException ioe) {
>                        ioe.printStackTrace();
>                } finally {
>                        rowCounterContacts = 0;
>
>                        try {
>                                fos.close();
>                        } catch (IOException e) {
>                                e.printStackTrace();
>                        }
>                }
>
>                return exportedFile;
>
>        }
>
>        public void setUser(User user) {
>                actor = user;
>        }
>
>        public void setDomain(Domain domain) {
>                this.domain = domain;
>        }
> }
>
> Best,
> Martin
>
> -----Original Message-----
> From: MSB [mailto:markbrdsly@tiscali.co.uk]
> Sent: Sunday, April 25, 2010 10:45 AM
> To: user@poi.apache.org
> Subject: RE: flush workbook from time to time
>
>
> I should have expected you to look into CSV files, sorry about that. Must
> admit, I also like Richard's suggestion; it could all be down to the JVM
> not
> reclaiming space.
>
> Until Monday.
>
> Yours
>
> Mark B
>
>
> Martin Asenov wrote:
> >
> > Hi Mark,
> >
> > Well I've got both CSV and XLS export - you know about line separators
> and
> > all this stuff that is system dependent, so we provide the both variants.
> >
> > Let's discuss it in Monday... Have a nice weekend
> >
> > Thank you,
> > Martin
> >
> > -----Original Message-----
> > From: MSB [mailto:markbrdsly@tiscali.co.uk]
> > Sent: Friday, April 23, 2010 6:59 PM
> > To: user@poi.apache.org
> > Subject: RE: flush workbook from time to time
> >
> >
> > Old age is really slowing me down these days - and I can blame working
> for
> > the last few days on a 1 in 3 slope removing Hawthorn, still digging the
> > thorns out of my fingers now! - but have you looked into generating a CSV
> > file and then 'opening' this with Excel?
> >
> > I have never tried it myself but I think it should be possible to create
> a
> > template using Excel - set all of the cell formats correctly and so on.
> > Then
> > you could simply open a CSV file and populate the respective cells with
> > data
> > and I am guessing that the formatting information will be preserved; your
> > application would then only have to create/edit the CSV file.
> >
> > It should be easy enough to mock up a simple test to see if it would work
> > and although it would add some complexity to the task - opening the Excel
> > template and then importing the data - that might be worthwhile. A macro
> > could even be used to automate the process - something that runs when the
> > workbook is opened to import the data from the CSV file maybe.
> >
> > Yours
> >
> > Mark B
> >
> >
> > Martin Asenov wrote:
> >>
> >> No, I've been using old good system.out.println(); statements - and
> >> noticed that the gap between different insertions into the workbook
> >> slighly increases on every insertion. Actually, the output file was 212
> >> kb
> >> large - it is surprising for me that file of such size could be
> generated
> >> in so large amount of time. Something must be wrong...
> >>
> >> Best,
> >> Martin
> >>
> >> -----Original Message-----
> >> From: MSB [mailto:markbrdsly@tiscali.co.uk]
> >> Sent: Friday, April 23, 2010 6:24 PM
> >> To: user@poi.apache.org
> >> Subject: RE: flush workbook from time to time
> >>
> >>
> >> Oh, well in that case the BigGridDemo would not work for you as it was
> >> only
> >> applicable to the OOXML based file format. If you are using the older
> >> binary
> >> format (.xls) then the answer is a certain no to streaming the file in
> >> chunks I am sorry to say. The reason lies in the structure of the Excel
> >> file
> >> itself, I am not completely certain of the details but understand that
> it
> >> consists of a series of streams and is block structured - that is why
> the
> >> files sizes are always multiples of 512 or 4096 I believe. As a result,
> a
> >> model of the file must be assembled in memory firstly, formatted and
> then
> >> written out.
> >>
> >> Must admit that I am surprised you are experiencing problems with
> >> performance if you are using the HSSF stream as that is a more common
> >> problem with the OOXML files. Have you tried running a profiler on the
> >> code
> >> to see where the bottlenecks are?
> >>
> >> Yours
> >>
> >> Mark B
> >>
> >>
> >> Martin Asenov wrote:
> >>>
> >>> Hi, Mark, thanks for the quick reply.
> >>>
> >>> I don't know exactly what format we're exporting into, but I think it's
> >>> the .xls one. In code we're using HSSFWorkbook class. Anyway, the whole
> >>> thing is not written by me, so I'm unaware of more details about it.
> >>>
> >>> Every single contact expands from 2 to up to 20 horizontal cells, but
> >>> commonly in 4-5 horizontal cells.
> >>>
> >>> The point is that in the UI when the user clicks on 'export' we're
> >>> supposed to supply him with the exported file. Delaying is not a
> >>> problem,
> >>> unless it's 7-8 mins for a workbook of 2000 contacts :-) I hope there's
> >>> a
> >>> way to work around this...
> >>>
> >>> Best,
> >>> Martin
> >>>
> >>>
> >>> -----Original Message-----
> >>> From: MSB [mailto:markbrdsly@tiscali.co.uk]
> >>> Sent: Friday, April 23, 2010 5:17 PM
> >>> To: user@poi.apache.org
> >>> Subject: Re: flush workbook from time to time
> >>>
> >>>
> >>> Can I ask which version of the file format you are targetting please,
> >>> the
> >>> binary (.xls) or OOXML (.xlsx) one? If it is the latter, then it may be
> >>> worthwhile looking at Yegor's BigGridDemo program - sorry but I do not
> >>> have
> >>> the link at hand currently - where he streams the contents of a file
> >>> into
> >>> a
> >>> template to create large worksheets without encountering out of memory
> >>> exceptions. It is not a fully fledged utility but rather a proof that
> >>> you
> >>> may - and that is may - be able to adapt to handle your current
> >>> requirement.
> >>> I do not know if you would be able to modify the program to, for
> >>> example,
> >>> fill rows 0 to 2000 with data and then fill rows 2001 to 4000 with data
> >>> at
> >>> a
> >>> later date but I feel it could be worth looking into.
> >>>
> >>> Aside from that, there is no way to currently stream the document out
> >>> bit
> >>> by
> >>> bit I am sorry to say. It should be possible and has, I believe, been
> >>> discussed by the developers but would require some programming effort
> >>> and
> >>> no
> >>> one has stepped up to sponsor the work.
> >>>
> >>> Yours
> >>>
> >>> Mark B
> >>>
> >>>
> >>> Martin Asenov wrote:
> >>>>
> >>>> Hello, everyone!
> >>>>
> >>>> I've got some DB which contains some contacts. I'm using the Apache
> POI
> >>>> in
> >>>> order to export that DB to an Excel file. Anyway, let's presume the
> >>>> contacts are 2000. When I start creating the workbook, the contacts
> are
> >>>> inserted pretty quick, but when the workbook begins to get bigger and
> >>>> bigger, things are terribly slowed down.
> >>>>
> >>>> Is there a way to flush the workbook from time to time? I didn't see
> >>>> such
> >>>> opportunity in the API.
> >>>>
> >>>> Best,
> >>>> Martin
> >>>>
> >>>>
> >>>
> >>> --
> >>> View this message in context:
> >>>
> http://old.nabble.com/flush-workbook-from-time-to-time-tp28341613p28341925.html
> >>> Sent from the POI - User mailing list archive at Nabble.com.
> >>>
> >>>
> >>> ---------------------------------------------------------------------
> >>> To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
> >>> For additional commands, e-mail: user-help@poi.apache.org
> >>>
> >>>
> >>> ---------------------------------------------------------------------
> >>> To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
> >>> For additional commands, e-mail: user-help@poi.apache.org
> >>>
> >>>
> >>>
> >>
> >> --
> >> View this message in context:
> >>
> http://old.nabble.com/flush-workbook-from-time-to-time-tp28341613p28342812.html
> >> Sent from the POI - User mailing list archive at Nabble.com.
> >>
> >>
> >> ---------------------------------------------------------------------
> >> To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
> >> For additional commands, e-mail: user-help@poi.apache.org
> >>
> >>
> >> ---------------------------------------------------------------------
> >> To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
> >> For additional commands, e-mail: user-help@poi.apache.org
> >>
> >>
> >>
> >
> > --
> > View this message in context:
> >
> http://old.nabble.com/flush-workbook-from-time-to-time-tp28341613p28343277.html
> > Sent from the POI - User mailing list archive at Nabble.com.
> >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
> > For additional commands, e-mail: user-help@poi.apache.org
> >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
> > For additional commands, e-mail: user-help@poi.apache.org
> >
> >
> >
>
> --
> View this message in context:
> http://old.nabble.com/flush-workbook-from-time-to-time-tp28341613p28354716.html
> Sent from the POI - User mailing list archive at Nabble.com.
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
> For additional commands, e-mail: user-help@poi.apache.org
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
> For additional commands, e-mail: user-help@poi.apache.org
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
> For additional commands, e-mail: user-help@poi.apache.org
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
For additional commands, e-mail: user-help@poi.apache.org


Re: flush workbook from time to time

Posted by Richard Holmes <ri...@shedconsulting.co.uk>.
Sorry just saw that too,

On other thing that I noticed, I declared

            HSSFRow currentContactRow;
            Cell cellNumber;
            Cell cellFirstName;
            Cell cellLastName;
            Cell cellCurrentGroup;
Outside of the loop, this had a marked impact too although the
autoResizeColumn was worse, it definatly overloaded the memory.

On 26 April 2010 13:51, Martin Asenov <mA...@velti.com> wrote:

> Hello, everyone!
>
> The problem was that sheet.autoResizeColumn() was called on every single
> cell creation. Fortunately, I saw where the delay was coming from!
>
> Thank you all for the help!
>
> Best,
> Martin
>
> -----Original Message-----
> From: Richard Holmes [mailto:richard@shedconsulting.co.uk]
>  Sent: Monday, April 26, 2010 10:23 AM
> To: Martin Asenov
> Subject: Re: flush workbook from time to time
>
> Hi martin
>
> Ok was just wondering how the iterators worked specifically the inner
> iteration does it use queries against the database and are you explicitly
> closing result sets
>
> Richard
> -----Original Message-----
> From: Martin Asenov <mA...@velti.com>
> Date: Mon, 26 Apr 2010 10:16:09
> To: POI Users List<us...@poi.apache.org>; richard@shedconsulting.co.uk<
> richard@shedconsulting.co.uk>
> Subject: RE: flush workbook from time to time
>
> Hi, Richard!
>
> The two classes are too big and there is nothing special about them - the
> manager only uses underlying Hibernate layer to do queries, and the Contact
> class is just a bean.
>
> The problem here is as the workbook grows up, things are terribly slowed
> down.
>
> Best,
> Martin
>
> -----Original Message-----
> From: Richard Holmes [mailto:richard@shedconsulting.co.uk]
> Sent: Monday, April 26, 2010 10:12 AM
> To: POI Users List
> Subject: Re: flush workbook from time to time
>
> Hi martin
>
> Can you send the manager and contact class too
>
> I will look later when I am at my computer
> Thanks
> Richard
> -----Original Message-----
> From: Martin Asenov <mA...@velti.com>
> Date: Mon, 26 Apr 2010 09:46:35
> To: POI Users List<us...@poi.apache.org>
> Subject: RE: flush workbook from time to time
>
>
> Hello, everyone!
>
> I provide the snippet of code I'm using:
>
> @Component
> public class PhonebookExporterXLS {
>        public static final String GROUP_STR = "*";
>        public static final String OUTPUT_FILE_NAME = "contactsExportToXLS";
>
>        private ManagerBean manager;
>        private SMSOfficeProperties props;
>
>        public void setProps(SMSOfficeProperties props) {
>                this.props = props;
>        }
>
>        private Domain domain;
>        private User actor;
>
>        FileOutputStream fos;
>
>        public void setManager(ManagerBean manager) {
>                this.manager = manager;
>        }
>
>        public PhonebookExporterXLS() {
>        }
>
>        public PhonebookExporterXLS(User actor, Domain domain) {
>                this.domain = domain;
>                this.actor = actor;
>        }
>
>        // Excel tools
>        HSSFWorkbook wb;
>        HSSFSheet sheet;
>
>        int rowCounterContacts = 0;
>
>        public File export(String path) throws SMSOfficeException {
>
>                File exportedFile = new File(path +
> String.format(props.get(OUTPUT_FILE_NAME),
> StatisticsExporter.fileDateFormat.format(new Date())));
>                exportedFile.getParentFile().mkdirs();
>
>                int currentRowCellsCount;
>
>                try {
>                        fos = new FileOutputStream(exportedFile);
>                        wb = new HSSFWorkbook();
>                        sheet = wb.createSheet("Contacts");
>
>                        for (Contact currentContact :
> manager.getContacts(domain)) {
>                                HSSFRow currentContactRow =
> sheet.createRow(rowCounterContacts++);
>
>                                currentRowCellsCount = 0;
>
>                                // Number cell(s) - (String type)
>                                for (PhoneNumber currentNumber :
> currentContact.getNumbers()) {
>                                        Cell cellNumber =
> currentContactRow.createCell(currentRowCellsCount++);
>
>  cellNumber.setCellValue(currentNumber.getFullNumber());
>
>  sheet.autoSizeColumn(currentRowCellsCount - 1);
>                                }
>
>                                // First name cell - (String type)
>                                Cell cellFirstName =
> currentContactRow.createCell(currentRowCellsCount++);
>
>  cellFirstName.setCellValue(currentContact.getFirstName());
>                                sheet.autoSizeColumn(currentRowCellsCount -
> 1);
>
>                                // Last name cell - (String type)
>                                Cell cellLastName =
> currentContactRow.createCell(currentRowCellsCount++);
>
>  cellLastName.setCellValue(currentContact.getLastName());
>                                sheet.autoSizeColumn(currentRowCellsCount -
> 1);
>
>                                for (Group currentGroup :
> currentContact.getGroups()) {
>                                        Cell cellCurrentGroup =
> currentContactRow.createCell(currentRowCellsCount++);
>
>  cellCurrentGroup.setCellValue(GROUP_STR + currentGroup.getName());
>
>  sheet.autoSizeColumn(currentRowCellsCount - 1);
>                                }
>                        }
>
>                        // Writing the workbook to FileOutputStream...
>                        wb.write(fos);
>                        fos.flush();
>
>                } catch (IOException ioe) {
>                        ioe.printStackTrace();
>                } finally {
>                        rowCounterContacts = 0;
>
>                        try {
>                                fos.close();
>                        } catch (IOException e) {
>                                e.printStackTrace();
>                        }
>                }
>
>                return exportedFile;
>
>        }
>
>        public void setUser(User user) {
>                actor = user;
>        }
>
>        public void setDomain(Domain domain) {
>                this.domain = domain;
>        }
> }
>
> Best,
> Martin
>
> -----Original Message-----
> From: MSB [mailto:markbrdsly@tiscali.co.uk]
> Sent: Sunday, April 25, 2010 10:45 AM
> To: user@poi.apache.org
> Subject: RE: flush workbook from time to time
>
>
> I should have expected you to look into CSV files, sorry about that. Must
> admit, I also like Richard's suggestion; it could all be down to the JVM
> not
> reclaiming space.
>
> Until Monday.
>
> Yours
>
> Mark B
>
>
> Martin Asenov wrote:
> >
> > Hi Mark,
> >
> > Well I've got both CSV and XLS export - you know about line separators
> and
> > all this stuff that is system dependent, so we provide the both variants.
> >
> > Let's discuss it in Monday... Have a nice weekend
> >
> > Thank you,
> > Martin
> >
> > -----Original Message-----
> > From: MSB [mailto:markbrdsly@tiscali.co.uk]
> > Sent: Friday, April 23, 2010 6:59 PM
> > To: user@poi.apache.org
> > Subject: RE: flush workbook from time to time
> >
> >
> > Old age is really slowing me down these days - and I can blame working
> for
> > the last few days on a 1 in 3 slope removing Hawthorn, still digging the
> > thorns out of my fingers now! - but have you looked into generating a CSV
> > file and then 'opening' this with Excel?
> >
> > I have never tried it myself but I think it should be possible to create
> a
> > template using Excel - set all of the cell formats correctly and so on.
> > Then
> > you could simply open a CSV file and populate the respective cells with
> > data
> > and I am guessing that the formatting information will be preserved; your
> > application would then only have to create/edit the CSV file.
> >
> > It should be easy enough to mock up a simple test to see if it would work
> > and although it would add some complexity to the task - opening the Excel
> > template and then importing the data - that might be worthwhile. A macro
> > could even be used to automate the process - something that runs when the
> > workbook is opened to import the data from the CSV file maybe.
> >
> > Yours
> >
> > Mark B
> >
> >
> > Martin Asenov wrote:
> >>
> >> No, I've been using old good system.out.println(); statements - and
> >> noticed that the gap between different insertions into the workbook
> >> slighly increases on every insertion. Actually, the output file was 212
> >> kb
> >> large - it is surprising for me that file of such size could be
> generated
> >> in so large amount of time. Something must be wrong...
> >>
> >> Best,
> >> Martin
> >>
> >> -----Original Message-----
> >> From: MSB [mailto:markbrdsly@tiscali.co.uk]
> >> Sent: Friday, April 23, 2010 6:24 PM
> >> To: user@poi.apache.org
> >> Subject: RE: flush workbook from time to time
> >>
> >>
> >> Oh, well in that case the BigGridDemo would not work for you as it was
> >> only
> >> applicable to the OOXML based file format. If you are using the older
> >> binary
> >> format (.xls) then the answer is a certain no to streaming the file in
> >> chunks I am sorry to say. The reason lies in the structure of the Excel
> >> file
> >> itself, I am not completely certain of the details but understand that
> it
> >> consists of a series of streams and is block structured - that is why
> the
> >> files sizes are always multiples of 512 or 4096 I believe. As a result,
> a
> >> model of the file must be assembled in memory firstly, formatted and
> then
> >> written out.
> >>
> >> Must admit that I am surprised you are experiencing problems with
> >> performance if you are using the HSSF stream as that is a more common
> >> problem with the OOXML files. Have you tried running a profiler on the
> >> code
> >> to see where the bottlenecks are?
> >>
> >> Yours
> >>
> >> Mark B
> >>
> >>
> >> Martin Asenov wrote:
> >>>
> >>> Hi, Mark, thanks for the quick reply.
> >>>
> >>> I don't know exactly what format we're exporting into, but I think it's
> >>> the .xls one. In code we're using HSSFWorkbook class. Anyway, the whole
> >>> thing is not written by me, so I'm unaware of more details about it.
> >>>
> >>> Every single contact expands from 2 to up to 20 horizontal cells, but
> >>> commonly in 4-5 horizontal cells.
> >>>
> >>> The point is that in the UI when the user clicks on 'export' we're
> >>> supposed to supply him with the exported file. Delaying is not a
> >>> problem,
> >>> unless it's 7-8 mins for a workbook of 2000 contacts :-) I hope there's
> >>> a
> >>> way to work around this...
> >>>
> >>> Best,
> >>> Martin
> >>>
> >>>
> >>> -----Original Message-----
> >>> From: MSB [mailto:markbrdsly@tiscali.co.uk]
> >>> Sent: Friday, April 23, 2010 5:17 PM
> >>> To: user@poi.apache.org
> >>> Subject: Re: flush workbook from time to time
> >>>
> >>>
> >>> Can I ask which version of the file format you are targetting please,
> >>> the
> >>> binary (.xls) or OOXML (.xlsx) one? If it is the latter, then it may be
> >>> worthwhile looking at Yegor's BigGridDemo program - sorry but I do not
> >>> have
> >>> the link at hand currently - where he streams the contents of a file
> >>> into
> >>> a
> >>> template to create large worksheets without encountering out of memory
> >>> exceptions. It is not a fully fledged utility but rather a proof that
> >>> you
> >>> may - and that is may - be able to adapt to handle your current
> >>> requirement.
> >>> I do not know if you would be able to modify the program to, for
> >>> example,
> >>> fill rows 0 to 2000 with data and then fill rows 2001 to 4000 with data
> >>> at
> >>> a
> >>> later date but I feel it could be worth looking into.
> >>>
> >>> Aside from that, there is no way to currently stream the document out
> >>> bit
> >>> by
> >>> bit I am sorry to say. It should be possible and has, I believe, been
> >>> discussed by the developers but would require some programming effort
> >>> and
> >>> no
> >>> one has stepped up to sponsor the work.
> >>>
> >>> Yours
> >>>
> >>> Mark B
> >>>
> >>>
> >>> Martin Asenov wrote:
> >>>>
> >>>> Hello, everyone!
> >>>>
> >>>> I've got some DB which contains some contacts. I'm using the Apache
> POI
> >>>> in
> >>>> order to export that DB to an Excel file. Anyway, let's presume the
> >>>> contacts are 2000. When I start creating the workbook, the contacts
> are
> >>>> inserted pretty quick, but when the workbook begins to get bigger and
> >>>> bigger, things are terribly slowed down.
> >>>>
> >>>> Is there a way to flush the workbook from time to time? I didn't see
> >>>> such
> >>>> opportunity in the API.
> >>>>
> >>>> Best,
> >>>> Martin
> >>>>
> >>>>
> >>>
> >>> --
> >>> View this message in context:
> >>>
> http://old.nabble.com/flush-workbook-from-time-to-time-tp28341613p28341925.html
> >>> Sent from the POI - User mailing list archive at Nabble.com.
> >>>
> >>>
> >>> ---------------------------------------------------------------------
> >>> To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
> >>> For additional commands, e-mail: user-help@poi.apache.org
> >>>
> >>>
> >>> ---------------------------------------------------------------------
> >>> To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
> >>> For additional commands, e-mail: user-help@poi.apache.org
> >>>
> >>>
> >>>
> >>
> >> --
> >> View this message in context:
> >>
> http://old.nabble.com/flush-workbook-from-time-to-time-tp28341613p28342812.html
> >> Sent from the POI - User mailing list archive at Nabble.com.
> >>
> >>
> >> ---------------------------------------------------------------------
> >> To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
> >> For additional commands, e-mail: user-help@poi.apache.org
> >>
> >>
> >> ---------------------------------------------------------------------
> >> To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
> >> For additional commands, e-mail: user-help@poi.apache.org
> >>
> >>
> >>
> >
> > --
> > View this message in context:
> >
> http://old.nabble.com/flush-workbook-from-time-to-time-tp28341613p28343277.html
> > Sent from the POI - User mailing list archive at Nabble.com.
> >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
> > For additional commands, e-mail: user-help@poi.apache.org
> >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
> > For additional commands, e-mail: user-help@poi.apache.org
> >
> >
> >
>
> --
> View this message in context:
> http://old.nabble.com/flush-workbook-from-time-to-time-tp28341613p28354716.html
> Sent from the POI - User mailing list archive at Nabble.com.
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
> For additional commands, e-mail: user-help@poi.apache.org
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
> For additional commands, e-mail: user-help@poi.apache.org
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
> For additional commands, e-mail: user-help@poi.apache.org
>
>

RE: flush workbook from time to time

Posted by Martin Asenov <mA...@velti.com>.
Hello, everyone!

The problem was that sheet.autoResizeColumn() was called on every single cell creation. Fortunately, I saw where the delay was coming from!

Thank you all for the help!

Best,
Martin

-----Original Message-----
From: Richard Holmes [mailto:richard@shedconsulting.co.uk] 
Sent: Monday, April 26, 2010 10:23 AM
To: Martin Asenov
Subject: Re: flush workbook from time to time

Hi martin

Ok was just wondering how the iterators worked specifically the inner iteration does it use queries against the database and are you explicitly closing result sets

Richard
-----Original Message-----
From: Martin Asenov <mA...@velti.com>
Date: Mon, 26 Apr 2010 10:16:09 
To: POI Users List<us...@poi.apache.org>; richard@shedconsulting.co.uk<ri...@shedconsulting.co.uk>
Subject: RE: flush workbook from time to time

Hi, Richard!

The two classes are too big and there is nothing special about them - the manager only uses underlying Hibernate layer to do queries, and the Contact class is just a bean.

The problem here is as the workbook grows up, things are terribly slowed down.

Best,
Martin

-----Original Message-----
From: Richard Holmes [mailto:richard@shedconsulting.co.uk] 
Sent: Monday, April 26, 2010 10:12 AM
To: POI Users List
Subject: Re: flush workbook from time to time

Hi martin 

Can you send the manager and contact class too 

I will look later when I am at my computer
Thanks
Richard
-----Original Message-----
From: Martin Asenov <mA...@velti.com>
Date: Mon, 26 Apr 2010 09:46:35 
To: POI Users List<us...@poi.apache.org>
Subject: RE: flush workbook from time to time


Hello, everyone!

I provide the snippet of code I'm using:

@Component
public class PhonebookExporterXLS {
	public static final String GROUP_STR = "*";
	public static final String OUTPUT_FILE_NAME = "contactsExportToXLS";

	private ManagerBean manager;
	private SMSOfficeProperties props;

	public void setProps(SMSOfficeProperties props) {
		this.props = props;
	}

	private Domain domain;
	private User actor;

	FileOutputStream fos;

	public void setManager(ManagerBean manager) {
		this.manager = manager;
	}

	public PhonebookExporterXLS() {
	}

	public PhonebookExporterXLS(User actor, Domain domain) {
		this.domain = domain;
		this.actor = actor;
	}

	// Excel tools
	HSSFWorkbook wb;
	HSSFSheet sheet;

	int rowCounterContacts = 0;

	public File export(String path) throws SMSOfficeException {

		File exportedFile = new File(path + String.format(props.get(OUTPUT_FILE_NAME), StatisticsExporter.fileDateFormat.format(new Date())));
		exportedFile.getParentFile().mkdirs();

		int currentRowCellsCount;

		try {
			fos = new FileOutputStream(exportedFile);
			wb = new HSSFWorkbook();
			sheet = wb.createSheet("Contacts");

			for (Contact currentContact : manager.getContacts(domain)) {
				HSSFRow currentContactRow = sheet.createRow(rowCounterContacts++);

				currentRowCellsCount = 0;

				// Number cell(s) - (String type)
				for (PhoneNumber currentNumber : currentContact.getNumbers()) {
					Cell cellNumber = currentContactRow.createCell(currentRowCellsCount++);
					cellNumber.setCellValue(currentNumber.getFullNumber());
					sheet.autoSizeColumn(currentRowCellsCount - 1);
				}

				// First name cell - (String type)
				Cell cellFirstName = currentContactRow.createCell(currentRowCellsCount++);
				cellFirstName.setCellValue(currentContact.getFirstName());
				sheet.autoSizeColumn(currentRowCellsCount - 1);

				// Last name cell - (String type)
				Cell cellLastName = currentContactRow.createCell(currentRowCellsCount++);
				cellLastName.setCellValue(currentContact.getLastName());
				sheet.autoSizeColumn(currentRowCellsCount - 1);

				for (Group currentGroup : currentContact.getGroups()) {
					Cell cellCurrentGroup = currentContactRow.createCell(currentRowCellsCount++);
					cellCurrentGroup.setCellValue(GROUP_STR + currentGroup.getName());
					sheet.autoSizeColumn(currentRowCellsCount - 1);
				}
			}

			// Writing the workbook to FileOutputStream...
			wb.write(fos);
			fos.flush();

		} catch (IOException ioe) {
			ioe.printStackTrace();
		} finally {
			rowCounterContacts = 0;

			try {
				fos.close();
			} catch (IOException e) {
				e.printStackTrace();
			}
		}

		return exportedFile;

	}

	public void setUser(User user) {
		actor = user;
	}

	public void setDomain(Domain domain) {
		this.domain = domain;
	}
}

Best,
Martin

-----Original Message-----
From: MSB [mailto:markbrdsly@tiscali.co.uk] 
Sent: Sunday, April 25, 2010 10:45 AM
To: user@poi.apache.org
Subject: RE: flush workbook from time to time


I should have expected you to look into CSV files, sorry about that. Must
admit, I also like Richard's suggestion; it could all be down to the JVM not
reclaiming space.

Until Monday.

Yours

Mark B


Martin Asenov wrote:
> 
> Hi Mark,
> 
> Well I've got both CSV and XLS export - you know about line separators and
> all this stuff that is system dependent, so we provide the both variants. 
> 
> Let's discuss it in Monday... Have a nice weekend
> 
> Thank you,
> Martin
> 
> -----Original Message-----
> From: MSB [mailto:markbrdsly@tiscali.co.uk] 
> Sent: Friday, April 23, 2010 6:59 PM
> To: user@poi.apache.org
> Subject: RE: flush workbook from time to time
> 
> 
> Old age is really slowing me down these days - and I can blame working for
> the last few days on a 1 in 3 slope removing Hawthorn, still digging the
> thorns out of my fingers now! - but have you looked into generating a CSV
> file and then 'opening' this with Excel?
> 
> I have never tried it myself but I think it should be possible to create a
> template using Excel - set all of the cell formats correctly and so on.
> Then
> you could simply open a CSV file and populate the respective cells with
> data
> and I am guessing that the formatting information will be preserved; your
> application would then only have to create/edit the CSV file.
> 
> It should be easy enough to mock up a simple test to see if it would work
> and although it would add some complexity to the task - opening the Excel
> template and then importing the data - that might be worthwhile. A macro
> could even be used to automate the process - something that runs when the
> workbook is opened to import the data from the CSV file maybe.
> 
> Yours
> 
> Mark B
> 
> 
> Martin Asenov wrote:
>> 
>> No, I've been using old good system.out.println(); statements - and
>> noticed that the gap between different insertions into the workbook
>> slighly increases on every insertion. Actually, the output file was 212
>> kb
>> large - it is surprising for me that file of such size could be generated
>> in so large amount of time. Something must be wrong...
>> 
>> Best,
>> Martin
>> 
>> -----Original Message-----
>> From: MSB [mailto:markbrdsly@tiscali.co.uk] 
>> Sent: Friday, April 23, 2010 6:24 PM
>> To: user@poi.apache.org
>> Subject: RE: flush workbook from time to time
>> 
>> 
>> Oh, well in that case the BigGridDemo would not work for you as it was
>> only
>> applicable to the OOXML based file format. If you are using the older
>> binary
>> format (.xls) then the answer is a certain no to streaming the file in
>> chunks I am sorry to say. The reason lies in the structure of the Excel
>> file
>> itself, I am not completely certain of the details but understand that it
>> consists of a series of streams and is block structured - that is why the
>> files sizes are always multiples of 512 or 4096 I believe. As a result, a
>> model of the file must be assembled in memory firstly, formatted and then
>> written out.
>> 
>> Must admit that I am surprised you are experiencing problems with
>> performance if you are using the HSSF stream as that is a more common
>> problem with the OOXML files. Have you tried running a profiler on the
>> code
>> to see where the bottlenecks are?
>> 
>> Yours
>> 
>> Mark B
>> 
>> 
>> Martin Asenov wrote:
>>> 
>>> Hi, Mark, thanks for the quick reply.
>>> 
>>> I don't know exactly what format we're exporting into, but I think it's
>>> the .xls one. In code we're using HSSFWorkbook class. Anyway, the whole
>>> thing is not written by me, so I'm unaware of more details about it.
>>> 
>>> Every single contact expands from 2 to up to 20 horizontal cells, but
>>> commonly in 4-5 horizontal cells.
>>> 
>>> The point is that in the UI when the user clicks on 'export' we're
>>> supposed to supply him with the exported file. Delaying is not a
>>> problem,
>>> unless it's 7-8 mins for a workbook of 2000 contacts :-) I hope there's
>>> a
>>> way to work around this...
>>> 
>>> Best,
>>> Martin
>>> 
>>> 
>>> -----Original Message-----
>>> From: MSB [mailto:markbrdsly@tiscali.co.uk] 
>>> Sent: Friday, April 23, 2010 5:17 PM
>>> To: user@poi.apache.org
>>> Subject: Re: flush workbook from time to time
>>> 
>>> 
>>> Can I ask which version of the file format you are targetting please,
>>> the
>>> binary (.xls) or OOXML (.xlsx) one? If it is the latter, then it may be
>>> worthwhile looking at Yegor's BigGridDemo program - sorry but I do not
>>> have
>>> the link at hand currently - where he streams the contents of a file
>>> into
>>> a
>>> template to create large worksheets without encountering out of memory
>>> exceptions. It is not a fully fledged utility but rather a proof that
>>> you
>>> may - and that is may - be able to adapt to handle your current
>>> requirement.
>>> I do not know if you would be able to modify the program to, for
>>> example,
>>> fill rows 0 to 2000 with data and then fill rows 2001 to 4000 with data
>>> at
>>> a
>>> later date but I feel it could be worth looking into.
>>> 
>>> Aside from that, there is no way to currently stream the document out
>>> bit
>>> by
>>> bit I am sorry to say. It should be possible and has, I believe, been
>>> discussed by the developers but would require some programming effort
>>> and
>>> no
>>> one has stepped up to sponsor the work.
>>> 
>>> Yours
>>> 
>>> Mark B
>>> 
>>> 
>>> Martin Asenov wrote:
>>>> 
>>>> Hello, everyone!
>>>> 
>>>> I've got some DB which contains some contacts. I'm using the Apache POI
>>>> in
>>>> order to export that DB to an Excel file. Anyway, let's presume the
>>>> contacts are 2000. When I start creating the workbook, the contacts are
>>>> inserted pretty quick, but when the workbook begins to get bigger and
>>>> bigger, things are terribly slowed down.
>>>> 
>>>> Is there a way to flush the workbook from time to time? I didn't see
>>>> such
>>>> opportunity in the API.
>>>> 
>>>> Best,
>>>> Martin
>>>> 
>>>> 
>>> 
>>> -- 
>>> View this message in context:
>>> http://old.nabble.com/flush-workbook-from-time-to-time-tp28341613p28341925.html
>>> Sent from the POI - User mailing list archive at Nabble.com.
>>> 
>>> 
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
>>> For additional commands, e-mail: user-help@poi.apache.org
>>> 
>>> 
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
>>> For additional commands, e-mail: user-help@poi.apache.org
>>> 
>>> 
>>> 
>> 
>> -- 
>> View this message in context:
>> http://old.nabble.com/flush-workbook-from-time-to-time-tp28341613p28342812.html
>> Sent from the POI - User mailing list archive at Nabble.com.
>> 
>> 
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
>> For additional commands, e-mail: user-help@poi.apache.org
>> 
>> 
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
>> For additional commands, e-mail: user-help@poi.apache.org
>> 
>> 
>> 
> 
> -- 
> View this message in context:
> http://old.nabble.com/flush-workbook-from-time-to-time-tp28341613p28343277.html
> Sent from the POI - User mailing list archive at Nabble.com.
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
> For additional commands, e-mail: user-help@poi.apache.org
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
> For additional commands, e-mail: user-help@poi.apache.org
> 
> 
> 

-- 
View this message in context: http://old.nabble.com/flush-workbook-from-time-to-time-tp28341613p28354716.html
Sent from the POI - User mailing list archive at Nabble.com.


---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
For additional commands, e-mail: user-help@poi.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
For additional commands, e-mail: user-help@poi.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
For additional commands, e-mail: user-help@poi.apache.org


RE: flush workbook from time to time

Posted by Martin Asenov <mA...@velti.com>.
Hi, Richard!

The two classes are too big and there is nothing special about them - the manager only uses underlying Hibernate layer to do queries, and the Contact class is just a bean.

The problem here is as the workbook grows up, things are terribly slowed down.

Best,
Martin

-----Original Message-----
From: Richard Holmes [mailto:richard@shedconsulting.co.uk] 
Sent: Monday, April 26, 2010 10:12 AM
To: POI Users List
Subject: Re: flush workbook from time to time

Hi martin 

Can you send the manager and contact class too 

I will look later when I am at my computer
Thanks
Richard
-----Original Message-----
From: Martin Asenov <mA...@velti.com>
Date: Mon, 26 Apr 2010 09:46:35 
To: POI Users List<us...@poi.apache.org>
Subject: RE: flush workbook from time to time


Hello, everyone!

I provide the snippet of code I'm using:

@Component
public class PhonebookExporterXLS {
	public static final String GROUP_STR = "*";
	public static final String OUTPUT_FILE_NAME = "contactsExportToXLS";

	private ManagerBean manager;
	private SMSOfficeProperties props;

	public void setProps(SMSOfficeProperties props) {
		this.props = props;
	}

	private Domain domain;
	private User actor;

	FileOutputStream fos;

	public void setManager(ManagerBean manager) {
		this.manager = manager;
	}

	public PhonebookExporterXLS() {
	}

	public PhonebookExporterXLS(User actor, Domain domain) {
		this.domain = domain;
		this.actor = actor;
	}

	// Excel tools
	HSSFWorkbook wb;
	HSSFSheet sheet;

	int rowCounterContacts = 0;

	public File export(String path) throws SMSOfficeException {

		File exportedFile = new File(path + String.format(props.get(OUTPUT_FILE_NAME), StatisticsExporter.fileDateFormat.format(new Date())));
		exportedFile.getParentFile().mkdirs();

		int currentRowCellsCount;

		try {
			fos = new FileOutputStream(exportedFile);
			wb = new HSSFWorkbook();
			sheet = wb.createSheet("Contacts");

			for (Contact currentContact : manager.getContacts(domain)) {
				HSSFRow currentContactRow = sheet.createRow(rowCounterContacts++);

				currentRowCellsCount = 0;

				// Number cell(s) - (String type)
				for (PhoneNumber currentNumber : currentContact.getNumbers()) {
					Cell cellNumber = currentContactRow.createCell(currentRowCellsCount++);
					cellNumber.setCellValue(currentNumber.getFullNumber());
					sheet.autoSizeColumn(currentRowCellsCount - 1);
				}

				// First name cell - (String type)
				Cell cellFirstName = currentContactRow.createCell(currentRowCellsCount++);
				cellFirstName.setCellValue(currentContact.getFirstName());
				sheet.autoSizeColumn(currentRowCellsCount - 1);

				// Last name cell - (String type)
				Cell cellLastName = currentContactRow.createCell(currentRowCellsCount++);
				cellLastName.setCellValue(currentContact.getLastName());
				sheet.autoSizeColumn(currentRowCellsCount - 1);

				for (Group currentGroup : currentContact.getGroups()) {
					Cell cellCurrentGroup = currentContactRow.createCell(currentRowCellsCount++);
					cellCurrentGroup.setCellValue(GROUP_STR + currentGroup.getName());
					sheet.autoSizeColumn(currentRowCellsCount - 1);
				}
			}

			// Writing the workbook to FileOutputStream...
			wb.write(fos);
			fos.flush();

		} catch (IOException ioe) {
			ioe.printStackTrace();
		} finally {
			rowCounterContacts = 0;

			try {
				fos.close();
			} catch (IOException e) {
				e.printStackTrace();
			}
		}

		return exportedFile;

	}

	public void setUser(User user) {
		actor = user;
	}

	public void setDomain(Domain domain) {
		this.domain = domain;
	}
}

Best,
Martin

-----Original Message-----
From: MSB [mailto:markbrdsly@tiscali.co.uk] 
Sent: Sunday, April 25, 2010 10:45 AM
To: user@poi.apache.org
Subject: RE: flush workbook from time to time


I should have expected you to look into CSV files, sorry about that. Must
admit, I also like Richard's suggestion; it could all be down to the JVM not
reclaiming space.

Until Monday.

Yours

Mark B


Martin Asenov wrote:
> 
> Hi Mark,
> 
> Well I've got both CSV and XLS export - you know about line separators and
> all this stuff that is system dependent, so we provide the both variants. 
> 
> Let's discuss it in Monday... Have a nice weekend
> 
> Thank you,
> Martin
> 
> -----Original Message-----
> From: MSB [mailto:markbrdsly@tiscali.co.uk] 
> Sent: Friday, April 23, 2010 6:59 PM
> To: user@poi.apache.org
> Subject: RE: flush workbook from time to time
> 
> 
> Old age is really slowing me down these days - and I can blame working for
> the last few days on a 1 in 3 slope removing Hawthorn, still digging the
> thorns out of my fingers now! - but have you looked into generating a CSV
> file and then 'opening' this with Excel?
> 
> I have never tried it myself but I think it should be possible to create a
> template using Excel - set all of the cell formats correctly and so on.
> Then
> you could simply open a CSV file and populate the respective cells with
> data
> and I am guessing that the formatting information will be preserved; your
> application would then only have to create/edit the CSV file.
> 
> It should be easy enough to mock up a simple test to see if it would work
> and although it would add some complexity to the task - opening the Excel
> template and then importing the data - that might be worthwhile. A macro
> could even be used to automate the process - something that runs when the
> workbook is opened to import the data from the CSV file maybe.
> 
> Yours
> 
> Mark B
> 
> 
> Martin Asenov wrote:
>> 
>> No, I've been using old good system.out.println(); statements - and
>> noticed that the gap between different insertions into the workbook
>> slighly increases on every insertion. Actually, the output file was 212
>> kb
>> large - it is surprising for me that file of such size could be generated
>> in so large amount of time. Something must be wrong...
>> 
>> Best,
>> Martin
>> 
>> -----Original Message-----
>> From: MSB [mailto:markbrdsly@tiscali.co.uk] 
>> Sent: Friday, April 23, 2010 6:24 PM
>> To: user@poi.apache.org
>> Subject: RE: flush workbook from time to time
>> 
>> 
>> Oh, well in that case the BigGridDemo would not work for you as it was
>> only
>> applicable to the OOXML based file format. If you are using the older
>> binary
>> format (.xls) then the answer is a certain no to streaming the file in
>> chunks I am sorry to say. The reason lies in the structure of the Excel
>> file
>> itself, I am not completely certain of the details but understand that it
>> consists of a series of streams and is block structured - that is why the
>> files sizes are always multiples of 512 or 4096 I believe. As a result, a
>> model of the file must be assembled in memory firstly, formatted and then
>> written out.
>> 
>> Must admit that I am surprised you are experiencing problems with
>> performance if you are using the HSSF stream as that is a more common
>> problem with the OOXML files. Have you tried running a profiler on the
>> code
>> to see where the bottlenecks are?
>> 
>> Yours
>> 
>> Mark B
>> 
>> 
>> Martin Asenov wrote:
>>> 
>>> Hi, Mark, thanks for the quick reply.
>>> 
>>> I don't know exactly what format we're exporting into, but I think it's
>>> the .xls one. In code we're using HSSFWorkbook class. Anyway, the whole
>>> thing is not written by me, so I'm unaware of more details about it.
>>> 
>>> Every single contact expands from 2 to up to 20 horizontal cells, but
>>> commonly in 4-5 horizontal cells.
>>> 
>>> The point is that in the UI when the user clicks on 'export' we're
>>> supposed to supply him with the exported file. Delaying is not a
>>> problem,
>>> unless it's 7-8 mins for a workbook of 2000 contacts :-) I hope there's
>>> a
>>> way to work around this...
>>> 
>>> Best,
>>> Martin
>>> 
>>> 
>>> -----Original Message-----
>>> From: MSB [mailto:markbrdsly@tiscali.co.uk] 
>>> Sent: Friday, April 23, 2010 5:17 PM
>>> To: user@poi.apache.org
>>> Subject: Re: flush workbook from time to time
>>> 
>>> 
>>> Can I ask which version of the file format you are targetting please,
>>> the
>>> binary (.xls) or OOXML (.xlsx) one? If it is the latter, then it may be
>>> worthwhile looking at Yegor's BigGridDemo program - sorry but I do not
>>> have
>>> the link at hand currently - where he streams the contents of a file
>>> into
>>> a
>>> template to create large worksheets without encountering out of memory
>>> exceptions. It is not a fully fledged utility but rather a proof that
>>> you
>>> may - and that is may - be able to adapt to handle your current
>>> requirement.
>>> I do not know if you would be able to modify the program to, for
>>> example,
>>> fill rows 0 to 2000 with data and then fill rows 2001 to 4000 with data
>>> at
>>> a
>>> later date but I feel it could be worth looking into.
>>> 
>>> Aside from that, there is no way to currently stream the document out
>>> bit
>>> by
>>> bit I am sorry to say. It should be possible and has, I believe, been
>>> discussed by the developers but would require some programming effort
>>> and
>>> no
>>> one has stepped up to sponsor the work.
>>> 
>>> Yours
>>> 
>>> Mark B
>>> 
>>> 
>>> Martin Asenov wrote:
>>>> 
>>>> Hello, everyone!
>>>> 
>>>> I've got some DB which contains some contacts. I'm using the Apache POI
>>>> in
>>>> order to export that DB to an Excel file. Anyway, let's presume the
>>>> contacts are 2000. When I start creating the workbook, the contacts are
>>>> inserted pretty quick, but when the workbook begins to get bigger and
>>>> bigger, things are terribly slowed down.
>>>> 
>>>> Is there a way to flush the workbook from time to time? I didn't see
>>>> such
>>>> opportunity in the API.
>>>> 
>>>> Best,
>>>> Martin
>>>> 
>>>> 
>>> 
>>> -- 
>>> View this message in context:
>>> http://old.nabble.com/flush-workbook-from-time-to-time-tp28341613p28341925.html
>>> Sent from the POI - User mailing list archive at Nabble.com.
>>> 
>>> 
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
>>> For additional commands, e-mail: user-help@poi.apache.org
>>> 
>>> 
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
>>> For additional commands, e-mail: user-help@poi.apache.org
>>> 
>>> 
>>> 
>> 
>> -- 
>> View this message in context:
>> http://old.nabble.com/flush-workbook-from-time-to-time-tp28341613p28342812.html
>> Sent from the POI - User mailing list archive at Nabble.com.
>> 
>> 
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
>> For additional commands, e-mail: user-help@poi.apache.org
>> 
>> 
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
>> For additional commands, e-mail: user-help@poi.apache.org
>> 
>> 
>> 
> 
> -- 
> View this message in context:
> http://old.nabble.com/flush-workbook-from-time-to-time-tp28341613p28343277.html
> Sent from the POI - User mailing list archive at Nabble.com.
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
> For additional commands, e-mail: user-help@poi.apache.org
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
> For additional commands, e-mail: user-help@poi.apache.org
> 
> 
> 

-- 
View this message in context: http://old.nabble.com/flush-workbook-from-time-to-time-tp28341613p28354716.html
Sent from the POI - User mailing list archive at Nabble.com.


---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
For additional commands, e-mail: user-help@poi.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
For additional commands, e-mail: user-help@poi.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
For additional commands, e-mail: user-help@poi.apache.org


Re: flush workbook from time to time

Posted by Richard Holmes <ri...@shedconsulting.co.uk>.
Hi martin 

Can you send the manager and contact class too 

I will look later when I am at my computer
Thanks
Richard
-----Original Message-----
From: Martin Asenov <mA...@velti.com>
Date: Mon, 26 Apr 2010 09:46:35 
To: POI Users List<us...@poi.apache.org>
Subject: RE: flush workbook from time to time


Hello, everyone!

I provide the snippet of code I'm using:

@Component
public class PhonebookExporterXLS {
	public static final String GROUP_STR = "*";
	public static final String OUTPUT_FILE_NAME = "contactsExportToXLS";

	private ManagerBean manager;
	private SMSOfficeProperties props;

	public void setProps(SMSOfficeProperties props) {
		this.props = props;
	}

	private Domain domain;
	private User actor;

	FileOutputStream fos;

	public void setManager(ManagerBean manager) {
		this.manager = manager;
	}

	public PhonebookExporterXLS() {
	}

	public PhonebookExporterXLS(User actor, Domain domain) {
		this.domain = domain;
		this.actor = actor;
	}

	// Excel tools
	HSSFWorkbook wb;
	HSSFSheet sheet;

	int rowCounterContacts = 0;

	public File export(String path) throws SMSOfficeException {

		File exportedFile = new File(path + String.format(props.get(OUTPUT_FILE_NAME), StatisticsExporter.fileDateFormat.format(new Date())));
		exportedFile.getParentFile().mkdirs();

		int currentRowCellsCount;

		try {
			fos = new FileOutputStream(exportedFile);
			wb = new HSSFWorkbook();
			sheet = wb.createSheet("Contacts");

			for (Contact currentContact : manager.getContacts(domain)) {
				HSSFRow currentContactRow = sheet.createRow(rowCounterContacts++);

				currentRowCellsCount = 0;

				// Number cell(s) - (String type)
				for (PhoneNumber currentNumber : currentContact.getNumbers()) {
					Cell cellNumber = currentContactRow.createCell(currentRowCellsCount++);
					cellNumber.setCellValue(currentNumber.getFullNumber());
					sheet.autoSizeColumn(currentRowCellsCount - 1);
				}

				// First name cell - (String type)
				Cell cellFirstName = currentContactRow.createCell(currentRowCellsCount++);
				cellFirstName.setCellValue(currentContact.getFirstName());
				sheet.autoSizeColumn(currentRowCellsCount - 1);

				// Last name cell - (String type)
				Cell cellLastName = currentContactRow.createCell(currentRowCellsCount++);
				cellLastName.setCellValue(currentContact.getLastName());
				sheet.autoSizeColumn(currentRowCellsCount - 1);

				for (Group currentGroup : currentContact.getGroups()) {
					Cell cellCurrentGroup = currentContactRow.createCell(currentRowCellsCount++);
					cellCurrentGroup.setCellValue(GROUP_STR + currentGroup.getName());
					sheet.autoSizeColumn(currentRowCellsCount - 1);
				}
			}

			// Writing the workbook to FileOutputStream...
			wb.write(fos);
			fos.flush();

		} catch (IOException ioe) {
			ioe.printStackTrace();
		} finally {
			rowCounterContacts = 0;

			try {
				fos.close();
			} catch (IOException e) {
				e.printStackTrace();
			}
		}

		return exportedFile;

	}

	public void setUser(User user) {
		actor = user;
	}

	public void setDomain(Domain domain) {
		this.domain = domain;
	}
}

Best,
Martin

-----Original Message-----
From: MSB [mailto:markbrdsly@tiscali.co.uk] 
Sent: Sunday, April 25, 2010 10:45 AM
To: user@poi.apache.org
Subject: RE: flush workbook from time to time


I should have expected you to look into CSV files, sorry about that. Must
admit, I also like Richard's suggestion; it could all be down to the JVM not
reclaiming space.

Until Monday.

Yours

Mark B


Martin Asenov wrote:
> 
> Hi Mark,
> 
> Well I've got both CSV and XLS export - you know about line separators and
> all this stuff that is system dependent, so we provide the both variants. 
> 
> Let's discuss it in Monday... Have a nice weekend
> 
> Thank you,
> Martin
> 
> -----Original Message-----
> From: MSB [mailto:markbrdsly@tiscali.co.uk] 
> Sent: Friday, April 23, 2010 6:59 PM
> To: user@poi.apache.org
> Subject: RE: flush workbook from time to time
> 
> 
> Old age is really slowing me down these days - and I can blame working for
> the last few days on a 1 in 3 slope removing Hawthorn, still digging the
> thorns out of my fingers now! - but have you looked into generating a CSV
> file and then 'opening' this with Excel?
> 
> I have never tried it myself but I think it should be possible to create a
> template using Excel - set all of the cell formats correctly and so on.
> Then
> you could simply open a CSV file and populate the respective cells with
> data
> and I am guessing that the formatting information will be preserved; your
> application would then only have to create/edit the CSV file.
> 
> It should be easy enough to mock up a simple test to see if it would work
> and although it would add some complexity to the task - opening the Excel
> template and then importing the data - that might be worthwhile. A macro
> could even be used to automate the process - something that runs when the
> workbook is opened to import the data from the CSV file maybe.
> 
> Yours
> 
> Mark B
> 
> 
> Martin Asenov wrote:
>> 
>> No, I've been using old good system.out.println(); statements - and
>> noticed that the gap between different insertions into the workbook
>> slighly increases on every insertion. Actually, the output file was 212
>> kb
>> large - it is surprising for me that file of such size could be generated
>> in so large amount of time. Something must be wrong...
>> 
>> Best,
>> Martin
>> 
>> -----Original Message-----
>> From: MSB [mailto:markbrdsly@tiscali.co.uk] 
>> Sent: Friday, April 23, 2010 6:24 PM
>> To: user@poi.apache.org
>> Subject: RE: flush workbook from time to time
>> 
>> 
>> Oh, well in that case the BigGridDemo would not work for you as it was
>> only
>> applicable to the OOXML based file format. If you are using the older
>> binary
>> format (.xls) then the answer is a certain no to streaming the file in
>> chunks I am sorry to say. The reason lies in the structure of the Excel
>> file
>> itself, I am not completely certain of the details but understand that it
>> consists of a series of streams and is block structured - that is why the
>> files sizes are always multiples of 512 or 4096 I believe. As a result, a
>> model of the file must be assembled in memory firstly, formatted and then
>> written out.
>> 
>> Must admit that I am surprised you are experiencing problems with
>> performance if you are using the HSSF stream as that is a more common
>> problem with the OOXML files. Have you tried running a profiler on the
>> code
>> to see where the bottlenecks are?
>> 
>> Yours
>> 
>> Mark B
>> 
>> 
>> Martin Asenov wrote:
>>> 
>>> Hi, Mark, thanks for the quick reply.
>>> 
>>> I don't know exactly what format we're exporting into, but I think it's
>>> the .xls one. In code we're using HSSFWorkbook class. Anyway, the whole
>>> thing is not written by me, so I'm unaware of more details about it.
>>> 
>>> Every single contact expands from 2 to up to 20 horizontal cells, but
>>> commonly in 4-5 horizontal cells.
>>> 
>>> The point is that in the UI when the user clicks on 'export' we're
>>> supposed to supply him with the exported file. Delaying is not a
>>> problem,
>>> unless it's 7-8 mins for a workbook of 2000 contacts :-) I hope there's
>>> a
>>> way to work around this...
>>> 
>>> Best,
>>> Martin
>>> 
>>> 
>>> -----Original Message-----
>>> From: MSB [mailto:markbrdsly@tiscali.co.uk] 
>>> Sent: Friday, April 23, 2010 5:17 PM
>>> To: user@poi.apache.org
>>> Subject: Re: flush workbook from time to time
>>> 
>>> 
>>> Can I ask which version of the file format you are targetting please,
>>> the
>>> binary (.xls) or OOXML (.xlsx) one? If it is the latter, then it may be
>>> worthwhile looking at Yegor's BigGridDemo program - sorry but I do not
>>> have
>>> the link at hand currently - where he streams the contents of a file
>>> into
>>> a
>>> template to create large worksheets without encountering out of memory
>>> exceptions. It is not a fully fledged utility but rather a proof that
>>> you
>>> may - and that is may - be able to adapt to handle your current
>>> requirement.
>>> I do not know if you would be able to modify the program to, for
>>> example,
>>> fill rows 0 to 2000 with data and then fill rows 2001 to 4000 with data
>>> at
>>> a
>>> later date but I feel it could be worth looking into.
>>> 
>>> Aside from that, there is no way to currently stream the document out
>>> bit
>>> by
>>> bit I am sorry to say. It should be possible and has, I believe, been
>>> discussed by the developers but would require some programming effort
>>> and
>>> no
>>> one has stepped up to sponsor the work.
>>> 
>>> Yours
>>> 
>>> Mark B
>>> 
>>> 
>>> Martin Asenov wrote:
>>>> 
>>>> Hello, everyone!
>>>> 
>>>> I've got some DB which contains some contacts. I'm using the Apache POI
>>>> in
>>>> order to export that DB to an Excel file. Anyway, let's presume the
>>>> contacts are 2000. When I start creating the workbook, the contacts are
>>>> inserted pretty quick, but when the workbook begins to get bigger and
>>>> bigger, things are terribly slowed down.
>>>> 
>>>> Is there a way to flush the workbook from time to time? I didn't see
>>>> such
>>>> opportunity in the API.
>>>> 
>>>> Best,
>>>> Martin
>>>> 
>>>> 
>>> 
>>> -- 
>>> View this message in context:
>>> http://old.nabble.com/flush-workbook-from-time-to-time-tp28341613p28341925.html
>>> Sent from the POI - User mailing list archive at Nabble.com.
>>> 
>>> 
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
>>> For additional commands, e-mail: user-help@poi.apache.org
>>> 
>>> 
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
>>> For additional commands, e-mail: user-help@poi.apache.org
>>> 
>>> 
>>> 
>> 
>> -- 
>> View this message in context:
>> http://old.nabble.com/flush-workbook-from-time-to-time-tp28341613p28342812.html
>> Sent from the POI - User mailing list archive at Nabble.com.
>> 
>> 
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
>> For additional commands, e-mail: user-help@poi.apache.org
>> 
>> 
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
>> For additional commands, e-mail: user-help@poi.apache.org
>> 
>> 
>> 
> 
> -- 
> View this message in context:
> http://old.nabble.com/flush-workbook-from-time-to-time-tp28341613p28343277.html
> Sent from the POI - User mailing list archive at Nabble.com.
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
> For additional commands, e-mail: user-help@poi.apache.org
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
> For additional commands, e-mail: user-help@poi.apache.org
> 
> 
> 

-- 
View this message in context: http://old.nabble.com/flush-workbook-from-time-to-time-tp28341613p28354716.html
Sent from the POI - User mailing list archive at Nabble.com.


---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
For additional commands, e-mail: user-help@poi.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
For additional commands, e-mail: user-help@poi.apache.org


RE: flush workbook from time to time

Posted by Martin Asenov <mA...@velti.com>.
It causes the problem :)

Best,
Martin

-----Original Message-----
From: MSB [mailto:markbrdsly@tiscali.co.uk] 
Sent: Monday, April 26, 2010 10:00 AM
To: user@poi.apache.org
Subject: RE: flush workbook from time to time


Morning Martin,

Forgive the stupid question but does this code solve the problem or is it
exhibiting the issue you described?

Yours

Mark B


Martin Asenov wrote:
> 
> 
> Hello, everyone!
> 
> I provide the snippet of code I'm using:
> 
> @Component
> public class PhonebookExporterXLS {
> 	public static final String GROUP_STR = "*";
> 	public static final String OUTPUT_FILE_NAME = "contactsExportToXLS";
> 
> 	private ManagerBean manager;
> 	private SMSOfficeProperties props;
> 
> 	public void setProps(SMSOfficeProperties props) {
> 		this.props = props;
> 	}
> 
> 	private Domain domain;
> 	private User actor;
> 
> 	FileOutputStream fos;
> 
> 	public void setManager(ManagerBean manager) {
> 		this.manager = manager;
> 	}
> 
> 	public PhonebookExporterXLS() {
> 	}
> 
> 	public PhonebookExporterXLS(User actor, Domain domain) {
> 		this.domain = domain;
> 		this.actor = actor;
> 	}
> 
> 	// Excel tools
> 	HSSFWorkbook wb;
> 	HSSFSheet sheet;
> 
> 	int rowCounterContacts = 0;
> 
> 	public File export(String path) throws SMSOfficeException {
> 
> 		File exportedFile = new File(path +
> String.format(props.get(OUTPUT_FILE_NAME),
> StatisticsExporter.fileDateFormat.format(new Date())));
> 		exportedFile.getParentFile().mkdirs();
> 
> 		int currentRowCellsCount;
> 
> 		try {
> 			fos = new FileOutputStream(exportedFile);
> 			wb = new HSSFWorkbook();
> 			sheet = wb.createSheet("Contacts");
> 
> 			for (Contact currentContact : manager.getContacts(domain)) {
> 				HSSFRow currentContactRow = sheet.createRow(rowCounterContacts++);
> 
> 				currentRowCellsCount = 0;
> 
> 				// Number cell(s) - (String type)
> 				for (PhoneNumber currentNumber : currentContact.getNumbers()) {
> 					Cell cellNumber =
> currentContactRow.createCell(currentRowCellsCount++);
> 					cellNumber.setCellValue(currentNumber.getFullNumber());
> 					sheet.autoSizeColumn(currentRowCellsCount - 1);
> 				}
> 
> 				// First name cell - (String type)
> 				Cell cellFirstName =
> currentContactRow.createCell(currentRowCellsCount++);
> 				cellFirstName.setCellValue(currentContact.getFirstName());
> 				sheet.autoSizeColumn(currentRowCellsCount - 1);
> 
> 				// Last name cell - (String type)
> 				Cell cellLastName =
> currentContactRow.createCell(currentRowCellsCount++);
> 				cellLastName.setCellValue(currentContact.getLastName());
> 				sheet.autoSizeColumn(currentRowCellsCount - 1);
> 
> 				for (Group currentGroup : currentContact.getGroups()) {
> 					Cell cellCurrentGroup =
> currentContactRow.createCell(currentRowCellsCount++);
> 					cellCurrentGroup.setCellValue(GROUP_STR + currentGroup.getName());
> 					sheet.autoSizeColumn(currentRowCellsCount - 1);
> 				}
> 			}
> 
> 			// Writing the workbook to FileOutputStream...
> 			wb.write(fos);
> 			fos.flush();
> 
> 		} catch (IOException ioe) {
> 			ioe.printStackTrace();
> 		} finally {
> 			rowCounterContacts = 0;
> 
> 			try {
> 				fos.close();
> 			} catch (IOException e) {
> 				e.printStackTrace();
> 			}
> 		}
> 
> 		return exportedFile;
> 
> 	}
> 
> 	public void setUser(User user) {
> 		actor = user;
> 	}
> 
> 	public void setDomain(Domain domain) {
> 		this.domain = domain;
> 	}
> }
> 
> Best,
> Martin
> 
> -----Original Message-----
> From: MSB [mailto:markbrdsly@tiscali.co.uk] 
> Sent: Sunday, April 25, 2010 10:45 AM
> To: user@poi.apache.org
> Subject: RE: flush workbook from time to time
> 
> 
> I should have expected you to look into CSV files, sorry about that. Must
> admit, I also like Richard's suggestion; it could all be down to the JVM
> not
> reclaiming space.
> 
> Until Monday.
> 
> Yours
> 
> Mark B
> 
> 
> Martin Asenov wrote:
>> 
>> Hi Mark,
>> 
>> Well I've got both CSV and XLS export - you know about line separators
>> and
>> all this stuff that is system dependent, so we provide the both variants. 
>> 
>> Let's discuss it in Monday... Have a nice weekend
>> 
>> Thank you,
>> Martin
>> 
>> -----Original Message-----
>> From: MSB [mailto:markbrdsly@tiscali.co.uk] 
>> Sent: Friday, April 23, 2010 6:59 PM
>> To: user@poi.apache.org
>> Subject: RE: flush workbook from time to time
>> 
>> 
>> Old age is really slowing me down these days - and I can blame working
>> for
>> the last few days on a 1 in 3 slope removing Hawthorn, still digging the
>> thorns out of my fingers now! - but have you looked into generating a CSV
>> file and then 'opening' this with Excel?
>> 
>> I have never tried it myself but I think it should be possible to create
>> a
>> template using Excel - set all of the cell formats correctly and so on.
>> Then
>> you could simply open a CSV file and populate the respective cells with
>> data
>> and I am guessing that the formatting information will be preserved; your
>> application would then only have to create/edit the CSV file.
>> 
>> It should be easy enough to mock up a simple test to see if it would work
>> and although it would add some complexity to the task - opening the Excel
>> template and then importing the data - that might be worthwhile. A macro
>> could even be used to automate the process - something that runs when the
>> workbook is opened to import the data from the CSV file maybe.
>> 
>> Yours
>> 
>> Mark B
>> 
>> 
>> Martin Asenov wrote:
>>> 
>>> No, I've been using old good system.out.println(); statements - and
>>> noticed that the gap between different insertions into the workbook
>>> slighly increases on every insertion. Actually, the output file was 212
>>> kb
>>> large - it is surprising for me that file of such size could be
>>> generated
>>> in so large amount of time. Something must be wrong...
>>> 
>>> Best,
>>> Martin
>>> 
>>> -----Original Message-----
>>> From: MSB [mailto:markbrdsly@tiscali.co.uk] 
>>> Sent: Friday, April 23, 2010 6:24 PM
>>> To: user@poi.apache.org
>>> Subject: RE: flush workbook from time to time
>>> 
>>> 
>>> Oh, well in that case the BigGridDemo would not work for you as it was
>>> only
>>> applicable to the OOXML based file format. If you are using the older
>>> binary
>>> format (.xls) then the answer is a certain no to streaming the file in
>>> chunks I am sorry to say. The reason lies in the structure of the Excel
>>> file
>>> itself, I am not completely certain of the details but understand that
>>> it
>>> consists of a series of streams and is block structured - that is why
>>> the
>>> files sizes are always multiples of 512 or 4096 I believe. As a result,
>>> a
>>> model of the file must be assembled in memory firstly, formatted and
>>> then
>>> written out.
>>> 
>>> Must admit that I am surprised you are experiencing problems with
>>> performance if you are using the HSSF stream as that is a more common
>>> problem with the OOXML files. Have you tried running a profiler on the
>>> code
>>> to see where the bottlenecks are?
>>> 
>>> Yours
>>> 
>>> Mark B
>>> 
>>> 
>>> Martin Asenov wrote:
>>>> 
>>>> Hi, Mark, thanks for the quick reply.
>>>> 
>>>> I don't know exactly what format we're exporting into, but I think it's
>>>> the .xls one. In code we're using HSSFWorkbook class. Anyway, the whole
>>>> thing is not written by me, so I'm unaware of more details about it.
>>>> 
>>>> Every single contact expands from 2 to up to 20 horizontal cells, but
>>>> commonly in 4-5 horizontal cells.
>>>> 
>>>> The point is that in the UI when the user clicks on 'export' we're
>>>> supposed to supply him with the exported file. Delaying is not a
>>>> problem,
>>>> unless it's 7-8 mins for a workbook of 2000 contacts :-) I hope there's
>>>> a
>>>> way to work around this...
>>>> 
>>>> Best,
>>>> Martin
>>>> 
>>>> 
>>>> -----Original Message-----
>>>> From: MSB [mailto:markbrdsly@tiscali.co.uk] 
>>>> Sent: Friday, April 23, 2010 5:17 PM
>>>> To: user@poi.apache.org
>>>> Subject: Re: flush workbook from time to time
>>>> 
>>>> 
>>>> Can I ask which version of the file format you are targetting please,
>>>> the
>>>> binary (.xls) or OOXML (.xlsx) one? If it is the latter, then it may be
>>>> worthwhile looking at Yegor's BigGridDemo program - sorry but I do not
>>>> have
>>>> the link at hand currently - where he streams the contents of a file
>>>> into
>>>> a
>>>> template to create large worksheets without encountering out of memory
>>>> exceptions. It is not a fully fledged utility but rather a proof that
>>>> you
>>>> may - and that is may - be able to adapt to handle your current
>>>> requirement.
>>>> I do not know if you would be able to modify the program to, for
>>>> example,
>>>> fill rows 0 to 2000 with data and then fill rows 2001 to 4000 with data
>>>> at
>>>> a
>>>> later date but I feel it could be worth looking into.
>>>> 
>>>> Aside from that, there is no way to currently stream the document out
>>>> bit
>>>> by
>>>> bit I am sorry to say. It should be possible and has, I believe, been
>>>> discussed by the developers but would require some programming effort
>>>> and
>>>> no
>>>> one has stepped up to sponsor the work.
>>>> 
>>>> Yours
>>>> 
>>>> Mark B
>>>> 
>>>> 
>>>> Martin Asenov wrote:
>>>>> 
>>>>> Hello, everyone!
>>>>> 
>>>>> I've got some DB which contains some contacts. I'm using the Apache
>>>>> POI
>>>>> in
>>>>> order to export that DB to an Excel file. Anyway, let's presume the
>>>>> contacts are 2000. When I start creating the workbook, the contacts
>>>>> are
>>>>> inserted pretty quick, but when the workbook begins to get bigger and
>>>>> bigger, things are terribly slowed down.
>>>>> 
>>>>> Is there a way to flush the workbook from time to time? I didn't see
>>>>> such
>>>>> opportunity in the API.
>>>>> 
>>>>> Best,
>>>>> Martin
>>>>> 
>>>>> 
>>>> 
>>>> -- 
>>>> View this message in context:
>>>> http://old.nabble.com/flush-workbook-from-time-to-time-tp28341613p28341925.html
>>>> Sent from the POI - User mailing list archive at Nabble.com.
>>>> 
>>>> 
>>>> ---------------------------------------------------------------------
>>>> To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
>>>> For additional commands, e-mail: user-help@poi.apache.org
>>>> 
>>>> 
>>>> ---------------------------------------------------------------------
>>>> To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
>>>> For additional commands, e-mail: user-help@poi.apache.org
>>>> 
>>>> 
>>>> 
>>> 
>>> -- 
>>> View this message in context:
>>> http://old.nabble.com/flush-workbook-from-time-to-time-tp28341613p28342812.html
>>> Sent from the POI - User mailing list archive at Nabble.com.
>>> 
>>> 
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
>>> For additional commands, e-mail: user-help@poi.apache.org
>>> 
>>> 
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
>>> For additional commands, e-mail: user-help@poi.apache.org
>>> 
>>> 
>>> 
>> 
>> -- 
>> View this message in context:
>> http://old.nabble.com/flush-workbook-from-time-to-time-tp28341613p28343277.html
>> Sent from the POI - User mailing list archive at Nabble.com.
>> 
>> 
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
>> For additional commands, e-mail: user-help@poi.apache.org
>> 
>> 
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
>> For additional commands, e-mail: user-help@poi.apache.org
>> 
>> 
>> 
> 
> -- 
> View this message in context:
> http://old.nabble.com/flush-workbook-from-time-to-time-tp28341613p28354716.html
> Sent from the POI - User mailing list archive at Nabble.com.
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
> For additional commands, e-mail: user-help@poi.apache.org
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
> For additional commands, e-mail: user-help@poi.apache.org
> 
> 
> 

-- 
View this message in context: http://old.nabble.com/flush-workbook-from-time-to-time-tp28341613p28361478.html
Sent from the POI - User mailing list archive at Nabble.com.


---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
For additional commands, e-mail: user-help@poi.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
For additional commands, e-mail: user-help@poi.apache.org


RE: flush workbook from time to time

Posted by MSB <ma...@tiscali.co.uk>.
Morning Martin,

Forgive the stupid question but does this code solve the problem or is it
exhibiting the issue you described?

Yours

Mark B


Martin Asenov wrote:
> 
> 
> Hello, everyone!
> 
> I provide the snippet of code I'm using:
> 
> @Component
> public class PhonebookExporterXLS {
> 	public static final String GROUP_STR = "*";
> 	public static final String OUTPUT_FILE_NAME = "contactsExportToXLS";
> 
> 	private ManagerBean manager;
> 	private SMSOfficeProperties props;
> 
> 	public void setProps(SMSOfficeProperties props) {
> 		this.props = props;
> 	}
> 
> 	private Domain domain;
> 	private User actor;
> 
> 	FileOutputStream fos;
> 
> 	public void setManager(ManagerBean manager) {
> 		this.manager = manager;
> 	}
> 
> 	public PhonebookExporterXLS() {
> 	}
> 
> 	public PhonebookExporterXLS(User actor, Domain domain) {
> 		this.domain = domain;
> 		this.actor = actor;
> 	}
> 
> 	// Excel tools
> 	HSSFWorkbook wb;
> 	HSSFSheet sheet;
> 
> 	int rowCounterContacts = 0;
> 
> 	public File export(String path) throws SMSOfficeException {
> 
> 		File exportedFile = new File(path +
> String.format(props.get(OUTPUT_FILE_NAME),
> StatisticsExporter.fileDateFormat.format(new Date())));
> 		exportedFile.getParentFile().mkdirs();
> 
> 		int currentRowCellsCount;
> 
> 		try {
> 			fos = new FileOutputStream(exportedFile);
> 			wb = new HSSFWorkbook();
> 			sheet = wb.createSheet("Contacts");
> 
> 			for (Contact currentContact : manager.getContacts(domain)) {
> 				HSSFRow currentContactRow = sheet.createRow(rowCounterContacts++);
> 
> 				currentRowCellsCount = 0;
> 
> 				// Number cell(s) - (String type)
> 				for (PhoneNumber currentNumber : currentContact.getNumbers()) {
> 					Cell cellNumber =
> currentContactRow.createCell(currentRowCellsCount++);
> 					cellNumber.setCellValue(currentNumber.getFullNumber());
> 					sheet.autoSizeColumn(currentRowCellsCount - 1);
> 				}
> 
> 				// First name cell - (String type)
> 				Cell cellFirstName =
> currentContactRow.createCell(currentRowCellsCount++);
> 				cellFirstName.setCellValue(currentContact.getFirstName());
> 				sheet.autoSizeColumn(currentRowCellsCount - 1);
> 
> 				// Last name cell - (String type)
> 				Cell cellLastName =
> currentContactRow.createCell(currentRowCellsCount++);
> 				cellLastName.setCellValue(currentContact.getLastName());
> 				sheet.autoSizeColumn(currentRowCellsCount - 1);
> 
> 				for (Group currentGroup : currentContact.getGroups()) {
> 					Cell cellCurrentGroup =
> currentContactRow.createCell(currentRowCellsCount++);
> 					cellCurrentGroup.setCellValue(GROUP_STR + currentGroup.getName());
> 					sheet.autoSizeColumn(currentRowCellsCount - 1);
> 				}
> 			}
> 
> 			// Writing the workbook to FileOutputStream...
> 			wb.write(fos);
> 			fos.flush();
> 
> 		} catch (IOException ioe) {
> 			ioe.printStackTrace();
> 		} finally {
> 			rowCounterContacts = 0;
> 
> 			try {
> 				fos.close();
> 			} catch (IOException e) {
> 				e.printStackTrace();
> 			}
> 		}
> 
> 		return exportedFile;
> 
> 	}
> 
> 	public void setUser(User user) {
> 		actor = user;
> 	}
> 
> 	public void setDomain(Domain domain) {
> 		this.domain = domain;
> 	}
> }
> 
> Best,
> Martin
> 
> -----Original Message-----
> From: MSB [mailto:markbrdsly@tiscali.co.uk] 
> Sent: Sunday, April 25, 2010 10:45 AM
> To: user@poi.apache.org
> Subject: RE: flush workbook from time to time
> 
> 
> I should have expected you to look into CSV files, sorry about that. Must
> admit, I also like Richard's suggestion; it could all be down to the JVM
> not
> reclaiming space.
> 
> Until Monday.
> 
> Yours
> 
> Mark B
> 
> 
> Martin Asenov wrote:
>> 
>> Hi Mark,
>> 
>> Well I've got both CSV and XLS export - you know about line separators
>> and
>> all this stuff that is system dependent, so we provide the both variants. 
>> 
>> Let's discuss it in Monday... Have a nice weekend
>> 
>> Thank you,
>> Martin
>> 
>> -----Original Message-----
>> From: MSB [mailto:markbrdsly@tiscali.co.uk] 
>> Sent: Friday, April 23, 2010 6:59 PM
>> To: user@poi.apache.org
>> Subject: RE: flush workbook from time to time
>> 
>> 
>> Old age is really slowing me down these days - and I can blame working
>> for
>> the last few days on a 1 in 3 slope removing Hawthorn, still digging the
>> thorns out of my fingers now! - but have you looked into generating a CSV
>> file and then 'opening' this with Excel?
>> 
>> I have never tried it myself but I think it should be possible to create
>> a
>> template using Excel - set all of the cell formats correctly and so on.
>> Then
>> you could simply open a CSV file and populate the respective cells with
>> data
>> and I am guessing that the formatting information will be preserved; your
>> application would then only have to create/edit the CSV file.
>> 
>> It should be easy enough to mock up a simple test to see if it would work
>> and although it would add some complexity to the task - opening the Excel
>> template and then importing the data - that might be worthwhile. A macro
>> could even be used to automate the process - something that runs when the
>> workbook is opened to import the data from the CSV file maybe.
>> 
>> Yours
>> 
>> Mark B
>> 
>> 
>> Martin Asenov wrote:
>>> 
>>> No, I've been using old good system.out.println(); statements - and
>>> noticed that the gap between different insertions into the workbook
>>> slighly increases on every insertion. Actually, the output file was 212
>>> kb
>>> large - it is surprising for me that file of such size could be
>>> generated
>>> in so large amount of time. Something must be wrong...
>>> 
>>> Best,
>>> Martin
>>> 
>>> -----Original Message-----
>>> From: MSB [mailto:markbrdsly@tiscali.co.uk] 
>>> Sent: Friday, April 23, 2010 6:24 PM
>>> To: user@poi.apache.org
>>> Subject: RE: flush workbook from time to time
>>> 
>>> 
>>> Oh, well in that case the BigGridDemo would not work for you as it was
>>> only
>>> applicable to the OOXML based file format. If you are using the older
>>> binary
>>> format (.xls) then the answer is a certain no to streaming the file in
>>> chunks I am sorry to say. The reason lies in the structure of the Excel
>>> file
>>> itself, I am not completely certain of the details but understand that
>>> it
>>> consists of a series of streams and is block structured - that is why
>>> the
>>> files sizes are always multiples of 512 or 4096 I believe. As a result,
>>> a
>>> model of the file must be assembled in memory firstly, formatted and
>>> then
>>> written out.
>>> 
>>> Must admit that I am surprised you are experiencing problems with
>>> performance if you are using the HSSF stream as that is a more common
>>> problem with the OOXML files. Have you tried running a profiler on the
>>> code
>>> to see where the bottlenecks are?
>>> 
>>> Yours
>>> 
>>> Mark B
>>> 
>>> 
>>> Martin Asenov wrote:
>>>> 
>>>> Hi, Mark, thanks for the quick reply.
>>>> 
>>>> I don't know exactly what format we're exporting into, but I think it's
>>>> the .xls one. In code we're using HSSFWorkbook class. Anyway, the whole
>>>> thing is not written by me, so I'm unaware of more details about it.
>>>> 
>>>> Every single contact expands from 2 to up to 20 horizontal cells, but
>>>> commonly in 4-5 horizontal cells.
>>>> 
>>>> The point is that in the UI when the user clicks on 'export' we're
>>>> supposed to supply him with the exported file. Delaying is not a
>>>> problem,
>>>> unless it's 7-8 mins for a workbook of 2000 contacts :-) I hope there's
>>>> a
>>>> way to work around this...
>>>> 
>>>> Best,
>>>> Martin
>>>> 
>>>> 
>>>> -----Original Message-----
>>>> From: MSB [mailto:markbrdsly@tiscali.co.uk] 
>>>> Sent: Friday, April 23, 2010 5:17 PM
>>>> To: user@poi.apache.org
>>>> Subject: Re: flush workbook from time to time
>>>> 
>>>> 
>>>> Can I ask which version of the file format you are targetting please,
>>>> the
>>>> binary (.xls) or OOXML (.xlsx) one? If it is the latter, then it may be
>>>> worthwhile looking at Yegor's BigGridDemo program - sorry but I do not
>>>> have
>>>> the link at hand currently - where he streams the contents of a file
>>>> into
>>>> a
>>>> template to create large worksheets without encountering out of memory
>>>> exceptions. It is not a fully fledged utility but rather a proof that
>>>> you
>>>> may - and that is may - be able to adapt to handle your current
>>>> requirement.
>>>> I do not know if you would be able to modify the program to, for
>>>> example,
>>>> fill rows 0 to 2000 with data and then fill rows 2001 to 4000 with data
>>>> at
>>>> a
>>>> later date but I feel it could be worth looking into.
>>>> 
>>>> Aside from that, there is no way to currently stream the document out
>>>> bit
>>>> by
>>>> bit I am sorry to say. It should be possible and has, I believe, been
>>>> discussed by the developers but would require some programming effort
>>>> and
>>>> no
>>>> one has stepped up to sponsor the work.
>>>> 
>>>> Yours
>>>> 
>>>> Mark B
>>>> 
>>>> 
>>>> Martin Asenov wrote:
>>>>> 
>>>>> Hello, everyone!
>>>>> 
>>>>> I've got some DB which contains some contacts. I'm using the Apache
>>>>> POI
>>>>> in
>>>>> order to export that DB to an Excel file. Anyway, let's presume the
>>>>> contacts are 2000. When I start creating the workbook, the contacts
>>>>> are
>>>>> inserted pretty quick, but when the workbook begins to get bigger and
>>>>> bigger, things are terribly slowed down.
>>>>> 
>>>>> Is there a way to flush the workbook from time to time? I didn't see
>>>>> such
>>>>> opportunity in the API.
>>>>> 
>>>>> Best,
>>>>> Martin
>>>>> 
>>>>> 
>>>> 
>>>> -- 
>>>> View this message in context:
>>>> http://old.nabble.com/flush-workbook-from-time-to-time-tp28341613p28341925.html
>>>> Sent from the POI - User mailing list archive at Nabble.com.
>>>> 
>>>> 
>>>> ---------------------------------------------------------------------
>>>> To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
>>>> For additional commands, e-mail: user-help@poi.apache.org
>>>> 
>>>> 
>>>> ---------------------------------------------------------------------
>>>> To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
>>>> For additional commands, e-mail: user-help@poi.apache.org
>>>> 
>>>> 
>>>> 
>>> 
>>> -- 
>>> View this message in context:
>>> http://old.nabble.com/flush-workbook-from-time-to-time-tp28341613p28342812.html
>>> Sent from the POI - User mailing list archive at Nabble.com.
>>> 
>>> 
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
>>> For additional commands, e-mail: user-help@poi.apache.org
>>> 
>>> 
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
>>> For additional commands, e-mail: user-help@poi.apache.org
>>> 
>>> 
>>> 
>> 
>> -- 
>> View this message in context:
>> http://old.nabble.com/flush-workbook-from-time-to-time-tp28341613p28343277.html
>> Sent from the POI - User mailing list archive at Nabble.com.
>> 
>> 
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
>> For additional commands, e-mail: user-help@poi.apache.org
>> 
>> 
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
>> For additional commands, e-mail: user-help@poi.apache.org
>> 
>> 
>> 
> 
> -- 
> View this message in context:
> http://old.nabble.com/flush-workbook-from-time-to-time-tp28341613p28354716.html
> Sent from the POI - User mailing list archive at Nabble.com.
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
> For additional commands, e-mail: user-help@poi.apache.org
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
> For additional commands, e-mail: user-help@poi.apache.org
> 
> 
> 

-- 
View this message in context: http://old.nabble.com/flush-workbook-from-time-to-time-tp28341613p28361478.html
Sent from the POI - User mailing list archive at Nabble.com.


---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
For additional commands, e-mail: user-help@poi.apache.org


RE: flush workbook from time to time

Posted by Martin Asenov <mA...@velti.com>.
Hello, everyone!

I provide the snippet of code I'm using:

@Component
public class PhonebookExporterXLS {
	public static final String GROUP_STR = "*";
	public static final String OUTPUT_FILE_NAME = "contactsExportToXLS";

	private ManagerBean manager;
	private SMSOfficeProperties props;

	public void setProps(SMSOfficeProperties props) {
		this.props = props;
	}

	private Domain domain;
	private User actor;

	FileOutputStream fos;

	public void setManager(ManagerBean manager) {
		this.manager = manager;
	}

	public PhonebookExporterXLS() {
	}

	public PhonebookExporterXLS(User actor, Domain domain) {
		this.domain = domain;
		this.actor = actor;
	}

	// Excel tools
	HSSFWorkbook wb;
	HSSFSheet sheet;

	int rowCounterContacts = 0;

	public File export(String path) throws SMSOfficeException {

		File exportedFile = new File(path + String.format(props.get(OUTPUT_FILE_NAME), StatisticsExporter.fileDateFormat.format(new Date())));
		exportedFile.getParentFile().mkdirs();

		int currentRowCellsCount;

		try {
			fos = new FileOutputStream(exportedFile);
			wb = new HSSFWorkbook();
			sheet = wb.createSheet("Contacts");

			for (Contact currentContact : manager.getContacts(domain)) {
				HSSFRow currentContactRow = sheet.createRow(rowCounterContacts++);

				currentRowCellsCount = 0;

				// Number cell(s) - (String type)
				for (PhoneNumber currentNumber : currentContact.getNumbers()) {
					Cell cellNumber = currentContactRow.createCell(currentRowCellsCount++);
					cellNumber.setCellValue(currentNumber.getFullNumber());
					sheet.autoSizeColumn(currentRowCellsCount - 1);
				}

				// First name cell - (String type)
				Cell cellFirstName = currentContactRow.createCell(currentRowCellsCount++);
				cellFirstName.setCellValue(currentContact.getFirstName());
				sheet.autoSizeColumn(currentRowCellsCount - 1);

				// Last name cell - (String type)
				Cell cellLastName = currentContactRow.createCell(currentRowCellsCount++);
				cellLastName.setCellValue(currentContact.getLastName());
				sheet.autoSizeColumn(currentRowCellsCount - 1);

				for (Group currentGroup : currentContact.getGroups()) {
					Cell cellCurrentGroup = currentContactRow.createCell(currentRowCellsCount++);
					cellCurrentGroup.setCellValue(GROUP_STR + currentGroup.getName());
					sheet.autoSizeColumn(currentRowCellsCount - 1);
				}
			}

			// Writing the workbook to FileOutputStream...
			wb.write(fos);
			fos.flush();

		} catch (IOException ioe) {
			ioe.printStackTrace();
		} finally {
			rowCounterContacts = 0;

			try {
				fos.close();
			} catch (IOException e) {
				e.printStackTrace();
			}
		}

		return exportedFile;

	}

	public void setUser(User user) {
		actor = user;
	}

	public void setDomain(Domain domain) {
		this.domain = domain;
	}
}

Best,
Martin

-----Original Message-----
From: MSB [mailto:markbrdsly@tiscali.co.uk] 
Sent: Sunday, April 25, 2010 10:45 AM
To: user@poi.apache.org
Subject: RE: flush workbook from time to time


I should have expected you to look into CSV files, sorry about that. Must
admit, I also like Richard's suggestion; it could all be down to the JVM not
reclaiming space.

Until Monday.

Yours

Mark B


Martin Asenov wrote:
> 
> Hi Mark,
> 
> Well I've got both CSV and XLS export - you know about line separators and
> all this stuff that is system dependent, so we provide the both variants. 
> 
> Let's discuss it in Monday... Have a nice weekend
> 
> Thank you,
> Martin
> 
> -----Original Message-----
> From: MSB [mailto:markbrdsly@tiscali.co.uk] 
> Sent: Friday, April 23, 2010 6:59 PM
> To: user@poi.apache.org
> Subject: RE: flush workbook from time to time
> 
> 
> Old age is really slowing me down these days - and I can blame working for
> the last few days on a 1 in 3 slope removing Hawthorn, still digging the
> thorns out of my fingers now! - but have you looked into generating a CSV
> file and then 'opening' this with Excel?
> 
> I have never tried it myself but I think it should be possible to create a
> template using Excel - set all of the cell formats correctly and so on.
> Then
> you could simply open a CSV file and populate the respective cells with
> data
> and I am guessing that the formatting information will be preserved; your
> application would then only have to create/edit the CSV file.
> 
> It should be easy enough to mock up a simple test to see if it would work
> and although it would add some complexity to the task - opening the Excel
> template and then importing the data - that might be worthwhile. A macro
> could even be used to automate the process - something that runs when the
> workbook is opened to import the data from the CSV file maybe.
> 
> Yours
> 
> Mark B
> 
> 
> Martin Asenov wrote:
>> 
>> No, I've been using old good system.out.println(); statements - and
>> noticed that the gap between different insertions into the workbook
>> slighly increases on every insertion. Actually, the output file was 212
>> kb
>> large - it is surprising for me that file of such size could be generated
>> in so large amount of time. Something must be wrong...
>> 
>> Best,
>> Martin
>> 
>> -----Original Message-----
>> From: MSB [mailto:markbrdsly@tiscali.co.uk] 
>> Sent: Friday, April 23, 2010 6:24 PM
>> To: user@poi.apache.org
>> Subject: RE: flush workbook from time to time
>> 
>> 
>> Oh, well in that case the BigGridDemo would not work for you as it was
>> only
>> applicable to the OOXML based file format. If you are using the older
>> binary
>> format (.xls) then the answer is a certain no to streaming the file in
>> chunks I am sorry to say. The reason lies in the structure of the Excel
>> file
>> itself, I am not completely certain of the details but understand that it
>> consists of a series of streams and is block structured - that is why the
>> files sizes are always multiples of 512 or 4096 I believe. As a result, a
>> model of the file must be assembled in memory firstly, formatted and then
>> written out.
>> 
>> Must admit that I am surprised you are experiencing problems with
>> performance if you are using the HSSF stream as that is a more common
>> problem with the OOXML files. Have you tried running a profiler on the
>> code
>> to see where the bottlenecks are?
>> 
>> Yours
>> 
>> Mark B
>> 
>> 
>> Martin Asenov wrote:
>>> 
>>> Hi, Mark, thanks for the quick reply.
>>> 
>>> I don't know exactly what format we're exporting into, but I think it's
>>> the .xls one. In code we're using HSSFWorkbook class. Anyway, the whole
>>> thing is not written by me, so I'm unaware of more details about it.
>>> 
>>> Every single contact expands from 2 to up to 20 horizontal cells, but
>>> commonly in 4-5 horizontal cells.
>>> 
>>> The point is that in the UI when the user clicks on 'export' we're
>>> supposed to supply him with the exported file. Delaying is not a
>>> problem,
>>> unless it's 7-8 mins for a workbook of 2000 contacts :-) I hope there's
>>> a
>>> way to work around this...
>>> 
>>> Best,
>>> Martin
>>> 
>>> 
>>> -----Original Message-----
>>> From: MSB [mailto:markbrdsly@tiscali.co.uk] 
>>> Sent: Friday, April 23, 2010 5:17 PM
>>> To: user@poi.apache.org
>>> Subject: Re: flush workbook from time to time
>>> 
>>> 
>>> Can I ask which version of the file format you are targetting please,
>>> the
>>> binary (.xls) or OOXML (.xlsx) one? If it is the latter, then it may be
>>> worthwhile looking at Yegor's BigGridDemo program - sorry but I do not
>>> have
>>> the link at hand currently - where he streams the contents of a file
>>> into
>>> a
>>> template to create large worksheets without encountering out of memory
>>> exceptions. It is not a fully fledged utility but rather a proof that
>>> you
>>> may - and that is may - be able to adapt to handle your current
>>> requirement.
>>> I do not know if you would be able to modify the program to, for
>>> example,
>>> fill rows 0 to 2000 with data and then fill rows 2001 to 4000 with data
>>> at
>>> a
>>> later date but I feel it could be worth looking into.
>>> 
>>> Aside from that, there is no way to currently stream the document out
>>> bit
>>> by
>>> bit I am sorry to say. It should be possible and has, I believe, been
>>> discussed by the developers but would require some programming effort
>>> and
>>> no
>>> one has stepped up to sponsor the work.
>>> 
>>> Yours
>>> 
>>> Mark B
>>> 
>>> 
>>> Martin Asenov wrote:
>>>> 
>>>> Hello, everyone!
>>>> 
>>>> I've got some DB which contains some contacts. I'm using the Apache POI
>>>> in
>>>> order to export that DB to an Excel file. Anyway, let's presume the
>>>> contacts are 2000. When I start creating the workbook, the contacts are
>>>> inserted pretty quick, but when the workbook begins to get bigger and
>>>> bigger, things are terribly slowed down.
>>>> 
>>>> Is there a way to flush the workbook from time to time? I didn't see
>>>> such
>>>> opportunity in the API.
>>>> 
>>>> Best,
>>>> Martin
>>>> 
>>>> 
>>> 
>>> -- 
>>> View this message in context:
>>> http://old.nabble.com/flush-workbook-from-time-to-time-tp28341613p28341925.html
>>> Sent from the POI - User mailing list archive at Nabble.com.
>>> 
>>> 
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
>>> For additional commands, e-mail: user-help@poi.apache.org
>>> 
>>> 
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
>>> For additional commands, e-mail: user-help@poi.apache.org
>>> 
>>> 
>>> 
>> 
>> -- 
>> View this message in context:
>> http://old.nabble.com/flush-workbook-from-time-to-time-tp28341613p28342812.html
>> Sent from the POI - User mailing list archive at Nabble.com.
>> 
>> 
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
>> For additional commands, e-mail: user-help@poi.apache.org
>> 
>> 
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
>> For additional commands, e-mail: user-help@poi.apache.org
>> 
>> 
>> 
> 
> -- 
> View this message in context:
> http://old.nabble.com/flush-workbook-from-time-to-time-tp28341613p28343277.html
> Sent from the POI - User mailing list archive at Nabble.com.
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
> For additional commands, e-mail: user-help@poi.apache.org
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
> For additional commands, e-mail: user-help@poi.apache.org
> 
> 
> 

-- 
View this message in context: http://old.nabble.com/flush-workbook-from-time-to-time-tp28341613p28354716.html
Sent from the POI - User mailing list archive at Nabble.com.


---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
For additional commands, e-mail: user-help@poi.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
For additional commands, e-mail: user-help@poi.apache.org


RE: flush workbook from time to time

Posted by MSB <ma...@tiscali.co.uk>.
I should have expected you to look into CSV files, sorry about that. Must
admit, I also like Richard's suggestion; it could all be down to the JVM not
reclaiming space.

Until Monday.

Yours

Mark B


Martin Asenov wrote:
> 
> Hi Mark,
> 
> Well I've got both CSV and XLS export - you know about line separators and
> all this stuff that is system dependent, so we provide the both variants. 
> 
> Let's discuss it in Monday... Have a nice weekend
> 
> Thank you,
> Martin
> 
> -----Original Message-----
> From: MSB [mailto:markbrdsly@tiscali.co.uk] 
> Sent: Friday, April 23, 2010 6:59 PM
> To: user@poi.apache.org
> Subject: RE: flush workbook from time to time
> 
> 
> Old age is really slowing me down these days - and I can blame working for
> the last few days on a 1 in 3 slope removing Hawthorn, still digging the
> thorns out of my fingers now! - but have you looked into generating a CSV
> file and then 'opening' this with Excel?
> 
> I have never tried it myself but I think it should be possible to create a
> template using Excel - set all of the cell formats correctly and so on.
> Then
> you could simply open a CSV file and populate the respective cells with
> data
> and I am guessing that the formatting information will be preserved; your
> application would then only have to create/edit the CSV file.
> 
> It should be easy enough to mock up a simple test to see if it would work
> and although it would add some complexity to the task - opening the Excel
> template and then importing the data - that might be worthwhile. A macro
> could even be used to automate the process - something that runs when the
> workbook is opened to import the data from the CSV file maybe.
> 
> Yours
> 
> Mark B
> 
> 
> Martin Asenov wrote:
>> 
>> No, I've been using old good system.out.println(); statements - and
>> noticed that the gap between different insertions into the workbook
>> slighly increases on every insertion. Actually, the output file was 212
>> kb
>> large - it is surprising for me that file of such size could be generated
>> in so large amount of time. Something must be wrong...
>> 
>> Best,
>> Martin
>> 
>> -----Original Message-----
>> From: MSB [mailto:markbrdsly@tiscali.co.uk] 
>> Sent: Friday, April 23, 2010 6:24 PM
>> To: user@poi.apache.org
>> Subject: RE: flush workbook from time to time
>> 
>> 
>> Oh, well in that case the BigGridDemo would not work for you as it was
>> only
>> applicable to the OOXML based file format. If you are using the older
>> binary
>> format (.xls) then the answer is a certain no to streaming the file in
>> chunks I am sorry to say. The reason lies in the structure of the Excel
>> file
>> itself, I am not completely certain of the details but understand that it
>> consists of a series of streams and is block structured - that is why the
>> files sizes are always multiples of 512 or 4096 I believe. As a result, a
>> model of the file must be assembled in memory firstly, formatted and then
>> written out.
>> 
>> Must admit that I am surprised you are experiencing problems with
>> performance if you are using the HSSF stream as that is a more common
>> problem with the OOXML files. Have you tried running a profiler on the
>> code
>> to see where the bottlenecks are?
>> 
>> Yours
>> 
>> Mark B
>> 
>> 
>> Martin Asenov wrote:
>>> 
>>> Hi, Mark, thanks for the quick reply.
>>> 
>>> I don't know exactly what format we're exporting into, but I think it's
>>> the .xls one. In code we're using HSSFWorkbook class. Anyway, the whole
>>> thing is not written by me, so I'm unaware of more details about it.
>>> 
>>> Every single contact expands from 2 to up to 20 horizontal cells, but
>>> commonly in 4-5 horizontal cells.
>>> 
>>> The point is that in the UI when the user clicks on 'export' we're
>>> supposed to supply him with the exported file. Delaying is not a
>>> problem,
>>> unless it's 7-8 mins for a workbook of 2000 contacts :-) I hope there's
>>> a
>>> way to work around this...
>>> 
>>> Best,
>>> Martin
>>> 
>>> 
>>> -----Original Message-----
>>> From: MSB [mailto:markbrdsly@tiscali.co.uk] 
>>> Sent: Friday, April 23, 2010 5:17 PM
>>> To: user@poi.apache.org
>>> Subject: Re: flush workbook from time to time
>>> 
>>> 
>>> Can I ask which version of the file format you are targetting please,
>>> the
>>> binary (.xls) or OOXML (.xlsx) one? If it is the latter, then it may be
>>> worthwhile looking at Yegor's BigGridDemo program - sorry but I do not
>>> have
>>> the link at hand currently - where he streams the contents of a file
>>> into
>>> a
>>> template to create large worksheets without encountering out of memory
>>> exceptions. It is not a fully fledged utility but rather a proof that
>>> you
>>> may - and that is may - be able to adapt to handle your current
>>> requirement.
>>> I do not know if you would be able to modify the program to, for
>>> example,
>>> fill rows 0 to 2000 with data and then fill rows 2001 to 4000 with data
>>> at
>>> a
>>> later date but I feel it could be worth looking into.
>>> 
>>> Aside from that, there is no way to currently stream the document out
>>> bit
>>> by
>>> bit I am sorry to say. It should be possible and has, I believe, been
>>> discussed by the developers but would require some programming effort
>>> and
>>> no
>>> one has stepped up to sponsor the work.
>>> 
>>> Yours
>>> 
>>> Mark B
>>> 
>>> 
>>> Martin Asenov wrote:
>>>> 
>>>> Hello, everyone!
>>>> 
>>>> I've got some DB which contains some contacts. I'm using the Apache POI
>>>> in
>>>> order to export that DB to an Excel file. Anyway, let's presume the
>>>> contacts are 2000. When I start creating the workbook, the contacts are
>>>> inserted pretty quick, but when the workbook begins to get bigger and
>>>> bigger, things are terribly slowed down.
>>>> 
>>>> Is there a way to flush the workbook from time to time? I didn't see
>>>> such
>>>> opportunity in the API.
>>>> 
>>>> Best,
>>>> Martin
>>>> 
>>>> 
>>> 
>>> -- 
>>> View this message in context:
>>> http://old.nabble.com/flush-workbook-from-time-to-time-tp28341613p28341925.html
>>> Sent from the POI - User mailing list archive at Nabble.com.
>>> 
>>> 
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
>>> For additional commands, e-mail: user-help@poi.apache.org
>>> 
>>> 
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
>>> For additional commands, e-mail: user-help@poi.apache.org
>>> 
>>> 
>>> 
>> 
>> -- 
>> View this message in context:
>> http://old.nabble.com/flush-workbook-from-time-to-time-tp28341613p28342812.html
>> Sent from the POI - User mailing list archive at Nabble.com.
>> 
>> 
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
>> For additional commands, e-mail: user-help@poi.apache.org
>> 
>> 
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
>> For additional commands, e-mail: user-help@poi.apache.org
>> 
>> 
>> 
> 
> -- 
> View this message in context:
> http://old.nabble.com/flush-workbook-from-time-to-time-tp28341613p28343277.html
> Sent from the POI - User mailing list archive at Nabble.com.
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
> For additional commands, e-mail: user-help@poi.apache.org
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
> For additional commands, e-mail: user-help@poi.apache.org
> 
> 
> 

-- 
View this message in context: http://old.nabble.com/flush-workbook-from-time-to-time-tp28341613p28354716.html
Sent from the POI - User mailing list archive at Nabble.com.


---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
For additional commands, e-mail: user-help@poi.apache.org


RE: flush workbook from time to time

Posted by Martin Asenov <mA...@velti.com>.
Hi Mark,

Well I've got both CSV and XLS export - you know about line separators and all this stuff that is system dependent, so we provide the both variants. 

Let's discuss it in Monday... Have a nice weekend

Thank you,
Martin

-----Original Message-----
From: MSB [mailto:markbrdsly@tiscali.co.uk] 
Sent: Friday, April 23, 2010 6:59 PM
To: user@poi.apache.org
Subject: RE: flush workbook from time to time


Old age is really slowing me down these days - and I can blame working for
the last few days on a 1 in 3 slope removing Hawthorn, still digging the
thorns out of my fingers now! - but have you looked into generating a CSV
file and then 'opening' this with Excel?

I have never tried it myself but I think it should be possible to create a
template using Excel - set all of the cell formats correctly and so on. Then
you could simply open a CSV file and populate the respective cells with data
and I am guessing that the formatting information will be preserved; your
application would then only have to create/edit the CSV file.

It should be easy enough to mock up a simple test to see if it would work
and although it would add some complexity to the task - opening the Excel
template and then importing the data - that might be worthwhile. A macro
could even be used to automate the process - something that runs when the
workbook is opened to import the data from the CSV file maybe.

Yours

Mark B


Martin Asenov wrote:
> 
> No, I've been using old good system.out.println(); statements - and
> noticed that the gap between different insertions into the workbook
> slighly increases on every insertion. Actually, the output file was 212 kb
> large - it is surprising for me that file of such size could be generated
> in so large amount of time. Something must be wrong...
> 
> Best,
> Martin
> 
> -----Original Message-----
> From: MSB [mailto:markbrdsly@tiscali.co.uk] 
> Sent: Friday, April 23, 2010 6:24 PM
> To: user@poi.apache.org
> Subject: RE: flush workbook from time to time
> 
> 
> Oh, well in that case the BigGridDemo would not work for you as it was
> only
> applicable to the OOXML based file format. If you are using the older
> binary
> format (.xls) then the answer is a certain no to streaming the file in
> chunks I am sorry to say. The reason lies in the structure of the Excel
> file
> itself, I am not completely certain of the details but understand that it
> consists of a series of streams and is block structured - that is why the
> files sizes are always multiples of 512 or 4096 I believe. As a result, a
> model of the file must be assembled in memory firstly, formatted and then
> written out.
> 
> Must admit that I am surprised you are experiencing problems with
> performance if you are using the HSSF stream as that is a more common
> problem with the OOXML files. Have you tried running a profiler on the
> code
> to see where the bottlenecks are?
> 
> Yours
> 
> Mark B
> 
> 
> Martin Asenov wrote:
>> 
>> Hi, Mark, thanks for the quick reply.
>> 
>> I don't know exactly what format we're exporting into, but I think it's
>> the .xls one. In code we're using HSSFWorkbook class. Anyway, the whole
>> thing is not written by me, so I'm unaware of more details about it.
>> 
>> Every single contact expands from 2 to up to 20 horizontal cells, but
>> commonly in 4-5 horizontal cells.
>> 
>> The point is that in the UI when the user clicks on 'export' we're
>> supposed to supply him with the exported file. Delaying is not a problem,
>> unless it's 7-8 mins for a workbook of 2000 contacts :-) I hope there's a
>> way to work around this...
>> 
>> Best,
>> Martin
>> 
>> 
>> -----Original Message-----
>> From: MSB [mailto:markbrdsly@tiscali.co.uk] 
>> Sent: Friday, April 23, 2010 5:17 PM
>> To: user@poi.apache.org
>> Subject: Re: flush workbook from time to time
>> 
>> 
>> Can I ask which version of the file format you are targetting please, the
>> binary (.xls) or OOXML (.xlsx) one? If it is the latter, then it may be
>> worthwhile looking at Yegor's BigGridDemo program - sorry but I do not
>> have
>> the link at hand currently - where he streams the contents of a file into
>> a
>> template to create large worksheets without encountering out of memory
>> exceptions. It is not a fully fledged utility but rather a proof that you
>> may - and that is may - be able to adapt to handle your current
>> requirement.
>> I do not know if you would be able to modify the program to, for example,
>> fill rows 0 to 2000 with data and then fill rows 2001 to 4000 with data
>> at
>> a
>> later date but I feel it could be worth looking into.
>> 
>> Aside from that, there is no way to currently stream the document out bit
>> by
>> bit I am sorry to say. It should be possible and has, I believe, been
>> discussed by the developers but would require some programming effort and
>> no
>> one has stepped up to sponsor the work.
>> 
>> Yours
>> 
>> Mark B
>> 
>> 
>> Martin Asenov wrote:
>>> 
>>> Hello, everyone!
>>> 
>>> I've got some DB which contains some contacts. I'm using the Apache POI
>>> in
>>> order to export that DB to an Excel file. Anyway, let's presume the
>>> contacts are 2000. When I start creating the workbook, the contacts are
>>> inserted pretty quick, but when the workbook begins to get bigger and
>>> bigger, things are terribly slowed down.
>>> 
>>> Is there a way to flush the workbook from time to time? I didn't see
>>> such
>>> opportunity in the API.
>>> 
>>> Best,
>>> Martin
>>> 
>>> 
>> 
>> -- 
>> View this message in context:
>> http://old.nabble.com/flush-workbook-from-time-to-time-tp28341613p28341925.html
>> Sent from the POI - User mailing list archive at Nabble.com.
>> 
>> 
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
>> For additional commands, e-mail: user-help@poi.apache.org
>> 
>> 
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
>> For additional commands, e-mail: user-help@poi.apache.org
>> 
>> 
>> 
> 
> -- 
> View this message in context:
> http://old.nabble.com/flush-workbook-from-time-to-time-tp28341613p28342812.html
> Sent from the POI - User mailing list archive at Nabble.com.
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
> For additional commands, e-mail: user-help@poi.apache.org
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
> For additional commands, e-mail: user-help@poi.apache.org
> 
> 
> 

-- 
View this message in context: http://old.nabble.com/flush-workbook-from-time-to-time-tp28341613p28343277.html
Sent from the POI - User mailing list archive at Nabble.com.


---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
For additional commands, e-mail: user-help@poi.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
For additional commands, e-mail: user-help@poi.apache.org


RE: flush workbook from time to time

Posted by MSB <ma...@tiscali.co.uk>.
Old age is really slowing me down these days - and I can blame working for
the last few days on a 1 in 3 slope removing Hawthorn, still digging the
thorns out of my fingers now! - but have you looked into generating a CSV
file and then 'opening' this with Excel?

I have never tried it myself but I think it should be possible to create a
template using Excel - set all of the cell formats correctly and so on. Then
you could simply open a CSV file and populate the respective cells with data
and I am guessing that the formatting information will be preserved; your
application would then only have to create/edit the CSV file.

It should be easy enough to mock up a simple test to see if it would work
and although it would add some complexity to the task - opening the Excel
template and then importing the data - that might be worthwhile. A macro
could even be used to automate the process - something that runs when the
workbook is opened to import the data from the CSV file maybe.

Yours

Mark B


Martin Asenov wrote:
> 
> No, I've been using old good system.out.println(); statements - and
> noticed that the gap between different insertions into the workbook
> slighly increases on every insertion. Actually, the output file was 212 kb
> large - it is surprising for me that file of such size could be generated
> in so large amount of time. Something must be wrong...
> 
> Best,
> Martin
> 
> -----Original Message-----
> From: MSB [mailto:markbrdsly@tiscali.co.uk] 
> Sent: Friday, April 23, 2010 6:24 PM
> To: user@poi.apache.org
> Subject: RE: flush workbook from time to time
> 
> 
> Oh, well in that case the BigGridDemo would not work for you as it was
> only
> applicable to the OOXML based file format. If you are using the older
> binary
> format (.xls) then the answer is a certain no to streaming the file in
> chunks I am sorry to say. The reason lies in the structure of the Excel
> file
> itself, I am not completely certain of the details but understand that it
> consists of a series of streams and is block structured - that is why the
> files sizes are always multiples of 512 or 4096 I believe. As a result, a
> model of the file must be assembled in memory firstly, formatted and then
> written out.
> 
> Must admit that I am surprised you are experiencing problems with
> performance if you are using the HSSF stream as that is a more common
> problem with the OOXML files. Have you tried running a profiler on the
> code
> to see where the bottlenecks are?
> 
> Yours
> 
> Mark B
> 
> 
> Martin Asenov wrote:
>> 
>> Hi, Mark, thanks for the quick reply.
>> 
>> I don't know exactly what format we're exporting into, but I think it's
>> the .xls one. In code we're using HSSFWorkbook class. Anyway, the whole
>> thing is not written by me, so I'm unaware of more details about it.
>> 
>> Every single contact expands from 2 to up to 20 horizontal cells, but
>> commonly in 4-5 horizontal cells.
>> 
>> The point is that in the UI when the user clicks on 'export' we're
>> supposed to supply him with the exported file. Delaying is not a problem,
>> unless it's 7-8 mins for a workbook of 2000 contacts :-) I hope there's a
>> way to work around this...
>> 
>> Best,
>> Martin
>> 
>> 
>> -----Original Message-----
>> From: MSB [mailto:markbrdsly@tiscali.co.uk] 
>> Sent: Friday, April 23, 2010 5:17 PM
>> To: user@poi.apache.org
>> Subject: Re: flush workbook from time to time
>> 
>> 
>> Can I ask which version of the file format you are targetting please, the
>> binary (.xls) or OOXML (.xlsx) one? If it is the latter, then it may be
>> worthwhile looking at Yegor's BigGridDemo program - sorry but I do not
>> have
>> the link at hand currently - where he streams the contents of a file into
>> a
>> template to create large worksheets without encountering out of memory
>> exceptions. It is not a fully fledged utility but rather a proof that you
>> may - and that is may - be able to adapt to handle your current
>> requirement.
>> I do not know if you would be able to modify the program to, for example,
>> fill rows 0 to 2000 with data and then fill rows 2001 to 4000 with data
>> at
>> a
>> later date but I feel it could be worth looking into.
>> 
>> Aside from that, there is no way to currently stream the document out bit
>> by
>> bit I am sorry to say. It should be possible and has, I believe, been
>> discussed by the developers but would require some programming effort and
>> no
>> one has stepped up to sponsor the work.
>> 
>> Yours
>> 
>> Mark B
>> 
>> 
>> Martin Asenov wrote:
>>> 
>>> Hello, everyone!
>>> 
>>> I've got some DB which contains some contacts. I'm using the Apache POI
>>> in
>>> order to export that DB to an Excel file. Anyway, let's presume the
>>> contacts are 2000. When I start creating the workbook, the contacts are
>>> inserted pretty quick, but when the workbook begins to get bigger and
>>> bigger, things are terribly slowed down.
>>> 
>>> Is there a way to flush the workbook from time to time? I didn't see
>>> such
>>> opportunity in the API.
>>> 
>>> Best,
>>> Martin
>>> 
>>> 
>> 
>> -- 
>> View this message in context:
>> http://old.nabble.com/flush-workbook-from-time-to-time-tp28341613p28341925.html
>> Sent from the POI - User mailing list archive at Nabble.com.
>> 
>> 
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
>> For additional commands, e-mail: user-help@poi.apache.org
>> 
>> 
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
>> For additional commands, e-mail: user-help@poi.apache.org
>> 
>> 
>> 
> 
> -- 
> View this message in context:
> http://old.nabble.com/flush-workbook-from-time-to-time-tp28341613p28342812.html
> Sent from the POI - User mailing list archive at Nabble.com.
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
> For additional commands, e-mail: user-help@poi.apache.org
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
> For additional commands, e-mail: user-help@poi.apache.org
> 
> 
> 

-- 
View this message in context: http://old.nabble.com/flush-workbook-from-time-to-time-tp28341613p28343277.html
Sent from the POI - User mailing list archive at Nabble.com.


---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
For additional commands, e-mail: user-help@poi.apache.org


RE: flush workbook from time to time

Posted by Martin Asenov <mA...@velti.com>.
No, I've been using old good system.out.println(); statements - and noticed that the gap between different insertions into the workbook slighly increases on every insertion. Actually, the output file was 212 kb large - it is surprising for me that file of such size could be generated in so large amount of time. Something must be wrong...

Best,
Martin

-----Original Message-----
From: MSB [mailto:markbrdsly@tiscali.co.uk] 
Sent: Friday, April 23, 2010 6:24 PM
To: user@poi.apache.org
Subject: RE: flush workbook from time to time


Oh, well in that case the BigGridDemo would not work for you as it was only
applicable to the OOXML based file format. If you are using the older binary
format (.xls) then the answer is a certain no to streaming the file in
chunks I am sorry to say. The reason lies in the structure of the Excel file
itself, I am not completely certain of the details but understand that it
consists of a series of streams and is block structured - that is why the
files sizes are always multiples of 512 or 4096 I believe. As a result, a
model of the file must be assembled in memory firstly, formatted and then
written out.

Must admit that I am surprised you are experiencing problems with
performance if you are using the HSSF stream as that is a more common
problem with the OOXML files. Have you tried running a profiler on the code
to see where the bottlenecks are?

Yours

Mark B


Martin Asenov wrote:
> 
> Hi, Mark, thanks for the quick reply.
> 
> I don't know exactly what format we're exporting into, but I think it's
> the .xls one. In code we're using HSSFWorkbook class. Anyway, the whole
> thing is not written by me, so I'm unaware of more details about it.
> 
> Every single contact expands from 2 to up to 20 horizontal cells, but
> commonly in 4-5 horizontal cells.
> 
> The point is that in the UI when the user clicks on 'export' we're
> supposed to supply him with the exported file. Delaying is not a problem,
> unless it's 7-8 mins for a workbook of 2000 contacts :-) I hope there's a
> way to work around this...
> 
> Best,
> Martin
> 
> 
> -----Original Message-----
> From: MSB [mailto:markbrdsly@tiscali.co.uk] 
> Sent: Friday, April 23, 2010 5:17 PM
> To: user@poi.apache.org
> Subject: Re: flush workbook from time to time
> 
> 
> Can I ask which version of the file format you are targetting please, the
> binary (.xls) or OOXML (.xlsx) one? If it is the latter, then it may be
> worthwhile looking at Yegor's BigGridDemo program - sorry but I do not
> have
> the link at hand currently - where he streams the contents of a file into
> a
> template to create large worksheets without encountering out of memory
> exceptions. It is not a fully fledged utility but rather a proof that you
> may - and that is may - be able to adapt to handle your current
> requirement.
> I do not know if you would be able to modify the program to, for example,
> fill rows 0 to 2000 with data and then fill rows 2001 to 4000 with data at
> a
> later date but I feel it could be worth looking into.
> 
> Aside from that, there is no way to currently stream the document out bit
> by
> bit I am sorry to say. It should be possible and has, I believe, been
> discussed by the developers but would require some programming effort and
> no
> one has stepped up to sponsor the work.
> 
> Yours
> 
> Mark B
> 
> 
> Martin Asenov wrote:
>> 
>> Hello, everyone!
>> 
>> I've got some DB which contains some contacts. I'm using the Apache POI
>> in
>> order to export that DB to an Excel file. Anyway, let's presume the
>> contacts are 2000. When I start creating the workbook, the contacts are
>> inserted pretty quick, but when the workbook begins to get bigger and
>> bigger, things are terribly slowed down.
>> 
>> Is there a way to flush the workbook from time to time? I didn't see such
>> opportunity in the API.
>> 
>> Best,
>> Martin
>> 
>> 
> 
> -- 
> View this message in context:
> http://old.nabble.com/flush-workbook-from-time-to-time-tp28341613p28341925.html
> Sent from the POI - User mailing list archive at Nabble.com.
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
> For additional commands, e-mail: user-help@poi.apache.org
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
> For additional commands, e-mail: user-help@poi.apache.org
> 
> 
> 

-- 
View this message in context: http://old.nabble.com/flush-workbook-from-time-to-time-tp28341613p28342812.html
Sent from the POI - User mailing list archive at Nabble.com.


---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
For additional commands, e-mail: user-help@poi.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
For additional commands, e-mail: user-help@poi.apache.org


RE: flush workbook from time to time

Posted by MSB <ma...@tiscali.co.uk>.
Oh, well in that case the BigGridDemo would not work for you as it was only
applicable to the OOXML based file format. If you are using the older binary
format (.xls) then the answer is a certain no to streaming the file in
chunks I am sorry to say. The reason lies in the structure of the Excel file
itself, I am not completely certain of the details but understand that it
consists of a series of streams and is block structured - that is why the
files sizes are always multiples of 512 or 4096 I believe. As a result, a
model of the file must be assembled in memory firstly, formatted and then
written out.

Must admit that I am surprised you are experiencing problems with
performance if you are using the HSSF stream as that is a more common
problem with the OOXML files. Have you tried running a profiler on the code
to see where the bottlenecks are?

Yours

Mark B


Martin Asenov wrote:
> 
> Hi, Mark, thanks for the quick reply.
> 
> I don't know exactly what format we're exporting into, but I think it's
> the .xls one. In code we're using HSSFWorkbook class. Anyway, the whole
> thing is not written by me, so I'm unaware of more details about it.
> 
> Every single contact expands from 2 to up to 20 horizontal cells, but
> commonly in 4-5 horizontal cells.
> 
> The point is that in the UI when the user clicks on 'export' we're
> supposed to supply him with the exported file. Delaying is not a problem,
> unless it's 7-8 mins for a workbook of 2000 contacts :-) I hope there's a
> way to work around this...
> 
> Best,
> Martin
> 
> 
> -----Original Message-----
> From: MSB [mailto:markbrdsly@tiscali.co.uk] 
> Sent: Friday, April 23, 2010 5:17 PM
> To: user@poi.apache.org
> Subject: Re: flush workbook from time to time
> 
> 
> Can I ask which version of the file format you are targetting please, the
> binary (.xls) or OOXML (.xlsx) one? If it is the latter, then it may be
> worthwhile looking at Yegor's BigGridDemo program - sorry but I do not
> have
> the link at hand currently - where he streams the contents of a file into
> a
> template to create large worksheets without encountering out of memory
> exceptions. It is not a fully fledged utility but rather a proof that you
> may - and that is may - be able to adapt to handle your current
> requirement.
> I do not know if you would be able to modify the program to, for example,
> fill rows 0 to 2000 with data and then fill rows 2001 to 4000 with data at
> a
> later date but I feel it could be worth looking into.
> 
> Aside from that, there is no way to currently stream the document out bit
> by
> bit I am sorry to say. It should be possible and has, I believe, been
> discussed by the developers but would require some programming effort and
> no
> one has stepped up to sponsor the work.
> 
> Yours
> 
> Mark B
> 
> 
> Martin Asenov wrote:
>> 
>> Hello, everyone!
>> 
>> I've got some DB which contains some contacts. I'm using the Apache POI
>> in
>> order to export that DB to an Excel file. Anyway, let's presume the
>> contacts are 2000. When I start creating the workbook, the contacts are
>> inserted pretty quick, but when the workbook begins to get bigger and
>> bigger, things are terribly slowed down.
>> 
>> Is there a way to flush the workbook from time to time? I didn't see such
>> opportunity in the API.
>> 
>> Best,
>> Martin
>> 
>> 
> 
> -- 
> View this message in context:
> http://old.nabble.com/flush-workbook-from-time-to-time-tp28341613p28341925.html
> Sent from the POI - User mailing list archive at Nabble.com.
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
> For additional commands, e-mail: user-help@poi.apache.org
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
> For additional commands, e-mail: user-help@poi.apache.org
> 
> 
> 

-- 
View this message in context: http://old.nabble.com/flush-workbook-from-time-to-time-tp28341613p28342812.html
Sent from the POI - User mailing list archive at Nabble.com.


---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
For additional commands, e-mail: user-help@poi.apache.org


RE: flush workbook from time to time

Posted by Martin Asenov <mA...@velti.com>.
Hi, Mark, thanks for the quick reply.

I don't know exactly what format we're exporting into, but I think it's the .xls one. In code we're using HSSFWorkbook class. Anyway, the whole thing is not written by me, so I'm unaware of more details about it.

Every single contact expands from 2 to up to 20 horizontal cells, but commonly in 4-5 horizontal cells.

The point is that in the UI when the user clicks on 'export' we're supposed to supply him with the exported file. Delaying is not a problem, unless it's 7-8 mins for a workbook of 2000 contacts :-) I hope there's a way to work around this...

Best,
Martin


-----Original Message-----
From: MSB [mailto:markbrdsly@tiscali.co.uk] 
Sent: Friday, April 23, 2010 5:17 PM
To: user@poi.apache.org
Subject: Re: flush workbook from time to time


Can I ask which version of the file format you are targetting please, the
binary (.xls) or OOXML (.xlsx) one? If it is the latter, then it may be
worthwhile looking at Yegor's BigGridDemo program - sorry but I do not have
the link at hand currently - where he streams the contents of a file into a
template to create large worksheets without encountering out of memory
exceptions. It is not a fully fledged utility but rather a proof that you
may - and that is may - be able to adapt to handle your current requirement.
I do not know if you would be able to modify the program to, for example,
fill rows 0 to 2000 with data and then fill rows 2001 to 4000 with data at a
later date but I feel it could be worth looking into.

Aside from that, there is no way to currently stream the document out bit by
bit I am sorry to say. It should be possible and has, I believe, been
discussed by the developers but would require some programming effort and no
one has stepped up to sponsor the work.

Yours

Mark B


Martin Asenov wrote:
> 
> Hello, everyone!
> 
> I've got some DB which contains some contacts. I'm using the Apache POI in
> order to export that DB to an Excel file. Anyway, let's presume the
> contacts are 2000. When I start creating the workbook, the contacts are
> inserted pretty quick, but when the workbook begins to get bigger and
> bigger, things are terribly slowed down.
> 
> Is there a way to flush the workbook from time to time? I didn't see such
> opportunity in the API.
> 
> Best,
> Martin
> 
> 

-- 
View this message in context: http://old.nabble.com/flush-workbook-from-time-to-time-tp28341613p28341925.html
Sent from the POI - User mailing list archive at Nabble.com.


---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
For additional commands, e-mail: user-help@poi.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
For additional commands, e-mail: user-help@poi.apache.org


Re: flush workbook from time to time

Posted by MSB <ma...@tiscali.co.uk>.
Can I ask which version of the file format you are targetting please, the
binary (.xls) or OOXML (.xlsx) one? If it is the latter, then it may be
worthwhile looking at Yegor's BigGridDemo program - sorry but I do not have
the link at hand currently - where he streams the contents of a file into a
template to create large worksheets without encountering out of memory
exceptions. It is not a fully fledged utility but rather a proof that you
may - and that is may - be able to adapt to handle your current requirement.
I do not know if you would be able to modify the program to, for example,
fill rows 0 to 2000 with data and then fill rows 2001 to 4000 with data at a
later date but I feel it could be worth looking into.

Aside from that, there is no way to currently stream the document out bit by
bit I am sorry to say. It should be possible and has, I believe, been
discussed by the developers but would require some programming effort and no
one has stepped up to sponsor the work.

Yours

Mark B


Martin Asenov wrote:
> 
> Hello, everyone!
> 
> I've got some DB which contains some contacts. I'm using the Apache POI in
> order to export that DB to an Excel file. Anyway, let's presume the
> contacts are 2000. When I start creating the workbook, the contacts are
> inserted pretty quick, but when the workbook begins to get bigger and
> bigger, things are terribly slowed down.
> 
> Is there a way to flush the workbook from time to time? I didn't see such
> opportunity in the API.
> 
> Best,
> Martin
> 
> 

-- 
View this message in context: http://old.nabble.com/flush-workbook-from-time-to-time-tp28341613p28341925.html
Sent from the POI - User mailing list archive at Nabble.com.


---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
For additional commands, e-mail: user-help@poi.apache.org