You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@poi.apache.org by Dave Madole <dm...@iodalliance.com> on 2008/09/16 23:49:48 UTC

stack overflow using getRecordSize() in HSSFListener in XLS2CSVmra

Hi,

(As far as I can tell the only way I¹ll be able to figure out
what sheet I¹m in using the event model is to add up the bytes and compare
it to a table I built from the offsets in the BoundSheetRecord records when
I read them.  If there¹s a simpler way--and I hoper there is--this email is
moot, although the stack overflow seems like a potential problem.)

I get a stack overflow if I put ³getRecordSize()² in the processRecord
method while modifying the XLS2CSVmra example app.

        /**
         * Main HSSFListener method, processes events, and outputs the
         *  CSV as the file is processed.
         */


        public void processRecord(Record record) {
                int thisRow = -1;
                int thisColumn = -1;
                String thisStr = null;


                size = record.getRecordSize();  # this blows the program up
                switch (record.getSid())
        {

Exception in thread "main" java.lang.StackOverflowError
        at org.apache.poi.hssf.record.Record.getRecordSize(Record.java:98)
        at org.apache.poi.hssf.record.Record.serialize(Record.java:82)
        at org.apache.poi.hssf.record.Record.getRecordSize(Record.java:98)
        at org.apache.poi.hssf.record.Record.serialize(Record.java:82)
        at org.apache.poi.hssf.record.Record.getRecordSize(Record.java:98)
        at org.apache.poi.hssf.record.Record.serialize(Record.java:82)
        at org.apache.poi.hssf.record.Record.getRecordSize(Record.java:98)
        at org.apache.poi.hssf.record.Record.serialize(Record.java:82)
        at org.apache.poi.hssf.record.Record.getRecordSize(Record.java:98)
        at org.apache.poi.hssf.record.Record.serialize(Record.java:82)
        at org.apache.poi.hssf.record.Record.getRecordSize(Record.java:98)
        at org.apache.poi.hssf.record.Record.serialize(Record.java:82)
        at org.apache.poi.hssf.record.Record.getRecordSize(Record.java:98)
        at org.apache.poi.hssf.record.Record.serialize(Record.java:82)
        at org.apache.poi.hssf.record.Record.getRecordSize(Record.java:98)
        at org.apache.poi.hssf.record.Record.serialize(Record.java:82)
        at org.apache.poi.hssf.record.Record.getRecordSize(Record.java:98)
        at org.apache.poi.hssf.record.Record.serialize(Record.java:82)
        at org.apache.poi.hssf.record.Record.getRecordSize(Record.java:98)
        at org.apache.poi.hssf.record.Record.serialize(Record.java:82)
        at org.apache.poi.hssf.record.Record.getRecordSize(Record.java:98)
        at org.apache.poi.hssf.record.Record.serialize(Record.java:82)
        at org.apache.poi.hssf.record.Record.getRecordSize(Record.java:98)
        at org.apache.poi.hssf.record.Record.serialize(Record.java:82)
        at org.apache.poi.hssf.record.Record.getRecordSize(Record.java:98)
        at org.apache.poi.hssf.record.Record.serialize(Record.java:82)
        at org.apache.poi.hssf.record.Record.getRecordSize(Record.java:98)
        at org.apache.poi.hssf.record.Record.serialize(Record.java:82)
        at org.apache.poi.hssf.record.Record.getRecordSize(Record.java:98)
        at org.apache.poi.hssf.record.Record.serialize(Record.java:82)
        at org.apache.poi.hssf.record.Record.getRecordSize(Record.java:98)
        at org.apache.poi.hssf.record.Record.serialize(Record.java:82)
        at org.apache.poi.hssf.record.Record.getRecordSize(Record.java:98)
        etc., etc. kaboom!


Thanks,

Dave




---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
For additional commands, e-mail: user-help@poi.apache.org

Re: stack overflow using getRecordSize() in HSSFListener in XLS2CSVmra

Posted by Nick Burch <ni...@torchbox.com>.

On Wed, 17 Sep 2008, Dave Madole wrote:
> The sheet I¹m parsing is quite complex so it is reasonable to account 
> for the possibility of anything, even though I really only want the CSV 
> output.

Try grabbing a svn checkout from this evening, and taking a look at the 
updated XLS2CSVmra. That'll now output sheet names for you too. (It works 
by doing the bof offset ordering trick I suggested)

Nick

Re: stack overflow using getRecordSize() in HSSFListener in XLS2CSVmra

Posted by Dave Madole <dm...@iodalliance.com>.

Hi,

This seems to happen with any file I point the program at, regardless of the
type of record last processed.

So this is a bug?  If so, is it on the ³to fix² list or should I work around
it? (I¹d be happy to fix it if my Java was any good.)

Thanks,

Dave




On 9/17/08 9:58 AM, "Dave Madole" <dm...@iodalliance.com> wrote:

> 
> The stack trace just ends when the stack overflows, but it looks
> like a getSid() call returns 1 (this may be the error record,
> although I¹d think blowing the stack would not ever return). There
> are over 500 recursive serialize/getRecordSize calls before it explodes.
> 
> The two previous records return Sids of 520, so at least one "520" made
> it through.  (Where in the code can one see what those sid numbers
> signify?)
> 
>> What's the record you're trying to find the size of? (Should be the bottom
>> one in the stack trace). In theory most of them should be fine to call
>> getRecordSize(), but possibly not all, especially those with whacky
>> continue records
>> 
> 
> I was trying to keep track of my position in the sheet, so I
> was trying to keep track of my progress by getting the size of
> ALL records.  (And the app initializes using addListenerForAllRecords
> so I expect a callback under any circumstances.)
> 
> The sheet I¹m parsing is quite complex so it is reasonable to
> account for the possibility of anything, even though I really only
> want the CSV output.
> 
> I¹ll try your ³by order² tactic.  I thought of it but wasn¹t sure it
> could be guaranteed to work, being unsure of HSSF internals I thought
> the safer route (an actual number) better.
> 
> Thanks,
> 
> Dave
> 
> 
> 
> 
> On 9/17/08 4:26 AM, "Nick Burch" <ni...@torchbox.com> wrote:
> 
>> On Tue, 16 Sep 2008, Dave Madole wrote:
>>> (As far as I can tell the only way I¹ll be able to figure out what sheet
>>> I¹m in using the event model is to add up the bytes and compare it to a
>>> table I built from the offsets in the BoundSheetRecord records when I
>>> read them.  If there¹s a simpler way--and I hoper there is--this email
>>> is moot, although the stack overflow seems like a potential problem.)
>> 
>> Each sheet's data is started with a BOFRecord. You could sort the
>> BoundSheetRecords by the order of their field_1_position_of_BOF entries,
>> then track the BOFRecords of type WORKSHEET
>> 
>>> I get a stack overflow if I put ³getRecordSize()² in the processRecord
>>> method while modifying the XLS2CSVmra example app.
>> 
>> What's the record you're trying to find the size of? (Should be the bottom
>> one in the stack trace). In theory most of them should be fine to call
>> getRecordSize(), but possibly not all, especially those with whacky
>> continue records
>> 
>> Nick
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
>> For additional commands, e-mail: user-help@poi.apache.org
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
> For additional commands, e-mail: user-help@poi.apache.org
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
For additional commands, e-mail: user-help@poi.apache.org

Re: stack overflow using getRecordSize() in HSSFListener in XLS2CSVmra

Posted by Dave Madole <dm...@iodalliance.com>.

The stack trace just ends when the stack overflows, but it looks
like a getSid() call returns 1 (this may be the error record,
although I¹d think blowing the stack would not ever return). There
are over 500 recursive serialize/getRecordSize calls before it explodes.

The two previous records return Sids of 520, so at least one "520" made
it through.  (Where in the code can one see what those sid numbers
signify?)

> What's the record you're trying to find the size of? (Should be the bottom
> one in the stack trace). In theory most of them should be fine to call
> getRecordSize(), but possibly not all, especially those with whacky
> continue records
> 

I was trying to keep track of my position in the sheet, so I
was trying to keep track of my progress by getting the size of
ALL records.  (And the app initializes using addListenerForAllRecords
so I expect a callback under any circumstances.)

The sheet I¹m parsing is quite complex so it is reasonable to
account for the possibility of anything, even though I really only
want the CSV output.

I¹ll try your ³by order² tactic.  I thought of it but wasn¹t sure it
could be guaranteed to work, being unsure of HSSF internals I thought
the safer route (an actual number) better.

Thanks,

Dave

On 9/17/08 4:26 AM, "Nick Burch" <ni...@torchbox.com> wrote:

> On Tue, 16 Sep 2008, Dave Madole wrote:
>> (As far as I can tell the only way I¹ll be able to figure out what sheet
>> I¹m in using the event model is to add up the bytes and compare it to a
>> table I built from the offsets in the BoundSheetRecord records when I
>> read them.  If there¹s a simpler way--and I hoper there is--this email
>> is moot, although the stack overflow seems like a potential problem.)
> 
> Each sheet's data is started with a BOFRecord. You could sort the
> BoundSheetRecords by the order of their field_1_position_of_BOF entries,
> then track the BOFRecords of type WORKSHEET
> 
>> I get a stack overflow if I put ³getRecordSize()² in the processRecord
>> method while modifying the XLS2CSVmra example app.
> 
> What's the record you're trying to find the size of? (Should be the bottom
> one in the stack trace). In theory most of them should be fine to call
> getRecordSize(), but possibly not all, especially those with whacky
> continue records
> 
> Nick
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
> For additional commands, e-mail: user-help@poi.apache.org

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
For additional commands, e-mail: user-help@poi.apache.org

Re: stack overflow using getRecordSize() in HSSFListener in XLS2CSVmra

Posted by Nick Burch <ni...@torchbox.com>.

On Tue, 16 Sep 2008, Dave Madole wrote:
> (As far as I can tell the only way I¹ll be able to figure out what sheet 
> I¹m in using the event model is to add up the bytes and compare it to a 
> table I built from the offsets in the BoundSheetRecord records when I 
> read them.  If there¹s a simpler way--and I hoper there is--this email 
> is moot, although the stack overflow seems like a potential problem.)

Each sheet's data is started with a BOFRecord. You could sort the 
BoundSheetRecords by the order of their field_1_position_of_BOF entries, 
then track the BOFRecords of type WORKSHEET

> I get a stack overflow if I put ³getRecordSize()² in the processRecord 
> method while modifying the XLS2CSVmra example app.

What's the record you're trying to find the size of? (Should be the bottom 
one in the stack trace). In theory most of them should be fine to call 
getRecordSize(), but possibly not all, especially those with whacky 
continue records

Nick