You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@poi.apache.org by WATHELET Thomas <th...@europarl.europa.eu> on 2007/09/05 12:24:54 UTC

StackOverflowError when reading msword doc

High,
I'm using poi-3.0.1-FINAL-20070705 and I try to extract the content of a
word document with this method :
 
public String getPoiContent(InputStream is) throws Exception {

String temp = null;

WordExtractor wd = null;

// try {

wd = new WordExtractor(is);

temp = wd.getText();

// } catch (Exception e) {

// throw new Exception("getPoiContent: " + e.toString());

// }

return temp;

}

And I retrived this error:

Exception in thread "main" java.lang.StackOverflowError

at org.apache.poi.hwpf.model.StyleSheet.createChp(StyleSheet.java:266)

at org.apache.poi.hwpf.model.StyleSheet.createChp(StyleSheet.java:277)

at org.apache.poi.hwpf.model.StyleSheet.createChp(StyleSheet.java:277)

at org.apache.poi.hwpf.model.StyleSheet.createChp(StyleSheet.java:277)

 

What's happened?


Re: StackOverflowError when reading msword doc

Posted by Rainer Schwarze <rs...@admadic.de>.
Hi Thomas,

I quickly checked the file and there are at least two issues:

1) There is a style in the file which is stored at index 17 and which
declares to have a base style at index 17. So the style specification in
this file is cyclic. When I open the file in Word and try to look at the
style definitions, it crashes. (Word prevents cyclic definitions when
you want to create them, but obviously it does not handle them well when
the file just has them.)

2) With a quick and dirty fix to pass the cyclic style definition my
code runs into a list formatting problem. So probably the stack overflow
/cyclic style definition is not the only problem.

So I suspect that the file is either damaged or has been
created/exported by an application which is not Word and which does not
create valid output. (Some version information in the first few bytes of
the Word stream of your file look unfamiliar - the Word files which I
have access to have other version values... (nProduct field in FIB record))

Can you be sure, that this file has been saved by Word and was not
changed by another software?

best wishes, Rainer

Rainer Schwarze wrote:
> WATHELET Thomas wrote:
>> High,
>> I'm using poi-3.0.1-FINAL-20070705 and I try to extract the content of a
>> word document with this method :
> [snip]
>> And I retrived this error:
>> Exception in thread "main" java.lang.StackOverflowError
[snip]
> If you like, you can send the file to my email address and I take a look
> at it. 

-- 

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
For additional commands, e-mail: user-help@poi.apache.org


Re: StackOverflowError when reading msword doc

Posted by Rainer Schwarze <rs...@admadic.de>.
WATHELET Thomas wrote:
> High,
> I'm using poi-3.0.1-FINAL-20070705 and I try to extract the content of a
> word document with this method :
[snip]
> And I retrived this error:
> Exception in thread "main" java.lang.StackOverflowError
> at org.apache.poi.hwpf.model.StyleSheet.createChp(StyleSheet.java:266)
> at org.apache.poi.hwpf.model.StyleSheet.createChp(StyleSheet.java:277)
[snip]
> 
> What's happened?

hi,

The createChp method is recursive, and went into an infinite self
calling loop.
Either you have a valid Word file and the implementation of createChp is
incorrect, or your Word file is not really valid. (Is it a "genuine Word
file"?)
If you like, you can send the file to my email address and I take a look
at it. Probably (I hope) I can identify which case it is and what to do
about it.

Best wishes, Rainer
-- 

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@poi.apache.org
For additional commands, e-mail: user-help@poi.apache.org