You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@poi.apache.org by bu...@apache.org on 2012/06/07 12:37:50 UTC

[Bug 53379] New: IndexOutOfBoundsException on MS word 2007 doc

https://issues.apache.org/bugzilla/show_bug.cgi?id=53379

          Priority: P2
            Bug ID: 53379
          Assignee: dev@poi.apache.org
           Summary: IndexOutOfBoundsException on MS word 2007 doc
          Severity: major
    Classification: Unclassified
          Reporter: tim.barrett@comcor.nl
          Hardware: Macintosh
            Status: NEW
           Version: unspecified
         Component: HDF
           Product: POI

Created attachment 28900
  --> https://issues.apache.org/bugzilla/attachment.cgi?id=28900&action=edit
offending word document

Error (stack trace heer) when parsing 'old' .doc format word doc. When same doc
is saved to docx format, error no longer occurs.
<p class="tOC_3"><i>Exception in thread "main"
org.apache.tika.exception.TikaException: Unexpected RuntimeException from
org.apache.tika.parser.microsoft.OfficeParser@4c5cc942
    at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:244)
    at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:242)
    at org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:120)
    at org.apache.tika.cli.TikaCLI$OutputType.process(TikaCLI.java:133)
    at org.apache.tika.cli.TikaCLI.process(TikaCLI.java:400)
    at org.apache.tika.cli.TikaCLI.main(TikaCLI.java:101)
Caused by: java.lang.IndexOutOfBoundsException: Index: 151, Size: 79
    at java.util.ArrayList.RangeCheck(ArrayList.java:547)
    at java.util.ArrayList.get(ArrayList.java:322)
    at org.apache.poi.hwpf.model.ListTables.getOverride(ListTables.java:196)
    at org.apache.poi.hwpf.usermodel.Paragraph.newParagraph(Paragraph.java:108)
    at org.apache.poi.hwpf.usermodel.Range.getParagraph(Range.java:890)
    at
org.apache.tika.parser.microsoft.WordExtractor.parse(WordExtractor.java:96)
    at
org.apache.tika.parser.microsoft.OfficeParser.parse(OfficeParser.java:185)
    at
org.apache.tika.parser.microsoft.OfficeParser.parse(OfficeParser.java:160)
    at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:242)
    ... 5 more

-- 
You are receiving this mail because:
You are the assignee for the bug.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org


[Bug 53379] IndexOutOfBoundsException on MS word 2007 doc

Posted by bu...@apache.org.
https://issues.apache.org/bugzilla/show_bug.cgi?id=53379

Tim Barrett <ti...@comcor.nl> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 OS|                            |All

-- 
You are receiving this mail because:
You are the assignee for the bug.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org