You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@tika.apache.org by Aami <si...@algotree.com> on 2011/12/30 07:52:33 UTC

arrayindex out of bounds exception

Hi
    When i tried to extract apdf using apache tika version 0.9 
im getting the following exception
Thread-0/PDFStreamEngine [WARN] java.lang.ArrayIndexOutOfBoundsException: 1
java.lang.ArrayIndexOutOfBoundsException: 1
	at
org.apache.pdfbox.util.TextPosition.mergeDiacritic(TextPosition.java:625)
	at
org.apache.pdfbox.util.PDFTextStripper.processTextPosition(PDFTextStripper.java:1026)
	at
org.apache.pdfbox.util.PDFStreamEngine.processEncodedText(PDFStreamEngine.java:494)
	at
org.apache.pdfbox.util.operator.ShowTextGlyph.process(ShowTextGlyph.java:62)
	at
org.apache.pdfbox.util.PDFStreamEngine.processOperator(PDFStreamEngine.java:551)
	at
org.apache.pdfbox.util.PDFStreamEngine.processSubStream(PDFStreamEngine.java:274)
	at
org.apache.pdfbox.util.PDFStreamEngine.processSubStream(PDFStreamEngine.java:251)
	at
org.apache.pdfbox.util.PDFStreamEngine.processStream(PDFStreamEngine.java:225)
	at
org.apache.pdfbox.util.PDFTextStripper.processPage(PDFTextStripper.java:442)
	at
org.apache.pdfbox.util.PDFTextStripper.processPages(PDFTextStripper.java:366)
	at
org.apache.pdfbox.util.PDFTextStripper.writeText(PDFTextStripper.java:322)
	at org.apache.tika.parser.pdf.PDF2XHTML.process(PDF2XHTML.java:56)
	at org.apache.tika.parser.pdf.PDFParser.parse(PDFParser.java:89)
	at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:197)
	at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:197)
	at org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:135)
	at org.apache.tika.Tika.parseToString(Tika.java:357)
	at org.apache.tika.Tika.parseToString(Tika.java:423)
	at org.apache.tika.Tika.parseToString(Tika.java:403)
please help me to solve the issue.Thanks in advance

--
View this message in context: http://lucene.472066.n3.nabble.com/arrayindex-out-of-bounds-exception-tp3620421p3620421.html
Sent from the Apache Tika - Development mailing list archive at Nabble.com.