You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pdfbox.apache.org by "Tomonori Soejima (JIRA)" <ji...@apache.org> on 2017/11/01 00:27:01 UTC

[jira] [Commented] (PDFBOX-3985) IOException thrown from org.apache.fontbox.ttf.CMAPEncodingEntry.processSubtype14

    [ https://issues.apache.org/jira/browse/PDFBOX-3985?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16229424#comment-16229424 ] 

Tomonori Soejima commented on PDFBOX-3985:
------------------------------------------

I can try to get the problematic pdf, but TimesNewRomanPS-BoldMT font is what is causing the error.

Also, I know I am using 2.0.3, but it is true that even the latest does not solve it anyway.

This is additional messages found right before getting the exception. 
2017/10/31 00:01:13.348 [WARN ] [elasticsearch[test][bulk][T#3]] [FontManager] Font not found: TimesNewRomanPS-BoldMT
2017/10/31 00:01:13.413 [ERROR] [elasticsearch[test][bulk][T#3]] [TrueTypeFont] An error occured when reading table cmap
java.io.IOException: CMap subtype 14 not yet implemented

> IOException thrown from org.apache.fontbox.ttf.CMAPEncodingEntry.processSubtype14
> ---------------------------------------------------------------------------------
>
>                 Key: PDFBOX-3985
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-3985
>             Project: PDFBox
>          Issue Type: Improvement
>          Components: FontBox
>    Affects Versions: 2.0.7
>            Reporter: Tomonori Soejima
>
> I ran into this issue while processing a pdf file through elasticsearch and it turns out that the error was because [the method is not implemented|
> https://apache.googlesource.com/pdfbox/+/refs/heads/trunk/fontbox/src/main/java/org/apache/fontbox/ttf/CmapSubtable.java#327] 
> Below is an a snippet of stack trace I ran into.
> Is there any plan to implementing this method?
> An error occured when reading table cmap
> java.io.IOException: CMap subtype 14 not yet implemented
>         at org.apache.fontbox.ttf.CMAPEncodingEntry.processSubtype14(CMAPEncodingEntry.java:304)
>         at org.apache.fontbox.ttf.CMAPEncodingEntry.initSubtable(CMAPEncodingEntry.java:114)
>         at org.apache.fontbox.ttf.CMAPTable.initData(CMAPTable.java:100)
>         at org.apache.fontbox.ttf.TrueTypeFont.initializeTable(TrueTypeFont.java:280)
>         at org.apache.fontbox.ttf.AbstractTTFParser.parseTables(AbstractTTFParser.java:128)
>         at org.apache.fontbox.ttf.TTFParser.parseTables(TTFParser.java:80)
>         at org.apache.fontbox.ttf.AbstractTTFParser.parseTTF(AbstractTTFParser.java:109)
>         at org.apache.fontbox.ttf.TTFParser.parseTTF(TTFParser.java:25)
>         at org.apache.fontbox.ttf.AbstractTTFParser.parseTTF(AbstractTTFParser.java:84)
>         at org.apache.fontbox.ttf.TTFParser.parseTTF(TTFParser.java:25)
>         at org.apache.pdfbox.pdmodel.font.PDTrueTypeFont.getTTFFont(PDTrueTypeFont.java:632)
>         at org.apache.pdfbox.pdmodel.font.PDTrueTypeFont.getFontWidth(PDTrueTypeFont.java:673)
>         at org.apache.pdfbox.pdmodel.font.PDSimpleFont.getFontWidth(PDSimpleFont.java:231)
>         at org.apache.pdfbox.pdmodel.font.PDSimpleFont.getSpaceWidth(PDSimpleFont.java:533)
>         at org.apache.pdfbox.util.PDFStreamEngine.processEncodedText(PDFStreamEngine.java:355)
>         at org.apache.pdfbox.util.operator.ShowTextGlyph.process(ShowTextGlyph.java:62)
>         at org.apache.pdfbox.util.PDFStreamEngine.processOperator(PDFStreamEngine.java:557)
>         at org.apache.pdfbox.util.PDFStreamEngine.processSubStream(PDFStreamEngine.java:268)
>         at org.apache.pdfbox.util.PDFStreamEngine.processSubStream(PDFStreamEngine.java:235)
>         at org.apache.pdfbox.util.PDFStreamEngine.processStream(PDFStreamEngine.java:215)
>         at org.apache.pdfbox.util.PDFTextStripper.processPage(PDFTextStripper.java:458)
>         at org.apache.pdfbox.util.PDFTextStripper.processPages(PDFTextStripper.java:383)
>         at org.apache.pdfbox.util.PDFTextStripper.writeText(PDFTextStripper.java:342)
>         at org.apache.tika.parser.pdf.PDF2XHTML.process(PDF2XHTML.java:148)
>         at org.apache.tika.parser.pdf.PDFParser.parse(PDFParser.java:148)
>         at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:280)
>         at org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:120)
>         at org.apache.tika.Tika.parseToString(Tika.java:537)



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: dev-help@pdfbox.apache.org