You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@tika.apache.org by "Tim Allison (Jira)" <ji...@apache.org> on 2022/10/13 10:49:00 UTC

[jira] [Comment Edited] (TIKA-3875) Add metadata items for "broken" fonts and non-embedded fonts for PDF

    [ https://issues.apache.org/jira/browse/TIKA-3875?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17616948#comment-17616948 ] 

Tim Allison edited comment on TIKA-3875 at 10/13/22 10:48 AM:
--------------------------------------------------------------

[~tilman] responded to my question on the PDFBox user list that PDFont has an .isEmbedded() method.  We have access to PDFonts with the document at the end of each page and on every call to showGlyph().

Not sure if we want a boolean for the document or counts of characters per page like we do for missing unicode mappings.  Or both?


was (Author: tallison@mitre.org):
[~tilman] responded to my question on the PDFBox user list that PDFont has an .isEmbedded() method.  We have access to PDFonts with the document at the end of each page and on every call to showGlyph().

Not sure if we want a boolean for the document or counts of characters per page like we do for missing unicode mappings.

> Add metadata items for "broken" fonts and non-embedded fonts for PDF
> --------------------------------------------------------------------
>
>                 Key: TIKA-3875
>                 URL: https://issues.apache.org/jira/browse/TIKA-3875
>             Project: Tika
>          Issue Type: Task
>            Reporter: Tim Allison
>            Priority: Minor
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)