You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pdfbox.apache.org by "Andreas Lehmkühler (JIRA)" <ji...@apache.org> on 2011/03/27 13:59:05 UTC

[jira] [Commented] (PDFBOX-959) Text extraction slow and /tmp fills upwith AWT font files

    [ https://issues.apache.org/jira/browse/PDFBOX-959?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13011783#comment-13011783 ] 

Andreas Lehmkühler commented on PDFBOX-959:
-------------------------------------------

Kevins patch was only half the battle. The type1 font constructor called the getawtFont method right after initializing the type1C font. I moved that call to the getawtFont of the type1 font, so that the AWTFont will only be created if needed.

> Text extraction slow and /tmp fills upwith AWT font files
> ---------------------------------------------------------
>
>                 Key: PDFBOX-959
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-959
>             Project: PDFBox
>          Issue Type: Bug
>          Components: Text extraction
>    Affects Versions: 1.4.0
>            Reporter: Kevin Jackson
>            Assignee: Andreas Lehmkühler
>            Priority: Critical
>             Fix For: 1.5.0
>
>         Attachments: PDType1CFont.java.patch
>
>
> During text extraction there is NO need to create AWT fonts.
> However the current Type1C Font code creates the AWT always while initializing.
> This has several really bad side effects:
> 1. Wasted time creating the AWT font.
> 2. The font files are copied into /tmp which fills up after a few thousand text extractions.
> 3. The AWT is created in a synchronized region so is single threaded.
> The patch is quite simple.  Just delay creation of the AWT fint until required.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira