You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pdfbox.apache.org by "Arjohn Kampman (JIRA)" <ji...@apache.org> on 2010/06/30 14:46:54 UTC

[jira] Commented: (PDFBOX-765) Performance regression in PDFBox 1.2.0

    [ https://issues.apache.org/jira/browse/PDFBOX-765?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12883906#action_12883906 ] 

Arjohn Kampman commented on PDFBOX-765:
---------------------------------------

The performance degradation seems to be related to files that can not be found. For example, with some PDF files, pdfbox tries to load org/apache/pdfbox/resources/afm/MicrosoftSansSerif.afm over and over again. Normally, the result of such load operations are cached in PDTrueTypeFont.afmObjects, but not so when the result is <null>.

Here's (one of?) the relevant stack trace(s):

ResourceLoader.loadResource(String) line: 54	
PDTrueTypeFont(PDFont).getAFM() line: 305	
PDTrueTypeFont(PDSimpleFont).getFontHeight(byte[], int, int) line: 119	
PDFTextStripper(PDFStreamEngine).processEncodedText(byte[]) line: 402	
ShowTextGlyph.process(PDFOperator, List<COSBase>) line: 61	
PDFTextStripper(PDFStreamEngine).processOperator(PDFOperator, List) line: 567	
PDFTextStripper(PDFStreamEngine).processSubStream(PDPage, PDResources, COSStream) line: 250	
PDFTextStripper(PDFStreamEngine).processStream(PDPage, PDResources, COSStream) line: 208	
PDFTextStripper.processPage(PDPage, COSStream) line: 378	
PDFTextStripper.processPages(List<COSObjectable>) line: 302	
PDFTextStripper.writeText(PDDocument, Writer) line: 258	
PDFTextStripper.getText(PDDocument) line: 184	



> Performance regression in PDFBox 1.2.0
> --------------------------------------
>
>                 Key: PDFBOX-765
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-765
>             Project: PDFBox
>          Issue Type: Bug
>    Affects Versions: 1.2.0
>            Reporter: Jukka Zitting
>            Assignee: Jukka Zitting
>            Priority: Critical
>
> Arjohn Kampman reported a notable performance drop in PDFBox 1.2.0, possibly caused by PDFBOX-754.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.