You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pdfbox.apache.org by "Tilman Hausherr (JIRA)" <ji...@apache.org> on 2018/08/29 06:54:00 UTC

[jira] [Updated] (PDFBOX-4296) Question: Performance

     [ https://issues.apache.org/jira/browse/PDFBOX-4296?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Tilman Hausherr updated PDFBOX-4296:
------------------------------------
    Labels: optimization performance  (was: performance)

> Question: Performance
> ---------------------
>
>                 Key: PDFBOX-4296
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-4296
>             Project: PDFBox
>          Issue Type: Improvement
>          Components: Rendering
>    Affects Versions: 2.0.11
>            Reporter: Daniel Persson
>            Priority: Trivial
>              Labels: optimization, performance
>
> Hi Team.
> We use a tool we built using PDFBox to extract text for about 10k pages per day. Then we have another tool to extract images using Poppler.
> We want to use PDFBox for both tasks but sadly we see a performance hit using PDFBox in the order of 3 times.
> Do you have any backlog / technical dept / ideas on how to improve performance?
> We have tried -Dorg.apache.pdfbox.rendering.UsePureJavaCMYKConversion=true and that made image generation much slower.
> We have set System.setProperty("sun.java2d.cmm", "sun.java2d.cmm.kcms.KcmsServiceProvider") in code.
> We use image libraries from twelvemonkeys, pdfbox and the standard jai project.
> I've read in the code that we do double writes for images using transparency which might be a culprit.
> I have been allowed to put some time into the project if we have some solid leads or a roadmap to reach better performance.
> Hope it's okay to track this issue here instead of a question on the mailing list.
> Best regards
> Daniel



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: dev-help@pdfbox.apache.org