You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pdfbox.apache.org by "Tilman Hausherr (JIRA)" <ji...@apache.org> on 2018/08/29 07:00:00 UTC

[jira] [Closed] (PDFBOX-4296) Question: Performance

     [ https://issues.apache.org/jira/browse/PDFBOX-4296?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Tilman Hausherr closed PDFBOX-4296.
-----------------------------------
    Resolution: Incomplete

I'm closing this one because it is too unspecific. I've added the optimization label and if you click on it you'll see some issues where speed was improved by looking at specific files where people found poor performance and where the profiler could help by pointing us to specific areas that were inefficient.

The sad thing is that PDFBox will become much slower when jdk8/9 disappear because jdk10 and higher use the slow LCMS exclusively instead of the fast KCMS.

> Question: Performance
> ---------------------
>
>                 Key: PDFBOX-4296
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-4296
>             Project: PDFBox
>          Issue Type: Improvement
>          Components: Rendering
>    Affects Versions: 2.0.11
>            Reporter: Daniel Persson
>            Priority: Trivial
>              Labels: optimization, performance
>
> Hi Team.
> We use a tool we built using PDFBox to extract text for about 10k pages per day. Then we have another tool to extract images using Poppler.
> We want to use PDFBox for both tasks but sadly we see a performance hit using PDFBox in the order of 3 times.
> Do you have any backlog / technical dept / ideas on how to improve performance?
> We have tried -Dorg.apache.pdfbox.rendering.UsePureJavaCMYKConversion=true and that made image generation much slower.
> We have set System.setProperty("sun.java2d.cmm", "sun.java2d.cmm.kcms.KcmsServiceProvider") in code.
> We use image libraries from twelvemonkeys, pdfbox and the standard jai project.
> I've read in the code that we do double writes for images using transparency which might be a culprit.
> I have been allowed to put some time into the project if we have some solid leads or a roadmap to reach better performance.
> Hope it's okay to track this issue here instead of a question on the mailing list.
> Best regards
> Daniel



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: dev-help@pdfbox.apache.org