You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@pdfbox.apache.org by Alex <al...@gmail.com> on 2018/07/30 18:01:02 UTC

Slowness in PDFRenderer.renderImageWithDPI()

Hi,

I use PDFBox's PDFRenderer to render PDF pages into BufferedImages. It handles most PDFs well, but I've come across one that causes it to struggle. I first noticed this in an earlier 2.0.x version, but it still happens in 2.0.11:

File file = new File("A334vu0TFjpSVeK4_NgZNw")
PDDocument doc = PDDocument.load(file);
PDFRenderer renderer = new PDFRenderer(doc);
PDFRenderer.renderImageWithDPI(0, 50f);

The call to renderImageWithDPI() uses 100% CPU and never completes (at least not within 30 minutes). But only on page 0. The other pages do render.

The file is hosted here for the time being (4.2 MB): https://s3.us-east-2.amazonaws.com/alexd-pdfbox-sscce/A334vu0TFjpSVeK4_NgZNw <https://s3.us-east-2.amazonaws.com/alexd-pdfbox-sscce/A334vu0TFjpSVeK4_NgZNw>

Any advice welcome. Thanks.


Re: Slowness in PDFRenderer.renderImageWithDPI()

Posted by Alex <al...@gmail.com>.
Thanks, Tilman.

I figured it was something to do with the way the PDF was encoded, but unfortunately I don't have control over the input files.

In my last email I wrote that renderImageWithDPI() never completes. After sending it, I re-ran my test several times to test the various combinations of `sun.java2d.cmm=sun.java2d.cmm.kcms.KcmsServiceProvider`
and `org.apache.pdfbox.rendering.UsePureJavaCMYKConversion=true` settings, as suggested on the "Getting Started" page, and it was able to finish rendering the first page in about 9.5 minutes, no matter what the settings. So, it does complete eventually at least some of the time. This is under JDKs 8 & 10 on a several years-old Core i7.

Alex

> On Jul 30, 2018, at 2:11 PM, Tilman Hausherr <TH...@t-online.de> wrote:
> 
> Hi,
> 
> The page renders in 27 seconds with energy settings set to minimal, and 11 seconds when set to max performance. This is slow, but I've seen worse. Have you done what is explained on
> https://pdfbox.apache.org/2.0/getting-started.html ?
> 
> The page contains over 1000 tiny images. This is extremely inefficient, and maybe done to avoid people grabbing the image.
> 
> Tilman
> 
> Am 30.07.2018 um 20:01 schrieb Alex:
>> Hi,
>> 
>> I use PDFBox's PDFRenderer to render PDF pages into BufferedImages. It handles most PDFs well, but I've come across one that causes it to struggle. I first noticed this in an earlier 2.0.x version, but it still happens in 2.0.11:
>> 
>> File file = new File("A334vu0TFjpSVeK4_NgZNw")
>> PDDocument doc = PDDocument.load(file);
>> PDFRenderer renderer = new PDFRenderer(doc);
>> PDFRenderer.renderImageWithDPI(0, 50f);
>> 
>> The call to renderImageWithDPI() uses 100% CPU and never completes (at least not within 30 minutes). But only on page 0. The other pages do render.
>> 
>> The file is hosted here for the time being (4.2 MB): https://s3.us-east-2.amazonaws.com/alexd-pdfbox-sscce/A334vu0TFjpSVeK4_NgZNw <https://s3.us-east-2.amazonaws.com/alexd-pdfbox-sscce/A334vu0TFjpSVeK4_NgZNw>
>> 
>> Any advice welcome. Thanks.
>> 
>> 
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
> For additional commands, e-mail: users-help@pdfbox.apache.org
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: users-help@pdfbox.apache.org


Re: Slowness in PDFRenderer.renderImageWithDPI()

Posted by Tilman Hausherr <TH...@t-online.de>.
Hi,

The page renders in 27 seconds with energy settings set to minimal, and 
11 seconds when set to max performance. This is slow, but I've seen 
worse. Have you done what is explained on
https://pdfbox.apache.org/2.0/getting-started.html ?

The page contains over 1000 tiny images. This is extremely inefficient, 
and maybe done to avoid people grabbing the image.

Tilman

Am 30.07.2018 um 20:01 schrieb Alex:
> Hi,
>
> I use PDFBox's PDFRenderer to render PDF pages into BufferedImages. It handles most PDFs well, but I've come across one that causes it to struggle. I first noticed this in an earlier 2.0.x version, but it still happens in 2.0.11:
>
> File file = new File("A334vu0TFjpSVeK4_NgZNw")
> PDDocument doc = PDDocument.load(file);
> PDFRenderer renderer = new PDFRenderer(doc);
> PDFRenderer.renderImageWithDPI(0, 50f);
>
> The call to renderImageWithDPI() uses 100% CPU and never completes (at least not within 30 minutes). But only on page 0. The other pages do render.
>
> The file is hosted here for the time being (4.2 MB): https://s3.us-east-2.amazonaws.com/alexd-pdfbox-sscce/A334vu0TFjpSVeK4_NgZNw <https://s3.us-east-2.amazonaws.com/alexd-pdfbox-sscce/A334vu0TFjpSVeK4_NgZNw>
>
> Any advice welcome. Thanks.
>
>


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: users-help@pdfbox.apache.org