You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pdfbox.apache.org by "Tilman Hausherr (JIRA)" <ji...@apache.org> on 2014/04/02 08:36:16 UTC

[jira] [Comment Edited] (PDFBOX-2007) Performance regression since PDFRenderer

    [ https://issues.apache.org/jira/browse/PDFBOX-2007?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13957397#comment-13957397 ] 

Tilman Hausherr edited comment on PDFBOX-2007 at 4/2/14 6:34 AM:
-----------------------------------------------------------------

I assume the relevant code is
https://github.com/fbernier/taz-clj/blob/master/src/taz_clj/converter.clj
Is it possible for you to do a profiling _within_ PDFBox, i.e. to see how much time is spent in the private PDFRenderer.renderPage method, as compared to the old PDPage.convertToImage() ?

This way we could find out whether the difference is in the rendering, or in the PDFBox code before the rendering. I ask this because both were changed.

And does this happen with any PDF, or with a certain, special PDF?


was (Author: tilman):
I assume the relevant code is
https://github.com/fbernier/taz-clj/blob/master/src/taz_clj/converter.clj
Is it possible for you to do a profiling _within_ PDFBox, i.e. to see how much time is spent in the private PDFRenderer.renderPage method, as compared to the old PDPage.convertToImage() ?

This way we could find out whether the difference is in the rendering, or in the code before the rendering. I ask this because both were changed.

And does this happen with any PDF, or with a certain, special PDF?

> Performance regression since PDFRenderer
> ----------------------------------------
>
>                 Key: PDFBOX-2007
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-2007
>             Project: PDFBox
>          Issue Type: Bug
>          Components: Rendering
>    Affects Versions: 2.0.0
>            Reporter: François Bernier
>              Labels: perfomance, regression
>
> Hi,
> I have the following toy project where I use PDFBox: https://github.com/fbernier/taz-clj
> I've been using the snapshot versions of PDFBox for quite a while and recently since the move from RenderUtil#convertToImage to PDFRenderer#renderImage (this commit: https://github.com/fbernier/taz-clj/commit/47917d494f2a9a0999da7f36827c45145d4bb42c), there is quite a big performance regression. If I change the PDFBox dependency to 1.8.x, everything is good. Here are my benchmarks:
> PDFBox 1.8.x:
> Running 1m test @ http://127.0.0.1:8080/testing.pdf?page=1
>   4 threads and 4 connections
>   Thread Stats   Avg      Stdev     Max   +/- Stdev
>     Latency   208.98ms   58.27ms 391.43ms   52.08%
>     Req/Sec     4.63      1.73     8.00     62.88%
>   1224 requests in 1.00m, 72.34MB read
> Requests/sec:     20.40
> Transfer/sec:      1.21MB
> PDFBox 2.0.0:
> Running 1m test @ http://127.0.0.1:8080/testing.pdf?page=1
>   4 threads and 4 connections
>   Thread Stats   Avg      Stdev     Max   +/- Stdev
>     Latency   920.25ms  378.94ms   2.76s    91.38%
>     Req/Sec     0.80      0.40     1.00     80.17%
>   275 requests in 1.00m, 15.85MB read
> Requests/sec:      4.58
> Transfer/sec:    270.41KB
> I have not looked any further than this and have no more data to give you (yet).



--
This message was sent by Atlassian JIRA
(v6.2#6252)