You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pdfbox.apache.org by "Tilman Hausherr (Jira)" <ji...@apache.org> on 2021/04/20 03:38:00 UTC

[jira] [Comment Edited] (PDFBOX-5169) PDFMerger produces overly large output PDF

    [ https://issues.apache.org/jira/browse/PDFBOX-5169?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17325247#comment-17325247 ] 

Tilman Hausherr edited comment on PDFBOX-5169 at 4/20/21, 3:37 AM:
-------------------------------------------------------------------

No idea yet, some things that could be investigated when more time:
- removing structure tree
- run the error check from the merge test class on the structure tree
- try the same on a split PDF and compare
- read and save the 3.0.0 result with the 2.0.23 code (is 3.0 result smaller because the support of compressed object streams in 3.0 ?)


was (Author: tilman):
No idea yet, some things that could be investigated when more time:
- removing structure tree
- run the error check from the merge test class on the structure tree
- try the same on a split PDF and compare
- read and save the 3.0.0 result with the 2.0.23 code (is 3.0 result smaller because of compressed object streams in 3.0 ?)

> PDFMerger produces overly large output PDF
> ------------------------------------------
>
>                 Key: PDFBOX-5169
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-5169
>             Project: PDFBox
>          Issue Type: Bug
>    Affects Versions: 2.0.22, 2.0.23
>         Environment: Debian 10
>            Reporter: Jakov Vežić
>            Priority: Minor
>
> Using PDFMerger to combine
> [https://www.dropbox.com/s/kprk7aeggni420c/1.pdf?dl=1]
> with
> [https://www.dropbox.com/s/0h8bced4tm3gppz/2.pdf?dl=1]
> results in an overly large file. The two input files are 1,25 MB and 16,3 MB large, while the output file is just over 400 MB large. The action also consumes about 1 GB of memory. No errors are produced during the merge that I can tell.
> The command is:
> {code:java}
> java -Xmx2500M -jar pdfbox-app-2.0.23.jar PDFMerger 1.pdf 2.pdf output.pdf
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: dev-help@pdfbox.apache.org