You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@pdfbox.apache.org by John Lussmyer <Co...@CasaDelGato.com> on 2022/03/17 18:15:39 UTC
Possible PDFBox bug?
We have an app that can generate multi-page PDF Files. We recently ran into a problem where the library we were using would keep ALL the pages in memory. For a quick workaround we have it write out single-page PDF files, then use PDFBox to combine them.
We recently found a bug in the way that the pages get modified when combined into a single PDF.
When we generate the pages, sometimes the MediaBox starts at negative coordinates. When PDFBox adds that page to a document, it offsets it by that negative amount - which moves the page content up and to the right.
Out page combining code looks like this.
try (PDDocument doc = new PDDocument(MemoryUsageSetting.setupTempFileOnly())) {
for (File pagFile : srcPages) {
log.debug("make: page {}", pagFile.getAbsolutePath());
PDPage page = new PDPage();
doc.addPage(page);
try (PDPageContentStream contents = new PDPageContentStream(doc, page)) {
try (PDDocument sourceDoc = Loader.loadPDF(pagFile, MemoryUsageSetting.setupTempFileOnly())) {
PDPage srcPage = sourceDoc.getPage(0);
page.setUserUnit(srcPage.getUserUnit());
page.setMediaBox(srcPage.getMediaBox());
page.setCropBox(srcPage.getCropBox());
page.setTrimBox(srcPage.getTrimBox());
// Create a Form XObject from the source document using LayerUtility
LayerUtility layerUtility = new LayerUtility(doc);
PDFormXObject form = layerUtility.importPageAsForm(sourceDoc, 0);
// draw the full form
contents.drawForm(form);
}
}
}
doc.save(outPDF);
}
The original Page pdf has a TrimBox[0,0,1296,864], MediaBox[-72,-72,1368,936]
The page in the PDFBox combined output has the same TrimBox and MediaBox, BUT the /Form1 it uses to place the contents has a BBox[-72,-72,1368,936] and a Matrix[1,0,0,1,72,72].
I'm not sure why it's adding a Matrix to offset the content.
AW: Possible PDFBox bug?
Posted by "Hiller, Gerhard" <Ge...@msh.de>.
Hi John,
try srcPage.getMediaBox().createRetranslatedRectangle(), also for the other boxes.
The returned rectangle from srcPage.getMediaBox() will reflect the negativ coordinates.
Greetings
Gerhard
Mit freundlichen Grüßen
Gerhard Hiller
mailto:gerhard.hiller@msh.de | Phone: +49 711 72007 4163 | Mobile: +49 172 718 48 46
Printproduktion neu gedacht
- -
MSH Medien System Haus GmbH & Co. KG, Stuttgart, HRA 9274 Stuttgart
P.h.G.: MSH Medien System Haus Verwaltungsges. MbH, Stuttgart, HRB 4443 Stuttgart
-----Ursprüngliche Nachricht-----
Von: John Lussmyer <Co...@CasaDelGato.com>
Gesendet: Donnerstag, 17. März 2022 19:16
An: users@pdfbox.apache.org
Betreff: Possible PDFBox bug?
We have an app that can generate multi-page PDF Files. We recently ran into a problem where the library we were using would keep ALL the pages in memory. For a quick workaround we have it write out single-page PDF files, then use PDFBox to combine them.
We recently found a bug in the way that the pages get modified when combined into a single PDF.
When we generate the pages, sometimes the MediaBox starts at negative coordinates. When PDFBox adds that page to a document, it offsets it by that negative amount - which moves the page content up and to the right.
Out page combining code looks like this.
try (PDDocument doc = new PDDocument(MemoryUsageSetting.setupTempFileOnly())) {
for (File pagFile : srcPages) {
log.debug("make: page {}", pagFile.getAbsolutePath());
PDPage page = new PDPage();
doc.addPage(page);
try (PDPageContentStream contents = new PDPageContentStream(doc, page)) {
try (PDDocument sourceDoc = Loader.loadPDF(pagFile, MemoryUsageSetting.setupTempFileOnly())) {
PDPage srcPage = sourceDoc.getPage(0);
page.setUserUnit(srcPage.getUserUnit());
page.setMediaBox(srcPage.getMediaBox());
page.setCropBox(srcPage.getCropBox());
page.setTrimBox(srcPage.getTrimBox());
// Create a Form XObject from the source document using LayerUtility
LayerUtility layerUtility = new LayerUtility(doc);
PDFormXObject form = layerUtility.importPageAsForm(sourceDoc, 0);
// draw the full form
contents.drawForm(form);
}
}
}
doc.save(outPDF);
}
The original Page pdf has a TrimBox[0,0,1296,864], MediaBox[-72,-72,1368,936] The page in the PDFBox combined output has the same TrimBox and MediaBox, BUT the /Form1 it uses to place the contents has a BBox[-72,-72,1368,936] and a Matrix[1,0,0,1,72,72].
I'm not sure why it's adding a Matrix to offset the content.
---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: users-help@pdfbox.apache.org