You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@pdfbox.apache.org by Fabian Zünd SI-Solutions Gmbh <zu...@si-solutions.ch> on 2024/01/03 08:43:13 UTC

Importing landscape format and portrait format oriented pages into the same PDF causes PDF corruption

Good Day

The platform i'm developing for recently switched from PDFBox 2.X to 3.0.0.

I created an add-on which generates a PDF-Documentation of the PBX for customers.
This PDF Contains multiple A4-Pages, some in the normail Portrait format, some rotated in landscape format for more space.

I use «Template» pages which are single page PDF's. (Cover Sheet.pdf, Normal_page.pdf, Normal_page_landscape.pdf), of which i create a copy for every page in the main pdf, based on what the user's choice for the documentation is.

In 2.X i used the integrated PDFCloneUtility to create a copy of the Template Page(s), and copy it to the main PDF using this:

PDPage SelectedPage = PDFSource.getPage(PageNumber);
              PDFCloneUtility PDC = new PDFCloneUtility(PDFTarget);
              COSDictionary PD = (COSDictionary) PDC.cloneForNewDocument(SelectedPage);
              PDPage ClonedPage = new PDPage(PD);
              PDFTarget.addPage(ClonedPage);

But since the PDFCloneUtility is protected in 3.0.0 i switched over to using the PDDocument ImportPage Function.

PDPage SelectedPage = PDFSource.getDocument().getPage(PageNumber);
PDPage PDCopiedPage = PDFTarget.importPage(SelectedPage);

Everything seemed fine, when testing. But when i started to generate the full documentation, the finished pdf did contain all pages, but adobe throws a lot of errors, and all the Landscaped pages are blank.

If i only generate Portrait Pages (Generated_PDF_Portraint_only.pdf), or LandScape Pages (Generated_PDF_Landscape_only.pdf) everything is fine, but when i mix them (Generated_PDF_Mixed.pdf), the result is broken.

I don't exactly know what could be causing this issue, i was hoping somebody might have some kind of clue, where this could come from.
Maybe i'm misunderstanding the importpage function, and that is not actually the correct way to clone pages?

Sincerely
Fabian Zünd

AW: Importing landscape format and portrait format oriented pages into the same PDF causes PDF corruption

Posted by Fabian Zünd SI-Solutions Gmbh <zu...@si-solutions.ch>.
Hello Andreas

I tried the Build https://repository.apache.org/content/groups/snapshots/org/apache/pdfbox/pdfbox-app/3.0.2-SNAPSHOT/pdfbox-app-3.0.2-20240303.134023-94.jar from 03.03.2024 and is indeed working as intended.

Thank you fort he timely fix!

Mit freundlichen Grüssen
 
Fabian Zünd
ICT Techniker EFZ  / Modul Programmierer
 
 

 
 
Industriestrasse 19
CH-9450 Altstätten SG
 
+41 71 595 10 60
+41 77 261 16 21
 
 
www.si-solutions.ch
zuend@si-solutions.ch
 


-----Ursprüngliche Nachricht-----
Von: Andreas Lehmkühler <an...@lehmi.de.INVALID> 
Gesendet: Samstag, 2. März 2024 17:49
An: users@pdfbox.apache.org
Betreff: Re: Importing landscape format and portrait format oriented pages into the same PDF causes PDF corruption

Hi,

I guess I've fixed https://issues.apache.org/jira/browse/PDFBOX-5752 and the fix works for PDFBOX-5775 as well.

@Fabian please give the newest SNAPSHOT build of 3.0.2 a try

Andreas

Am 23.02.24 um 11:43 schrieb Tilman Hausherr:
> On 21.02.2024 16:07, Fabian Zünd SI-Solutions Gmbh wrote:
>> Hello I manged to try it all out with the Most current build 
>> pdfbox-app-3.0.2-20240221.085334-88.jar
>>
>> The issue persists.
>>
>> Maybe i'm doing the copying of the page completely wrong?
> 
> Hi,
> 
> You did nothing wrong. Sadly, this is the problem that I mentioned in 
> my last mail. I've created 
> https://issues.apache.org/jira/browse/PDFBOX-5775
> 
> Tilman
> 

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: users-help@pdfbox.apache.org


Re: Importing landscape format and portrait format oriented pages into the same PDF causes PDF corruption

Posted by Andreas Lehmkühler <an...@lehmi.de.INVALID>.
Hi,

I guess I've fixed https://issues.apache.org/jira/browse/PDFBOX-5752 and 
the fix works for PDFBOX-5775 as well.

@Fabian please give the newest SNAPSHOT build of 3.0.2 a try

Andreas

Am 23.02.24 um 11:43 schrieb Tilman Hausherr:
> On 21.02.2024 16:07, Fabian Zünd SI-Solutions Gmbh wrote:
>> Hello I manged to try it all out with the Most current build 
>> pdfbox-app-3.0.2-20240221.085334-88.jar
>>
>> The issue persists.
>>
>> Maybe i'm doing the copying of the page completely wrong?
> 
> Hi,
> 
> You did nothing wrong. Sadly, this is the problem that I mentioned in my 
> last mail. I've created https://issues.apache.org/jira/browse/PDFBOX-5775
> 
> Tilman
> 

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: users-help@pdfbox.apache.org


AW: AW: Importing landscape format and portrait format oriented pages into the same PDF causes PDF corruption

Posted by Fabian Zünd SI-Solutions Gmbh <zu...@si-solutions.ch>.
Thank you for the feedback. 

I wanted to try a workaround, where i'd have 3 reference pages, that i would dupe in the document internally

While researching if there is a different way than using the importpage Function i stumbled upon this: https://copyprogramming.com/howto/pdfbox-how-to-clone-a-page

"Encountering an issue with large result sets; upon opening the pdf, the first two pages appear perfect, however, the third page causes Acrobat to error out. The template page is visible, but the report data is missing. Suspecting the issue lies with the aforementioned code, any suggestions would be greatly appreciated. Thank you."

And it seems you were able to help him:

"I was able to resolve my query with the assistance of Tilman Hausherr, who guided me towards the correct path."

So for now i'll try this approach of internally copying the pdf pages in the template, filling them, and when done deleting the three rerference pages at the start.
With the code given from that webpage.

Mit freundlichen Grüssen
 
Fabian Zünd
ICT Techniker EFZ  / Modul Programmierer
 
 

 
 
Industriestrasse 19
CH-9450 Altstätten SG
 
+41 71 595 10 60
+41 77 261 16 21
 
 
www.si-solutions.ch
zuend@si-solutions.ch
 


-----Ursprüngliche Nachricht-----
Von: Tilman Hausherr <TH...@t-online.de> 
Gesendet: Freitag, 23. Februar 2024 11:43
An: users@pdfbox.apache.org
Betreff: Re: AW: Importing landscape format and portrait format oriented pages into the same PDF causes PDF corruption

On 21.02.2024 16:07, Fabian Zünd SI-Solutions Gmbh wrote:
> Hello I manged to try it all out with the Most current build 
> pdfbox-app-3.0.2-20240221.085334-88.jar
>
> The issue persists.
>
> Maybe i'm doing the copying of the page completely wrong?

Hi,

You did nothing wrong. Sadly, this is the problem that I mentioned in my last mail. I've created https://issues.apache.org/jira/browse/PDFBOX-5775

Tilman

Re: AW: Importing landscape format and portrait format oriented pages into the same PDF causes PDF corruption

Posted by Tilman Hausherr <TH...@t-online.de>.
On 21.02.2024 16:07, Fabian Zünd SI-Solutions Gmbh wrote:
> Hello I manged to try it all out with the Most current build pdfbox-app-3.0.2-20240221.085334-88.jar
>
> The issue persists.
>
> Maybe i'm doing the copying of the page completely wrong?

Hi,

You did nothing wrong. Sadly, this is the problem that I mentioned in my 
last mail. I've created https://issues.apache.org/jira/browse/PDFBOX-5775

Tilman

AW: Importing landscape format and portrait format oriented pages into the same PDF causes PDF corruption

Posted by Fabian Zünd SI-Solutions Gmbh <zu...@si-solutions.ch>.
Hello I manged to try it all out with the Most current build pdfbox-app-3.0.2-20240221.085334-88.jar

The issue persists.

Maybe i'm doing the copying of the page completely wrong?

This is the code i'm using now to test copy/import single pages into a different document.
_____________________________________________________________
			
		System.out.println("Generating Document");
		//Generate Empty Document
		PDDocument Document = new PDDocument();
		PDAcroForm Form = new PDAcroForm(Document);
		PDResources Resources = new PDResources();
		Resources.put(COSName.HELV,  new PDType1Font(FontName.HELVETICA)); 
		Document.getDocumentCatalog().setAcroForm(Form);
		Form.setDefaultResources(Resources);
		Form.setDefaultAppearance("/Helv 0 Tf 0 g");

		
		//Templates PDF's
		File PDFRef1 = new File("C:\\Temp\\Page1.pdf");
		File PDFRef2 = new File("C:\\Temp\\Page2.pdf");
		File PDFRef3 = new File("C:\\Temp\\Page3.pdf");
		
		//Load the Templates
		System.out.println("Loading Template Documents");
		PDDocument PDF1 = Loader.loadPDF(PDFRef1);
		PDDocument PDF2 = Loader.loadPDF(PDFRef2);
		PDDocument PDF3 = Loader.loadPDF(PDFRef3);
		
		//Get the First Page of the Template PDF's
		PDPage PDF1Page = PDF1.getPage(0);
		PDPage PDF2Page = PDF2.getPage(0);
		PDPage PDF3Page = PDF3.getPage(0);
		
		
		System.out.println("Importing Pages");
		
		//Import all pages
		
		PDPage PDF1Imported = Document.importPage(PDF1Page);
		PDPage PDF2Imported = Document.importPage(PDF2Page);
		PDPage PDF3Imported = Document.importPage(PDF3Page);
		
		System.out.println("Saving Document");
		File Export = new File("C:\\Temp\\Export.pdf");
		if(Export.exists())
		{
			Export.delete();
		}
				
		Document.save(Export);
		System.out.println("Done");
		

_____________________________________________________________

The Result is as follows:

The very First Imported Page is completely fine
Any follow up page loads as long as it's in the same orentation as the original. But any text turns into something quite strange. It's like the Oiignal, but the Characters are scrambled.
If the Rotation of any follow up page is different to the first page the docuemnt becomes corrupt to the point where adobe throws an error.

For example the Text of Page Two

Original:
––
Tel.: +49 (0) 721 / 151 042 – 0
Fax.: +49 (0) 721 / 151 042 – 99
E-Mail: info@starface.com
Web: www.starface.com
USt-ID: DE243439720
Geschäftsführung:
Florian Buzin
Barbara Mauve
Jürgen Signer
Thomas Weiss
Bank: Sparkasse Karlsruhe
BLZ: 660 501 01
Konto: 108 015 108
IBAN: DE 0566 0501 0101 0801 5108
BIC: KARSDE66XXX
STARFACE GmbH
Adlerstr.61
76137 Karlsruhe
Amtsgericht Mannheim
HRB 110990

After importing the page into the other document it turns into this:
––
.Tel: 9+4 )0( 127 / 151 240 – 0
:Fax. 9+4 )0( 127 / 151 240 – 9
E - :Mail c.eftara@sfino cmo
:beW c.e.swftarwaw cmo
StU - :ID 02793434E2D
:gnusrfhtüsGceäh
ainFlor zinBu
arabBra vMaue
negJrü reSgin
amsTho siseW
:kBna s kesraSap ehsurlKra
:BZL 06 105 10
t:Kno 801 510 801
:ANIB E D 50 6 6 50 10 0 1 10 1080 15 80
B:IC X6E6KSADR
EACSFATR GHbm
.16stAelrdr
3167 7 ehlsuKra
cth irseAgmt imehnMan
B RH 1 0901

I have no idea what i'm doing wrong. Any leads?

The Reference PDF's are generated in Word and each saved as a 1 page pdf.
We're doing it this way, because the customer might want to add his own "Templates" with page designs, that we add content to.

Sincerely
Fabian Zünd

-----Ursprüngliche Nachricht-----
Von: Fabian Zünd SI-Solutions Gmbh <zu...@si-solutions.ch> 
Gesendet: Mittwoch, 3. Januar 2024 10:00
An: 'users@pdfbox.apache.org' <us...@pdfbox.apache.org>
Betreff: AW: Importing landscape format and portrait format oriented pages into the same PDF causes PDF corruption

Thanks for the feedback. 

As i have no control over the PDF-Toolbox version i'll have to contact the software-developer.
They'll have to update the library that comes with the PBX.

I can deliver my own .jar files with the add-ons but loading both the 3.0.0 and 3.0.1 JAR file in the same runtime will surely lead to conflicts.

-----Ursprüngliche Nachricht-----
Von: Tilman Hausherr <TH...@t-online.de>
Gesendet: Mittwoch, 3. Januar 2024 09:47
An: users@pdfbox.apache.org
Betreff: Re: Importing landscape format and portrait format oriented pages into the same PDF causes PDF corruption

Please retry with 3.0.1 and if it still doesn't work, with the current snapshot version, because there have been several bugs related to include "foreign" pages in PDFs.
https://repository.apache.org/content/groups/snapshots/org/apache/pdfbox/pdfbox-app/3.0.2-SNAPSHOT/
Tilman

On 03.01.2024 09:43, Fabian Zünd SI-Solutions Gmbh wrote:
>
> Good Day
>
> The platform i’m developing for recently switched from PDFBox 2.X to 
> 3.0.0.
>
> I created an add-on which generates a PDF-Documentation of the PBX for 
> customers.
>
> This PDF Contains multiple A4-Pages, some in the normail Portrait 
> format, some rotated in landscape format for more space.
>
> I use «Template» pages which are single page PDF’s. (Cover Sheet.pdf, 
> Normal_page.pdf, Normal_page_landscape.pdf), of which i create a copy 
> for every page in the main pdf, based on what the user’s choice for 
> the documentation is.
>
> In 2.X i used the integrated PDFCloneUtility to create a copy of the 
> Template Page(s), and copy it to the main PDF using this:
>
> PDPage SelectedPage = PDFSource.getPage(PageNumber);
>
>               PDFCloneUtility PDC = new PDFCloneUtility(PDFTarget);
>
>               COSDictionary PD = (COSDictionary) 
> PDC.cloneForNewDocument(SelectedPage);
>
>               PDPage ClonedPage = new PDPage(PD);
>
> PDFTarget.addPage(ClonedPage);
>
> But since the PDFCloneUtility is protected in 3.0.0 i switched over to 
> using the PDDocument ImportPage Function.
>
> PDPage SelectedPage = PDFSource.getDocument().getPage(PageNumber);
>
> PDPage PDCopiedPage = PDFTarget.importPage(SelectedPage);
>
> Everything seemed fine, when testing. But when i started to generate 
> the full documentation, the finished pdf did contain all pages, but 
> adobe throws a lot of errors, and all the Landscaped pages are blank.
>
> If i only generate Portrait Pages (Generated_PDF_Portraint_only.pdf),
> or LandScape Pages (Generated_PDF_Landscape_only.pdf) everything is 
> fine, but when i mix them (Generated_PDF_Mixed.pdf), the result is broken.
>
> I don’t exactly know what could be causing this issue, i was hoping 
> somebody might have some kind of clue, where this could come from.
>
> Maybe i’m misunderstanding the importpage function, and that is not 
> actually the correct way to clone pages?
>
> Sincerely
>
> Fabian Zünd
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail:users-unsubscribe@pdfbox.apache.org
> For additional commands, e-mail:users-help@pdfbox.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: users-help@pdfbox.apache.org


AW: Importing landscape format and portrait format oriented pages into the same PDF causes PDF corruption

Posted by Fabian Zünd SI-Solutions Gmbh <zu...@si-solutions.ch>.
Thanks for the feedback. 

As i have no control over the PDF-Toolbox version i'll have to contact the software-developer.
They'll have to update the library that comes with the PBX.

I can deliver my own .jar files with the add-ons but loading both the 3.0.0 and 3.0.1 JAR file in the same runtime will surely lead to conflicts.

-----Ursprüngliche Nachricht-----
Von: Tilman Hausherr <TH...@t-online.de> 
Gesendet: Mittwoch, 3. Januar 2024 09:47
An: users@pdfbox.apache.org
Betreff: Re: Importing landscape format and portrait format oriented pages into the same PDF causes PDF corruption

Please retry with 3.0.1 and if it still doesn't work, with the current snapshot version, because there have been several bugs related to include "foreign" pages in PDFs.
https://repository.apache.org/content/groups/snapshots/org/apache/pdfbox/pdfbox-app/3.0.2-SNAPSHOT/
Tilman

On 03.01.2024 09:43, Fabian Zünd SI-Solutions Gmbh wrote:
>
> Good Day
>
> The platform i’m developing for recently switched from PDFBox 2.X to 
> 3.0.0.
>
> I created an add-on which generates a PDF-Documentation of the PBX for 
> customers.
>
> This PDF Contains multiple A4-Pages, some in the normail Portrait 
> format, some rotated in landscape format for more space.
>
> I use «Template» pages which are single page PDF’s. (Cover Sheet.pdf, 
> Normal_page.pdf, Normal_page_landscape.pdf), of which i create a copy 
> for every page in the main pdf, based on what the user’s choice for 
> the documentation is.
>
> In 2.X i used the integrated PDFCloneUtility to create a copy of the 
> Template Page(s), and copy it to the main PDF using this:
>
> PDPage SelectedPage = PDFSource.getPage(PageNumber);
>
>               PDFCloneUtility PDC = new PDFCloneUtility(PDFTarget);
>
>               COSDictionary PD = (COSDictionary) 
> PDC.cloneForNewDocument(SelectedPage);
>
>               PDPage ClonedPage = new PDPage(PD);
>
> PDFTarget.addPage(ClonedPage);
>
> But since the PDFCloneUtility is protected in 3.0.0 i switched over to 
> using the PDDocument ImportPage Function.
>
> PDPage SelectedPage = PDFSource.getDocument().getPage(PageNumber);
>
> PDPage PDCopiedPage = PDFTarget.importPage(SelectedPage);
>
> Everything seemed fine, when testing. But when i started to generate 
> the full documentation, the finished pdf did contain all pages, but 
> adobe throws a lot of errors, and all the Landscaped pages are blank.
>
> If i only generate Portrait Pages (Generated_PDF_Portraint_only.pdf),
> or LandScape Pages (Generated_PDF_Landscape_only.pdf) everything is 
> fine, but when i mix them (Generated_PDF_Mixed.pdf), the result is broken.
>
> I don’t exactly know what could be causing this issue, i was hoping 
> somebody might have some kind of clue, where this could come from.
>
> Maybe i’m misunderstanding the importpage function, and that is not 
> actually the correct way to clone pages?
>
> Sincerely
>
> Fabian Zünd
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail:users-unsubscribe@pdfbox.apache.org
> For additional commands, e-mail:users-help@pdfbox.apache.org


Re: Importing landscape format and portrait format oriented pages into the same PDF causes PDF corruption

Posted by Tilman Hausherr <TH...@t-online.de>.
Please retry with 3.0.1 and if it still doesn't work, with the current 
snapshot version, because there have been several bugs related to 
include "foreign" pages in PDFs.
https://repository.apache.org/content/groups/snapshots/org/apache/pdfbox/pdfbox-app/3.0.2-SNAPSHOT/
Tilman

On 03.01.2024 09:43, Fabian Zünd SI-Solutions Gmbh wrote:
>
> Good Day
>
> The platform i’m developing for recently switched from PDFBox 2.X to 
> 3.0.0.
>
> I created an add-on which generates a PDF-Documentation of the PBX for 
> customers.
>
> This PDF Contains multiple A4-Pages, some in the normail Portrait 
> format, some rotated in landscape format for more space.
>
> I use «Template» pages which are single page PDF’s. (Cover Sheet.pdf, 
> Normal_page.pdf, Normal_page_landscape.pdf), of which i create a copy 
> for every page in the main pdf, based on what the user’s choice for 
> the documentation is.
>
> In 2.X i used the integrated PDFCloneUtility to create a copy of the 
> Template Page(s), and copy it to the main PDF using this:
>
> PDPage SelectedPage = PDFSource.getPage(PageNumber);
>
>               PDFCloneUtility PDC = new PDFCloneUtility(PDFTarget);
>
>               COSDictionary PD = (COSDictionary) 
> PDC.cloneForNewDocument(SelectedPage);
>
>               PDPage ClonedPage = new PDPage(PD);
>
> PDFTarget.addPage(ClonedPage);
>
> But since the PDFCloneUtility is protected in 3.0.0 i switched over to 
> using the PDDocument ImportPage Function.
>
> PDPage SelectedPage = PDFSource.getDocument().getPage(PageNumber);
>
> PDPage PDCopiedPage = PDFTarget.importPage(SelectedPage);
>
> Everything seemed fine, when testing. But when i started to generate 
> the full documentation, the finished pdf did contain all pages, but 
> adobe throws a lot of errors, and all the Landscaped pages are blank.
>
> If i only generate Portrait Pages (Generated_PDF_Portraint_only.pdf), 
> or LandScape Pages (Generated_PDF_Landscape_only.pdf) everything is 
> fine, but when i mix them (Generated_PDF_Mixed.pdf), the result is broken.
>
> I don’t exactly know what could be causing this issue, i was hoping 
> somebody might have some kind of clue, where this could come from.
>
> Maybe i’m misunderstanding the importpage function, and that is not 
> actually the correct way to clone pages?
>
> Sincerely
>
> Fabian Zünd
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail:users-unsubscribe@pdfbox.apache.org
> For additional commands, e-mail:users-help@pdfbox.apache.org