You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@pdfbox.apache.org by "Richter, Michael" <m....@tu-berlin.de> on 2019/01/11 11:22:04 UTC

PDFBox 2.0.12 creates invalid PDF/A

Hi,

I'm trying to create PDF/A files using PDFBox 2.0.12. My code is based on the example code from the PDFBox repo. It creates the PDF file and I can verify XMP data is in it containing my parameters. But tools like veraPDF and pdfPilot say it's invalid (no PDF/A info, no metadata).

So I converted the file using pdfPilot. I extracted the xpacket XML part from the resulting PDF file. And I added this data directly without using PDFBox API. And the resulting PDF is valid to both apps.

So I think the way PDFBox writes XMP data to PDF files, it gets invalid. Can someone confirm that or is there something wrong in the way I do it?

I also wrote a question on stackoverflow with further infos on this:
https://stackoverflow.com/questions/54142847/pdfbox-2-does-not-create-pdf-a-file

Greetings
--
Michael Richter
Abt. Online-Dienste und IT-Entwicklung

Technische Universität Berlin
Universitätsbibliothek
Fasanenstraße 88
10623 Berlin

www.tu-berlin.de<http://www.tu-berlin.de>

Re: PDFBox 2.0.12 creates invalid PDF/A

Posted by "Richter, Michael" <m....@tu-berlin.de>.
Interesting. Using your XML code works here too. So I need to use another XML Tranformer. Actually it should be Saxon here but I didn't verify this. I'll have a look into this. Thanks.

Michael

Am Freitag, den 11.01.2019, 12:46 +0100 schrieb Tilman Hausherr:

I just ran CreatePDF.java from the example project with only one line

changed ("id.setPart(2);") and then successfully validated it on

https://www.pdf-online.com/osa/validate.aspx

and with VeraPDF.


The XML is:



<?xpacket begin="" id="W5M0MpCehiHzreSzNTczkc9d"?><x:xmpmeta

xmlns:x="adobe:ns:meta/">

   <rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#">

     <rdf:Description xmlns:dc="http://purl.org/dc/elements/1.1/"

rdf:about="">

       <dc:title>

         <rdf:Alt>

           <rdf:li xml:lang="x-default">target/test-output/PDFA.pdf</rdf:li>

         </rdf:Alt>

       </dc:title>

     </rdf:Description>

     <rdf:Description xmlns:pdfaid="http://www.aiim.org/pdfa/ns/id/"

rdf:about="">

       <pdfaid:part>2</pdfaid:part>

       <pdfaid:conformance>B</pdfaid:conformance>

     </rdf:Description>

   </rdf:RDF>

</x:xmpmeta><?xpacket end="w"?>


I see  one difference in the title line. See this:

https://www.mail-archive.com/users@pdfbox.apache.org/msg09256.html


Tilman


Am 11.01.2019 um 12:22 schrieb Richter, Michael:

Hi,


I'm trying to create PDF/A files using PDFBox 2.0.12. My code is based on the example code from the PDFBox repo. It creates the PDF file and I can verify XMP data is in it containing my parameters. But tools like veraPDF and pdfPilot say it's invalid (no PDF/A info, no metadata).


So I converted the file using pdfPilot. I extracted the xpacket XML part from the resulting PDF file. And I added this data directly without using PDFBox API. And the resulting PDF is valid to both apps.


So I think the way PDFBox writes XMP data to PDF files, it gets invalid. Can someone confirm that or is there something wrong in the way I do it?


I also wrote a question on stackoverflow with further infos on this:

https://stackoverflow.com/questions/54142847/pdfbox-2-does-not-create-pdf-a-file


Greetings

--

Michael Richter

Abt. Online-Dienste und IT-Entwicklung


Technische Universität Berlin

Universitätsbibliothek

Fasanenstraße 88

10623 Berlin


www.tu-berlin.de<http://www.tu-berlin.de<http://www.tu-berlin.de<http://www.tu-berlin.de>>




---------------------------------------------------------------------

To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org<ma...@pdfbox.apache.org>

For additional commands, e-mail: users-help@pdfbox.apache.org<ma...@pdfbox.apache.org>



Re: PDFBox 2.0.12 creates invalid PDF/A

Posted by Tilman Hausherr <TH...@t-online.de>.
I just ran CreatePDF.java from the example project with only one line 
changed ("id.setPart(2);") and then successfully validated it on
https://www.pdf-online.com/osa/validate.aspx
and with VeraPDF.

The XML is:


<?xpacket begin="" id="W5M0MpCehiHzreSzNTczkc9d"?><x:xmpmeta 
xmlns:x="adobe:ns:meta/">
   <rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#">
     <rdf:Description xmlns:dc="http://purl.org/dc/elements/1.1/" 
rdf:about="">
       <dc:title>
         <rdf:Alt>
           <rdf:li xml:lang="x-default">target/test-output/PDFA.pdf</rdf:li>
         </rdf:Alt>
       </dc:title>
     </rdf:Description>
     <rdf:Description xmlns:pdfaid="http://www.aiim.org/pdfa/ns/id/" 
rdf:about="">
       <pdfaid:part>2</pdfaid:part>
       <pdfaid:conformance>B</pdfaid:conformance>
     </rdf:Description>
   </rdf:RDF>
</x:xmpmeta><?xpacket end="w"?>

I see  one difference in the title line. See this:
https://www.mail-archive.com/users@pdfbox.apache.org/msg09256.html

Tilman

Am 11.01.2019 um 12:22 schrieb Richter, Michael:
> Hi,
>
> I'm trying to create PDF/A files using PDFBox 2.0.12. My code is based on the example code from the PDFBox repo. It creates the PDF file and I can verify XMP data is in it containing my parameters. But tools like veraPDF and pdfPilot say it's invalid (no PDF/A info, no metadata).
>
> So I converted the file using pdfPilot. I extracted the xpacket XML part from the resulting PDF file. And I added this data directly without using PDFBox API. And the resulting PDF is valid to both apps.
>
> So I think the way PDFBox writes XMP data to PDF files, it gets invalid. Can someone confirm that or is there something wrong in the way I do it?
>
> I also wrote a question on stackoverflow with further infos on this:
> https://stackoverflow.com/questions/54142847/pdfbox-2-does-not-create-pdf-a-file
>
> Greetings
> --
> Michael Richter
> Abt. Online-Dienste und IT-Entwicklung
>
> Technische Universität Berlin
> Universitätsbibliothek
> Fasanenstraße 88
> 10623 Berlin
>
> www.tu-berlin.de<http://www.tu-berlin.de>



---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: users-help@pdfbox.apache.org