You are viewing a plain text version of this content. The canonical link for it is here.
Posted to fop-users@xmlgraphics.apache.org by Wim VN <wi...@gmail.com> on 2010/10/27 14:40:03 UTC

PDF/A validation

http://www.validatepdfa.com/ http://www.validatepdfa.com/ 


The above website allows you to send in pdf files (attached to an e-mail),
returning you a - rather detailed - report about the pdf/A conformity. In my
experience the report could easily inform you of errors and inform you of
not passing the conformity tests, while opening the same pdf file in Acrobat
Reader indicates the file is PDF/A conform.


I thought this could help out FOP users who wish to create PDF/A but don't
have the means of testing the conformity. I know it's been helpful in my
recent project and I saw some posts in the last few weeks about it.


It might even be helpfull to the Apache FOP developers. Though Acrobat
Reader indicated a file I created with FOP 1.0 as being conform, the
validation service still mentioned failure and following issues:




  problem severity="warning" objectID="8" clause="p32" standard="xmp">XMP
packet read-only

  problem severity="warning" objectID="8" clause="p33"
standard="xmp">Recommended padding is not found in the XMP packet trailer

  problem severity="error" objectID="8" clause="6.7"
standard="pdfa">Property 'dc:date' used incorrect value type 'simple'
instead of 'rdf:Seq'

  problem severity="warning" objectID="8" clause="TN0001"
standard="pdfa">Recommended property 'format' for schema 'dc' missing



Possibly other validators like preflight and pdf/a manager will also stumble
over these issues, while Acrobat Reader is less strict.


I hope this helps some people in their projects.

Regards

Wim

-- 
View this message in context: http://old.nabble.com/PDF-A-validation-tp30066770p30066770.html
Sent from the FOP - Users mailing list archive at Nabble.com.

Re: PDF/A validation

Posted by Tor-Einar Jarnbjo <to...@jarnbjo.name>.
Am 27.10.2010 17:57, schrieb Jeremias Maerki:
>>    problem severity="warning" objectID="8" clause="p33"
>> standard="xmp">Recommended padding is not found in the XMP packet trailer
> Same as above (a writable packet implies the padding). I have not found
> any recommendation in XMP, PDF 1.4 or PDF/A-1 about this. As mentioned
> above, the recommended padding is 2-4KB which is rather large especially
> for smaller files. Maybe this could be made configurable if someone
> really wants the padding.

Hi Jeremias,

the "p33" clause in the message actually means "page 33" of the XMP 
specification, and there it stated that:

"It is recommended that applications place 2 KB to 4 KB of padding 
within the packet. This
allows the XMP to be edited in place, and expanded if necessary, without 
overwriting existing
application data ..."

The severity is a warning, so the missing padding is not breaking PDF/A 
conformance, but as you can see, it obviously violates the quoted 
recommendation.

Regards,
Tor


---------------------------------------------------------------------
To unsubscribe, e-mail: fop-users-unsubscribe@xmlgraphics.apache.org
For additional commands, e-mail: fop-users-help@xmlgraphics.apache.org


Re: PDF/A validation

Posted by Wim VN <wi...@gmail.com>.
8 0 obj
<</Type/Metadata/Subtype/XML/Length 2847>>stream
<?xpacket begin="" id="W5M0MpCehiHzreSzNTczkc9d"?>
<x:xmpmeta xmlns:x="adobe:ns:meta/">
<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#">
<rdf:Description rdf:about=""
xmlns:dc="http://purl.org/dc/elements/1.1/"><dc:format>application/pdf</dc:format></rdf:Description>
<rdf:Description rdf:about=""
xmlns:pdf="http://ns.adobe.com/pdf/1.3/"><pdf:Producer>iText 5.0.5 (c) 1T3XT
BVBA</pdf:Producer></rdf:Description>
<rdf:Description rdf:about=""
xmlns:xmp="http://ns.adobe.com/xap/1.0/"><xmp:CreateDate>2010-11-19T11:59:33+01:00</xmp:CreateDate><xmp:ModifyDate>2010-11-19T11:59:33+01:00</xmp:ModifyDate></rdf:Description>
<rdf:Description rdf:about=""
xmlns:pdfaid="http://www.aiim.org/pdfa/ns/id/"><pdfaid:conformance>B</pdfaid:conformance><pdfaid:part>1</pdfaid:part></rdf:Description>
</rdf:RDF></x:xmpmeta>
                                                                                                   
                                                                                                   
                                                                                                   
                                                                                                   
                                                                                                   
                                                                                                   
                                                                                                   
                                                                                                   
                                                                                                   
                                                                                                   
                                                                                                   
                                                                                                   
                                                                                                   
                                                                                                   
                                                                                                   
                                                                                                   
                                                                                                   
                                                                                                   
                                                                                                   
                                                                                                   
<?xpacket end="w"?>
endstream
endobj

------

The above snippet is part of a pdf/a created by use of iText. They obviously
follow those recommendations closely. Indeed, the validator service I've
mentioned returns with a report stating the testfile is PDF/A-1b compliant.

Adding this to Apache FOP - either by default or configurable - would be
welcome. I can imagine it's not high on the demand list though. Maybe I
should start a petition ;-)

Wim


Jeremias Maerki-2 wrote:
> 
> Thank you for the link, Wim! That is very useful.
> 
> I'll look into these issues. Below I'll list my comments to the
> individual findings...
> 
> I have not found anything in the XMP, PDF 1.4 or PDF/A-1 specs that
> indicates that the XMP packet may not be read-only. Of course, using
> writable XMP packet opens up additional flexibility. PDF allows
> incremental changes to a document thus providing the ability to override
> the metadata. A writable packet increases PDF file size by 2-4KB just
> for the possibility that a non-PDF-specific tool wants to update the XMP
> packet. I know of no such commonly used tool.
> 
> Same as above (a writable packet implies the padding). I have not found
> any recommendation in XMP, PDF 1.4 or PDF/A-1 about this. As mentioned
> above, the recommended padding is 2-4KB which is rather large especially
> for smaller files. Maybe this could be made configurable if someone
> really wants the padding.
> 
> ...
> 
> 
-- 
View this message in context: http://old.nabble.com/PDF-A-validation-tp30066770p30256667.html
Sent from the FOP - Users mailing list archive at Nabble.com.


---------------------------------------------------------------------
To unsubscribe, e-mail: fop-users-unsubscribe@xmlgraphics.apache.org
For additional commands, e-mail: fop-users-help@xmlgraphics.apache.org


Re: PDF/A validation

Posted by Jeremias Maerki <de...@jeremias-maerki.ch>.
Thank you for the link, Wim! That is very useful.

I'll look into these issues. Below I'll list my comments to the
individual findings...

On 27.10.2010 13:40:03 Wim VN wrote:
> 
> http://www.validatepdfa.com/ http://www.validatepdfa.com/ 
> 
> 
> The above website allows you to send in pdf files (attached to an e-mail),
> returning you a - rather detailed - report about the pdf/A conformity. In my
> experience the report could easily inform you of errors and inform you of
> not passing the conformity tests, while opening the same pdf file in Acrobat
> Reader indicates the file is PDF/A conform.
> 
> 
> I thought this could help out FOP users who wish to create PDF/A but don't
> have the means of testing the conformity. I know it's been helpful in my
> recent project and I saw some posts in the last few weeks about it.
> 
> 
> It might even be helpfull to the Apache FOP developers. Though Acrobat
> Reader indicated a file I created with FOP 1.0 as being conform, the
> validation service still mentioned failure and following issues:
> 
> 
> 
> 
>   problem severity="warning" objectID="8" clause="p32" standard="xmp">XMP
> packet read-only

I have not found anything in the XMP, PDF 1.4 or PDF/A-1 specs that
indicates that the XMP packet may not be read-only. Of course, using
writable XMP packet opens up additional flexibility. PDF allows
incremental changes to a document thus providing the ability to override
the metadata. A writable packet increases PDF file size by 2-4KB just
for the possibility that a non-PDF-specific tool wants to update the XMP
packet. I know of no such commonly used tool.

>   problem severity="warning" objectID="8" clause="p33"
> standard="xmp">Recommended padding is not found in the XMP packet trailer

Same as above (a writable packet implies the padding). I have not found
any recommendation in XMP, PDF 1.4 or PDF/A-1 about this. As mentioned
above, the recommended padding is 2-4KB which is rather large especially
for smaller files. Maybe this could be made configurable if someone
really wants the padding.

>   problem severity="error" objectID="8" clause="6.7"
> standard="pdfa">Property 'dc:date' used incorrect value type 'simple'
> instead of 'rdf:Seq'

See https://issues.apache.org/bugzilla/show_bug.cgi?id=49499
ISO 19005:1:2005(E) (PDF-A/1) references the XMP specification from
January 2004 which does not explicitely require values to be in the
exact form that they are specified in the schema. There are various ways
a value can be represented in RDF/XML. Currently, XML Graphics Commons
does not normalize values. Only the XMP spec from 2008 seems to have
some comment concerning this, although I can't find the reference
anymore. But maybe this actually refers to clause "6.7.7 Normalization"
in PDF/A-1 although the text is a bit cryptic.

Anyway, it should be relatively simple to add the normalization. But I
make no promises when (and if) I get to this.

>   problem severity="warning" objectID="8" clause="TN0001"
> standard="pdfa">Recommended property 'format' for schema 'dc' missing

That is easily added, but again, I found no requirement anywhere that
mentions this. I guess that's the reason why these are warnings and no
errors.

> 
> Possibly other validators like preflight and pdf/a manager will also stumble
> over these issues, while Acrobat Reader is less strict.
> 
> 
> I hope this helps some people in their projects.
> 
> Regards
> 
> Wim
> 
> -- 
> View this message in context: http://old.nabble.com/PDF-A-validation-tp30066770p30066770.html
> Sent from the FOP - Users mailing list archive at Nabble.com.




Jeremias Maerki


---------------------------------------------------------------------
To unsubscribe, e-mail: fop-users-unsubscribe@xmlgraphics.apache.org
For additional commands, e-mail: fop-users-help@xmlgraphics.apache.org