You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@pdfbox.apache.org by titto agustine <ti...@hotmail.com> on 2015/10/08 16:24:50 UTC

PDFA/1B minimum requirements to pass validation

Hello ,

 I am creating a PDFA/1B document using PDFBOX. But the valiation to PDFA/1B standard is failing and showing below erro messages in

Adobe prob Preflight.

1. Metadata does not conform to XMP
2. Font not embedded(and text rendering mode not 3)


I have two questions

1. what is the minimum meta data set requirement for passing the validation.
2. Does the usage of base fonts classify for PDFA/1B standard or we need to embbed it? (The size goes too big as this is done for archiving a log files)

Can I refere some sample how this meta data and font encoding done?

Appreciate a response.

Regards
Augustine
 		 	   		  

Re: PDFA/1B minimum requirements to pass validation

Posted by Tilman Hausherr <TH...@t-online.de>.
Am 09.10.2015 um 20:13 schrieb titto agustine:
> Thank you all for your input.
>
> 1. Meta data validation .
> I resolved the Issue by adding
> A) PDFAIidentificationschema.
> B)Adobepdfschema
> C) xmpbasicschema
>
> If I add  Dublincoreschema it is not validating properly.

The best would be to upload such a file somewhere to see what happens. 
You didn't mention any actual error messages.

Tilman


>
> 2 font embedding.
>
> I just embed one true type font( large volume of files and size issue ) and it resolved that.
>
> Thanks for the valuable input and support.
>
> Regards
> Augustine
>
>
> Sent from my iPhone
>
>> On Oct 8, 2015, at 5:17 PM, Petras Petkus <pe...@mitsoft.lt> wrote:
>>
>> Hi Olaf,
>>
>> A little correction to your post.
>>> PDF/A-1b does not require the presence of any metadata at all.
>> No, it does. See ISO 19005-1 Chapter 6.7.2: "The document catalog dictionary
>> of a conforming file shall contain the Metadata key." It shall contain at
>> least PDF/A version and conformance level identification information (see
>> Chapter 6.7.11).
>>
>> With best regards,
>> Petras Petkus
>>
>> -----Original Message-----
>> From: Olaf Drümmer [mailto:olaflist@callassoftware.com]
>> Sent: Thursday, October 08, 2015 6:01 PM
>> To: users@pdfbox.apache.org
>> Cc: Olaf Drümmer
>> Subject: Re: PDFA/1B minimum requirements to pass validation
>>
>> Hi Augustine,
>>
>>> 1. Metadata does not conform to XMP
>> PDF/A-1b does not require the presence of any metadata at all. Nevertheless,
>> if metadata is present, it must be present as XMP metadata. By implication:
>> if the document Info entry for example contains the Title or Author fields,
>> these fields must also be reflected in matching XMP metadata fields, e.g.
>> dc:title (where dc is the recommended prefix for metadata fields according
>> to the Dublin Core metadata standard). For metadata fields that are
>> expressed using non-standard metadata schemas (e.g. a company specific
>> metadata schema), an "extension schema description" must also be embedded in
>> the XMP metadata stream.
>>
>> To get started I would try to create a PDF with just one metadata field, and
>> try to get that right. Depending on where you get stuck, please report back…
>>
>>
>>
>>> 2. Font not embedded(and text rendering mode not 3)
>> using the so called standard 14 fonts (without embedding them) is not an
>> option in PDF/A - even not for very simple ext centric documents like log
>> files. PDF/A reqires that alll fonts used are embedded (except for fonts
>> used in invisible text mode / text rendering mode 3). If font embedding is
>> done right, the fonts do not really need a lot of space.
>>
>>
>> Olaf
>>
>>
>>> On 08.10.2015, at 16:24, titto agustine <ti...@hotmail.com> wrote:
>>>
>>> Hello ,
>>>
>>> I am creating a PDFA/1B document using PDFBOX. But the valiation to
>> PDFA/1B standard is failing and showing below erro messages in
>>> Adobe prob Preflight.
>>>
>>> 1. Metadata does not conform to XMP
>>> 2. Font not embedded(and text rendering mode not 3)
>>>
>>>
>>> I have two questions
>>>
>>> 1. what is the minimum meta data set requirement for passing the
>> validation.
>>> 2. Does the usage of base fonts classify for PDFA/1B standard or we need
>> to embbed it? (The size goes too big as this is done for archiving a log
>> files)
>>> Can I refere some sample how this meta data and font encoding done?
>>>
>>> Appreciate a response.
>>>
>>> Regards
>>> Augustine
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
>> For additional commands, e-mail: users-help@pdfbox.apache.org
>>
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
>> For additional commands, e-mail: users-help@pdfbox.apache.org
>>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
> For additional commands, e-mail: users-help@pdfbox.apache.org
>


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: users-help@pdfbox.apache.org


Re: PDFA/1B minimum requirements to pass validation

Posted by titto agustine <ti...@hotmail.com>.
Thank you all for your input.

1. Meta data validation .
I resolved the Issue by adding 
A) PDFAIidentificationschema.
B)Adobepdfschema
C) xmpbasicschema

If I add  Dublincoreschema it is not validating properly.

2 font embedding.

I just embed one true type font( large volume of files and size issue ) and it resolved that.

Thanks for the valuable input and support.

Regards
Augustine


Sent from my iPhone

> On Oct 8, 2015, at 5:17 PM, Petras Petkus <pe...@mitsoft.lt> wrote:
> 
> Hi Olaf,
> 
> A little correction to your post.
>> PDF/A-1b does not require the presence of any metadata at all.
> 
> No, it does. See ISO 19005-1 Chapter 6.7.2: "The document catalog dictionary
> of a conforming file shall contain the Metadata key." It shall contain at
> least PDF/A version and conformance level identification information (see
> Chapter 6.7.11).
> 
> With best regards,
> Petras Petkus
> 
> -----Original Message-----
> From: Olaf Drümmer [mailto:olaflist@callassoftware.com] 
> Sent: Thursday, October 08, 2015 6:01 PM
> To: users@pdfbox.apache.org
> Cc: Olaf Drümmer
> Subject: Re: PDFA/1B minimum requirements to pass validation
> 
> Hi Augustine,
> 
>> 1. Metadata does not conform to XMP
> 
> PDF/A-1b does not require the presence of any metadata at all. Nevertheless,
> if metadata is present, it must be present as XMP metadata. By implication:
> if the document Info entry for example contains the Title or Author fields,
> these fields must also be reflected in matching XMP metadata fields, e.g.
> dc:title (where dc is the recommended prefix for metadata fields according
> to the Dublin Core metadata standard). For metadata fields that are
> expressed using non-standard metadata schemas (e.g. a company specific
> metadata schema), an "extension schema description" must also be embedded in
> the XMP metadata stream.
> 
> To get started I would try to create a PDF with just one metadata field, and
> try to get that right. Depending on where you get stuck, please report back…
> 
> 
> 
>> 2. Font not embedded(and text rendering mode not 3)
> 
> using the so called standard 14 fonts (without embedding them) is not an
> option in PDF/A - even not for very simple ext centric documents like log
> files. PDF/A reqires that alll fonts used are embedded (except for fonts
> used in invisible text mode / text rendering mode 3). If font embedding is
> done right, the fonts do not really need a lot of space.
> 
> 
> Olaf
> 
> 
>> On 08.10.2015, at 16:24, titto agustine <ti...@hotmail.com> wrote:
>> 
>> Hello ,
>> 
>> I am creating a PDFA/1B document using PDFBOX. But the valiation to
> PDFA/1B standard is failing and showing below erro messages in
>> 
>> Adobe prob Preflight.
>> 
>> 1. Metadata does not conform to XMP
>> 2. Font not embedded(and text rendering mode not 3)
>> 
>> 
>> I have two questions
>> 
>> 1. what is the minimum meta data set requirement for passing the
> validation.
>> 2. Does the usage of base fonts classify for PDFA/1B standard or we need
> to embbed it? (The size goes too big as this is done for archiving a log
> files)
>> 
>> Can I refere some sample how this meta data and font encoding done?
>> 
>> Appreciate a response.
>> 
>> Regards
>> Augustine
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
> For additional commands, e-mail: users-help@pdfbox.apache.org
> 
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
> For additional commands, e-mail: users-help@pdfbox.apache.org
> 

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: users-help@pdfbox.apache.org


RE: PDFA/1B minimum requirements to pass validation

Posted by Petras Petkus <pe...@mitsoft.lt>.
Hi Olaf,

A little correction to your post.
> PDF/A-1b does not require the presence of any metadata at all.

No, it does. See ISO 19005-1 Chapter 6.7.2: "The document catalog dictionary
of a conforming file shall contain the Metadata key." It shall contain at
least PDF/A version and conformance level identification information (see
Chapter 6.7.11).

With best regards,
Petras Petkus

-----Original Message-----
From: Olaf Drümmer [mailto:olaflist@callassoftware.com] 
Sent: Thursday, October 08, 2015 6:01 PM
To: users@pdfbox.apache.org
Cc: Olaf Drümmer
Subject: Re: PDFA/1B minimum requirements to pass validation

Hi Augustine,

> 1. Metadata does not conform to XMP

PDF/A-1b does not require the presence of any metadata at all. Nevertheless,
if metadata is present, it must be present as XMP metadata. By implication:
if the document Info entry for example contains the Title or Author fields,
these fields must also be reflected in matching XMP metadata fields, e.g.
dc:title (where dc is the recommended prefix for metadata fields according
to the Dublin Core metadata standard). For metadata fields that are
expressed using non-standard metadata schemas (e.g. a company specific
metadata schema), an "extension schema description" must also be embedded in
the XMP metadata stream.

To get started I would try to create a PDF with just one metadata field, and
try to get that right. Depending on where you get stuck, please report back…



> 2. Font not embedded(and text rendering mode not 3)

using the so called standard 14 fonts (without embedding them) is not an
option in PDF/A - even not for very simple ext centric documents like log
files. PDF/A reqires that alll fonts used are embedded (except for fonts
used in invisible text mode / text rendering mode 3). If font embedding is
done right, the fonts do not really need a lot of space.


Olaf


On 08.10.2015, at 16:24, titto agustine <ti...@hotmail.com> wrote:

> Hello ,
> 
> I am creating a PDFA/1B document using PDFBOX. But the valiation to
PDFA/1B standard is failing and showing below erro messages in
> 
> Adobe prob Preflight.
> 
> 1. Metadata does not conform to XMP
> 2. Font not embedded(and text rendering mode not 3)
> 
> 
> I have two questions
> 
> 1. what is the minimum meta data set requirement for passing the
validation.
> 2. Does the usage of base fonts classify for PDFA/1B standard or we need
to embbed it? (The size goes too big as this is done for archiving a log
files)
> 
> Can I refere some sample how this meta data and font encoding done?
> 
> Appreciate a response.
> 
> Regards
> Augustine
> 		 	   		  


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: users-help@pdfbox.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: users-help@pdfbox.apache.org


Re: PDFA/1B minimum requirements to pass validation

Posted by Tilman Hausherr <TH...@t-online.de>.
Am 08.10.2015 um 17:10 schrieb titto agustine:
> Hello Olaf,
>
> Thanks for the quick and detailed response.
>
> About the metadata, I am using the createpdfaidentificationschema() and just setting the type and part.

Their content must also be identical to what you wrote in /Info. You 
could also try the 2.0 version
https://repository.apache.org/content/groups/snapshots/org/apache/pdfbox/preflight-app/2.0.0-SNAPSHOT/
some of the error messages have been improved. The best would be you 
post them here.

But don't trust us, get a second opinion with this tool:
http://www.pdf-tools.com/pdf/validate-pdfa-online.aspx

Tilman

>
> Is it mandatory to use document information while using any schema?
>
> I will try download and load the fonts.
>
> Sent from my iPhone
>
>> On Oct 8, 2015, at 5:02 PM, Olaf Drümmer <ol...@callassoftware.com> wrote:
>>
>> Hi Augustine,
>>
>>> 1. Metadata does not conform to XMP
>> PDF/A-1b does not require the presence of any metadata at all. Nevertheless, if metadata is present, it must be present as XMP metadata. By implication: if the document Info entry for example contains the Title or Author fields, these fields must also be reflected in matching XMP metadata fields, e.g. dc:title (where dc is the recommended prefix for metadata fields according to the Dublin Core metadata standard). For metadata fields that are expressed using non-standard metadata schemas (e.g. a company specific metadata schema), an "extension schema description" must also be embedded in the XMP metadata stream.
>>
>> To get started I would try to create a PDF with just one metadata field, and try to get that right. Depending on where you get stuck, please report back…
>>
>>
>>> 2. Font not embedded(and text rendering mode not 3)
>> using the so called standard 14 fonts (without embedding them) is not an option in PDF/A - even not for very simple ext centric documents like log files. PDF/A reqires that alll fonts used are embedded (except for fonts used in invisible text mode / text rendering mode 3). If font embedding is done right, the fonts do not really need a lot of space.
>>
>>
>> Olaf
>>
>>
>>> On 08.10.2015, at 16:24, titto agustine <ti...@hotmail.com> wrote:
>>>
>>> Hello ,
>>>
>>> I am creating a PDFA/1B document using PDFBOX. But the valiation to PDFA/1B standard is failing and showing below erro messages in
>>>
>>> Adobe prob Preflight.
>>>
>>> 1. Metadata does not conform to XMP
>>> 2. Font not embedded(and text rendering mode not 3)
>>>
>>>
>>> I have two questions
>>>
>>> 1. what is the minimum meta data set requirement for passing the validation.
>>> 2. Does the usage of base fonts classify for PDFA/1B standard or we need to embbed it? (The size goes too big as this is done for archiving a log files)
>>>
>>> Can I refere some sample how this meta data and font encoding done?
>>>
>>> Appreciate a response.
>>>
>>> Regards
>>> Augustine
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
>> For additional commands, e-mail: users-help@pdfbox.apache.org
>>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
> For additional commands, e-mail: users-help@pdfbox.apache.org
>


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: users-help@pdfbox.apache.org


Re: PDFA/1B minimum requirements to pass validation

Posted by titto agustine <ti...@hotmail.com>.
Hello Olaf, 

Thanks for the quick and detailed response.

About the metadata, I am using the createpdfaidentificationschema() and just setting the type and part.

Is it mandatory to use document information while using any schema?

I will try download and load the fonts.

Sent from my iPhone

> On Oct 8, 2015, at 5:02 PM, Olaf Drümmer <ol...@callassoftware.com> wrote:
> 
> Hi Augustine,
> 
>> 1. Metadata does not conform to XMP
> 
> PDF/A-1b does not require the presence of any metadata at all. Nevertheless, if metadata is present, it must be present as XMP metadata. By implication: if the document Info entry for example contains the Title or Author fields, these fields must also be reflected in matching XMP metadata fields, e.g. dc:title (where dc is the recommended prefix for metadata fields according to the Dublin Core metadata standard). For metadata fields that are expressed using non-standard metadata schemas (e.g. a company specific metadata schema), an "extension schema description" must also be embedded in the XMP metadata stream.
> 
> To get started I would try to create a PDF with just one metadata field, and try to get that right. Depending on where you get stuck, please report back… 
> 
> 
>> 2. Font not embedded(and text rendering mode not 3)
> 
> using the so called standard 14 fonts (without embedding them) is not an option in PDF/A - even not for very simple ext centric documents like log files. PDF/A reqires that alll fonts used are embedded (except for fonts used in invisible text mode / text rendering mode 3). If font embedding is done right, the fonts do not really need a lot of space.
> 
> 
> Olaf
> 
> 
>> On 08.10.2015, at 16:24, titto agustine <ti...@hotmail.com> wrote:
>> 
>> Hello ,
>> 
>> I am creating a PDFA/1B document using PDFBOX. But the valiation to PDFA/1B standard is failing and showing below erro messages in
>> 
>> Adobe prob Preflight.
>> 
>> 1. Metadata does not conform to XMP
>> 2. Font not embedded(and text rendering mode not 3)
>> 
>> 
>> I have two questions
>> 
>> 1. what is the minimum meta data set requirement for passing the validation.
>> 2. Does the usage of base fonts classify for PDFA/1B standard or we need to embbed it? (The size goes too big as this is done for archiving a log files)
>> 
>> Can I refere some sample how this meta data and font encoding done?
>> 
>> Appreciate a response.
>> 
>> Regards
>> Augustine
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
> For additional commands, e-mail: users-help@pdfbox.apache.org
> 

Re: PDFA/1B minimum requirements to pass validation

Posted by Olaf Drümmer <ol...@callassoftware.com>.
Hi Augustine,

> 1. Metadata does not conform to XMP

PDF/A-1b does not require the presence of any metadata at all. Nevertheless, if metadata is present, it must be present as XMP metadata. By implication: if the document Info entry for example contains the Title or Author fields, these fields must also be reflected in matching XMP metadata fields, e.g. dc:title (where dc is the recommended prefix for metadata fields according to the Dublin Core metadata standard). For metadata fields that are expressed using non-standard metadata schemas (e.g. a company specific metadata schema), an "extension schema description" must also be embedded in the XMP metadata stream.

To get started I would try to create a PDF with just one metadata field, and try to get that right. Depending on where you get stuck, please report back… 


> 2. Font not embedded(and text rendering mode not 3)

using the so called standard 14 fonts (without embedding them) is not an option in PDF/A - even not for very simple ext centric documents like log files. PDF/A reqires that alll fonts used are embedded (except for fonts used in invisible text mode / text rendering mode 3). If font embedding is done right, the fonts do not really need a lot of space.


Olaf


On 08.10.2015, at 16:24, titto agustine <ti...@hotmail.com> wrote:

> Hello ,
> 
> I am creating a PDFA/1B document using PDFBOX. But the valiation to PDFA/1B standard is failing and showing below erro messages in
> 
> Adobe prob Preflight.
> 
> 1. Metadata does not conform to XMP
> 2. Font not embedded(and text rendering mode not 3)
> 
> 
> I have two questions
> 
> 1. what is the minimum meta data set requirement for passing the validation.
> 2. Does the usage of base fonts classify for PDFA/1B standard or we need to embbed it? (The size goes too big as this is done for archiving a log files)
> 
> Can I refere some sample how this meta data and font encoding done?
> 
> Appreciate a response.
> 
> Regards
> Augustine
> 		 	   		  


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: users-help@pdfbox.apache.org


Re: PDFA/1B minimum requirements to pass validation

Posted by Tilman Hausherr <TH...@t-online.de>.
Hi,

Additional to Olafs answer: there is an example in the source download 
CreatePDFA.java, orient yourself on that one :-)

Tilman

Am 08.10.2015 um 16:24 schrieb titto agustine:
> Hello ,
>
>   I am creating a PDFA/1B document using PDFBOX. But the valiation to PDFA/1B standard is failing and showing below erro messages in
>
> Adobe prob Preflight.
>
> 1. Metadata does not conform to XMP
> 2. Font not embedded(and text rendering mode not 3)
>
>
> I have two questions
>
> 1. what is the minimum meta data set requirement for passing the validation.
> 2. Does the usage of base fonts classify for PDFA/1B standard or we need to embbed it? (The size goes too big as this is done for archiving a log files)
>
> Can I refere some sample how this meta data and font encoding done?
>
> Appreciate a response.
>
> Regards
> Augustine
>   		 	   		


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: users-help@pdfbox.apache.org