You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pdfbox.apache.org by "Eric Leleu (JIRA)" <ji...@apache.org> on 2012/07/14 22:09:34 UTC

[jira] [Commented] (PDFBOX-1312) Refactor the PdfA parser

    [ https://issues.apache.org/jira/browse/PDFBOX-1312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13414487#comment-13414487 ] 

Eric Leleu commented on PDFBOX-1312:
------------------------------------

Hi,

The refactor of Font validation is committed. 
I hope this new version is more clear than the older. 

Now all font validators have 4 steps :
- check that all mandatory field are present
- process the FontDescription validation
- check the encoding rules
- check rules linked with the toUnicode (nothing to do in PDF/A-1b)

Font Descriptors are processed by specific classes and have 4 steps :
- check that all mandatory field are present
- extract the FontFile stream
- process the font file stream in order to compute Glyph Widths
- check Font MetaData entry

There are two exception on this :
- Type3 Font that has specific behaviour
- CompositeFont that calls a FontValidator for the DescendantFont instead of calling a FontDescriptor

Like for the older version, FontContainer objects contain information to allow the Glyph width validation. 

Any feedbacks are welcomes.

BR,
Eric

PS : I will be off line during one week
                
> Refactor the PdfA parser
> ------------------------
>
>                 Key: PDFBOX-1312
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-1312
>             Project: PDFBox
>          Issue Type: Improvement
>          Components: Preflight
>    Affects Versions: 1.7.0
>            Reporter: Eric Leleu
>            Assignee: Eric Leleu
>             Fix For: 1.8.0
>
>         Attachments: patch-PDFBOX-1312.txt.gz
>
>
> To fix the PDFBox-1274 issue, the  validation of PDF/A needs a refactoring.
> Currently, each XRef entry is checked independently. 
> Most of the time, this is enough because the required information to validate the object are present in the object.
> For the issue PDFBox-1274,  the object validation should access to the page that uses the object.
> After the refactoring the valdiation unit will be the PDPage.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira