You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pdfbox.apache.org by "Eric Leleu (JIRA)" <ji...@apache.org> on 2012/05/16 09:28:10 UTC

[jira] [Created] (PDFBOX-1312) Refactor the PdfA parser

Eric Leleu created PDFBOX-1312:
----------------------------------

             Summary: Refactor the PdfA parser
                 Key: PDFBOX-1312
                 URL: https://issues.apache.org/jira/browse/PDFBOX-1312
             Project: PDFBox
          Issue Type: Improvement
          Components: Preflight
    Affects Versions: 1.7.0
            Reporter: Eric Leleu
            Assignee: Eric Leleu


To fix the PDFBox-1274 issue, the  validation of PDF/A needs a refactoring.
Currently, each XRef entry is checked independently. 
Most of the time, this is enough because the required information to validate the object are present in the object.

For the issue PDFBox-1274,  the object validation should access to the page that uses the object.

After the refactoring the valdiation unit will be the PDPage.



--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (PDFBOX-1312) Refactor the PdfA parser

Posted by "Eric Leleu (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/PDFBOX-1312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13397567#comment-13397567 ] 

Eric Leleu commented on PDFBOX-1312:
------------------------------------

Hi,


There are a new directory 'pdfbox' under apache. (To be consistent with the package naming convention of others PDFBox modules)
I worked in this new package to develop the refactored preflight version.

When the new preflight implementation will be validated, the 'padaf' directory will be removed.


BR,
Eric
                
> Refactor the PdfA parser
> ------------------------
>
>                 Key: PDFBOX-1312
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-1312
>             Project: PDFBox
>          Issue Type: Improvement
>          Components: Preflight
>    Affects Versions: 1.7.0
>            Reporter: Eric Leleu
>            Assignee: Eric Leleu
>             Fix For: 1.8.0
>
>         Attachments: patch-PDFBOX-1312.txt.gz
>
>
> To fix the PDFBox-1274 issue, the  validation of PDF/A needs a refactoring.
> Currently, each XRef entry is checked independently. 
> Most of the time, this is enough because the required information to validate the object are present in the object.
> For the issue PDFBox-1274,  the object validation should access to the page that uses the object.
> After the refactoring the valdiation unit will be the PDPage.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (PDFBOX-1312) Refactor the PdfA parser

Posted by "Eric Leleu (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/PDFBOX-1312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13403823#comment-13403823 ] 

Eric Leleu commented on PDFBOX-1312:
------------------------------------

Hi,

I have just committed the previous patch with some minor changes. One of this change is the renaming of the validateDocument() method of the PreflightDocument to validate().

     PreflightParser parser = new PreflightParser(new FileDataSource(target));
     parser.parse();
     PreflightDocument document = (PreflightDocument) parser.getPDDocument();
     document.validate();
     Assert.assertTrue(document.getResult().isValid()); 

Some work still have to be done, the most important is the refactor the way to validate Font requirements.

BR,
Eric

                
> Refactor the PdfA parser
> ------------------------
>
>                 Key: PDFBOX-1312
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-1312
>             Project: PDFBox
>          Issue Type: Improvement
>          Components: Preflight
>    Affects Versions: 1.7.0
>            Reporter: Eric Leleu
>            Assignee: Eric Leleu
>             Fix For: 1.8.0
>
>         Attachments: patch-PDFBOX-1312.txt.gz
>
>
> To fix the PDFBox-1274 issue, the  validation of PDF/A needs a refactoring.
> Currently, each XRef entry is checked independently. 
> Most of the time, this is enough because the required information to validate the object are present in the object.
> For the issue PDFBox-1274,  the object validation should access to the page that uses the object.
> After the refactoring the valdiation unit will be the PDPage.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (PDFBOX-1312) Refactor the PdfA parser

Posted by "William Fausser (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/PDFBOX-1312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13430286#comment-13430286 ] 

William Fausser commented on PDFBOX-1312:
-----------------------------------------

Hi Eric,

I'm OK with removing the "old implementation" of Preflight.

Regards,
Bill
                
> Refactor the PdfA parser
> ------------------------
>
>                 Key: PDFBOX-1312
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-1312
>             Project: PDFBox
>          Issue Type: Improvement
>          Components: Preflight
>    Affects Versions: 1.7.0
>            Reporter: Eric Leleu
>            Assignee: Eric Leleu
>             Fix For: 1.8.0
>
>         Attachments: patch-PDFBOX-1312.txt.gz
>
>
> To fix the PDFBox-1274 issue, the  validation of PDF/A needs a refactoring.
> Currently, each XRef entry is checked independently. 
> Most of the time, this is enough because the required information to validate the object are present in the object.
> For the issue PDFBox-1274,  the object validation should access to the page that uses the object.
> After the refactoring the valdiation unit will be the PDPage.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (PDFBOX-1312) Refactor the PdfA parser

Posted by "William Fausser (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/PDFBOX-1312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13403877#comment-13403877 ] 

William Fausser commented on PDFBOX-1312:
-----------------------------------------

Hi Eric,

  Thank You for all the progress and work so far.... :)


I got it compiled and will test soon apres de vacances.

Regards,
Bill
                
> Refactor the PdfA parser
> ------------------------
>
>                 Key: PDFBOX-1312
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-1312
>             Project: PDFBox
>          Issue Type: Improvement
>          Components: Preflight
>    Affects Versions: 1.7.0
>            Reporter: Eric Leleu
>            Assignee: Eric Leleu
>             Fix For: 1.8.0
>
>         Attachments: patch-PDFBOX-1312.txt.gz
>
>
> To fix the PDFBox-1274 issue, the  validation of PDF/A needs a refactoring.
> Currently, each XRef entry is checked independently. 
> Most of the time, this is enough because the required information to validate the object are present in the object.
> For the issue PDFBox-1274,  the object validation should access to the page that uses the object.
> After the refactoring the valdiation unit will be the PDPage.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (PDFBOX-1312) Refactor the PdfA parser

Posted by "Eric Leleu (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/PDFBOX-1312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13429428#comment-13429428 ] 

Eric Leleu commented on PDFBOX-1312:
------------------------------------

Hi,


Can I remove the "old implementation" of Preflight to use only the new one?
Currently all my test files return the expected result  (226 pdf invalid [Isartor & some generated pdf], 63 Pdf/A-1b valid)  


BR,
Eric
                
> Refactor the PdfA parser
> ------------------------
>
>                 Key: PDFBOX-1312
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-1312
>             Project: PDFBox
>          Issue Type: Improvement
>          Components: Preflight
>    Affects Versions: 1.7.0
>            Reporter: Eric Leleu
>            Assignee: Eric Leleu
>             Fix For: 1.8.0
>
>         Attachments: patch-PDFBOX-1312.txt.gz
>
>
> To fix the PDFBox-1274 issue, the  validation of PDF/A needs a refactoring.
> Currently, each XRef entry is checked independently. 
> Most of the time, this is enough because the required information to validate the object are present in the object.
> For the issue PDFBox-1274,  the object validation should access to the page that uses the object.
> After the refactoring the valdiation unit will be the PDPage.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (PDFBOX-1312) Refactor the PdfA parser

Posted by "William Fausser (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/PDFBOX-1312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13399300#comment-13399300 ] 

William Fausser commented on PDFBOX-1312:
-----------------------------------------

whoops found it......

 public abstract void validate() throws ValidationException; in XObjectValidator.java
                
> Refactor the PdfA parser
> ------------------------
>
>                 Key: PDFBOX-1312
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-1312
>             Project: PDFBox
>          Issue Type: Improvement
>          Components: Preflight
>    Affects Versions: 1.7.0
>            Reporter: Eric Leleu
>            Assignee: Eric Leleu
>             Fix For: 1.8.0
>
>         Attachments: patch-PDFBOX-1312.txt.gz
>
>
> To fix the PDFBox-1274 issue, the  validation of PDF/A needs a refactoring.
> Currently, each XRef entry is checked independently. 
> Most of the time, this is enough because the required information to validate the object are present in the object.
> For the issue PDFBox-1274,  the object validation should access to the page that uses the object.
> After the refactoring the valdiation unit will be the PDPage.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (PDFBOX-1312) Refactor the PdfA parser

Posted by "Eric Leleu (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/PDFBOX-1312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13414487#comment-13414487 ] 

Eric Leleu commented on PDFBOX-1312:
------------------------------------

Hi,

The refactor of Font validation is committed. 
I hope this new version is more clear than the older. 

Now all font validators have 4 steps :
- check that all mandatory field are present
- process the FontDescription validation
- check the encoding rules
- check rules linked with the toUnicode (nothing to do in PDF/A-1b)

Font Descriptors are processed by specific classes and have 4 steps :
- check that all mandatory field are present
- extract the FontFile stream
- process the font file stream in order to compute Glyph Widths
- check Font MetaData entry

There are two exception on this :
- Type3 Font that has specific behaviour
- CompositeFont that calls a FontValidator for the DescendantFont instead of calling a FontDescriptor

Like for the older version, FontContainer objects contain information to allow the Glyph width validation. 

Any feedbacks are welcomes.

BR,
Eric

PS : I will be off line during one week
                
> Refactor the PdfA parser
> ------------------------
>
>                 Key: PDFBOX-1312
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-1312
>             Project: PDFBox
>          Issue Type: Improvement
>          Components: Preflight
>    Affects Versions: 1.7.0
>            Reporter: Eric Leleu
>            Assignee: Eric Leleu
>             Fix For: 1.8.0
>
>         Attachments: patch-PDFBOX-1312.txt.gz
>
>
> To fix the PDFBox-1274 issue, the  validation of PDF/A needs a refactoring.
> Currently, each XRef entry is checked independently. 
> Most of the time, this is enough because the required information to validate the object are present in the object.
> For the issue PDFBox-1274,  the object validation should access to the page that uses the object.
> After the refactoring the valdiation unit will be the PDPage.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (PDFBOX-1312) Refactor the PdfA parser

Posted by "Eric Leleu (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/PDFBOX-1312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13435133#comment-13435133 ] 

Eric Leleu commented on PDFBOX-1312:
------------------------------------

Hi, 

Old implementation has just been removed.

BR
Eric
                
> Refactor the PdfA parser
> ------------------------
>
>                 Key: PDFBOX-1312
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-1312
>             Project: PDFBox
>          Issue Type: Improvement
>          Components: Preflight
>    Affects Versions: 1.7.0
>            Reporter: Eric Leleu
>            Assignee: Eric Leleu
>             Fix For: 1.8.0
>
>         Attachments: patch-PDFBOX-1312.txt.gz
>
>
> To fix the PDFBox-1274 issue, the  validation of PDF/A needs a refactoring.
> Currently, each XRef entry is checked independently. 
> Most of the time, this is enough because the required information to validate the object are present in the object.
> For the issue PDFBox-1274,  the object validation should access to the page that uses the object.
> After the refactoring the valdiation unit will be the PDPage.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (PDFBOX-1312) Refactor the PdfA parser

Posted by "William Fausser (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/PDFBOX-1312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13399283#comment-13399283 ] 

William Fausser commented on PDFBOX-1312:
-----------------------------------------

Hi Eric,
                 I've been compiling trying to get the patch installed and my brain is starting to hurt...... :)

I can't seem to get past this type of error:
/home/fausser/pdfbox-1.7.0/preflight/src/main/java/org/apache/pdfbox/preflight/xobject/AbstractXObjValidator.java:[149,13] validate() in org.apache.pdfbox.preflight.xobject.AbstractXObjValidator cannot implement validate() in org.apache.pdfbox.preflight.xobject.XObjectValidator; attempting to use incompatible return type
found   : void
required: java.util.List<org.apache.pdfbox.preflight.ValidationResult.ValidationError>

Any ideas on the above error?

BR,
Bill

                
> Refactor the PdfA parser
> ------------------------
>
>                 Key: PDFBOX-1312
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-1312
>             Project: PDFBox
>          Issue Type: Improvement
>          Components: Preflight
>    Affects Versions: 1.7.0
>            Reporter: Eric Leleu
>            Assignee: Eric Leleu
>             Fix For: 1.8.0
>
>         Attachments: patch-PDFBOX-1312.txt.gz
>
>
> To fix the PDFBox-1274 issue, the  validation of PDF/A needs a refactoring.
> Currently, each XRef entry is checked independently. 
> Most of the time, this is enough because the required information to validate the object are present in the object.
> For the issue PDFBox-1274,  the object validation should access to the page that uses the object.
> After the refactoring the valdiation unit will be the PDPage.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (PDFBOX-1312) Refactor the PdfA parser

Posted by "William Fausser (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/PDFBOX-1312?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

William Fausser updated PDFBOX-1312:
------------------------------------

    Attachment:     (was: ghostpdlPDFA.pdf)
    
> Refactor the PdfA parser
> ------------------------
>
>                 Key: PDFBOX-1312
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-1312
>             Project: PDFBox
>          Issue Type: Improvement
>          Components: Preflight
>    Affects Versions: 1.7.0
>            Reporter: Eric Leleu
>            Assignee: Eric Leleu
>             Fix For: 1.8.0
>
>         Attachments: patch-PDFBOX-1312.txt.gz
>
>
> To fix the PDFBox-1274 issue, the  validation of PDF/A needs a refactoring.
> Currently, each XRef entry is checked independently. 
> Most of the time, this is enough because the required information to validate the object are present in the object.
> For the issue PDFBox-1274,  the object validation should access to the page that uses the object.
> After the refactoring the valdiation unit will be the PDPage.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (PDFBOX-1312) Refactor the PdfA parser

Posted by "Eric Leleu (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/PDFBOX-1312?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Eric Leleu updated PDFBOX-1312:
-------------------------------

    Fix Version/s: 1.8.0
    
> Refactor the PdfA parser
> ------------------------
>
>                 Key: PDFBOX-1312
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-1312
>             Project: PDFBox
>          Issue Type: Improvement
>          Components: Preflight
>    Affects Versions: 1.7.0
>            Reporter: Eric Leleu
>            Assignee: Eric Leleu
>             Fix For: 1.8.0
>
>
> To fix the PDFBox-1274 issue, the  validation of PDF/A needs a refactoring.
> Currently, each XRef entry is checked independently. 
> Most of the time, this is enough because the required information to validate the object are present in the object.
> For the issue PDFBox-1274,  the object validation should access to the page that uses the object.
> After the refactoring the valdiation unit will be the PDPage.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (PDFBOX-1312) Refactor the PdfA parser

Posted by "Eric Leleu (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/PDFBOX-1312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13462164#comment-13462164 ] 

Eric Leleu commented on PDFBOX-1312:
------------------------------------

Syntax validation is done by the Preflight Parser (r1389604)
                
> Refactor the PdfA parser
> ------------------------
>
>                 Key: PDFBOX-1312
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-1312
>             Project: PDFBox
>          Issue Type: Improvement
>          Components: Preflight
>    Affects Versions: 1.7.0
>            Reporter: Eric Leleu
>            Assignee: Eric Leleu
>             Fix For: 1.8.0
>
>         Attachments: patch-PDFBOX-1312.txt.gz
>
>
> To fix the PDFBox-1274 issue, the  validation of PDF/A needs a refactoring.
> Currently, each XRef entry is checked independently. 
> Most of the time, this is enough because the required information to validate the object are present in the object.
> For the issue PDFBox-1274,  the object validation should access to the page that uses the object.
> After the refactoring the valdiation unit will be the PDPage.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (PDFBOX-1312) Refactor the PdfA parser

Posted by "William Fausser (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/PDFBOX-1312?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

William Fausser updated PDFBOX-1312:
------------------------------------

    Attachment: ghostpdlPDFA.pdf
    
> Refactor the PdfA parser
> ------------------------
>
>                 Key: PDFBOX-1312
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-1312
>             Project: PDFBox
>          Issue Type: Improvement
>          Components: Preflight
>    Affects Versions: 1.7.0
>            Reporter: Eric Leleu
>            Assignee: Eric Leleu
>             Fix For: 1.8.0
>
>         Attachments: ghostpdlPDFA.pdf, patch-PDFBOX-1312.txt.gz
>
>
> To fix the PDFBox-1274 issue, the  validation of PDF/A needs a refactoring.
> Currently, each XRef entry is checked independently. 
> Most of the time, this is enough because the required information to validate the object are present in the object.
> For the issue PDFBox-1274,  the object validation should access to the page that uses the object.
> After the refactoring the valdiation unit will be the PDPage.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (PDFBOX-1312) Refactor the PdfA parser

Posted by "Eric Leleu (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/PDFBOX-1312?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Eric Leleu updated PDFBOX-1312:
-------------------------------

    Attachment: patch-PDFBOX-1312.txt.gz

Hi,

In attachment you can find a patch that contains the first version of the new preflight implementation. (patch-PDFBOX-1312.txt.gz)

This new implementation and all underlying classes are in the "org.apache.pdfbox.preflight" package. 

This implementation succeed Isartor Test Suite.

Here is the new way to validate a document :

PreflightParser parser = new PreflightParser(new FileDataSource(target));
parser.parse();

PreflightDocument document = (PreflightDocument) parser.getPDDocument();
document.validateDocument();
  		
Assert.assertTrue(document.getResult().isValid());


If you encounter some issue with this new version, please update this thread.

BR,
Eric
                
> Refactor the PdfA parser
> ------------------------
>
>                 Key: PDFBOX-1312
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-1312
>             Project: PDFBox
>          Issue Type: Improvement
>          Components: Preflight
>    Affects Versions: 1.7.0
>            Reporter: Eric Leleu
>            Assignee: Eric Leleu
>             Fix For: 1.8.0
>
>         Attachments: patch-PDFBOX-1312.txt.gz
>
>
> To fix the PDFBox-1274 issue, the  validation of PDF/A needs a refactoring.
> Currently, each XRef entry is checked independently. 
> Most of the time, this is enough because the required information to validate the object are present in the object.
> For the issue PDFBox-1274,  the object validation should access to the page that uses the object.
> After the refactoring the valdiation unit will be the PDPage.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (PDFBOX-1312) Refactor the PdfA parser

Posted by "William Fausser (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/PDFBOX-1312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13397506#comment-13397506 ] 

William Fausser commented on PDFBOX-1312:
-----------------------------------------

Hi Eric,

Question on the patch.  Under pdfbox-1.7.0 the directory structure /preflight/src/main/java/org/apache/padaf/preflight is followed.  My question is
regarding the patch and reference to /preflight/src/main/java/org/apache/pdfbox/preflight.  Are you creating  a new directory 'pdfbox' under
apache along with padaf or is this a mistake?  Apologies if I'm over looking something........

Regards,
Bill
                
> Refactor the PdfA parser
> ------------------------
>
>                 Key: PDFBOX-1312
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-1312
>             Project: PDFBox
>          Issue Type: Improvement
>          Components: Preflight
>    Affects Versions: 1.7.0
>            Reporter: Eric Leleu
>            Assignee: Eric Leleu
>             Fix For: 1.8.0
>
>         Attachments: patch-PDFBOX-1312.txt.gz
>
>
> To fix the PDFBox-1274 issue, the  validation of PDF/A needs a refactoring.
> Currently, each XRef entry is checked independently. 
> Most of the time, this is enough because the required information to validate the object are present in the object.
> For the issue PDFBox-1274,  the object validation should access to the page that uses the object.
> After the refactoring the valdiation unit will be the PDPage.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira