You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@pdfbox.apache.org by "Andreas Lehmkühler (JIRA)" <ji...@apache.org> on 2015/03/08 15:27:38 UTC

[jira] [Commented] (PDFBOX-1792) Different metadata with NonSequentialPDFParser

    [ https://issues.apache.org/jira/browse/PDFBOX-1792?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14352060#comment-14352060 ] 

Andreas Lehmkühler commented on PDFBOX-1792:
--------------------------------------------

The testcase is in place again as Thomas reverted his changes some time ago.

The metadata of the attached pdf can't be extracted as it contains the unsupported namespace "http://ns.adobe.com/xfa/promoted-desc/"

[~msahyoun] I can't find any detailed information about that namespace. It seems to be related to Adobe Lifecylce. Can you shed some light on this?

> Different metadata with NonSequentialPDFParser
> ----------------------------------------------
>
>                 Key: PDFBOX-1792
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-1792
>             Project: PDFBox
>          Issue Type: Bug
>          Components: Parsing
>    Affects Versions: 1.8.3
>            Reporter: Tim Allison
>            Assignee: Andreas Lehmkühler
>            Priority: Minor
>         Attachments: PDFBOX-1792.tar.gz, testPDF_acroForm2.pdf
>
>
> The traditional parser is able to extract metadata from a test document from TIKA-738.  The NonSequentialPDFParser is not able to extract metadata from that file.  Another file from the Tika test suite has metadata that can be extracted by the NonSequentialPDFParser but not by classic. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: dev-help@pdfbox.apache.org