You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@tika.apache.org by "Tyler Palsulich (JIRA)" <ji...@apache.org> on 2015/03/15 23:26:38 UTC
[jira] [Commented] (TIKA-1203) Some metadata not extracted from PDF
files when NonSequentialPDFParser is used
[ https://issues.apache.org/jira/browse/TIKA-1203?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14362596#comment-14362596 ]
Tyler Palsulich commented on TIKA-1203:
---------------------------------------
This looks like it's just a date formatting issue:
{{PDFParserTest.testPdfParsing:104 expected:<\[Sat Sep 15 10:02:31 BST 2007]> but was:<\[2007-09-15T09:02:31Z]>}}
> Some metadata not extracted from PDF files when NonSequentialPDFParser is used
> ------------------------------------------------------------------------------
>
> Key: TIKA-1203
> URL: https://issues.apache.org/jira/browse/TIKA-1203
> Project: Tika
> Issue Type: Bug
> Components: parser
> Reporter: Tim Allison
> Priority: Minor
>
> While working on TIKA-1201, I noticed that metadata was not being extracted from the testAnnotations.pdf file when the NonSequentialPDFParser was being used. I opened PDFBOX-1792. This TIKA issue is a placeholder. When PDFBOX-1792 is fixed, we can stop skipping "testAnnotations.pdf" in PDFParserTest.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)