You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@tika.apache.org by "Nicolas Guillaumin (JIRA)" <ji...@apache.org> on 2013/08/16 09:31:49 UTC

[jira] [Updated] (TIKA-1161) Dates incorrectly extracted from PDF

     [ https://issues.apache.org/jira/browse/TIKA-1161?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Nicolas Guillaumin updated TIKA-1161:
-------------------------------------

    Attachment: WF_16_Youth_Coalition.pdf
    
> Dates incorrectly extracted from PDF
> ------------------------------------
>
>                 Key: TIKA-1161
>                 URL: https://issues.apache.org/jira/browse/TIKA-1161
>             Project: Tika
>          Issue Type: Bug
>          Components: parser
>    Affects Versions: 1.4
>         Environment: Windows 7 64bit, JDK 1.7
>            Reporter: Nicolas Guillaumin
>            Priority: Minor
>              Labels: pdf
>         Attachments: WF_16_Youth_Coalition.pdf
>
>
> Tika incorrectly extracts the date on the attached PDF to 5034-09-24T14:03:00Z, whereas the actual date on the PDF seems to be 2007-03-01 10:58:57 according to FoxIt reader.
> Interestingly PDFBox 1.8.2 is extracting the correct date as well (When using the PDFDebugger tool)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira