You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@tika.apache.org by "Isabelle Giguere (JIRA)" <ji...@apache.org> on 2018/06/12 22:14:00 UTC

[jira] [Created] (TIKA-2666) Document last printed in the year 27321

Isabelle Giguere created TIKA-2666:
--------------------------------------

             Summary: Document last printed in the year 27321
                 Key: TIKA-2666
                 URL: https://issues.apache.org/jira/browse/TIKA-2666
             Project: Tika
          Issue Type: Bug
    Affects Versions: 1.17
            Reporter: Isabelle Giguere
         Attachments: Genetic_Factors_and_the_Directionality_of.ppt, PPT_lastPrinted_00.png, tika-app-1.17.metadata.txt

Tika extracts a strange last print date for the attached PowerPoint (97-2003)

In the attached screen shot PPT_lastPrinted_00.png, the date for last print was set to 00:00

But when Tika extracts metadata from this document, the last print date is in the year 27321 !
Last-Printed: 27321-01-23T08:20:12Z
meta:print-date: 27321-01-23T08:20:12Z

Attached metadata obtained using Tika 1.17

This weird date is causing issues further down in processing.  We can probably filter it out for now, but I do wonder how 00:00 turns into 27321-01-23T08:20:12Z



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)