You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@tika.apache.org by "Nick Burch (JIRA)" <ji...@apache.org> on 2011/05/06 06:43:03 UTC

[jira] [Created] (TIKA-656) Outlook dates using the wrong metadata key

Outlook dates using the wrong metadata key
------------------------------------------

                 Key: TIKA-656
                 URL: https://issues.apache.org/jira/browse/TIKA-656
             Project: Tika
          Issue Type: Bug
          Components: parser
    Affects Versions: 0.9
            Reporter: Nick Burch
            Assignee: Nick Burch


Currently, the Outlook extractor fetches the "Accepted By Mail Server" date from POI, and then saves this into Metadata.EDIT_TIME and Metadata.LAST_SAVED, neither of which look right, and neither of which are date properties.

The rfc822 parser uses Metadata.CREATION_DATE, which is a Date property. The mbox parser uses Metadata.DATE, another (but different) Date property

All three should probably use the same. I'd suggest that for now, they all output the same value to both CREATION_DATE and DATE

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Resolved] (TIKA-656) Outlook dates using the wrong metadata key

Posted by "Nick Burch (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/TIKA-656?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Nick Burch resolved TIKA-656.
-----------------------------

       Resolution: Fixed
    Fix Version/s: 1.0

Fixed - the three mail parsers now all output their dates as proper ISO8601 formatted, as Metadata.DATE and Metadata.CREATION_DATE

Also fixed a poifs date extraction as iso8601 issue too

> Outlook dates using the wrong metadata key
> ------------------------------------------
>
>                 Key: TIKA-656
>                 URL: https://issues.apache.org/jira/browse/TIKA-656
>             Project: Tika
>          Issue Type: Bug
>          Components: parser
>    Affects Versions: 0.9
>            Reporter: Nick Burch
>            Assignee: Nick Burch
>             Fix For: 1.0
>
>
> Currently, the Outlook extractor fetches the "Accepted By Mail Server" date from POI, and then saves this into Metadata.EDIT_TIME and Metadata.LAST_SAVED, neither of which look right, and neither of which are date properties.
> The rfc822 parser uses Metadata.CREATION_DATE, which is a Date property. The mbox parser uses Metadata.DATE, another (but different) Date property
> All three should probably use the same. I'd suggest that for now, they all output the same value to both CREATION_DATE and DATE

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira