You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@tika.apache.org by "Nick Burch (JIRA)" <ji...@apache.org> on 2013/10/22 12:25:44 UTC

[jira] [Commented] (TIKA-1186) Missing sender mail address in Outlook 2010

    [ https://issues.apache.org/jira/browse/TIKA-1186?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13801685#comment-13801685 ] 

Nick Burch commented on TIKA-1186:
----------------------------------

Currently, Apache POI pretty much only uses the variable sized property values in Outlook files. In part, this is because much of the early work was done by reverse engineering and comparing with the bits of the spec that were open, and the variable sized properties are much easier to spot in a hex dump!

There's some work on fixed sized properties in POI now, but it's not finished, and probably will need a rework of a lot of the code to properly support

For now, can you try running HMEFDumper against the file, and see if the strings you want show up in the MAPI Properties dump section?

> Missing sender mail address in Outlook 2010
> -------------------------------------------
>
>                 Key: TIKA-1186
>                 URL: https://issues.apache.org/jira/browse/TIKA-1186
>             Project: Tika
>          Issue Type: Improvement
>          Components: metadata, parser
>    Affects Versions: 1.4
>         Environment: Windows 7, 32 bit, CLI version
>            Reporter: Christian Leubner
>            Priority: Minor
>         Attachments: b.msg, b.msg
>
>
> When extracting metadata with Tika from an Outlook 2010 message file the quite important information "sender mail address" is not extracted, but only the "Message-Recipient-Address". However, for the exact identification of a sender/author the mail address is the most important aspect.



--
This message was sent by Atlassian JIRA
(v6.1#6144)