You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@tika.apache.org by "Dave Meikle (JIRA)" <ji...@apache.org> on 2010/03/27 14:22:27 UTC
[jira] Created: (TIKA-396) Parser Attachements from Outlook
Messages
Parser Attachements from Outlook Messages
-----------------------------------------
Key: TIKA-396
URL: https://issues.apache.org/jira/browse/TIKA-396
Project: Tika
Issue Type: Improvement
Components: parser
Affects Versions: 0.6
Environment: All environments.
Reporter: Dave Meikle
Assignee: Dave Meikle
As raised by Albert Jensen on the tika-user mailing list[1], it would be good for the Outlook Parser to iterate through the mails attachements and then extract there content.
[1]http://mail-archives.apache.org/mod_mbox/lucene-tika-user/201003.mbox/%3C002701cacccf$16108b40$4231a1c0$@mail.dk%3E
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
Re: [jira] Commented: (TIKA-396) Parser Attachements from Outlook
Messages
Posted by Dave Meikle <lo...@gmail.com>.
Hi,
On 11 April 2010 06:59, For Apache Tika <ol...@gmail.com> wrote:
> Please find attached zip file with msgs.
> Change zip_0 to zip ;-)
>
Thanks for the mail Oleg but the attachments do not come through on the
mailing list. Feel free to fire it to me direct if you still have the ZIP
file.
Thanks,
Dave
Re: [jira] Commented: (TIKA-396) Parser Attachements from Outlook
Messages
Posted by For Apache Tika <ol...@gmail.com>.
Please find attached zip file with msgs.
Change zip_0 to zip ;-)
Best regards,
Oleg.
On Fri, Apr 9, 2010 at 9:26 AM, Oleg Tikhonov <ol...@gmail.com>wrote:
> I'll send you on Sunday.
>
> Just wondering, what about Lotus Notes? Do we have something?
>
> -Oleg
>
>
> On Thu, Apr 8, 2010 at 11:13 PM, Dave Meikle <lo...@gmail.com> wrote:
>
>> Hi Oleg,
>>
>> On 8 April 2010 14:56, Oleg Tikhonov <ol...@gmail.com> wrote:
>>
>> > Hi Dave. Which format of Outlook mail do you need? msg?
>> >
>>
>> Yes a msg file, from either Outlook Express or Outlook.
>>
>> Thanks,
>> Dave
>>
>
>
>
> --
> Best regards, Oleg.
>
--
Best regards, Oleg.
Re: [jira] Commented: (TIKA-396) Parser Attachements from Outlook
Messages
Posted by Oleg Tikhonov <ol...@gmail.com>.
I'll send you on Sunday.
Just wondering, what about Lotus Notes? Do we have something?
-Oleg
On Thu, Apr 8, 2010 at 11:13 PM, Dave Meikle <lo...@gmail.com> wrote:
> Hi Oleg,
>
> On 8 April 2010 14:56, Oleg Tikhonov <ol...@gmail.com> wrote:
>
> > Hi Dave. Which format of Outlook mail do you need? msg?
> >
>
> Yes a msg file, from either Outlook Express or Outlook.
>
> Thanks,
> Dave
>
--
Best regards, Oleg.
Re: [jira] Commented: (TIKA-396) Parser Attachements from Outlook
Messages
Posted by Dave Meikle <lo...@gmail.com>.
Hi Oleg,
On 8 April 2010 14:56, Oleg Tikhonov <ol...@gmail.com> wrote:
> Hi Dave. Which format of Outlook mail do you need? msg?
>
Yes a msg file, from either Outlook Express or Outlook.
Thanks,
Dave
Re: [jira] Commented: (TIKA-396) Parser Attachements from Outlook
Messages
Posted by Oleg Tikhonov <ol...@gmail.com>.
Hi Dave. Which format of Outlook mail do you need? msg?
On Thu, Apr 8, 2010 at 4:33 PM, Dave Meikle (JIRA) <ji...@apache.org> wrote:
>
> [
> https://issues.apache.org/jira/browse/TIKA-396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12854933#action_12854933]
>
> Dave Meikle commented on TIKA-396:
> ----------------------------------
>
> Looking to add a test file but everything I have contains an attachment
> with private information. Does anyone have anything suitable available? Or
> do we just need to mock one up?
>
> > Parser Attachements from Outlook Messages
> > -----------------------------------------
> >
> > Key: TIKA-396
> > URL: https://issues.apache.org/jira/browse/TIKA-396
> > Project: Tika
> > Issue Type: Improvement
> > Components: parser
> > Affects Versions: 0.6
> > Environment: All environments.
> > Reporter: Dave Meikle
> > Assignee: Dave Meikle
> >
> > As raised by Albert Jensen on the tika-user mailing list[1], it would be
> good for the Outlook Parser to iterate through the mails attachments and
> then extract their content.
> > [1]
> http://mail-archives.apache.org/mod_mbox/lucene-tika-user/201003.mbox/%3C002701cacccf$16108b40$4231a1c0$@mail.dk%3E
>
> --
> This message is automatically generated by JIRA.
> -
> You can reply to this email to add a comment to the issue online.
>
>
--
Best regards, Oleg.
[jira] Commented: (TIKA-396) Parser Attachements from Outlook
Messages
Posted by "Dave Meikle (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/TIKA-396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12854933#action_12854933 ]
Dave Meikle commented on TIKA-396:
----------------------------------
Looking to add a test file but everything I have contains an attachment with private information. Does anyone have anything suitable available? Or do we just need to mock one up?
> Parser Attachements from Outlook Messages
> -----------------------------------------
>
> Key: TIKA-396
> URL: https://issues.apache.org/jira/browse/TIKA-396
> Project: Tika
> Issue Type: Improvement
> Components: parser
> Affects Versions: 0.6
> Environment: All environments.
> Reporter: Dave Meikle
> Assignee: Dave Meikle
>
> As raised by Albert Jensen on the tika-user mailing list[1], it would be good for the Outlook Parser to iterate through the mails attachments and then extract their content.
> [1]http://mail-archives.apache.org/mod_mbox/lucene-tika-user/201003.mbox/%3C002701cacccf$16108b40$4231a1c0$@mail.dk%3E
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Commented: (TIKA-396) Parser Attachements from Outlook
Messages
Posted by "Jukka Zitting (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/TIKA-396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12856826#action_12856826 ]
Jukka Zitting commented on TIKA-396:
------------------------------------
In revision 933903 I modified the OutlookExtractor to use the parser instance in the ParseContext instead of a hardcoded AutoDetectParser when parsing the attachments. This is similar to what the PackageParser does, and allows better client-level control of the parsing process.
Note that there's now an extra "Invalid attachment id" line being printed to system out as a part of the tika-parsers test suite. I guess this comes from POI.
> Parser Attachements from Outlook Messages
> -----------------------------------------
>
> Key: TIKA-396
> URL: https://issues.apache.org/jira/browse/TIKA-396
> Project: Tika
> Issue Type: Improvement
> Components: parser
> Affects Versions: 0.6
> Environment: All environments.
> Reporter: Dave Meikle
> Assignee: Dave Meikle
>
> As raised by Albert Jensen on the tika-user mailing list[1], it would be good for the Outlook Parser to iterate through the mails attachments and then extract their content.
> [1]http://mail-archives.apache.org/mod_mbox/lucene-tika-user/201003.mbox/%3C002701cacccf$16108b40$4231a1c0$@mail.dk%3E
--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: https://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Updated: (TIKA-396) Parser Attachements from Outlook
Messages
Posted by "Dave Meikle (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/TIKA-396?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Dave Meikle updated TIKA-396:
-----------------------------
Description:
As raised by Albert Jensen on the tika-user mailing list[1], it would be good for the Outlook Parser to iterate through the mails attachments and then extract their content.
[1]http://mail-archives.apache.org/mod_mbox/lucene-tika-user/201003.mbox/%3C002701cacccf$16108b40$4231a1c0$@mail.dk%3E
was:
As raised by Albert Jensen on the tika-user mailing list[1], it would be good for the Outlook Parser to iterate through the mails attachements and then extract there content.
[1]http://mail-archives.apache.org/mod_mbox/lucene-tika-user/201003.mbox/%3C002701cacccf$16108b40$4231a1c0$@mail.dk%3E
Looks like basic English is escaping me this morning ;-)
> Parser Attachements from Outlook Messages
> -----------------------------------------
>
> Key: TIKA-396
> URL: https://issues.apache.org/jira/browse/TIKA-396
> Project: Tika
> Issue Type: Improvement
> Components: parser
> Affects Versions: 0.6
> Environment: All environments.
> Reporter: Dave Meikle
> Assignee: Dave Meikle
>
> As raised by Albert Jensen on the tika-user mailing list[1], it would be good for the Outlook Parser to iterate through the mails attachments and then extract their content.
> [1]http://mail-archives.apache.org/mod_mbox/lucene-tika-user/201003.mbox/%3C002701cacccf$16108b40$4231a1c0$@mail.dk%3E
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.