You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-dev@lucene.apache.org by "Preetam Rao (JIRA)" <ji...@apache.org> on 2008/12/22 10:34:44 UTC

[jira] Updated: (SOLR-934) Enable importing of mails into a solr index through DIH.

     [ https://issues.apache.org/jira/browse/SOLR-934?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Preetam Rao updated SOLR-934:
-----------------------------

    Description: 
Enable importing of mails into solr through DIH. Take one or more mailbox credentials, download and index their content along with the content from attachments.

The folders to fetch can be made configurable based on various criteria.

Apache Tika can be used for extracting content from different kinds of attachments.
JavaMail can be used for mail box related operations like fetching mails, filtering them etc.

The basic configuration for one mail box can look something like this:

<document>
   <entity processor="org.apache.solr.handler.dataimport.MailEntityProcessor"
 user="somebody@gmail.com"
password="something"
host="imap.gmail.com"
protocol="imaps"
folder="test1"/>
</document>

- This can be enhanced with timeouts, list to be read from a file, folder filters, delta import etc.

  was:
Enable importing of mails into solr through DIH. Take one or more mailbox credentials, download and index their content along with the content from attachments.

The folders to fetch can be made configurable based on various criteria.

Apache Tika can be used for extracting content from different kinds of attachments.
JavaMail can be used for mail box related operations like fetching mails, filtering them etc.


> Enable importing of mails into a solr index through DIH.
> --------------------------------------------------------
>
>                 Key: SOLR-934
>                 URL: https://issues.apache.org/jira/browse/SOLR-934
>             Project: Solr
>          Issue Type: New Feature
>          Components: contrib - DataImportHandler
>    Affects Versions: 1.4
>            Reporter: Preetam Rao
>            Priority: Minor
>             Fix For: 1.4
>
>   Original Estimate: 120h
>  Remaining Estimate: 120h
>
> Enable importing of mails into solr through DIH. Take one or more mailbox credentials, download and index their content along with the content from attachments.
> The folders to fetch can be made configurable based on various criteria.
> Apache Tika can be used for extracting content from different kinds of attachments.
> JavaMail can be used for mail box related operations like fetching mails, filtering them etc.
> The basic configuration for one mail box can look something like this:
> <document>
>    <entity processor="org.apache.solr.handler.dataimport.MailEntityProcessor"
>  user="somebody@gmail.com"
> password="something"
> host="imap.gmail.com"
> protocol="imaps"
> folder="test1"/>
> </document>
> - This can be enhanced with timeouts, list to be read from a file, folder filters, delta import etc.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.