You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Rohan Ashok Kumbhar <Ro...@infosys.com> on 2012/03/07 06:14:16 UTC

How to index doc file in solr?

Hi,

I would like to know how to index any  document other than xml in SOLR ?
Any comments would be appreciated !!!


Thanks,
Rohan


**************** CAUTION - Disclaimer *****************
This e-mail contains PRIVILEGED AND CONFIDENTIAL INFORMATION intended solely 
for the use of the addressee(s). If you are not the intended recipient, please 
notify the sender by e-mail and delete the original message. Further, you are not 
to copy, disclose, or distribute this e-mail or its contents to any other person and 
any such actions are unlawful. This e-mail may contain viruses. Infosys has taken 
every reasonable precaution to minimize this risk, but is not liable for any damage 
you may sustain as a result of any virus in this e-mail. You should carry out your 
own virus checks before opening the e-mail or attachment. Infosys reserves the 
right to monitor and review the content of all messages sent to or from this e-mail 
address. Messages sent to or from this e-mail address may be stored on the 
Infosys e-mail system.
***INFOSYS******** End of Disclaimer ********INFOSYS***

RE: How to index doc file in solr?

Posted by Rohan <Ro...@infosys.com>.
Thanks Erick ,really appreciated.

From: Erick Erickson [via Lucene] [mailto:ml-node+s472066n3819585h66@n3.nabble.com]
Sent: Monday, March 12, 2012 9:05 PM
To: Rohan Ashok Kumbhar
Subject: Re: How to index doc file in solr?

Consider using SolrJ, possibly combined with
Tika (which is what underlies Solr Cel).
http://www.lucidimagination.com/blog/2012/02/14/indexing-with-solrj/

AlthoughExtractingRequestHandler
has the capability of indexing metadata as
well if you map the fields.

See: http://wiki.apache.org/solr/ExtractingRequestHandler

Best
Erick


On Mon, Mar 12, 2012 at 11:09 AM, Rohan <[hidden email]</user/SendEmail.jtp?type=node&node=3819585&i=0>> wrote:

> Hi Erick,
>
> Thanks for the valuable comments on this.
>
> See i have few set of word docs file and i would like to index meta data
> part includeing the content of the page , so is there any way to complete
> this task?
>
> Need your comments on this.
>
> Thanks,
> Rohan
>
> --
> View this message in context: http://lucene.472066.n3.nabble.com/How-to-index-doc-file-in-solr-tp3806543p3818938.html
> Sent from the Solr - User mailing list archive at Nabble.com.

________________________________
If you reply to this email, your message will be added to the discussion below:
http://lucene.472066.n3.nabble.com/How-to-index-doc-file-in-solr-tp3806543p3819585.html
To unsubscribe from How to index doc file in solr?, click here<http://lucene.472066.n3.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_code&node=3806543&code=Um9oYW5fS3VtYmhhckBpbmZvc3lzLmNvbXwzODA2NTQzfC0xMjUwNDUyNDI1>.
NAML<http://lucene.472066.n3.nabble.com/template/NamlServlet.jtp?macro=macro_viewer&id=instant_html%21nabble%3Aemail.naml&base=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespace&breadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml>

**************** CAUTION - Disclaimer *****************
This e-mail contains PRIVILEGED AND CONFIDENTIAL INFORMATION intended solely
for the use of the addressee(s). If you are not the intended recipient, please
notify the sender by e-mail and delete the original message. Further, you are not
to copy, disclose, or distribute this e-mail or its contents to any other person and
any such actions are unlawful. This e-mail may contain viruses. Infosys has taken
every reasonable precaution to minimize this risk, but is not liable for any damage
you may sustain as a result of any virus in this e-mail. You should carry out your
own virus checks before opening the e-mail or attachment. Infosys reserves the
right to monitor and review the content of all messages sent to or from this e-mail
address. Messages sent to or from this e-mail address may be stored on the
Infosys e-mail system.
***INFOSYS******** End of Disclaimer ********INFOSYS***


--
View this message in context: http://lucene.472066.n3.nabble.com/How-to-index-doc-file-in-solr-tp3806543p3821271.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: How to index doc file in solr?

Posted by Erick Erickson <er...@gmail.com>.
Consider using SolrJ, possibly combined with
Tika (which is what underlies Solr Cel).
http://www.lucidimagination.com/blog/2012/02/14/indexing-with-solrj/

AlthoughExtractingRequestHandler
has the capability of indexing metadata as
well if you map the fields.

See: http://wiki.apache.org/solr/ExtractingRequestHandler

Best
Erick


On Mon, Mar 12, 2012 at 11:09 AM, Rohan <Ro...@infosys.com> wrote:
> Hi Erick,
>
> Thanks for the valuable comments on this.
>
> See i have few set of word docs file and i would like to index meta data
> part includeing the content of the page , so is there any way to complete
> this task?
>
> Need your comments on this.
>
> Thanks,
> Rohan
>
> --
> View this message in context: http://lucene.472066.n3.nabble.com/How-to-index-doc-file-in-solr-tp3806543p3818938.html
> Sent from the Solr - User mailing list archive at Nabble.com.

Re: How to index doc file in solr?

Posted by Rohan <Ro...@infosys.com>.
Hi Erick,

Thanks for the valuable comments on this.

See i have few set of word docs file and i would like to index meta data
part includeing the content of the page , so is there any way to complete
this task?

Need your comments on this.

Thanks,
Rohan

--
View this message in context: http://lucene.472066.n3.nabble.com/How-to-index-doc-file-in-solr-tp3806543p3818938.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: How to index doc file in solr?

Posted by Erick Erickson <er...@gmail.com>.
Have you looked at ExtractingRequestHandler (aka Solr Cell)? SolrJ?
Tika?

Perhaps if you defined the problem a bit more we'd be able to
give you more comprehensive answers....

Best
Erick

On Wed, Mar 7, 2012 at 12:14 AM, Rohan Ashok Kumbhar
<Ro...@infosys.com> wrote:
> Hi,
>
> I would like to know how to index any  document other than xml in SOLR ?
> Any comments would be appreciated !!!
>
>
> Thanks,
> Rohan
>
>
> **************** CAUTION - Disclaimer *****************
> This e-mail contains PRIVILEGED AND CONFIDENTIAL INFORMATION intended solely
> for the use of the addressee(s). If you are not the intended recipient, please
> notify the sender by e-mail and delete the original message. Further, you are not
> to copy, disclose, or distribute this e-mail or its contents to any other person and
> any such actions are unlawful. This e-mail may contain viruses. Infosys has taken
> every reasonable precaution to minimize this risk, but is not liable for any damage
> you may sustain as a result of any virus in this e-mail. You should carry out your
> own virus checks before opening the e-mail or attachment. Infosys reserves the
> right to monitor and review the content of all messages sent to or from this e-mail
> address. Messages sent to or from this e-mail address may be stored on the
> Infosys e-mail system.
> ***INFOSYS******** End of Disclaimer ********INFOSYS***