You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by ram anam <ra...@hotmail.com> on 2012/06/08 19:29:58 UTC

Writing custom data import handler for Solr.

Hi,
 
I am planning to write a custom data import handler for SOLR for some data source. Could you give me some pointers to documentation, examples on how to write a custom data import handler and how to integrate it with SOLR. Thank you for help. Thanks and regards,Ram Anam. 		 	   		  

RE: Writing custom data import handler for Solr.

Posted by "Dyer, James" <Ja...@ingrambook.com>.
More specifically, the 3.6 Data Import Handler code (DIH) can be seen here:

http://svn.apache.org/repos/asf/lucene/dev/branches/lucene_solr_3_6/solr/contrib/dataimporthandler/src/java/org/apache/solr/handler/dataimport/

The main wiki page is here:

http://wiki.apache.org/solr/DataImportHandler

The architecture of DIH is such that each import entity is driven by an EntityProcessor that reads data from a DataSource.  So you could create a KindaLikeAmazonE3DataSource and then a KindaLikeAmazonE3EntityProcessor.  The DataSource reads the data and passes it to the EntityProcessor.  

See also SolrEntityProcessor.  This is an Entity Processor that reads from 1 solr core to re-index the same data in another solr core.  This Entity Processor, I believe, does its own data reading and doesn't use a DataSource.  This might be a simpler approach for you.

On the wiki page, see these 2 sections:

http://wiki.apache.org/solr/DataImportHandler#EntityProcessor
http://wiki.apache.org/solr/DataImportHandler#DataSource

In the code your extension points are these 2 classes:

http://svn.apache.org/repos/asf/lucene/dev/branches/lucene_solr_3_6/solr/contrib/dataimporthandler/src/java/org/apache/solr/handler/dataimport/DataSource.java
http://svn.apache.org/repos/asf/lucene/dev/branches/lucene_solr_3_6/solr/contrib/dataimporthandler/src/java/org/apache/solr/handler/dataimport/EntityProcessor.java

For a good example that you might want to base your code from, see SolrEntityProcessor:

http://svn.apache.org/repos/asf/lucene/dev/branches/lucene_solr_3_6/solr/contrib/dataimporthandler/src/java/org/apache/solr/handler/dataimport/SolrEntityProcessor.java

James Dyer
E-Commerce Systems
Ingram Content Group
(615) 213-4311


-----Original Message-----
From: Lance Norskog [mailto:goksron@gmail.com] 
Sent: Saturday, June 09, 2012 7:37 PM
To: solr-user@lucene.apache.org
Subject: Re: Writing custom data import handler for Solr.

Nope, the code is all you get.

On Sat, Jun 9, 2012 at 12:16 AM, ram anam <ra...@hotmail.com> wrote:
>
> Thanks for the guidance. But is there any documentation that describes the steps to implement custom data source and integrate it with SOLR. The data source I am trying to integrate is like Amazon S3 Buckets. But provider is different.
>
> Thanks and regards,Ram Anam.
>
>> Date: Fri, 8 Jun 2012 20:40:05 -0700
>> Subject: Re: Writing custom data import handler for Solr.
>> From: goksron@gmail.com
>> To: solr-user@lucene.apache.org
>>
>> The DataImportHandler is a toolkit in Solr. It has a few different
>> kinds of plugins. It is very possible that you do not have to write
>> any Java code.
>>
>> If you have an unusual external data feed (database, file system,
>> Amazon S3 buckets) then you would write a Datasource. The only
>> examples are the source code in trunk/solr/contrib/dataimporthandler.
>>
>> http://wiki.apache.org/solr/DataImportHandler
>>
>> On Fri, Jun 8, 2012 at 8:35 PM, ram anam <ra...@hotmail.com> wrote:
>> >
>> > Hi Eric,
>> > I cannot disclose the data source which we are planning to index inside SOLR as it is confidential. But client wants it be in the form of Import Handler. We plan to install Solr and our custom data import handlers so that client can just consume it. Could you please provide me the pointers to examples of Custom Data Import Handlers.
>> >
>> > Thanks and regards,Ram Anam.
>> >
>> >> Date: Fri, 8 Jun 2012 13:59:34 -0400
>> >> Subject: Re: Writing custom data import handler for Solr.
>> >> From: erickerickson@gmail.com
>> >> To: solr-user@lucene.apache.org
>> >>
>> >> You need to back up a bit and describe _why_ you want to do this,
>> >> perhaps there's
>> >> an easy way to do what you want. This could easily be an XY problem...
>> >>
>> >> For instance, you can write a SolrJ program to index data, which _might_ be
>> >> what you want. It's a separate process runnable anywhere. See:
>> >> http://www.lucidimagination.com/blog/2012/02/14/indexing-with-solrj/
>> >>
>> >> Best
>> >> Erick
>> >>
>> >> On Fri, Jun 8, 2012 at 1:29 PM, ram anam <ra...@hotmail.com> wrote:
>> >> >
>> >> > Hi,
>> >> >
>> >> > I am planning to write a custom data import handler for SOLR for some data source. Could you give me some pointers to documentation, examples on how to write a custom data import handler and how to integrate it with SOLR. Thank you for help. Thanks and regards,Ram Anam.
>> >
>>
>>
>>
>> --
>> Lance Norskog
>> goksron@gmail.com
>



-- 
Lance Norskog
goksron@gmail.com

Re: Writing custom data import handler for Solr.

Posted by Lance Norskog <go...@gmail.com>.
Nope, the code is all you get.

On Sat, Jun 9, 2012 at 12:16 AM, ram anam <ra...@hotmail.com> wrote:
>
> Thanks for the guidance. But is there any documentation that describes the steps to implement custom data source and integrate it with SOLR. The data source I am trying to integrate is like Amazon S3 Buckets. But provider is different.
>
> Thanks and regards,Ram Anam.
>
>> Date: Fri, 8 Jun 2012 20:40:05 -0700
>> Subject: Re: Writing custom data import handler for Solr.
>> From: goksron@gmail.com
>> To: solr-user@lucene.apache.org
>>
>> The DataImportHandler is a toolkit in Solr. It has a few different
>> kinds of plugins. It is very possible that you do not have to write
>> any Java code.
>>
>> If you have an unusual external data feed (database, file system,
>> Amazon S3 buckets) then you would write a Datasource. The only
>> examples are the source code in trunk/solr/contrib/dataimporthandler.
>>
>> http://wiki.apache.org/solr/DataImportHandler
>>
>> On Fri, Jun 8, 2012 at 8:35 PM, ram anam <ra...@hotmail.com> wrote:
>> >
>> > Hi Eric,
>> > I cannot disclose the data source which we are planning to index inside SOLR as it is confidential. But client wants it be in the form of Import Handler. We plan to install Solr and our custom data import handlers so that client can just consume it. Could you please provide me the pointers to examples of Custom Data Import Handlers.
>> >
>> > Thanks and regards,Ram Anam.
>> >
>> >> Date: Fri, 8 Jun 2012 13:59:34 -0400
>> >> Subject: Re: Writing custom data import handler for Solr.
>> >> From: erickerickson@gmail.com
>> >> To: solr-user@lucene.apache.org
>> >>
>> >> You need to back up a bit and describe _why_ you want to do this,
>> >> perhaps there's
>> >> an easy way to do what you want. This could easily be an XY problem...
>> >>
>> >> For instance, you can write a SolrJ program to index data, which _might_ be
>> >> what you want. It's a separate process runnable anywhere. See:
>> >> http://www.lucidimagination.com/blog/2012/02/14/indexing-with-solrj/
>> >>
>> >> Best
>> >> Erick
>> >>
>> >> On Fri, Jun 8, 2012 at 1:29 PM, ram anam <ra...@hotmail.com> wrote:
>> >> >
>> >> > Hi,
>> >> >
>> >> > I am planning to write a custom data import handler for SOLR for some data source. Could you give me some pointers to documentation, examples on how to write a custom data import handler and how to integrate it with SOLR. Thank you for help. Thanks and regards,Ram Anam.
>> >
>>
>>
>>
>> --
>> Lance Norskog
>> goksron@gmail.com
>



-- 
Lance Norskog
goksron@gmail.com

RE: Writing custom data import handler for Solr.

Posted by ram anam <ra...@hotmail.com>.
Thanks for the guidance. But is there any documentation that describes the steps to implement custom data source and integrate it with SOLR. The data source I am trying to integrate is like Amazon S3 Buckets. But provider is different.

Thanks and regards,Ram Anam.

> Date: Fri, 8 Jun 2012 20:40:05 -0700
> Subject: Re: Writing custom data import handler for Solr.
> From: goksron@gmail.com
> To: solr-user@lucene.apache.org
> 
> The DataImportHandler is a toolkit in Solr. It has a few different
> kinds of plugins. It is very possible that you do not have to write
> any Java code.
> 
> If you have an unusual external data feed (database, file system,
> Amazon S3 buckets) then you would write a Datasource. The only
> examples are the source code in trunk/solr/contrib/dataimporthandler.
> 
> http://wiki.apache.org/solr/DataImportHandler
> 
> On Fri, Jun 8, 2012 at 8:35 PM, ram anam <ra...@hotmail.com> wrote:
> >
> > Hi Eric,
> > I cannot disclose the data source which we are planning to index inside SOLR as it is confidential. But client wants it be in the form of Import Handler. We plan to install Solr and our custom data import handlers so that client can just consume it. Could you please provide me the pointers to examples of Custom Data Import Handlers.
> >
> > Thanks and regards,Ram Anam.
> >
> >> Date: Fri, 8 Jun 2012 13:59:34 -0400
> >> Subject: Re: Writing custom data import handler for Solr.
> >> From: erickerickson@gmail.com
> >> To: solr-user@lucene.apache.org
> >>
> >> You need to back up a bit and describe _why_ you want to do this,
> >> perhaps there's
> >> an easy way to do what you want. This could easily be an XY problem...
> >>
> >> For instance, you can write a SolrJ program to index data, which _might_ be
> >> what you want. It's a separate process runnable anywhere. See:
> >> http://www.lucidimagination.com/blog/2012/02/14/indexing-with-solrj/
> >>
> >> Best
> >> Erick
> >>
> >> On Fri, Jun 8, 2012 at 1:29 PM, ram anam <ra...@hotmail.com> wrote:
> >> >
> >> > Hi,
> >> >
> >> > I am planning to write a custom data import handler for SOLR for some data source. Could you give me some pointers to documentation, examples on how to write a custom data import handler and how to integrate it with SOLR. Thank you for help. Thanks and regards,Ram Anam.
> >
> 
> 
> 
> -- 
> Lance Norskog
> goksron@gmail.com
 		 	   		  

Re: Writing custom data import handler for Solr.

Posted by Lance Norskog <go...@gmail.com>.
The DataImportHandler is a toolkit in Solr. It has a few different
kinds of plugins. It is very possible that you do not have to write
any Java code.

If you have an unusual external data feed (database, file system,
Amazon S3 buckets) then you would write a Datasource. The only
examples are the source code in trunk/solr/contrib/dataimporthandler.

http://wiki.apache.org/solr/DataImportHandler

On Fri, Jun 8, 2012 at 8:35 PM, ram anam <ra...@hotmail.com> wrote:
>
> Hi Eric,
> I cannot disclose the data source which we are planning to index inside SOLR as it is confidential. But client wants it be in the form of Import Handler. We plan to install Solr and our custom data import handlers so that client can just consume it. Could you please provide me the pointers to examples of Custom Data Import Handlers.
>
> Thanks and regards,Ram Anam.
>
>> Date: Fri, 8 Jun 2012 13:59:34 -0400
>> Subject: Re: Writing custom data import handler for Solr.
>> From: erickerickson@gmail.com
>> To: solr-user@lucene.apache.org
>>
>> You need to back up a bit and describe _why_ you want to do this,
>> perhaps there's
>> an easy way to do what you want. This could easily be an XY problem...
>>
>> For instance, you can write a SolrJ program to index data, which _might_ be
>> what you want. It's a separate process runnable anywhere. See:
>> http://www.lucidimagination.com/blog/2012/02/14/indexing-with-solrj/
>>
>> Best
>> Erick
>>
>> On Fri, Jun 8, 2012 at 1:29 PM, ram anam <ra...@hotmail.com> wrote:
>> >
>> > Hi,
>> >
>> > I am planning to write a custom data import handler for SOLR for some data source. Could you give me some pointers to documentation, examples on how to write a custom data import handler and how to integrate it with SOLR. Thank you for help. Thanks and regards,Ram Anam.
>



-- 
Lance Norskog
goksron@gmail.com

RE: Writing custom data import handler for Solr.

Posted by ram anam <ra...@hotmail.com>.
Hi Eric,
I cannot disclose the data source which we are planning to index inside SOLR as it is confidential. But client wants it be in the form of Import Handler. We plan to install Solr and our custom data import handlers so that client can just consume it. Could you please provide me the pointers to examples of Custom Data Import Handlers. 

Thanks and regards,Ram Anam.

> Date: Fri, 8 Jun 2012 13:59:34 -0400
> Subject: Re: Writing custom data import handler for Solr.
> From: erickerickson@gmail.com
> To: solr-user@lucene.apache.org
> 
> You need to back up a bit and describe _why_ you want to do this,
> perhaps there's
> an easy way to do what you want. This could easily be an XY problem...
> 
> For instance, you can write a SolrJ program to index data, which _might_ be
> what you want. It's a separate process runnable anywhere. See:
> http://www.lucidimagination.com/blog/2012/02/14/indexing-with-solrj/
> 
> Best
> Erick
> 
> On Fri, Jun 8, 2012 at 1:29 PM, ram anam <ra...@hotmail.com> wrote:
> >
> > Hi,
> >
> > I am planning to write a custom data import handler for SOLR for some data source. Could you give me some pointers to documentation, examples on how to write a custom data import handler and how to integrate it with SOLR. Thank you for help. Thanks and regards,Ram Anam.
 		 	   		  

Re: Writing custom data import handler for Solr.

Posted by Erick Erickson <er...@gmail.com>.
You need to back up a bit and describe _why_ you want to do this,
perhaps there's
an easy way to do what you want. This could easily be an XY problem...

For instance, you can write a SolrJ program to index data, which _might_ be
what you want. It's a separate process runnable anywhere. See:
http://www.lucidimagination.com/blog/2012/02/14/indexing-with-solrj/

Best
Erick

On Fri, Jun 8, 2012 at 1:29 PM, ram anam <ra...@hotmail.com> wrote:
>
> Hi,
>
> I am planning to write a custom data import handler for SOLR for some data source. Could you give me some pointers to documentation, examples on how to write a custom data import handler and how to integrate it with SOLR. Thank you for help. Thanks and regards,Ram Anam.