You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Alexandre Rafalovitch <ar...@gmail.com> on 2014/04/14 13:26:20 UTC

What's the actual story with new morphline and hadoop contribs?

Hello,

I saw that 4.7.1 has morphline and hadoop contribution libraries, but
I can't figure out the degree to which they are useful to _Solr_
users. I found one hadoop example in the readme that does some sort
injection into Solr. Is that the only use case supported?

I thought that maybe there is a UpdateRequestProcessor or Handler
end-point or something that hooks into morphline to do
similar/alternative work to DataImportHandler. But I can't see any
entry points or examples for that.

Anybody knows what the story is and/or what the future holds?

Regards,
    Alex.
Personal website: http://www.outerthoughts.com/
Current project: http://www.solr-start.com/ - Accelerating your Solr proficiency

Re: What's the actual story with new morphline and hadoop contribs?

Posted by rulinma <ru...@gmail.com>.
I think it is useful to dirstribute index and then merge to solr. Cloudear
use it often. But reference is too less to understand. 



--
View this message in context: http://lucene.472066.n3.nabble.com/What-s-the-actual-story-with-new-morphline-and-hadoop-contribs-tp4130999p4159830.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: What's the actual story with new morphline and hadoop contribs?

Posted by Wolfgang Hoschek <wh...@cloudera.com>.
The solr morphline jars are integrated with solr by way of the solr specific solr/contrib/map-reduce module.

Ingestion from Flume into Solr is available here: http://flume.apache.org/FlumeUserGuide.html#morphlinesolrsink

FWIW, for our purposes we see no role for DataImportHandler anymore.

Wolfgang.

On Apr 15, 2014, at 6:01 AM, Alexandre Rafalovitch <ar...@gmail.com> wrote:

> The use case I keep thinking about is Flue/Morphline replacing
> DataImportHandler. So, when I saw morphline shipped with Solr, I tried
> to understand whether it is a step towards it.
> 
> As it is, I am still not sure I understand why those jars are shipped
> with Solr, if it is not actually integrating into Solr.
> 
> Regards,
>   Alex.
> Personal website: http://www.outerthoughts.com/
> Current project: http://www.solr-start.com/ - Accelerating your Solr proficiency
> 
> 
> On Mon, Apr 14, 2014 at 8:36 PM, Wolfgang Hoschek <wh...@cloudera.com> wrote:
>> Currently all Solr morphline use cases I’m aware of run in processes outside of the Solr JVM, e.g. in Flume, in MapReduce, in HBase Lily Indexer, etc. These ingestion processes generate Solr documents for Solr updates. Running in external processes is done to improve scalability, reliability, flexibility and reusability. Not everything needs to run inside of the Solr JVM.
>> 
>> We haven’t found a use case for it so far, but it would be easy to add an UpdateRequestProcessor that runs a morphline inside of the Solr JVM.
>> 
>> Here is more background info:
>> 
>> http://kitesdk.org/docs/current/kite-morphlines/index.html
>> 
>> http://kitesdk.org/docs/current/kite-morphlines/morphlinesReferenceGuide.html
>> 
>> http://files.meetup.com/5139282/SHUG10%20-%20Search%20On%20Hadoop.pdf
>> 
>> Wolfgang.
>> 
>> On Apr 14, 2014, at 2:26 PM, Alexandre Rafalovitch <ar...@gmail.com> wrote:
>> 
>>> Hello,
>>> 
>>> I saw that 4.7.1 has morphline and hadoop contribution libraries, but
>>> I can't figure out the degree to which they are useful to _Solr_
>>> users. I found one hadoop example in the readme that does some sort
>>> injection into Solr. Is that the only use case supported?
>>> 
>>> I thought that maybe there is a UpdateRequestProcessor or Handler
>>> end-point or something that hooks into morphline to do
>>> similar/alternative work to DataImportHandler. But I can't see any
>>> entry points or examples for that.
>>> 
>>> Anybody knows what the story is and/or what the future holds?
>>> 
>>> Regards,
>>>   Alex.
>>> Personal website: http://www.outerthoughts.com/
>>> Current project: http://www.solr-start.com/ - Accelerating your Solr proficiency
>> 


Re: What's the actual story with new morphline and hadoop contribs?

Posted by Alexandre Rafalovitch <ar...@gmail.com>.
The use case I keep thinking about is Flue/Morphline replacing
DataImportHandler. So, when I saw morphline shipped with Solr, I tried
to understand whether it is a step towards it.

As it is, I am still not sure I understand why those jars are shipped
with Solr, if it is not actually integrating into Solr.

Regards,
   Alex.
Personal website: http://www.outerthoughts.com/
Current project: http://www.solr-start.com/ - Accelerating your Solr proficiency


On Mon, Apr 14, 2014 at 8:36 PM, Wolfgang Hoschek <wh...@cloudera.com> wrote:
> Currently all Solr morphline use cases I’m aware of run in processes outside of the Solr JVM, e.g. in Flume, in MapReduce, in HBase Lily Indexer, etc. These ingestion processes generate Solr documents for Solr updates. Running in external processes is done to improve scalability, reliability, flexibility and reusability. Not everything needs to run inside of the Solr JVM.
>
> We haven’t found a use case for it so far, but it would be easy to add an UpdateRequestProcessor that runs a morphline inside of the Solr JVM.
>
> Here is more background info:
>
> http://kitesdk.org/docs/current/kite-morphlines/index.html
>
> http://kitesdk.org/docs/current/kite-morphlines/morphlinesReferenceGuide.html
>
> http://files.meetup.com/5139282/SHUG10%20-%20Search%20On%20Hadoop.pdf
>
> Wolfgang.
>
> On Apr 14, 2014, at 2:26 PM, Alexandre Rafalovitch <ar...@gmail.com> wrote:
>
>> Hello,
>>
>> I saw that 4.7.1 has morphline and hadoop contribution libraries, but
>> I can't figure out the degree to which they are useful to _Solr_
>> users. I found one hadoop example in the readme that does some sort
>> injection into Solr. Is that the only use case supported?
>>
>> I thought that maybe there is a UpdateRequestProcessor or Handler
>> end-point or something that hooks into morphline to do
>> similar/alternative work to DataImportHandler. But I can't see any
>> entry points or examples for that.
>>
>> Anybody knows what the story is and/or what the future holds?
>>
>> Regards,
>>    Alex.
>> Personal website: http://www.outerthoughts.com/
>> Current project: http://www.solr-start.com/ - Accelerating your Solr proficiency
>

Re: What's the actual story with new morphline and hadoop contribs?

Posted by Wolfgang Hoschek <wh...@cloudera.com>.
Currently all Solr morphline use cases I’m aware of run in processes outside of the Solr JVM, e.g. in Flume, in MapReduce, in HBase Lily Indexer, etc. These ingestion processes generate Solr documents for Solr updates. Running in external processes is done to improve scalability, reliability, flexibility and reusability. Not everything needs to run inside of the Solr JVM.

We haven’t found a use case for it so far, but it would be easy to add an UpdateRequestProcessor that runs a morphline inside of the Solr JVM.

Here is more background info: 

http://kitesdk.org/docs/current/kite-morphlines/index.html

http://kitesdk.org/docs/current/kite-morphlines/morphlinesReferenceGuide.html

http://files.meetup.com/5139282/SHUG10%20-%20Search%20On%20Hadoop.pdf

Wolfgang.

On Apr 14, 2014, at 2:26 PM, Alexandre Rafalovitch <ar...@gmail.com> wrote:

> Hello,
> 
> I saw that 4.7.1 has morphline and hadoop contribution libraries, but
> I can't figure out the degree to which they are useful to _Solr_
> users. I found one hadoop example in the readme that does some sort
> injection into Solr. Is that the only use case supported?
> 
> I thought that maybe there is a UpdateRequestProcessor or Handler
> end-point or something that hooks into morphline to do
> similar/alternative work to DataImportHandler. But I can't see any
> entry points or examples for that.
> 
> Anybody knows what the story is and/or what the future holds?
> 
> Regards,
>    Alex.
> Personal website: http://www.outerthoughts.com/
> Current project: http://www.solr-start.com/ - Accelerating your Solr proficiency