You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by kamaci <fu...@gmail.com> on 2013/03/11 13:38:48 UTC

How to Integrate Solr With Hbase

I have crawled data into Hbase with my Nutch. How can I use Solr to index the
data at Hbase? (Is there any solution from Nutch side, you are welcome)

PS: I am new to such kind of technologies and I run Solr from under example
folder as startup.jar



--
View this message in context: http://lucene.472066.n3.nabble.com/How-to-Integrate-Solr-With-Hbase-tp4046297.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: How to Integrate Solr With Hbase

Posted by lboutros <bo...@gmail.com>.
Hi Kamaci,

why don't you use the Nutch indexing functionality ?

The Nutch Crawling script already contains the Solr indexing step.

http://wiki.apache.org/nutch/bin/nutch%20solrindex

Ludovic.



-----
Jouve
France.
--
View this message in context: http://lucene.472066.n3.nabble.com/How-to-Integrate-Solr-With-Hbase-tp4046297p4046774.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: How to Integrate Solr With Hbase

Posted by kamaci <fu...@gmail.com>.
How can I store my crawled data in Solr? Which configuration should I do?



--
View this message in context: http://lucene.472066.n3.nabble.com/How-to-Integrate-Solr-With-Hbase-tp4046297p4046699.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: How to Integrate Solr With Hbase

Posted by adfel70 <ad...@gmail.com>.
My needs are different.
I have documents with large fields, larger than solr can store (at least if
there havn't been any chnages in this issue).
I want to index the fields but don't want to store them in solr.
And I also need highlighting on these fields. 
So I need to integrate solr and hbase  in search time ,or have some other
component do it, which will intervene after solr searches are retrieved.


Upayavira wrote
> If you want to be ble to use the data in both places, that's what you
> will need. You won't be ble to have Solr read indexes from within hbase,
> it needs to manage its own indexes. 
> 
> Upayavira
> 
> On Wed, Mar 13, 2013, at 09:03 AM, adfel70 wrote:
>> So you end up having all the data both in hbase and solr?
>> 
>> 
>> Bharat Mallampati wrote
>> > We haven't used Nutch to crawl data into SOLR, we used the standard
>> HBASE
>> > API to read and SOLRJ API to write to solr.
>> > 
>> > And our document size is relatively small with 100 to 150 fields.
>> > 
>> > 
>> > Thanks
>> > Bharat
>> > 
>> > 
>> > On Tue, Mar 12, 2013 at 1:15 AM, adfel70 &lt;
>> 
>> > adfel70@
>> 
>> > &gt; wrote:
>> > 
>> >>
>> >> DO you store all your crawled nutch data in solr?
>> >> including the text?
>> >> If you do - dont you get problems with too big documents?
>> >> If you dont - how do you support snippets and highlighting ?
>> >>
>> >>
>> >>
>> >> Bharat Mallampati wrote
>> >> > We do have same kind of scenario in our application also.
>> >> >
>> >> > The way we are achieving it is we have a batch process to read the
>> data
>> >> > from Hbase using Hbase API  and write it to SOLR using SOLRJ API.
>> >> >
>> >> >
>> >> > Thanks
>> >> > Bharat
>> >> >
>> >> >
>> >> >
>> >> > On Mon, Mar 11, 2013 at 5:38 AM, kamaci &lt;
>> >>
>> >> > furkankamaci@
>> >>
>> >> > &gt; wrote:
>> >> >
>> >> >> I have crawled data into Hbase with my Nutch. How can I use Solr to
>> >> index
>> >> >> the
>> >> >> data at Hbase? (Is there any solution from Nutch side, you are
>> >> welcome)
>> >> >>
>> >> >> PS: I am new to such kind of technologies and I run Solr from under
>> >> >> example
>> >> >> folder as startup.jar
>> >> >>
>> >> >>
>> >> >>
>> >> >> --
>> >> >> View this message in context:
>> >> >>
>> >>
>> http://lucene.472066.n3.nabble.com/How-to-Integrate-Solr-With-Hbase-tp4046297.html
>> >> >> Sent from the Solr - User mailing list archive at Nabble.com.
>> >> >>
>> >>
>> >>
>> >>
>> >>
>> >>
>> >> --
>> >> View this message in context:
>> >>
>> http://lucene.472066.n3.nabble.com/How-to-Integrate-Solr-With-Hbase-tp4046297p4046572.html
>> >> Sent from the Solr - User mailing list archive at Nabble.com.
>> >>
>> 
>> 
>> 
>> 
>> 
>> --
>> View this message in context:
>> http://lucene.472066.n3.nabble.com/How-to-Integrate-Solr-With-Hbase-tp4046297p4046909.html
>> Sent from the Solr - User mailing list archive at Nabble.com.





--
View this message in context: http://lucene.472066.n3.nabble.com/How-to-Integrate-Solr-With-Hbase-tp4046297p4046925.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: How to Integrate Solr With Hbase

Posted by Upayavira <uv...@odoko.co.uk>.
If you want to be ble to use the data in both places, that's what you
will need. You won't be ble to have Solr read indexes from within hbase,
it needs to manage its own indexes. 

Upayavira

On Wed, Mar 13, 2013, at 09:03 AM, adfel70 wrote:
> So you end up having all the data both in hbase and solr?
> 
> 
> Bharat Mallampati wrote
> > We haven't used Nutch to crawl data into SOLR, we used the standard HBASE
> > API to read and SOLRJ API to write to solr.
> > 
> > And our document size is relatively small with 100 to 150 fields.
> > 
> > 
> > Thanks
> > Bharat
> > 
> > 
> > On Tue, Mar 12, 2013 at 1:15 AM, adfel70 &lt;
> 
> > adfel70@
> 
> > &gt; wrote:
> > 
> >>
> >> DO you store all your crawled nutch data in solr?
> >> including the text?
> >> If you do - dont you get problems with too big documents?
> >> If you dont - how do you support snippets and highlighting ?
> >>
> >>
> >>
> >> Bharat Mallampati wrote
> >> > We do have same kind of scenario in our application also.
> >> >
> >> > The way we are achieving it is we have a batch process to read the data
> >> > from Hbase using Hbase API  and write it to SOLR using SOLRJ API.
> >> >
> >> >
> >> > Thanks
> >> > Bharat
> >> >
> >> >
> >> >
> >> > On Mon, Mar 11, 2013 at 5:38 AM, kamaci &lt;
> >>
> >> > furkankamaci@
> >>
> >> > &gt; wrote:
> >> >
> >> >> I have crawled data into Hbase with my Nutch. How can I use Solr to
> >> index
> >> >> the
> >> >> data at Hbase? (Is there any solution from Nutch side, you are
> >> welcome)
> >> >>
> >> >> PS: I am new to such kind of technologies and I run Solr from under
> >> >> example
> >> >> folder as startup.jar
> >> >>
> >> >>
> >> >>
> >> >> --
> >> >> View this message in context:
> >> >>
> >> http://lucene.472066.n3.nabble.com/How-to-Integrate-Solr-With-Hbase-tp4046297.html
> >> >> Sent from the Solr - User mailing list archive at Nabble.com.
> >> >>
> >>
> >>
> >>
> >>
> >>
> >> --
> >> View this message in context:
> >> http://lucene.472066.n3.nabble.com/How-to-Integrate-Solr-With-Hbase-tp4046297p4046572.html
> >> Sent from the Solr - User mailing list archive at Nabble.com.
> >>
> 
> 
> 
> 
> 
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/How-to-Integrate-Solr-With-Hbase-tp4046297p4046909.html
> Sent from the Solr - User mailing list archive at Nabble.com.

Re: How to Integrate Solr With Hbase

Posted by adfel70 <ad...@gmail.com>.
So you end up having all the data both in hbase and solr?


Bharat Mallampati wrote
> We haven't used Nutch to crawl data into SOLR, we used the standard HBASE
> API to read and SOLRJ API to write to solr.
> 
> And our document size is relatively small with 100 to 150 fields.
> 
> 
> Thanks
> Bharat
> 
> 
> On Tue, Mar 12, 2013 at 1:15 AM, adfel70 &lt;

> adfel70@

> &gt; wrote:
> 
>>
>> DO you store all your crawled nutch data in solr?
>> including the text?
>> If you do - dont you get problems with too big documents?
>> If you dont - how do you support snippets and highlighting ?
>>
>>
>>
>> Bharat Mallampati wrote
>> > We do have same kind of scenario in our application also.
>> >
>> > The way we are achieving it is we have a batch process to read the data
>> > from Hbase using Hbase API  and write it to SOLR using SOLRJ API.
>> >
>> >
>> > Thanks
>> > Bharat
>> >
>> >
>> >
>> > On Mon, Mar 11, 2013 at 5:38 AM, kamaci &lt;
>>
>> > furkankamaci@
>>
>> > &gt; wrote:
>> >
>> >> I have crawled data into Hbase with my Nutch. How can I use Solr to
>> index
>> >> the
>> >> data at Hbase? (Is there any solution from Nutch side, you are
>> welcome)
>> >>
>> >> PS: I am new to such kind of technologies and I run Solr from under
>> >> example
>> >> folder as startup.jar
>> >>
>> >>
>> >>
>> >> --
>> >> View this message in context:
>> >>
>> http://lucene.472066.n3.nabble.com/How-to-Integrate-Solr-With-Hbase-tp4046297.html
>> >> Sent from the Solr - User mailing list archive at Nabble.com.
>> >>
>>
>>
>>
>>
>>
>> --
>> View this message in context:
>> http://lucene.472066.n3.nabble.com/How-to-Integrate-Solr-With-Hbase-tp4046297p4046572.html
>> Sent from the Solr - User mailing list archive at Nabble.com.
>>





--
View this message in context: http://lucene.472066.n3.nabble.com/How-to-Integrate-Solr-With-Hbase-tp4046297p4046909.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: How to Integrate Solr With Hbase

Posted by Bharat Mallampati <ma...@gmail.com>.
We haven't used Nutch to crawl data into SOLR, we used the standard HBASE
API to read and SOLRJ API to write to solr.

And our document size is relatively small with 100 to 150 fields.


Thanks
Bharat


On Tue, Mar 12, 2013 at 1:15 AM, adfel70 <ad...@gmail.com> wrote:

>
> DO you store all your crawled nutch data in solr?
> including the text?
> If you do - dont you get problems with too big documents?
> If you dont - how do you support snippets and highlighting ?
>
>
>
> Bharat Mallampati wrote
> > We do have same kind of scenario in our application also.
> >
> > The way we are achieving it is we have a batch process to read the data
> > from Hbase using Hbase API  and write it to SOLR using SOLRJ API.
> >
> >
> > Thanks
> > Bharat
> >
> >
> >
> > On Mon, Mar 11, 2013 at 5:38 AM, kamaci &lt;
>
> > furkankamaci@
>
> > &gt; wrote:
> >
> >> I have crawled data into Hbase with my Nutch. How can I use Solr to
> index
> >> the
> >> data at Hbase? (Is there any solution from Nutch side, you are welcome)
> >>
> >> PS: I am new to such kind of technologies and I run Solr from under
> >> example
> >> folder as startup.jar
> >>
> >>
> >>
> >> --
> >> View this message in context:
> >>
> http://lucene.472066.n3.nabble.com/How-to-Integrate-Solr-With-Hbase-tp4046297.html
> >> Sent from the Solr - User mailing list archive at Nabble.com.
> >>
>
>
>
>
>
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/How-to-Integrate-Solr-With-Hbase-tp4046297p4046572.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>

Re: How to Integrate Solr With Hbase

Posted by adfel70 <ad...@gmail.com>.
DO you store all your crawled nutch data in solr?
including the text?
If you do - dont you get problems with too big documents?
If you dont - how do you support snippets and highlighting ?



Bharat Mallampati wrote
> We do have same kind of scenario in our application also.
> 
> The way we are achieving it is we have a batch process to read the data
> from Hbase using Hbase API  and write it to SOLR using SOLRJ API.
> 
> 
> Thanks
> Bharat
> 
> 
> 
> On Mon, Mar 11, 2013 at 5:38 AM, kamaci &lt;

> furkankamaci@

> &gt; wrote:
> 
>> I have crawled data into Hbase with my Nutch. How can I use Solr to index
>> the
>> data at Hbase? (Is there any solution from Nutch side, you are welcome)
>>
>> PS: I am new to such kind of technologies and I run Solr from under
>> example
>> folder as startup.jar
>>
>>
>>
>> --
>> View this message in context:
>> http://lucene.472066.n3.nabble.com/How-to-Integrate-Solr-With-Hbase-tp4046297.html
>> Sent from the Solr - User mailing list archive at Nabble.com.
>>





--
View this message in context: http://lucene.472066.n3.nabble.com/How-to-Integrate-Solr-With-Hbase-tp4046297p4046572.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: How to Integrate Solr With Hbase

Posted by Bharat Mallampati <ma...@gmail.com>.
We do have same kind of scenario in our application also.

The way we are achieving it is we have a batch process to read the data
from Hbase using Hbase API  and write it to SOLR using SOLRJ API.


Thanks
Bharat



On Mon, Mar 11, 2013 at 5:38 AM, kamaci <fu...@gmail.com> wrote:

> I have crawled data into Hbase with my Nutch. How can I use Solr to index
> the
> data at Hbase? (Is there any solution from Nutch side, you are welcome)
>
> PS: I am new to such kind of technologies and I run Solr from under example
> folder as startup.jar
>
>
>
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/How-to-Integrate-Solr-With-Hbase-tp4046297.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>