You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Ian Connor <ia...@gmail.com> on 2009/09/17 17:40:40 UTC

Solr via ruby

Hi,

Is there any support for connection pooling or a more optimized data
exchange format? We are looking at any further ways to optimize the solr
queries so we can possibly make more of them in the one request.

The JSON like format seems pretty tight but I understand when the
distributed search takes place it uses a binary protocol instead of text. I
wanted to know if that was available or could be available via the ruby
library.

Is it possible to host a local shard and skip HTTP between ruby and solr?

-- 
Regards,

Ian Connor
1 Leighton St #723
Cambridge, MA 02141
Call Center Phone: +1 (714) 239 3875 (24 hrs)
Fax: +1(770) 818 5697
Skype: ian.connor

Re: Solr via ruby

Posted by rajan chandi <ch...@gmail.com>.
Thanks Ian for sharing your knowledge on this.

We've been going through the recently published "Solr 1.4 Enterprise Search
Server" book and came across some stuff that means - Acts_as_Solr schema
could be less flexible when it comes to complex indexing and faceting of the
fields.

We are still exploring Flare/solr-ruby.

The concern with using JRuby and Solr_Ruby is that JRuby is only half the
fast as C implementation of Ruby.

The Flare is not mature enough. So, It doesn't meet our release fast release
often plan.

Has anyone on the form implemented faceting with Acts_as_solr? If yes, how
complex/flexible was it?

Regards
Rajan

On Wed, Sep 23, 2009 at 12:43 PM, Ian Connor <ia...@gmail.com> wrote:

> Hi,
>
> Thanks for the discussion. We use the distributed option so I am not sure
> embedded is possible.
>
> As you also guessed, we use haproxy for load balancing and failover between
> replicas of the shards so giving this up for a minor performance boost is
> probably not wise.
>
> So essentially we have: User -> HTTP Load Balancer -> Mogrel Cluster ->
> Haproxy -> N x Solr Shards
>
> and it looks like that is the standard setup for performance from what you
> suggest here and most of the performance tweaks I thought of are already in
> use.
>
> Ian.
>
> On Fri, Sep 18, 2009 at 3:09 AM, Erik Hatcher <erik.hatcher@gmail.com
> >wrote:
>
> >
> > On Sep 18, 2009, at 1:09 AM, rajan chandi wrote:
> >
> >> We are planning to use the external Solr on tomcat for scalability
> >> reasons.
> >>
> >> We thought that EmbeddedSolrServer uses HTTP too to talk with Ruby and
> >> vise-versa as in acts_as_solr ruby plugin.
> >>
> >
> > EmbeddedSolrServer is a way to run Solr as an API (like Lucene) rather
> than
> > with any web container involved at all.  In other words, only Java can
> use
> > EmbeddedSolrServer (which means JRuby works great).
> >
> > The acts_as_solr plugin uses the solr-ruby library to communicate with
> > Solr.  Under solr-ruby, it's HTTP with ruby (wt=ruby) formatted responses
> > for searches, and documents being indexed get converted to Solr's XML
> format
> > and POSTed to the Solr URL used to open the Solr::Connection
> >
> >        Erik
> >
> >
> >
> >
> >> If Ruby is not using the HTTP to talk EmbeddedSolrServer, what is it
> >> using?
> >>
> >> Thanks and Regards
> >> Rajan Chandi
> >>
> >> On Thu, Sep 17, 2009 at 9:44 PM, Erik Hatcher <erik.hatcher@gmail.com
> >> >wrote:
> >>
> >>
> >>> On Sep 17, 2009, at 11:40 AM, Ian Connor wrote:
> >>>
> >>>  Is there any support for connection pooling or a more optimized data
> >>>> exchange format?
> >>>>
> >>>>
> >>> The solr-ruby library (as do other Solr + Ruby libraries) use the ruby
> >>> response format and eval it.  solr-ruby supports keeping the HTTP
> >>> connection
> >>> alive too.
> >>>
> >>> We are looking at any further ways to optimize the solr
> >>>
> >>>> queries so we can possibly make more of them in the one request.
> >>>>
> >>>> The JSON like format seems pretty tight but I understand when the
> >>>> distributed search takes place it uses a binary protocol instead of
> >>>> text.
> >>>> I
> >>>> wanted to know if that was available or could be available via the
> ruby
> >>>> library.
> >>>>
> >>>> Is it possible to host a local shard and skip HTTP between ruby and
> >>>> solr?
> >>>>
> >>>>
> >>> If you use JRuby you can do some fancy stuff, like use the javabin
> update
> >>> and response formats so no XML is involved, and you could also use
> Solr's
> >>> EmbeddedSolrServer to avoid HTTP.   However, in practice rarely is HTTP
> >>> the
> >>> bottleneck and actually offers a lot of advantages, such as easy
> >>> commodity
> >>> load balancing and caching.
> >>>
> >>> But JRuby + Solr is a very beautiful way to go!
> >>>
> >>> If you're using MRI Ruby, though, you don't really have any options
> other
> >>> than to go over HTTP. You could use json or ruby formatted responses -
> >>> I'd
> >>> be curious to see some performance numbers comparing those two.
> >>>
> >>>      Erik
> >>>
> >>>
> >>>
> >
>
>
> --
> Regards,
>
> Ian Connor
> 1 Leighton St #723
> Cambridge, MA 02141
> Call Center Phone: +1 (714) 239 3875 (24 hrs)
> Fax: +1(770) 818 5697
> Skype: ian.connor
>

Re: Solr via ruby

Posted by Ian Connor <ia...@gmail.com>.
Hi,

Thanks for the discussion. We use the distributed option so I am not sure
embedded is possible.

As you also guessed, we use haproxy for load balancing and failover between
replicas of the shards so giving this up for a minor performance boost is
probably not wise.

So essentially we have: User -> HTTP Load Balancer -> Mogrel Cluster ->
Haproxy -> N x Solr Shards

and it looks like that is the standard setup for performance from what you
suggest here and most of the performance tweaks I thought of are already in
use.

Ian.

On Fri, Sep 18, 2009 at 3:09 AM, Erik Hatcher <er...@gmail.com>wrote:

>
> On Sep 18, 2009, at 1:09 AM, rajan chandi wrote:
>
>> We are planning to use the external Solr on tomcat for scalability
>> reasons.
>>
>> We thought that EmbeddedSolrServer uses HTTP too to talk with Ruby and
>> vise-versa as in acts_as_solr ruby plugin.
>>
>
> EmbeddedSolrServer is a way to run Solr as an API (like Lucene) rather than
> with any web container involved at all.  In other words, only Java can use
> EmbeddedSolrServer (which means JRuby works great).
>
> The acts_as_solr plugin uses the solr-ruby library to communicate with
> Solr.  Under solr-ruby, it's HTTP with ruby (wt=ruby) formatted responses
> for searches, and documents being indexed get converted to Solr's XML format
> and POSTed to the Solr URL used to open the Solr::Connection
>
>        Erik
>
>
>
>
>> If Ruby is not using the HTTP to talk EmbeddedSolrServer, what is it
>> using?
>>
>> Thanks and Regards
>> Rajan Chandi
>>
>> On Thu, Sep 17, 2009 at 9:44 PM, Erik Hatcher <erik.hatcher@gmail.com
>> >wrote:
>>
>>
>>> On Sep 17, 2009, at 11:40 AM, Ian Connor wrote:
>>>
>>>  Is there any support for connection pooling or a more optimized data
>>>> exchange format?
>>>>
>>>>
>>> The solr-ruby library (as do other Solr + Ruby libraries) use the ruby
>>> response format and eval it.  solr-ruby supports keeping the HTTP
>>> connection
>>> alive too.
>>>
>>> We are looking at any further ways to optimize the solr
>>>
>>>> queries so we can possibly make more of them in the one request.
>>>>
>>>> The JSON like format seems pretty tight but I understand when the
>>>> distributed search takes place it uses a binary protocol instead of
>>>> text.
>>>> I
>>>> wanted to know if that was available or could be available via the ruby
>>>> library.
>>>>
>>>> Is it possible to host a local shard and skip HTTP between ruby and
>>>> solr?
>>>>
>>>>
>>> If you use JRuby you can do some fancy stuff, like use the javabin update
>>> and response formats so no XML is involved, and you could also use Solr's
>>> EmbeddedSolrServer to avoid HTTP.   However, in practice rarely is HTTP
>>> the
>>> bottleneck and actually offers a lot of advantages, such as easy
>>> commodity
>>> load balancing and caching.
>>>
>>> But JRuby + Solr is a very beautiful way to go!
>>>
>>> If you're using MRI Ruby, though, you don't really have any options other
>>> than to go over HTTP. You could use json or ruby formatted responses -
>>> I'd
>>> be curious to see some performance numbers comparing those two.
>>>
>>>      Erik
>>>
>>>
>>>
>


-- 
Regards,

Ian Connor
1 Leighton St #723
Cambridge, MA 02141
Call Center Phone: +1 (714) 239 3875 (24 hrs)
Fax: +1(770) 818 5697
Skype: ian.connor

Re: Solr via ruby

Posted by Erik Hatcher <er...@gmail.com>.
On Sep 18, 2009, at 1:09 AM, rajan chandi wrote:
> We are planning to use the external Solr on tomcat for scalability  
> reasons.
>
> We thought that EmbeddedSolrServer uses HTTP too to talk with Ruby and
> vise-versa as in acts_as_solr ruby plugin.

EmbeddedSolrServer is a way to run Solr as an API (like Lucene) rather  
than with any web container involved at all.  In other words, only  
Java can use EmbeddedSolrServer (which means JRuby works great).

The acts_as_solr plugin uses the solr-ruby library to communicate with  
Solr.  Under solr-ruby, it's HTTP with ruby (wt=ruby) formatted  
responses for searches, and documents being indexed get converted to  
Solr's XML format and POSTed to the Solr URL used to open the  
Solr::Connection

	Erik


>
> If Ruby is not using the HTTP to talk EmbeddedSolrServer, what is it  
> using?
>
> Thanks and Regards
> Rajan Chandi
>
> On Thu, Sep 17, 2009 at 9:44 PM, Erik Hatcher  
> <er...@gmail.com>wrote:
>
>>
>> On Sep 17, 2009, at 11:40 AM, Ian Connor wrote:
>>
>>> Is there any support for connection pooling or a more optimized data
>>> exchange format?
>>>
>>
>> The solr-ruby library (as do other Solr + Ruby libraries) use the  
>> ruby
>> response format and eval it.  solr-ruby supports keeping the HTTP  
>> connection
>> alive too.
>>
>> We are looking at any further ways to optimize the solr
>>> queries so we can possibly make more of them in the one request.
>>>
>>> The JSON like format seems pretty tight but I understand when the
>>> distributed search takes place it uses a binary protocol instead  
>>> of text.
>>> I
>>> wanted to know if that was available or could be available via the  
>>> ruby
>>> library.
>>>
>>> Is it possible to host a local shard and skip HTTP between ruby  
>>> and solr?
>>>
>>
>> If you use JRuby you can do some fancy stuff, like use the javabin  
>> update
>> and response formats so no XML is involved, and you could also use  
>> Solr's
>> EmbeddedSolrServer to avoid HTTP.   However, in practice rarely is  
>> HTTP the
>> bottleneck and actually offers a lot of advantages, such as easy  
>> commodity
>> load balancing and caching.
>>
>> But JRuby + Solr is a very beautiful way to go!
>>
>> If you're using MRI Ruby, though, you don't really have any options  
>> other
>> than to go over HTTP. You could use json or ruby formatted  
>> responses - I'd
>> be curious to see some performance numbers comparing those two.
>>
>>       Erik
>>
>>


Re: Solr via ruby

Posted by rajan chandi <ch...@gmail.com>.
We are planning to use the external Solr on tomcat for scalability reasons.

We thought that EmbeddedSolrServer uses HTTP too to talk with Ruby and
vise-versa as in acts_as_solr ruby plugin.

If Ruby is not using the HTTP to talk EmbeddedSolrServer, what is it using?

Thanks and Regards
Rajan Chandi

On Thu, Sep 17, 2009 at 9:44 PM, Erik Hatcher <er...@gmail.com>wrote:

>
> On Sep 17, 2009, at 11:40 AM, Ian Connor wrote:
>
>> Is there any support for connection pooling or a more optimized data
>> exchange format?
>>
>
> The solr-ruby library (as do other Solr + Ruby libraries) use the ruby
> response format and eval it.  solr-ruby supports keeping the HTTP connection
> alive too.
>
>  We are looking at any further ways to optimize the solr
>> queries so we can possibly make more of them in the one request.
>>
>> The JSON like format seems pretty tight but I understand when the
>> distributed search takes place it uses a binary protocol instead of text.
>> I
>> wanted to know if that was available or could be available via the ruby
>> library.
>>
>> Is it possible to host a local shard and skip HTTP between ruby and solr?
>>
>
> If you use JRuby you can do some fancy stuff, like use the javabin update
> and response formats so no XML is involved, and you could also use Solr's
> EmbeddedSolrServer to avoid HTTP.   However, in practice rarely is HTTP the
> bottleneck and actually offers a lot of advantages, such as easy commodity
> load balancing and caching.
>
> But JRuby + Solr is a very beautiful way to go!
>
> If you're using MRI Ruby, though, you don't really have any options other
> than to go over HTTP. You could use json or ruby formatted responses - I'd
> be curious to see some performance numbers comparing those two.
>
>        Erik
>
>

Re: Solr via ruby

Posted by Erik Hatcher <er...@gmail.com>.
On Sep 17, 2009, at 11:40 AM, Ian Connor wrote:
> Is there any support for connection pooling or a more optimized data
> exchange format?

The solr-ruby library (as do other Solr + Ruby libraries) use the ruby  
response format and eval it.  solr-ruby supports keeping the HTTP  
connection alive too.

> We are looking at any further ways to optimize the solr
> queries so we can possibly make more of them in the one request.
>
> The JSON like format seems pretty tight but I understand when the
> distributed search takes place it uses a binary protocol instead of  
> text. I
> wanted to know if that was available or could be available via the  
> ruby
> library.
>
> Is it possible to host a local shard and skip HTTP between ruby and  
> solr?

If you use JRuby you can do some fancy stuff, like use the javabin  
update and response formats so no XML is involved, and you could also  
use Solr's EmbeddedSolrServer to avoid HTTP.   However, in practice  
rarely is HTTP the bottleneck and actually offers a lot of advantages,  
such as easy commodity load balancing and caching.

But JRuby + Solr is a very beautiful way to go!

If you're using MRI Ruby, though, you don't really have any options  
other than to go over HTTP. You could use json or ruby formatted  
responses - I'd be curious to see some performance numbers comparing  
those two.

	Erik