You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Joachim Martin <jm...@path-works.com> on 2006/06/07 20:06:26 UTC

embedding solr in a webapp?

Hi,

We are looking at running read-only solr nodes embedded in our webapp 
nodes.  This would give us the
additional features of solr over lucene, but would keep it in memory and 
reduce the overhead of http/xml
transport of results.

Looks like we would just create a request handler and call 
handleRequest(req,rsp), and deal with the
search results DocList ourselves.

Would there be any reason why this sort of setup would prohibit the use 
of index replication in a master/slave
setup?

Does this make sense?  As you might guess, speed is more important that 
flexibility.  We are using solr for
a content search, returning ids, and doing a secondary db lookup for 
extended entity information.

Thanks --Joachim

Re: embedding solr in a webapp?

Posted by Joachim Martin <jm...@path-works.com>.
Certainly running a load balanced solr cluster will be our first 
approach, I was just wondering if there were
any glaring problems with running solr embedded in each webapp node.  
Sounds like there are not.

As for the secondary db lookup, those will be cached, and are necessary 
to filter results further based on
time (schedule) restrictions.

We will probably also implement a custom ResponseWriter that just 
returns a comma separated list of ids-
the IPC time is just one component of the overhead, xml parsing is another.

Thanks  --Joachim

Yonik Seeley wrote:
> On 6/7/06, Joachim Martin <jm...@path-works.com> wrote:
>> We are looking at running read-only solr nodes embedded in our webapp
>> nodes.  This would give us the
>> additional features of solr over lucene, but would keep it in memory and
>> reduce the overhead of http/xml
>> transport of results.
>>
>> Looks like we would just create a request handler and call
>> handleRequest(req,rsp), and deal with the
>> search results DocList ourselves.
>
> Yes, that should work fine.
>
>> Would there be any reason why this sort of setup would prohibit the use
>> of index replication in a master/slave
>> setup?
>
> No, that should still work fine.
>
>> Does this make sense?  As you might guess, speed is more important that
>> flexibility.
>
> It can make sense in certain cases... but it does cut down on your
> flexibility to size the search tier independently of the appserver
> tier.
>
> Eliminating the IPC might get you 5% more performance, but at what
> development & flexibility cost?  It's easier to buy a slightly faster
> box, or simply add another server if you are running behind a
> load-balancer.  You know your situation best of course :-)
>
>>  We are using solr for
>> a content search, returning ids, and doing a secondary db lookup for
>> extended entity information.
>
> You go through the trouble of avoiding one IPC call, but you add it
> back in with the DB lookup... are the fields too large to store in
> Lucene?
>
> -Yonik


Re: embedding solr in a webapp?

Posted by Yonik Seeley <ys...@gmail.com>.
On 6/7/06, Joachim Martin <jm...@path-works.com> wrote:
> We are looking at running read-only solr nodes embedded in our webapp
> nodes.  This would give us the
> additional features of solr over lucene, but would keep it in memory and
> reduce the overhead of http/xml
> transport of results.
>
> Looks like we would just create a request handler and call
> handleRequest(req,rsp), and deal with the
> search results DocList ourselves.

Yes, that should work fine.

> Would there be any reason why this sort of setup would prohibit the use
> of index replication in a master/slave
> setup?

No, that should still work fine.

> Does this make sense?  As you might guess, speed is more important that
> flexibility.

It can make sense in certain cases... but it does cut down on your
flexibility to size the search tier independently of the appserver
tier.

Eliminating the IPC might get you 5% more performance, but at what
development & flexibility cost?  It's easier to buy a slightly faster
box, or simply add another server if you are running behind a
load-balancer.  You know your situation best of course :-)

>  We are using solr for
> a content search, returning ids, and doing a secondary db lookup for
> extended entity information.

You go through the trouble of avoiding one IPC call, but you add it
back in with the DB lookup... are the fields too large to store in
Lucene?

-Yonik