You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Peter Wolanin <pe...@acquia.com> on 2012/11/12 01:42:23 UTC

Solr 4.0 - distributed updates without zookeeper?

Looking at how we could upgrade some of our infrastructure to Solr 4.0
- I would really like to take advantage of distributed updates to get
NRT, but we want to keep our fixed master and slave server roles since
we use different hardware appropriate to the different roles.

Looking at the solr 4.0 distributed update code, it seems really
hard-coded and bound to zookeeper.  Is there a way to have a solr
master distribute updates without using ZK, or a way to mock the ZK
interface to provide a fixed cluster topography that will work when
sending updates just to the master?

To be clear, if the master goes doen I don't want a slave promoted,
nor do I want most of the other SolrCloud features - we have already
built out a system for managing groups of servers.

Thanks,

Peter

Re: Solr 4.0 - distributed updates without zookeeper?

Posted by Peter Wolanin <pe...@acquia.com>.
So, from looking at the code and talking to some of the Lucid guys
today, it seems like there is no good way (currently) to control the
shard leader selection, or even to "fail back" if the preferred leader
server comes back up.

We currently let indexing fail if the one master goes down, but adding
HA there would be helpful in some cases.

-Peter

On Tue, Nov 13, 2012 at 9:12 PM, Peter Wolanin <pe...@acquia.com> wrote:
> Yes, basically I want to at least avoid leader election and the other
> dynamic behaviors.  I don't have any experience with ZK, and a lot of
> "magic" behavior seems baked in now that's I'm concerned I'd need to
> dig into SK to debug or monitor what's really happening as we scale
> out.
>
> We also have a somewhat non-typical use case, of lots of small
> cores/indexes on the same server, rather large indexes that might need
> multiple shards.
>
> We have master servers that have persistent (but sometimes slower)
> storage, and slaves with faster non-persistent disk.
>
> My colleague noticed that their is a param to flag a server as
> eligible to be a shard leader, so I guess we could enable that for
> only the preferred master?
>
> I'm also having trouble understanding config handling from the docs.
> Even browsing the java code I don't see if Solr is creating the
> instance dirs, or somehow just linking to config files?  It sounds as
> though if I create a core using core admin, it would get associated
> with a collection of the same name.
>
> -Peter
>
> On Mon, Nov 12, 2012 at 9:41 PM, Otis Gospodnetic
> <ot...@gmail.com> wrote:
>> Hi Peter,
>>
>> Not sure I have the answer for you, but are you looking to avoid using ZK
>> for some reason?
>> Or are you OK with ZK per se, but just don't want any leader re-election
>> and any other dynamic/cloudy behaviour?
>>
>> Could you not simply treat 1 node as the "master" to which you send all
>> your updates and let SolrCloud distribute that to the rest of the cluster?
>> Is your main/only worry around what happens if this 1 node that you
>> designated as the master goes down? What would you like to happen?  You'd
>> like indexing to start failing, while the search functionality remains up?
>>
>> Otis
>> --
>> Search Analytics - http://sematext.com/search-analytics/index.html
>> Performance Monitoring - http://sematext.com/spm/index.html
>>
>>
>> On Sun, Nov 11, 2012 at 7:42 PM, Peter Wolanin <pe...@acquia.com>wrote:
>>
>>> Looking at how we could upgrade some of our infrastructure to Solr 4.0
>>> - I would really like to take advantage of distributed updates to get
>>> NRT, but we want to keep our fixed master and slave server roles since
>>> we use different hardware appropriate to the different roles.
>>>
>>> Looking at the solr 4.0 distributed update code, it seems really
>>> hard-coded and bound to zookeeper.  Is there a way to have a solr
>>> master distribute updates without using ZK, or a way to mock the ZK
>>> interface to provide a fixed cluster topography that will work when
>>> sending updates just to the master?
>>>
>>> To be clear, if the master goes doen I don't want a slave promoted,
>>> nor do I want most of the other SolrCloud features - we have already
>>> built out a system for managing groups of servers.
>>>
>>> Thanks,
>>>
>>> Peter
>>>
>
>
>
> --
> Peter M. Wolanin, Ph.D.      : Momentum Specialist,  Acquia. Inc.
> peter.wolanin@acquia.com : 781-313-8322
>
> "Get a free, hosted Drupal 7 site: http://www.drupalgardens.com"



-- 
Peter M. Wolanin, Ph.D.      : Momentum Specialist,  Acquia. Inc.
peter.wolanin@acquia.com : 781-313-8322

"Get a free, hosted Drupal 7 site: http://www.drupalgardens.com"

Re: Solr 4.0 - distributed updates without zookeeper?

Posted by Peter Wolanin <pe...@acquia.com>.
Yes, basically I want to at least avoid leader election and the other
dynamic behaviors.  I don't have any experience with ZK, and a lot of
"magic" behavior seems baked in now that's I'm concerned I'd need to
dig into SK to debug or monitor what's really happening as we scale
out.

We also have a somewhat non-typical use case, of lots of small
cores/indexes on the same server, rather large indexes that might need
multiple shards.

We have master servers that have persistent (but sometimes slower)
storage, and slaves with faster non-persistent disk.

My colleague noticed that their is a param to flag a server as
eligible to be a shard leader, so I guess we could enable that for
only the preferred master?

I'm also having trouble understanding config handling from the docs.
Even browsing the java code I don't see if Solr is creating the
instance dirs, or somehow just linking to config files?  It sounds as
though if I create a core using core admin, it would get associated
with a collection of the same name.

-Peter

On Mon, Nov 12, 2012 at 9:41 PM, Otis Gospodnetic
<ot...@gmail.com> wrote:
> Hi Peter,
>
> Not sure I have the answer for you, but are you looking to avoid using ZK
> for some reason?
> Or are you OK with ZK per se, but just don't want any leader re-election
> and any other dynamic/cloudy behaviour?
>
> Could you not simply treat 1 node as the "master" to which you send all
> your updates and let SolrCloud distribute that to the rest of the cluster?
> Is your main/only worry around what happens if this 1 node that you
> designated as the master goes down? What would you like to happen?  You'd
> like indexing to start failing, while the search functionality remains up?
>
> Otis
> --
> Search Analytics - http://sematext.com/search-analytics/index.html
> Performance Monitoring - http://sematext.com/spm/index.html
>
>
> On Sun, Nov 11, 2012 at 7:42 PM, Peter Wolanin <pe...@acquia.com>wrote:
>
>> Looking at how we could upgrade some of our infrastructure to Solr 4.0
>> - I would really like to take advantage of distributed updates to get
>> NRT, but we want to keep our fixed master and slave server roles since
>> we use different hardware appropriate to the different roles.
>>
>> Looking at the solr 4.0 distributed update code, it seems really
>> hard-coded and bound to zookeeper.  Is there a way to have a solr
>> master distribute updates without using ZK, or a way to mock the ZK
>> interface to provide a fixed cluster topography that will work when
>> sending updates just to the master?
>>
>> To be clear, if the master goes doen I don't want a slave promoted,
>> nor do I want most of the other SolrCloud features - we have already
>> built out a system for managing groups of servers.
>>
>> Thanks,
>>
>> Peter
>>



-- 
Peter M. Wolanin, Ph.D.      : Momentum Specialist,  Acquia. Inc.
peter.wolanin@acquia.com : 781-313-8322

"Get a free, hosted Drupal 7 site: http://www.drupalgardens.com"

Re: Solr 4.0 - distributed updates without zookeeper?

Posted by Otis Gospodnetic <ot...@gmail.com>.
Hi Peter,

Not sure I have the answer for you, but are you looking to avoid using ZK
for some reason?
Or are you OK with ZK per se, but just don't want any leader re-election
and any other dynamic/cloudy behaviour?

Could you not simply treat 1 node as the "master" to which you send all
your updates and let SolrCloud distribute that to the rest of the cluster?
Is your main/only worry around what happens if this 1 node that you
designated as the master goes down? What would you like to happen?  You'd
like indexing to start failing, while the search functionality remains up?

Otis
--
Search Analytics - http://sematext.com/search-analytics/index.html
Performance Monitoring - http://sematext.com/spm/index.html


On Sun, Nov 11, 2012 at 7:42 PM, Peter Wolanin <pe...@acquia.com>wrote:

> Looking at how we could upgrade some of our infrastructure to Solr 4.0
> - I would really like to take advantage of distributed updates to get
> NRT, but we want to keep our fixed master and slave server roles since
> we use different hardware appropriate to the different roles.
>
> Looking at the solr 4.0 distributed update code, it seems really
> hard-coded and bound to zookeeper.  Is there a way to have a solr
> master distribute updates without using ZK, or a way to mock the ZK
> interface to provide a fixed cluster topography that will work when
> sending updates just to the master?
>
> To be clear, if the master goes doen I don't want a slave promoted,
> nor do I want most of the other SolrCloud features - we have already
> built out a system for managing groups of servers.
>
> Thanks,
>
> Peter
>