You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@lucene.apache.org by Upayavira <uv...@odoko.co.uk> on 2016/02/25 17:12:53 UTC

Zookeeper and Solr Clients

I've recently had a patch merged into Pysolr that adds ZK awareness
(compatible with custerstate.json). Now I need to update it to be
compatible with the newer state.json, and I just wanted to confirm my
understanding....

If we create a Python 'client' that is tied to a specific collection,
then all I need to do is set up a watch on
/collections/${collection}/state.json, and update the list of nodes
accordingly (as I would have on a watch on clusterstate.json) when
state.json changes.

There's a lot more that *could* be done, but for the basics, it seems
that's enough.

Is it really this simple?

Upayavira

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org

Re: Zookeeper and Solr Clients

Posted by Upayavira <uv...@odoko.co.uk>.

Perfect. So, if when I want to find a node to talk to, I do:�* locate
state.json or clusterstate.json�* identify a suitable node�* confirm the
node is life, and if not repeat from previous step

Then I should be good.

Upayavira


On Fri, Feb 26, 2016, at 08:32 PM, Mark Miller wrote:
> Right, clusterstate.json is never used by itself to determine a
> replica's state. It's always, is it live? Then find it's state in
> clusterstate.json. If it's not live, the state can be anything in
> clusterstate.json and should be ignored.
>
> We make some best efforts to keep it up to date, but it should not be
> counted on (and can't always be counted on), and the above logic is
> how all Solr code reads state.
>
> - Mark
>
> On Fri, Feb 26, 2016 at 2:13 PM Scott Blum
> <dr...@gmail.com> wrote:
>> Published cluster state always lags.� And if a solr node crashes, the
>> status on affected replicas won't actually change until the owning
>> instances tries to come back up.� If you're working on a generally
>> reusable library, you'd want to also watch live_nodes.
>>
>> On Fri, Feb 26, 2016 at 5:23 AM, Upayavira <uv...@odoko.co.uk> wrote:
>>> This is for making a ZK aware Pysolr client (i.e. Python equiv
>>> of SolrJ
>>>
CloudSolrClient). It clearly needs to watch ZK to be able to update the
>>>
list of hosts that make up a collection. We can't use the API, because
>>>
we don't yet know where the Solr nodes are!
>>>
>>>
Upayavira
>>>
>>>
On Fri, Feb 26, 2016, at 09:09 AM, Noble Paul wrote:
>>>
> why do you need to watch anything? you can get the whole clusterstate
>>>
> using the API. ZK access is not required
>>>
>
>>>
> On Thu, Feb 25, 2016 at 9:42 PM, Upayavira <uv...@odoko.co.uk> wrote:
>>>
> > I've recently had a patch merged into Pysolr that adds ZK awareness
>>>
> > (compatible with custerstate.json). Now I need to update it to be
>>>
> > compatible with the newer state.json, and I just wanted to
> > confirm my
>>>
> > understanding....
>>>
> >
>>>
> > If we create a Python 'client' that is tied to a specific
> > collection,
>>>
> > then all I need to do is set up a watch on
>>>
> > /collections/${collection}/state.json, and update the list of nodes
>>>
> > accordingly (as I would have on a watch on clusterstate.json) when
>>>
> > state.json changes.
>>>
> >
>>>
> > There's a lot more that *could* be done, but for the basics,
> > it seems
>>>
> > that's enough.
>>>
> >
>>>
> > Is it really this simple?
>>>
> >
>>>
> > Upayavira
>>>
> >
>>>
> > ---------------------------------------------------------------
> > ------
>>>
> > To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
>>>
> > For additional commands, e-mail: dev-help@lucene.apache.org
>>>
> >
>>>
>
>>>
>
>>>
>
>>>
> --
>>>
> -----------------------------------------------------
>>>
> Noble Paul
>>>
>
>>>
> ---------------------------------------------------------------------
>>>
> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
>>>
> For additional commands, e-mail: dev-help@lucene.apache.org
>>>
>
>>>
>>>
---------------------------------------------------------------------
>>>
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
>>>
For additional commands, e-mail: dev-help@lucene.apache.org
>>>
> --
> - Mark about.me/markrmiller

Re: Zookeeper and Solr Clients

Posted by Mark Miller <ma...@gmail.com>.

Right, clusterstate.json is never used by itself to determine a replica's
state. It's always, is it live? Then find it's state in clusterstate.json.
If it's not live, the state can be anything in clusterstate.json and should
be ignored.

We make some best efforts to keep it up to date, but it should not be
counted on (and can't always be counted on), and the above logic is how all
Solr code reads state.

- Mark

On Fri, Feb 26, 2016 at 2:13 PM Scott Blum <dr...@gmail.com> wrote:

> Published cluster state always lags.  And if a solr node crashes, the
> status on affected replicas won't actually change until the owning
> instances tries to come back up.  If you're working on a generally reusable
> library, you'd want to also watch live_nodes.
>
> On Fri, Feb 26, 2016 at 5:23 AM, Upayavira <uv...@odoko.co.uk> wrote:
>
>> This is for making a ZK aware Pysolr client (i.e. Python equiv of SolrJ
>> CloudSolrClient). It clearly needs to watch ZK to be able to update the
>> list of hosts that make up a collection. We can't use the API, because
>> we don't yet know where the Solr nodes are!
>>
>> Upayavira
>>
>> On Fri, Feb 26, 2016, at 09:09 AM, Noble Paul wrote:
>> > why do you need to watch anything? you can get the whole clusterstate
>> > using the API. ZK access is not required
>> >
>> > On Thu, Feb 25, 2016 at 9:42 PM, Upayavira <uv...@odoko.co.uk> wrote:
>> > > I've recently had a patch merged into Pysolr that adds ZK awareness
>> > > (compatible with custerstate.json). Now I need to update it to be
>> > > compatible with the newer state.json, and I just wanted to confirm my
>> > > understanding....
>> > >
>> > > If we create a Python 'client' that is tied to a specific collection,
>> > > then all I need to do is set up a watch on
>> > > /collections/${collection}/state.json, and update the list of nodes
>> > > accordingly (as I would have on a watch on clusterstate.json) when
>> > > state.json changes.
>> > >
>> > > There's a lot more that *could* be done, but for the basics, it seems
>> > > that's enough.
>> > >
>> > > Is it really this simple?
>> > >
>> > > Upayavira
>> > >
>> > > ---------------------------------------------------------------------
>> > > To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
>> > > For additional commands, e-mail: dev-help@lucene.apache.org
>> > >
>> >
>> >
>> >
>> > --
>> > -----------------------------------------------------
>> > Noble Paul
>> >
>> > ---------------------------------------------------------------------
>> > To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
>> > For additional commands, e-mail: dev-help@lucene.apache.org
>> >
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: dev-help@lucene.apache.org
>>
>>
> --
- Mark
about.me/markrmiller

Re: Zookeeper and Solr Clients

Posted by Scott Blum <dr...@gmail.com>.

Published cluster state always lags.  And if a solr node crashes, the
status on affected replicas won't actually change until the owning
instances tries to come back up.  If you're working on a generally reusable
library, you'd want to also watch live_nodes.

On Fri, Feb 26, 2016 at 5:23 AM, Upayavira <uv...@odoko.co.uk> wrote:

> This is for making a ZK aware Pysolr client (i.e. Python equiv of SolrJ
> CloudSolrClient). It clearly needs to watch ZK to be able to update the
> list of hosts that make up a collection. We can't use the API, because
> we don't yet know where the Solr nodes are!
>
> Upayavira
>
> On Fri, Feb 26, 2016, at 09:09 AM, Noble Paul wrote:
> > why do you need to watch anything? you can get the whole clusterstate
> > using the API. ZK access is not required
> >
> > On Thu, Feb 25, 2016 at 9:42 PM, Upayavira <uv...@odoko.co.uk> wrote:
> > > I've recently had a patch merged into Pysolr that adds ZK awareness
> > > (compatible with custerstate.json). Now I need to update it to be
> > > compatible with the newer state.json, and I just wanted to confirm my
> > > understanding....
> > >
> > > If we create a Python 'client' that is tied to a specific collection,
> > > then all I need to do is set up a watch on
> > > /collections/${collection}/state.json, and update the list of nodes
> > > accordingly (as I would have on a watch on clusterstate.json) when
> > > state.json changes.
> > >
> > > There's a lot more that *could* be done, but for the basics, it seems
> > > that's enough.
> > >
> > > Is it really this simple?
> > >
> > > Upayavira
> > >
> > > ---------------------------------------------------------------------
> > > To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> > > For additional commands, e-mail: dev-help@lucene.apache.org
> > >
> >
> >
> >
> > --
> > -----------------------------------------------------
> > Noble Paul
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> > For additional commands, e-mail: dev-help@lucene.apache.org
> >
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: dev-help@lucene.apache.org
>
>

Re: Zookeeper and Solr Clients

Posted by Upayavira <uv...@odoko.co.uk>.

This is for making a ZK aware Pysolr client (i.e. Python equiv of SolrJ
CloudSolrClient). It clearly needs to watch ZK to be able to update the
list of hosts that make up a collection. We can't use the API, because
we don't yet know where the Solr nodes are!

Upayavira

On Fri, Feb 26, 2016, at 09:09 AM, Noble Paul wrote:
> why do you need to watch anything? you can get the whole clusterstate
> using the API. ZK access is not required
> 
> On Thu, Feb 25, 2016 at 9:42 PM, Upayavira <uv...@odoko.co.uk> wrote:
> > I've recently had a patch merged into Pysolr that adds ZK awareness
> > (compatible with custerstate.json). Now I need to update it to be
> > compatible with the newer state.json, and I just wanted to confirm my
> > understanding....
> >
> > If we create a Python 'client' that is tied to a specific collection,
> > then all I need to do is set up a watch on
> > /collections/${collection}/state.json, and update the list of nodes
> > accordingly (as I would have on a watch on clusterstate.json) when
> > state.json changes.
> >
> > There's a lot more that *could* be done, but for the basics, it seems
> > that's enough.
> >
> > Is it really this simple?
> >
> > Upayavira
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> > For additional commands, e-mail: dev-help@lucene.apache.org
> >
> 
> 
> 
> -- 
> -----------------------------------------------------
> Noble Paul
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: dev-help@lucene.apache.org
> 

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org

Re: Zookeeper and Solr Clients

Posted by Noble Paul <no...@gmail.com>.

why do you need to watch anything? you can get the whole clusterstate
using the API. ZK access is not required

On Thu, Feb 25, 2016 at 9:42 PM, Upayavira <uv...@odoko.co.uk> wrote:
> I've recently had a patch merged into Pysolr that adds ZK awareness
> (compatible with custerstate.json). Now I need to update it to be
> compatible with the newer state.json, and I just wanted to confirm my
> understanding....
>
> If we create a Python 'client' that is tied to a specific collection,
> then all I need to do is set up a watch on
> /collections/${collection}/state.json, and update the list of nodes
> accordingly (as I would have on a watch on clusterstate.json) when
> state.json changes.
>
> There's a lot more that *could* be done, but for the basics, it seems
> that's enough.
>
> Is it really this simple?
>
> Upayavira
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: dev-help@lucene.apache.org
>



-- 
-----------------------------------------------------
Noble Paul

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org

Re: Zookeeper and Solr Clients

Posted by Upayavira <uv...@odoko.co.uk>.

How does that help me? The live_nodes watch tells me when nodes go up
and down, but surely I should be waiting for the overseer to do the same
and update state.json. I'd just want live_nodes watched for situations
where someone wants to do, say, collections API calls that aren't
specific to a collection, I presume.

Upayavira


On Thu, Feb 25, 2016, at 04:45 PM, Scott Blum wrote:
> You probably also want a child watch on live_nodes to monitor
> connected nodes.
>
> On Thu, Feb 25, 2016 at 11:12 AM, Upayavira <uv...@odoko.co.uk> wrote:
>> I've recently had a patch merged into Pysolr that adds ZK awareness
>>
(compatible with custerstate.json). Now I need to update it to be
>>
compatible with the newer state.json, and I just wanted to confirm my
>>
understanding....
>>
>>
If we create a Python 'client' that is tied to a specific collection,
>>
then all I need to do is set up a watch on
>>
/collections/${collection}/state.json, and update the list of nodes
>>
accordingly (as I would have on a watch on clusterstate.json) when
>>
state.json changes.
>>
>>
There's a lot more that *could* be done, but for the basics, it seems
>>
that's enough.
>>
>>
Is it really this simple?
>>
>>
Upayavira
>>
>>
---------------------------------------------------------------------
>>
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
>>
For additional commands, e-mail: dev-help@lucene.apache.org
>>

Re: Zookeeper and Solr Clients

Posted by Scott Blum <dr...@gmail.com>.

You probably also want a child watch on live_nodes to monitor connected
nodes.

On Thu, Feb 25, 2016 at 11:12 AM, Upayavira <uv...@odoko.co.uk> wrote:

> I've recently had a patch merged into Pysolr that adds ZK awareness
> (compatible with custerstate.json). Now I need to update it to be
> compatible with the newer state.json, and I just wanted to confirm my
> understanding....
>
> If we create a Python 'client' that is tied to a specific collection,
> then all I need to do is set up a watch on
> /collections/${collection}/state.json, and update the list of nodes
> accordingly (as I would have on a watch on clusterstate.json) when
> state.json changes.
>
> There's a lot more that *could* be done, but for the basics, it seems
> that's enough.
>
> Is it really this simple?
>
> Upayavira
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: dev-help@lucene.apache.org
>
>