You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Monica Skidmore <Mo...@careerbuilder.com> on 2018/05/01 13:41:50 UTC
Re: Load Balancing between Two Cloud Clusters
Thank you, Erick. This is exactly the information I needed but hadn't correctly parsed as a new Solr cloud user. You've just made setting up our new configuration much easier!!
Monica Skidmore
Senior Software Engineer
On 4/30/18, 7:29 PM, "Erick Erickson" <er...@gmail.com> wrote:
"We need a way to determine that a node is still 'alive' and should be
in the load balancer, and we need a way to know that a new node is now
available and fully ready with its replicas to add to the load
balancer."
Why? If a Solr node is running but the replicas aren't up yet, it'll
pass the request along to a node that _does_ have live replicas, you
don't have to do anything. As far as the node being alive, there are
lots of ways, any API end point has to have a Solr to field it,
perhaps just use the Collections LIST command?
"How does ZooKeeper make this determination? Does it do something
different if multiple collections are on a single cluster? And, even
with just one cluster, what is best practice for keeping a current
list of active nodes in the cluster, especially for extremely high
query rates?"
This is a common misconception. ZooKeeper isn't interested in Solr at
all. ZooKeeper will ping the nodes it knows about and, perhaps, remove
a node from the live_nodes list, but that's all. It isn't involved in
Solr's operation in terms of routing queries, updates or anything like
that.
_Solr_ keeps track of all this by _watching_ various znodes. Say Solr
hosts some replica in a collection. when it comes up it sets a "watch"
on the /collections/my_collection/state.json Znode. It also published
its own state. So say it hosts three replicas for the collection. As
each one is loaded and ready for action, Solr posts an update to the
relevant state.json file.
ZooKeeper is then responsible for telling an other node who'd set a
watch that the znode has changed. ZK doesn't know or care whether
those are Solr nodes or not.
So when a request comes in to a Solr node, it knows what other Solr
nodes host what particular replicas and does all the sub-requests
itself, ZK isn't involved at all at that level.
So imagine node1 hosts S1R1 and S2R1 Node2 hosts S1R2 and S2R2 (for
collection A). When node1 comes up it updates the state in ZK to say
S1R2 and S1R2 are "active". Now claim node2 is coming up but hasn't
loaded it's cores yet. If it receives a request it can forward them on
to node1.
Now node2 loads both its cores. It updates the ZK node for the
collection, and since node1 is watching, it fetches the updated
state.json. From this point forward, both nodes have complete
information about all the replicas in the collection and don't need to
reference ZK any more at all.
In fact, ZK can completely go away and _queries_ can continue to work
off their cached state.json. Updates will fail since ZK quorums are
required for updates to indexes to prevent "split brain" problems.
Best,
Erick
On Mon, Apr 30, 2018 at 11:03 AM, Monica Skidmore
<Mo...@careerbuilder.com> wrote:
> Thank you, Erick. That confirms our understanding for a single cluster, or once we select a node from one of the two clusters to query.
>
> As we try to set up an external load balancer to go between two clusters, though, we still have some questions. We need a way to determine that a node is still 'alive' and should be in the load balancer, and we need a way to know that a new node is now available and fully ready with its replicas to add to the load balancer.
>
> How does ZooKeeper make this determination? Does it do something different if multiple collections are on a single cluster? And, even with just one cluster, what is best practice for keeping a current list of active nodes in the cluster, especially for extremely high query rates?
>
> Again, if there's some good documentation on this, I'd love a pointer...
>
> Monica Skidmore
> Senior Software Engineer
>
>
>
> On 4/30/18, 1:09 PM, "Erick Erickson" <er...@gmail.com> wrote:
>
> Multiple clusters with the same dataset aren't load-balanced by Solr,
> you'll have to accomplish that from "outside", e.g. something that sends
> queries to each cluster.
>
> _Within_ a cluster (collection), as long as a request gets to any Solr
> node, sub-requests are distributed with an internal software LB. As far as
> a single collection, you're fine just sending any query to any node. Even
> if you send a query to a node that hosts no replicas for a collection, Solr
> will "do the right thing" and forward it appropiately.
>
> HTH,
> Erick
>
> On Mon, Apr 30, 2018 at 9:46 AM, Monica Skidmore <
> Monica.Skidmore@careerbuilder.com> wrote:
>
> > We are migrating from a master-slave configuration to Solr cloud (7.3) and
> > have questions about the preferred way to load balance between the two
> > clusters. It looks like we want to use a load balancer that directs
> > queries to any of the server nodes in either cluster, trusting that node to
> > handle the query correctly – true? If we auto-scale nodes into the
> > cluster, are there considerations about when a node becomes ‘ready’ to
> > query from a Solr perspective and when it is added to the load balancer?
> > Also, what is the preferred method of doing a health-check for the load
> > balancer – would it be “bin/solr healthcheck -c myCollection”?
> >
> >
> >
> > Pointers in the right direction – especially to any documentation on
> > running multiple clusters with the same dataset – would be appreciated.
> >
> >
> >
> > *Monica Skidmore*
> > *Senior Software Engineer*
> >
> >
> >
> > [image: cid:image001.png@01D3A0F1.06327950]
> >
> >
> >
>
>
Re: Load Balancing between Two Cloud Clusters
Posted by Erick Erickson <er...@gmail.com>.
Glad to help. Yeah, I thought you might have been making it harder
than it needed to be ;).
In SolrCloud you're constantly running up against "it's just magic
until it's not", knowing when magic applies and when it doesn't can be
tricky, very tricky.....
Basically when using LBs, people just throw nodes at the LB when they
come up. If the Solr end points aren't available, then they're skipped
etc.....
I'll also add that SolrJ, the CloudSolrClient specifically, does all
this on the client side, it's ZK-aware so knows the topology of the
active Solr nodes and "does the right thing" via internal LBs.
Best,
Erick
On Tue, May 1, 2018 at 6:41 AM, Monica Skidmore
<Mo...@careerbuilder.com> wrote:
> Thank you, Erick. This is exactly the information I needed but hadn't correctly parsed as a new Solr cloud user. You've just made setting up our new configuration much easier!!
>
> Monica Skidmore
> Senior Software Engineer
>
>
>
> On 4/30/18, 7:29 PM, "Erick Erickson" <er...@gmail.com> wrote:
>
> "We need a way to determine that a node is still 'alive' and should be
> in the load balancer, and we need a way to know that a new node is now
> available and fully ready with its replicas to add to the load
> balancer."
>
> Why? If a Solr node is running but the replicas aren't up yet, it'll
> pass the request along to a node that _does_ have live replicas, you
> don't have to do anything. As far as the node being alive, there are
> lots of ways, any API end point has to have a Solr to field it,
> perhaps just use the Collections LIST command?
>
> "How does ZooKeeper make this determination? Does it do something
> different if multiple collections are on a single cluster? And, even
> with just one cluster, what is best practice for keeping a current
> list of active nodes in the cluster, especially for extremely high
> query rates?"
>
> This is a common misconception. ZooKeeper isn't interested in Solr at
> all. ZooKeeper will ping the nodes it knows about and, perhaps, remove
> a node from the live_nodes list, but that's all. It isn't involved in
> Solr's operation in terms of routing queries, updates or anything like
> that.
>
> _Solr_ keeps track of all this by _watching_ various znodes. Say Solr
> hosts some replica in a collection. when it comes up it sets a "watch"
> on the /collections/my_collection/state.json Znode. It also published
> its own state. So say it hosts three replicas for the collection. As
> each one is loaded and ready for action, Solr posts an update to the
> relevant state.json file.
>
> ZooKeeper is then responsible for telling an other node who'd set a
> watch that the znode has changed. ZK doesn't know or care whether
> those are Solr nodes or not.
>
> So when a request comes in to a Solr node, it knows what other Solr
> nodes host what particular replicas and does all the sub-requests
> itself, ZK isn't involved at all at that level.
>
> So imagine node1 hosts S1R1 and S2R1 Node2 hosts S1R2 and S2R2 (for
> collection A). When node1 comes up it updates the state in ZK to say
> S1R2 and S1R2 are "active". Now claim node2 is coming up but hasn't
> loaded it's cores yet. If it receives a request it can forward them on
> to node1.
>
> Now node2 loads both its cores. It updates the ZK node for the
> collection, and since node1 is watching, it fetches the updated
> state.json. From this point forward, both nodes have complete
> information about all the replicas in the collection and don't need to
> reference ZK any more at all.
>
> In fact, ZK can completely go away and _queries_ can continue to work
> off their cached state.json. Updates will fail since ZK quorums are
> required for updates to indexes to prevent "split brain" problems.
>
> Best,
> Erick
>
> On Mon, Apr 30, 2018 at 11:03 AM, Monica Skidmore
> <Mo...@careerbuilder.com> wrote:
> > Thank you, Erick. That confirms our understanding for a single cluster, or once we select a node from one of the two clusters to query.
> >
> > As we try to set up an external load balancer to go between two clusters, though, we still have some questions. We need a way to determine that a node is still 'alive' and should be in the load balancer, and we need a way to know that a new node is now available and fully ready with its replicas to add to the load balancer.
> >
> > How does ZooKeeper make this determination? Does it do something different if multiple collections are on a single cluster? And, even with just one cluster, what is best practice for keeping a current list of active nodes in the cluster, especially for extremely high query rates?
> >
> > Again, if there's some good documentation on this, I'd love a pointer...
> >
> > Monica Skidmore
> > Senior Software Engineer
> >
> >
> >
> > On 4/30/18, 1:09 PM, "Erick Erickson" <er...@gmail.com> wrote:
> >
> > Multiple clusters with the same dataset aren't load-balanced by Solr,
> > you'll have to accomplish that from "outside", e.g. something that sends
> > queries to each cluster.
> >
> > _Within_ a cluster (collection), as long as a request gets to any Solr
> > node, sub-requests are distributed with an internal software LB. As far as
> > a single collection, you're fine just sending any query to any node. Even
> > if you send a query to a node that hosts no replicas for a collection, Solr
> > will "do the right thing" and forward it appropiately.
> >
> > HTH,
> > Erick
> >
> > On Mon, Apr 30, 2018 at 9:46 AM, Monica Skidmore <
> > Monica.Skidmore@careerbuilder.com> wrote:
> >
> > > We are migrating from a master-slave configuration to Solr cloud (7.3) and
> > > have questions about the preferred way to load balance between the two
> > > clusters. It looks like we want to use a load balancer that directs
> > > queries to any of the server nodes in either cluster, trusting that node to
> > > handle the query correctly – true? If we auto-scale nodes into the
> > > cluster, are there considerations about when a node becomes ‘ready’ to
> > > query from a Solr perspective and when it is added to the load balancer?
> > > Also, what is the preferred method of doing a health-check for the load
> > > balancer – would it be “bin/solr healthcheck -c myCollection”?
> > >
> > >
> > >
> > > Pointers in the right direction – especially to any documentation on
> > > running multiple clusters with the same dataset – would be appreciated.
> > >
> > >
> > >
> > > *Monica Skidmore*
> > > *Senior Software Engineer*
> > >
> > >
> > >
> > > [image: cid:image001.png@01D3A0F1.06327950]
> > >
> > >
> > >
> >
> >
>
>