You are viewing a plain text version of this content. The canonical link for it is here.

Posted to solr-user@lucene.apache.org by Bernd Fehling <be...@uni-bielefeld.de> on 2017/05/08 11:38:45 UTC

distribution of leader and replica in SolrCloud

My assumption was that the strength of SolrCloud is the distribution
of leader and replica within the Cloud and make the Cloud somewhat failsafe.
But after setting up SolrCloud with a collection I have both, leader and
replica, on the same shard. And this should be failsafe?

o.a.s.h.a.CollectionsHandler Invoked Collection Action :create with params
replicationFactor=2&routerName=compositeId&collection.configName=boss&
maxShardsPerNode=1&name=boss&router.name=compositeId&action=CREATE&numShards=5

boss ------ shard1 ----- server2:7574
       |             |-- server2:8983 (leader)
       |
        --- shard2 ----- server1:8983
       |             |-- server5:7575 (leader)
       |
        --- shard3 ----- server3:8983 (leader)
       |             |-- server4:8983
       |
        --- shard4 ----- server1:7574 (leader)
       |             |-- server4:7574
       |
        --- shard5 ----- server3:7574 (leader)
                     |-- server5:8983

From my point of view, if server2 is going to crash then shard1 will disappear and
1/5th of the index is missing.

What is your opinion?

Regards
Bernd

Re: distribution of leader and replica in SolrCloud

Posted by Erick Erickson <er...@gmail.com>.

Also, you can specify custom placement rules, see:
https://cwiki.apache.org/confluence/display/solr/Rule-based+Replica+Placement

But Shawn's statement is the nub of what you're seeing, by default
multiple JVMs on the same physical machine are considered separate
Solr instances.

Also note that if you want to, you can specify a nodeSet when you
create the nodes, and in particular the special value EMPTY. That'll
create a collection with no replicas and you can ADDREPLICA to
precisely place each one if you require that level of control.

Best,
Erick

On Mon, May 8, 2017 at 7:44 AM, Shawn Heisey <ap...@elyograg.org> wrote:
> On 5/8/2017 5:38 AM, Bernd Fehling wrote:
>> boss ------ shard1 ----- server2:7574
>>        |             |-- server2:8983 (leader)
>
> The reason that this happened is because you've got two nodes running on
> every server.  From SolrCloud's perspective, there are ten distinct
> nodes, not five.
>
> SolrCloud doesn't notice the fact that different nodes are running on
> the same server(s).  If your reaction to hearing this is that it
> *should* notice, you're probably right, but in a typical use case, each
> server should only be running one Solr instance, so this would never happen.
>
> There is only one instance where I can think of where I would recommend
> running multiple instances per server, and that is when the required
> heap size for a single instance would be VERY large.  Running two
> instances with smaller heaps can yield better performance.
>
> See this issue:
>
> https://issues.apache.org/jira/browse/SOLR-6027
>
> Thanks,
> Shawn
>

Re: distribution of leader and replica in SolrCloud

Posted by Rick Leir <rl...@leirtech.com>.

Bernd,

Yes, cloud, ahhh. As you say, the world changed.  Do you have any hint 
from the cloud provider as to which physical machine your virtual server 
is on? If so, you can hopefully distribute your replicas across physical 
machines. This is not just for reliability: in a sharded system, each 
query will cause activity in several virtual servers and you would 
prefer that they are on separate physical machines, not competing for 
resources. Maybe, for Solr, you should choose a provider which can lease 
you the whole physical machine. You would prefer a 256G machine over 
several shards on 64G virtual machines.

And many cloud providers assume that servers are mostly idle, so they 
cram too many server containers into a machine. Then, very occasionally, 
you get OOM even though you did not exceed your advertised RAM. This is 
a topic for some other forum, where should I look?

With AWS you can choose to locate your virtual machine in US-west-Oregon 
or US-east-i-forget or a few other locations, but that is a very coarse 
division. Can you choose physical machine?

With Google, it might be dynamic?
cheers -- Rick


On 2017-05-09 03:44 AM, Bernd Fehling wrote:
> I would name your solution more a work around as any similar solution of this kind.
> The issue SOLR-6027 is now 3 years open and the world has changed.
> Instead of racks full of blades where you had many dedicated bare metal servers
> you have now huge machines with 256GB RAM and many CPUs. Virtualization has taken place.
> To get under these conditions some independance from the physical hardware you have
> to spread the shards across several physical machines with virtual servers.
> >From my point of view it is a good solution to have 5 virtual 64GB servers
> on 5 different huge physical machines and start 2 instances on each virtual server.
> If I would split up each 64GB virtual server into two 32GB virtual server there would
> be no gain. We don't have 10 huge machines (no security win) and we have to admin
> and control 10 virtual servers instead of 5 (plus zookeeper servers).
>
> It is state of the art that you don't have to care about the servers within
> the cloud. This is the main sense of a cloud.
> The leader should always be aware who are the members of his cloud, how to reach
> them (IP address) and how are the users of the cloud (collections) distributed
> across the cloud.
>
> It would be great if a solution of issue SOLR-6027 would lead to some kind of
> "automatic mode" for server distribution, without any special configuring.
>
> Regards,
> Bernd
>
>
> Am 08.05.2017 um 17:47 schrieb Erick Erickson:
>> Also, you can specify custom placement rules, see:
>> https://cwiki.apache.org/confluence/display/solr/Rule-based+Replica+Placement
>>
>> But Shawn's statement is the nub of what you're seeing, by default
>> multiple JVMs on the same physical machine are considered separate
>> Solr instances.
>>
>> Also note that if you want to, you can specify a nodeSet when you
>> create the nodes, and in particular the special value EMPTY. That'll
>> create a collection with no replicas and you can ADDREPLICA to
>> precisely place each one if you require that level of control.
>>
>> Best,
>> Erick
>>
>> On Mon, May 8, 2017 at 7:44 AM, Shawn Heisey <ap...@elyograg.org> wrote:
>>> On 5/8/2017 5:38 AM, Bernd Fehling wrote:
>>>> boss ------ shard1 ----- server2:7574
>>>>         |             |-- server2:8983 (leader)
>>> The reason that this happened is because you've got two nodes running on
>>> every server.  From SolrCloud's perspective, there are ten distinct
>>> nodes, not five.
>>>
>>> SolrCloud doesn't notice the fact that different nodes are running on
>>> the same server(s).  If your reaction to hearing this is that it
>>> *should* notice, you're probably right, but in a typical use case, each
>>> server should only be running one Solr instance, so this would never happen.
>>>
>>> There is only one instance where I can think of where I would recommend
>>> running multiple instances per server, and that is when the required
>>> heap size for a single instance would be VERY large.  Running two
>>> instances with smaller heaps can yield better performance.
>>>
>>> See this issue:
>>>
>>> https://issues.apache.org/jira/browse/SOLR-6027
>>>
>>> Thanks,
>>> Shawn
>>>

Re: distribution of leader and replica in SolrCloud

Posted by Erick Erickson <er...@gmail.com>.

Bernd:

You rarely have to worry about who the leader is unless and until you
get many 100s of shards. The extra work a leader does is usually
minimal and spending time trying to control where the leaders live is
usually time wasted. Leaders will shift from replica to replica
anyway. Say your leader for shard1 is on instance1, shard1_replica1.
Then you shut instance1 down. The leader will shift to some other
replica in the shard, say shard1_replica4.

If you insist you can use the collections API BALANCESHARDUNIQUE and
REBALANCELEADERS. The former assigns a "preferredLeader" role to one
replica for each shard and the latter tries to make those replicas the
real leader. If you really want to go all-out you can use
ADDREPLICAPROP to make the replica of your choice the preferredLeader.

But this is generally a waste of time and energy. Those abilities were
added for a case where 100s of leaders wound up being in the same JVM
and the performance impact was noticeable. And even if you do assign
the preferredLeader role, that is just a hint, not a requirement. The
collection will tend to have the specified replicas be the leaders,
but only "tend".

Best,
Erick

On Tue, May 9, 2017 at 5:35 AM, Shawn Heisey <ap...@elyograg.org> wrote:
> On 5/9/2017 1:44 AM, Bernd Fehling wrote:
>> From my point of view it is a good solution to have 5 virtual 64GB
>> servers on 5 different huge physical machines and start 2 instances on
>> each virtual server.
>
> If the total amount of memory in the virtual machine is 64GB, then I
> would run one Solr node on it with a heap size between 8 and 16GB.  The
> rest of the memory in the virtual machine would then be available to
> cache whatever index data exists.  That caching is extremely important
> for good performance.
>
> If the *heap* size is what would be 64GB (and you actually do need that
> much heap), then it *does* make sense to split that into two instances,
> each with a 31GB heap.  I would argue that it's better to have those two
> instances on separate machines.
>
> Assuming that you have a bare metal server with 256GB of RAM, you would
> *not* want to divide that up into five virtual machines each with 64GB.
> The physical host would not have enough memory for all five virtual
> machines.  It would have the option of using its disk space as extra
> memory, but as soon as you start swapping memory to disk, performance of
> ANY software becomes unacceptable.  Solr in particular requires actual
> real memory.  Oversubscribing memory on VMs might work for some
> workloads, but it won't work for Solr.
>
> If all your virtual machines are running on the same physical host, then
> you have no redundancy.  Modern servers have redundant power supplies,
> redundant hard drives, and other kinds of fault tolerance.  Even so,
> there are many components in a server that have no redundancy, like the
> motherboard, or the backplane.  If one of those components were to die,
> all of the virtual machines would go down.
>
> Thanks,
> Shawn
>

Re: distribution of leader and replica in SolrCloud

Posted by Shawn Heisey <ap...@elyograg.org>.

On 5/9/2017 1:44 AM, Bernd Fehling wrote:
> From my point of view it is a good solution to have 5 virtual 64GB
> servers on 5 different huge physical machines and start 2 instances on
> each virtual server. 

If the total amount of memory in the virtual machine is 64GB, then I
would run one Solr node on it with a heap size between 8 and 16GB.  The
rest of the memory in the virtual machine would then be available to
cache whatever index data exists.  That caching is extremely important
for good performance.

If the *heap* size is what would be 64GB (and you actually do need that
much heap), then it *does* make sense to split that into two instances,
each with a 31GB heap.  I would argue that it's better to have those two
instances on separate machines.

Assuming that you have a bare metal server with 256GB of RAM, you would
*not* want to divide that up into five virtual machines each with 64GB. 
The physical host would not have enough memory for all five virtual
machines.  It would have the option of using its disk space as extra
memory, but as soon as you start swapping memory to disk, performance of
ANY software becomes unacceptable.  Solr in particular requires actual
real memory.  Oversubscribing memory on VMs might work for some
workloads, but it won't work for Solr.

If all your virtual machines are running on the same physical host, then
you have no redundancy.  Modern servers have redundant power supplies,
redundant hard drives, and other kinds of fault tolerance.  Even so,
there are many components in a server that have no redundancy, like the
motherboard, or the backplane.  If one of those components were to die,
all of the virtual machines would go down.

Thanks,
Shawn

Re: distribution of leader and replica in SolrCloud

Posted by Erick Erickson <er...@gmail.com>.

Bernd:

Short form: Worrying about which node is the leader is wasting your
time. Details below:

Why do you care what nodes the leaders are on? There has to be some
concern you have about co-locating the leaders on the same node or you
wouldn't be spending the time on it. Please articulate that concern.
This really sounds like an XY problem, you're asking about X (how to
insure leader location) when you're concerned about Y. What's the Y?

Worrying about placing leaders is really a waste of time at the scale
you're talking. Which replica is the leader changes anyway when nodes
go up and down or depending on the order you start your Solr
instances. There's no way to specify it with rule based node placement
during collection creation because it's not worth the effort.

You may be misunderstanding what the leader does. It's biggest
responsibility is that it forwards the raw documents to the other
replicas when indexing. The reason updates are sent to the leader is
to insure ordering (i.e. if the same doc is sent to two nodes in the
cloud for indexing by two different clients at the same time, one of
them has to "win" deterministically). But other than this trivial
check (and forwarding the docs of course, but that's almost entirely
I/O) the leader is just like any other replica in the shard.

The leader does _not_ index the document and the forward the results
to the followers, all replicas on a shard index each document
independently.

When querying, the leader has no additional duties at all, it's just
another replica serving queries.

So all the added duties of a leader amount to is forwarding the raw
docs to the followers, collecting the responses and returning the
status of the indexing request to the caller. During querying, the
leader may or may not participate.

I doubt you'll even be able to measure the increased load on a node
even if all the leaders are located on it. As far as cluster
robustness is concerned, again where the leaders are placed is
irrelevant. The only time I've seen any problems with all the leaders
being on a single physical node, there were several hundred shards.

Best,
Erick

On Wed, May 10, 2017 at 5:15 AM, Rick Leir <rl...@leirtech.com> wrote:
> Myself, I am still in the old camp. For critical machines, I want to know that it is my machine, with my disks, and what software is installed exactly. But maybe the cloud provider's fast network is more important? Cheers--Rick
>
> On May 10, 2017 6:13:27 AM EDT, Bernd Fehling <be...@uni-bielefeld.de> wrote:
>>Hi Rick,
>>
>>yes I have distributed 5 virtual server accross 5 physical machines.
>>So each virtual server is on a separate physical machine.
>>
>>Splitting each virtual server (64GB RAM) into two (32GB RAM), which
>>then
>>will be 10 virtual server accross 5 physical machines, is no option
>>because there is no gain against hardware failure of a physical
>>machine.
>>
>>So I rather go with two Solr instances per 64GB virtual server as first
>>try.
>>
>>Currently I'm still trying to solve the Rule-based Replica Placement.
>>There seams to be no way to report if a node is a "leader" or has the
>>"role="leader".
>>
>>Do you know how to create a rule like:
>>--> "do not create the replica on the same host where his leader
>>exists"
>>
>>Regards,
>>Bernd
>>
>>
>>Am 10.05.2017 um 10:54 schrieb Rick Leir:
>>> Bernd,
>>>
>>> Yes, cloud, ahhh. As you say, the world changed.  Do you have any
>>hint from the cloud provider as to which physical machine your virtual
>>server
>>> is on? If so, you can hopefully distribute your replicas across
>>physical machines. This is not just for reliability: in a sharded
>>system, each
>>> query will cause activity in several virtual servers and you would
>>prefer that they are on separate physical machines, not competing for
>>> resources. Maybe, for Solr, you should choose a provider which can
>>lease you the whole physical machine. You would prefer a 256G machine
>>over
>>> several shards on 64G virtual machines.
>>>
>>> And many cloud providers assume that servers are mostly idle, so they
>>cram too many server containers into a machine. Then, very
>>occasionally,
>>> you get OOM even though you did not exceed your advertised RAM. This
>>is a topic for some other forum, where should I look?
>>>
>>> With AWS you can choose to locate your virtual machine in
>>US-west-Oregon or US-east-i-forget or a few other locations, but that
>>is a very coarse
>>> division. Can you choose physical machine?
>>>
>>> With Google, it might be dynamic?
>>> cheers -- Rick
>>>
>>>
>>> On 2017-05-09 03:44 AM, Bernd Fehling wrote:
>>>> I would name your solution more a work around as any similar
>>solution of this kind.
>>>> The issue SOLR-6027 is now 3 years open and the world has changed.
>>>> Instead of racks full of blades where you had many dedicated bare
>>metal servers
>>>> you have now huge machines with 256GB RAM and many CPUs.
>>Virtualization has taken place.
>>>> To get under these conditions some independance from the physical
>>hardware you have
>>>> to spread the shards across several physical machines with virtual
>>servers.
>>>> >From my point of view it is a good solution to have 5 virtual 64GB
>>servers
>>>> on 5 different huge physical machines and start 2 instances on each
>>virtual server.
>>>> If I would split up each 64GB virtual server into two 32GB virtual
>>server there would
>>>> be no gain. We don't have 10 huge machines (no security win) and we
>>have to admin
>>>> and control 10 virtual servers instead of 5 (plus zookeeper
>>servers).
>>>>
>>>> It is state of the art that you don't have to care about the servers
>>within
>>>> the cloud. This is the main sense of a cloud.
>>>> The leader should always be aware who are the members of his cloud,
>>how to reach
>>>> them (IP address) and how are the users of the cloud (collections)
>>distributed
>>>> across the cloud.
>>>>
>>>> It would be great if a solution of issue SOLR-6027 would lead to
>>some kind of
>>>> "automatic mode" for server distribution, without any special
>>configuring.
>>>>
>>>> Regards,
>>>> Bernd
>>>>
>>>>
>>>> Am 08.05.2017 um 17:47 schrieb Erick Erickson:
>>>>> Also, you can specify custom placement rules, see:
>>>>>
>>https://cwiki.apache.org/confluence/display/solr/Rule-based+Replica+Placement
>>>>>
>>>>> But Shawn's statement is the nub of what you're seeing, by default
>>>>> multiple JVMs on the same physical machine are considered separate
>>>>> Solr instances.
>>>>>
>>>>> Also note that if you want to, you can specify a nodeSet when you
>>>>> create the nodes, and in particular the special value EMPTY.
>>That'll
>>>>> create a collection with no replicas and you can ADDREPLICA to
>>>>> precisely place each one if you require that level of control.
>>>>>
>>>>> Best,
>>>>> Erick
>>>>>
>>>>> On Mon, May 8, 2017 at 7:44 AM, Shawn Heisey <ap...@elyograg.org>
>>wrote:
>>>>>> On 5/8/2017 5:38 AM, Bernd Fehling wrote:
>>>>>>> boss ------ shard1 ----- server2:7574
>>>>>>>         |             |-- server2:8983 (leader)
>>>>>> The reason that this happened is because you've got two nodes
>>running on
>>>>>> every server.  From SolrCloud's perspective, there are ten
>>distinct
>>>>>> nodes, not five.
>>>>>>
>>>>>> SolrCloud doesn't notice the fact that different nodes are running
>>on
>>>>>> the same server(s).  If your reaction to hearing this is that it
>>>>>> *should* notice, you're probably right, but in a typical use case,
>>each
>>>>>> server should only be running one Solr instance, so this would
>>never happen.
>>>>>>
>>>>>> There is only one instance where I can think of where I would
>>recommend
>>>>>> running multiple instances per server, and that is when the
>>required
>>>>>> heap size for a single instance would be VERY large.  Running two
>>>>>> instances with smaller heaps can yield better performance.
>>>>>>
>>>>>> See this issue:
>>>>>>
>>>>>> https://issues.apache.org/jira/browse/SOLR-6027
>>>>>>
>>>>>> Thanks,
>>>>>> Shawn
>>>>>>
>>>
>
> --
> Sorry for being brief. Alternate email is rickleir at yahoo dot com

Re: distribution of leader and replica in SolrCloud

Posted by Rick Leir <rl...@leirtech.com>.

Myself, I am still in the old camp. For critical machines, I want to know that it is my machine, with my disks, and what software is installed exactly. But maybe the cloud provider's fast network is more important? Cheers--Rick

On May 10, 2017 6:13:27 AM EDT, Bernd Fehling <be...@uni-bielefeld.de> wrote:
>Hi Rick,
>
>yes I have distributed 5 virtual server accross 5 physical machines.
>So each virtual server is on a separate physical machine.
>
>Splitting each virtual server (64GB RAM) into two (32GB RAM), which
>then
>will be 10 virtual server accross 5 physical machines, is no option
>because there is no gain against hardware failure of a physical
>machine.
>
>So I rather go with two Solr instances per 64GB virtual server as first
>try.
>
>Currently I'm still trying to solve the Rule-based Replica Placement.
>There seams to be no way to report if a node is a "leader" or has the
>"role="leader".
>
>Do you know how to create a rule like:
>--> "do not create the replica on the same host where his leader
>exists"
>
>Regards,
>Bernd
>
>
>Am 10.05.2017 um 10:54 schrieb Rick Leir:
>> Bernd,
>> 
>> Yes, cloud, ahhh. As you say, the world changed.  Do you have any
>hint from the cloud provider as to which physical machine your virtual
>server
>> is on? If so, you can hopefully distribute your replicas across
>physical machines. This is not just for reliability: in a sharded
>system, each
>> query will cause activity in several virtual servers and you would
>prefer that they are on separate physical machines, not competing for
>> resources. Maybe, for Solr, you should choose a provider which can
>lease you the whole physical machine. You would prefer a 256G machine
>over
>> several shards on 64G virtual machines.
>> 
>> And many cloud providers assume that servers are mostly idle, so they
>cram too many server containers into a machine. Then, very
>occasionally,
>> you get OOM even though you did not exceed your advertised RAM. This
>is a topic for some other forum, where should I look?
>> 
>> With AWS you can choose to locate your virtual machine in
>US-west-Oregon or US-east-i-forget or a few other locations, but that
>is a very coarse
>> division. Can you choose physical machine?
>> 
>> With Google, it might be dynamic?
>> cheers -- Rick
>> 
>> 
>> On 2017-05-09 03:44 AM, Bernd Fehling wrote:
>>> I would name your solution more a work around as any similar
>solution of this kind.
>>> The issue SOLR-6027 is now 3 years open and the world has changed.
>>> Instead of racks full of blades where you had many dedicated bare
>metal servers
>>> you have now huge machines with 256GB RAM and many CPUs.
>Virtualization has taken place.
>>> To get under these conditions some independance from the physical
>hardware you have
>>> to spread the shards across several physical machines with virtual
>servers.
>>> >From my point of view it is a good solution to have 5 virtual 64GB
>servers
>>> on 5 different huge physical machines and start 2 instances on each
>virtual server.
>>> If I would split up each 64GB virtual server into two 32GB virtual
>server there would
>>> be no gain. We don't have 10 huge machines (no security win) and we
>have to admin
>>> and control 10 virtual servers instead of 5 (plus zookeeper
>servers).
>>>
>>> It is state of the art that you don't have to care about the servers
>within
>>> the cloud. This is the main sense of a cloud.
>>> The leader should always be aware who are the members of his cloud,
>how to reach
>>> them (IP address) and how are the users of the cloud (collections)
>distributed
>>> across the cloud.
>>>
>>> It would be great if a solution of issue SOLR-6027 would lead to
>some kind of
>>> "automatic mode" for server distribution, without any special
>configuring.
>>>
>>> Regards,
>>> Bernd
>>>
>>>
>>> Am 08.05.2017 um 17:47 schrieb Erick Erickson:
>>>> Also, you can specify custom placement rules, see:
>>>>
>https://cwiki.apache.org/confluence/display/solr/Rule-based+Replica+Placement
>>>>
>>>> But Shawn's statement is the nub of what you're seeing, by default
>>>> multiple JVMs on the same physical machine are considered separate
>>>> Solr instances.
>>>>
>>>> Also note that if you want to, you can specify a nodeSet when you
>>>> create the nodes, and in particular the special value EMPTY.
>That'll
>>>> create a collection with no replicas and you can ADDREPLICA to
>>>> precisely place each one if you require that level of control.
>>>>
>>>> Best,
>>>> Erick
>>>>
>>>> On Mon, May 8, 2017 at 7:44 AM, Shawn Heisey <ap...@elyograg.org>
>wrote:
>>>>> On 5/8/2017 5:38 AM, Bernd Fehling wrote:
>>>>>> boss ------ shard1 ----- server2:7574
>>>>>>         |             |-- server2:8983 (leader)
>>>>> The reason that this happened is because you've got two nodes
>running on
>>>>> every server.  From SolrCloud's perspective, there are ten
>distinct
>>>>> nodes, not five.
>>>>>
>>>>> SolrCloud doesn't notice the fact that different nodes are running
>on
>>>>> the same server(s).  If your reaction to hearing this is that it
>>>>> *should* notice, you're probably right, but in a typical use case,
>each
>>>>> server should only be running one Solr instance, so this would
>never happen.
>>>>>
>>>>> There is only one instance where I can think of where I would
>recommend
>>>>> running multiple instances per server, and that is when the
>required
>>>>> heap size for a single instance would be VERY large.  Running two
>>>>> instances with smaller heaps can yield better performance.
>>>>>
>>>>> See this issue:
>>>>>
>>>>> https://issues.apache.org/jira/browse/SOLR-6027
>>>>>
>>>>> Thanks,
>>>>> Shawn
>>>>>
>> 

-- 
Sorry for being brief. Alternate email is rickleir at yahoo dot com

Re: distribution of leader and replica in SolrCloud

Posted by Bernd Fehling <be...@uni-bielefeld.de>.

Hi Rick,

yes I have distributed 5 virtual server accross 5 physical machines.
So each virtual server is on a separate physical machine.

Splitting each virtual server (64GB RAM) into two (32GB RAM), which then
will be 10 virtual server accross 5 physical machines, is no option
because there is no gain against hardware failure of a physical machine.

So I rather go with two Solr instances per 64GB virtual server as first try.

Currently I'm still trying to solve the Rule-based Replica Placement.
There seams to be no way to report if a node is a "leader" or has the "role="leader".

Do you know how to create a rule like:
--> "do not create the replica on the same host where his leader exists"

Regards,
Bernd


Am 10.05.2017 um 10:54 schrieb Rick Leir:
> Bernd,
> 
> Yes, cloud, ahhh. As you say, the world changed.  Do you have any hint from the cloud provider as to which physical machine your virtual server
> is on? If so, you can hopefully distribute your replicas across physical machines. This is not just for reliability: in a sharded system, each
> query will cause activity in several virtual servers and you would prefer that they are on separate physical machines, not competing for
> resources. Maybe, for Solr, you should choose a provider which can lease you the whole physical machine. You would prefer a 256G machine over
> several shards on 64G virtual machines.
> 
> And many cloud providers assume that servers are mostly idle, so they cram too many server containers into a machine. Then, very occasionally,
> you get OOM even though you did not exceed your advertised RAM. This is a topic for some other forum, where should I look?
> 
> With AWS you can choose to locate your virtual machine in US-west-Oregon or US-east-i-forget or a few other locations, but that is a very coarse
> division. Can you choose physical machine?
> 
> With Google, it might be dynamic?
> cheers -- Rick
> 
> 
> On 2017-05-09 03:44 AM, Bernd Fehling wrote:
>> I would name your solution more a work around as any similar solution of this kind.
>> The issue SOLR-6027 is now 3 years open and the world has changed.
>> Instead of racks full of blades where you had many dedicated bare metal servers
>> you have now huge machines with 256GB RAM and many CPUs. Virtualization has taken place.
>> To get under these conditions some independance from the physical hardware you have
>> to spread the shards across several physical machines with virtual servers.
>> >From my point of view it is a good solution to have 5 virtual 64GB servers
>> on 5 different huge physical machines and start 2 instances on each virtual server.
>> If I would split up each 64GB virtual server into two 32GB virtual server there would
>> be no gain. We don't have 10 huge machines (no security win) and we have to admin
>> and control 10 virtual servers instead of 5 (plus zookeeper servers).
>>
>> It is state of the art that you don't have to care about the servers within
>> the cloud. This is the main sense of a cloud.
>> The leader should always be aware who are the members of his cloud, how to reach
>> them (IP address) and how are the users of the cloud (collections) distributed
>> across the cloud.
>>
>> It would be great if a solution of issue SOLR-6027 would lead to some kind of
>> "automatic mode" for server distribution, without any special configuring.
>>
>> Regards,
>> Bernd
>>
>>
>> Am 08.05.2017 um 17:47 schrieb Erick Erickson:
>>> Also, you can specify custom placement rules, see:
>>> https://cwiki.apache.org/confluence/display/solr/Rule-based+Replica+Placement
>>>
>>> But Shawn's statement is the nub of what you're seeing, by default
>>> multiple JVMs on the same physical machine are considered separate
>>> Solr instances.
>>>
>>> Also note that if you want to, you can specify a nodeSet when you
>>> create the nodes, and in particular the special value EMPTY. That'll
>>> create a collection with no replicas and you can ADDREPLICA to
>>> precisely place each one if you require that level of control.
>>>
>>> Best,
>>> Erick
>>>
>>> On Mon, May 8, 2017 at 7:44 AM, Shawn Heisey <ap...@elyograg.org> wrote:
>>>> On 5/8/2017 5:38 AM, Bernd Fehling wrote:
>>>>> boss ------ shard1 ----- server2:7574
>>>>>         |             |-- server2:8983 (leader)
>>>> The reason that this happened is because you've got two nodes running on
>>>> every server.  From SolrCloud's perspective, there are ten distinct
>>>> nodes, not five.
>>>>
>>>> SolrCloud doesn't notice the fact that different nodes are running on
>>>> the same server(s).  If your reaction to hearing this is that it
>>>> *should* notice, you're probably right, but in a typical use case, each
>>>> server should only be running one Solr instance, so this would never happen.
>>>>
>>>> There is only one instance where I can think of where I would recommend
>>>> running multiple instances per server, and that is when the required
>>>> heap size for a single instance would be VERY large.  Running two
>>>> instances with smaller heaps can yield better performance.
>>>>
>>>> See this issue:
>>>>
>>>> https://issues.apache.org/jira/browse/SOLR-6027
>>>>
>>>> Thanks,
>>>> Shawn
>>>>
>

Re: distribution of leader and replica in SolrCloud

Posted by Bernd Fehling <be...@uni-bielefeld.de>.

I would name your solution more a work around as any similar solution of this kind.
The issue SOLR-6027 is now 3 years open and the world has changed.
Instead of racks full of blades where you had many dedicated bare metal servers
you have now huge machines with 256GB RAM and many CPUs. Virtualization has taken place.
To get under these conditions some independance from the physical hardware you have
to spread the shards across several physical machines with virtual servers.
From my point of view it is a good solution to have 5 virtual 64GB servers
on 5 different huge physical machines and start 2 instances on each virtual server.
If I would split up each 64GB virtual server into two 32GB virtual server there would
be no gain. We don't have 10 huge machines (no security win) and we have to admin
and control 10 virtual servers instead of 5 (plus zookeeper servers).

It is state of the art that you don't have to care about the servers within
the cloud. This is the main sense of a cloud.
The leader should always be aware who are the members of his cloud, how to reach
them (IP address) and how are the users of the cloud (collections) distributed
across the cloud.

It would be great if a solution of issue SOLR-6027 would lead to some kind of
"automatic mode" for server distribution, without any special configuring.

Regards,
Bernd

Am 08.05.2017 um 17:47 schrieb Erick Erickson:
> Also, you can specify custom placement rules, see:
> https://cwiki.apache.org/confluence/display/solr/Rule-based+Replica+Placement
> 
> But Shawn's statement is the nub of what you're seeing, by default
> multiple JVMs on the same physical machine are considered separate
> Solr instances.
> 
> Also note that if you want to, you can specify a nodeSet when you
> create the nodes, and in particular the special value EMPTY. That'll
> create a collection with no replicas and you can ADDREPLICA to
> precisely place each one if you require that level of control.
> 
> Best,
> Erick
> 
> On Mon, May 8, 2017 at 7:44 AM, Shawn Heisey <ap...@elyograg.org> wrote:
>> On 5/8/2017 5:38 AM, Bernd Fehling wrote:
>>> boss ------ shard1 ----- server2:7574
>>>        |             |-- server2:8983 (leader)
>>
>> The reason that this happened is because you've got two nodes running on
>> every server.  From SolrCloud's perspective, there are ten distinct
>> nodes, not five.
>>
>> SolrCloud doesn't notice the fact that different nodes are running on
>> the same server(s).  If your reaction to hearing this is that it
>> *should* notice, you're probably right, but in a typical use case, each
>> server should only be running one Solr instance, so this would never happen.
>>
>> There is only one instance where I can think of where I would recommend
>> running multiple instances per server, and that is when the required
>> heap size for a single instance would be VERY large.  Running two
>> instances with smaller heaps can yield better performance.
>>
>> See this issue:
>>
>> https://issues.apache.org/jira/browse/SOLR-6027
>>
>> Thanks,
>> Shawn
>>

Re: distribution of leader and replica in SolrCloud

Posted by Bernd Fehling <be...@uni-bielefeld.de>.

Hi Erik,

just went through
https://cwiki.apache.org/confluence/display/solr/Rule-based+Replica+Placement

I might be wrong but I didn't see anything to identify the "leader".
To solve my problem with a rule:
--> "do not create the replica on the same host where his leader exists"

May be something like "rule=replica:*,host:!leader" or anything similar.
But role "leader" is not available.

Any other rule to make this possible?

I must admit it is very flexible but also very complicated.

Regards,
Bernd

Am 08.05.2017 um 17:47 schrieb Erick Erickson:
> Also, you can specify custom placement rules, see:
> https://cwiki.apache.org/confluence/display/solr/Rule-based+Replica+Placement
> 
> But Shawn's statement is the nub of what you're seeing, by default
> multiple JVMs on the same physical machine are considered separate
> Solr instances.
> 
> Also note that if you want to, you can specify a nodeSet when you
> create the nodes, and in particular the special value EMPTY. That'll
> create a collection with no replicas and you can ADDREPLICA to
> precisely place each one if you require that level of control.
> 
> Best,
> Erick
> 
> On Mon, May 8, 2017 at 7:44 AM, Shawn Heisey <ap...@elyograg.org> wrote:
>> On 5/8/2017 5:38 AM, Bernd Fehling wrote:
>>> boss ------ shard1 ----- server2:7574
>>>        |             |-- server2:8983 (leader)
>>
>> The reason that this happened is because you've got two nodes running on
>> every server.  From SolrCloud's perspective, there are ten distinct
>> nodes, not five.
>>
>> SolrCloud doesn't notice the fact that different nodes are running on
>> the same server(s).  If your reaction to hearing this is that it
>> *should* notice, you're probably right, but in a typical use case, each
>> server should only be running one Solr instance, so this would never happen.
>>
>> There is only one instance where I can think of where I would recommend
>> running multiple instances per server, and that is when the required
>> heap size for a single instance would be VERY large.  Running two
>> instances with smaller heaps can yield better performance.
>>
>> See this issue:
>>
>> https://issues.apache.org/jira/browse/SOLR-6027
>>
>> Thanks,
>> Shawn
>>

Re: distribution of leader and replica in SolrCloud

Posted by Shawn Heisey <ap...@elyograg.org>.

On 5/8/2017 5:38 AM, Bernd Fehling wrote:
> boss ------ shard1 ----- server2:7574
>        |             |-- server2:8983 (leader)

The reason that this happened is because you've got two nodes running on
every server.  From SolrCloud's perspective, there are ten distinct
nodes, not five.

SolrCloud doesn't notice the fact that different nodes are running on
the same server(s).  If your reaction to hearing this is that it
*should* notice, you're probably right, but in a typical use case, each
server should only be running one Solr instance, so this would never happen.

There is only one instance where I can think of where I would recommend
running multiple instances per server, and that is when the required
heap size for a single instance would be VERY large.  Running two
instances with smaller heaps can yield better performance.

See this issue:

https://issues.apache.org/jira/browse/SOLR-6027

Thanks,
Shawn

Re: distribution of leader and replica in SolrCloud

Posted by Amrit Sarkar <sa...@gmail.com>.

Bernd,

When you create a collection via Collections API, the internal logic tries
its best to equally distribute the nodes across the shards but sometimes it
don't happen.

The best thing about SolrCloud is you can manipulate its cloud architecture
on the fly using Collections API. You can delete a replica of one
particular shard and add a replica (on a specific machine/node) to any of
the shards anytime depending to your design.

For the above, you can simply:

call DELETEREPLICA api on shard1--->server2:7574 (or the other one)

https://cwiki.apache.org/confluence/display/solr/Collections+API#CollectionsAPI-DELETEREPLICA:DeleteaReplica

boss ------ shard1
       |             |-- server2:8983 (leader)
       |
        --- shard2 ----- server1:8983
       |             |-- server5:7575 (leader)
       |
        --- shard3 ----- server3:8983 (leader)
       |             |-- server4:8983
       |
        --- shard4 ----- server1:7574 (leader)
       |             |-- server4:7574
       |
        --- shard5 ----- server3:7574 (leader)
                     |-- server5:8983

call ADDREPLICA api on shard1---->server1:8983

https://cwiki.apache.org/confluence/display/solr/Collections+API#CollectionsAPI-DELETEREPLICA:DeleteaReplica

boss ------ shard1 ----- server1:8983
       |             |-- server2:8983 (leader)
       |
        --- shard2 ----- server1:8983
       |             |-- server5:7575 (leader)
       |
        --- shard3 ----- server3:8983 (leader)
       |             |-- server4:8983
       |
        --- shard4 ----- server1:7574 (leader)
       |             |-- server4:7574
       |
        --- shard5 ----- server3:7574 (leader)
                     |-- server5:8983

Hope this helps.

Amrit Sarkar
Search Engineer
Lucidworks, Inc.
415-589-9269
www.lucidworks.com
Twitter http://twitter.com/lucidworks
LinkedIn: https://www.linkedin.com/in/sarkaramrit2

On Mon, May 8, 2017 at 5:08 PM, Bernd Fehling <
bernd.fehling@uni-bielefeld.de> wrote:

> My assumption was that the strength of SolrCloud is the distribution
> of leader and replica within the Cloud and make the Cloud somewhat
> failsafe.
> But after setting up SolrCloud with a collection I have both, leader and
> replica, on the same shard. And this should be failsafe?
>
> o.a.s.h.a.CollectionsHandler Invoked Collection Action :create with params
> replicationFactor=2&routerName=compositeId&collection.configName=boss&
> maxShardsPerNode=1&name=boss&router.name=compositeId&action=
> CREATE&numShards=5
>
> boss ------ shard1 ----- server2:7574
>        |             |-- server2:8983 (leader)
>        |
>         --- shard2 ----- server1:8983
>        |             |-- server5:7575 (leader)
>        |
>         --- shard3 ----- server3:8983 (leader)
>        |             |-- server4:8983
>        |
>         --- shard4 ----- server1:7574 (leader)
>        |             |-- server4:7574
>        |
>         --- shard5 ----- server3:7574 (leader)
>                      |-- server5:8983
>
> From my point of view, if server2 is going to crash then shard1 will
> disappear and
> 1/5th of the index is missing.
>
> What is your opinion?
>
> Regards
> Bernd
>
>
>
>

Re: distribution of leader and replica in SolrCloud

Posted by Bernd Fehling <be...@uni-bielefeld.de>.

And then delete replica shard2-->server1:8983 and add replica shard2-->server2:7574 ?

Would be nice to have some automatic logic like ES (_cluster/reroute with move).

Regards
Bernd


Am 08.05.2017 um 14:16 schrieb Amrit Sarkar:
> Bernd,
> 
> When you create a collection via Collections API, the internal logic tries
> its best to equally distribute the nodes across the shards but sometimes it
> don't happen.
> 
> The best thing about SolrCloud is you can manipulate its cloud architecture
> on the fly using Collections API. You can delete a replica of one
> particular shard and add a replica (on a specific machine/node) to any of
> the shards anytime depending to your design.
> 
> For the above, you can simply:
> 
> call DELETEREPLICA api on shard1--->server2:7574 (or the other one)
> 
> https://cwiki.apache.org/confluence/display/solr/Collections+API#CollectionsAPI-DELETEREPLICA:DeleteaReplica
> 
> boss ------ shard1
>        |             |-- server2:8983 (leader)
>        |
>         --- shard2 ----- server1:8983
>        |             |-- server5:7575 (leader)
>        |
>         --- shard3 ----- server3:8983 (leader)
>        |             |-- server4:8983
>        |
>         --- shard4 ----- server1:7574 (leader)
>        |             |-- server4:7574
>        |
>         --- shard5 ----- server3:7574 (leader)
>                      |-- server5:8983
> 
> call ADDREPLICA api on shard1---->server1:8983
> 
> https://cwiki.apache.org/confluence/display/solr/Collections+API#CollectionsAPI-DELETEREPLICA:DeleteaReplica
> 
> boss ------ shard1 ----- server1:8983
>        |             |-- server2:8983 (leader)
>        |
>         --- shard2 ----- server1:8983
>        |             |-- server5:7575 (leader)
>        |
>         --- shard3 ----- server3:8983 (leader)
>        |             |-- server4:8983
>        |
>         --- shard4 ----- server1:7574 (leader)
>        |             |-- server4:7574
>        |
>         --- shard5 ----- server3:7574 (leader)
>                      |-- server5:8983
> 
> Hope this helps.
> 
> Amrit Sarkar
> Search Engineer
> Lucidworks, Inc.
> 415-589-9269
> www.lucidworks.com
> Twitter http://twitter.com/lucidworks
> LinkedIn: https://www.linkedin.com/in/sarkaramrit2
> 
> On Mon, May 8, 2017 at 5:08 PM, Bernd Fehling <
> bernd.fehling@uni-bielefeld.de> wrote:
> 
>> My assumption was that the strength of SolrCloud is the distribution
>> of leader and replica within the Cloud and make the Cloud somewhat
>> failsafe.
>> But after setting up SolrCloud with a collection I have both, leader and
>> replica, on the same shard. And this should be failsafe?
>>
>> o.a.s.h.a.CollectionsHandler Invoked Collection Action :create with params
>> replicationFactor=2&routerName=compositeId&collection.configName=boss&
>> maxShardsPerNode=1&name=boss&router.name=compositeId&action=
>> CREATE&numShards=5
>>
>> boss ------ shard1 ----- server2:7574
>>        |             |-- server2:8983 (leader)
>>        |
>>         --- shard2 ----- server1:8983
>>        |             |-- server5:7575 (leader)
>>        |
>>         --- shard3 ----- server3:8983 (leader)
>>        |             |-- server4:8983
>>        |
>>         --- shard4 ----- server1:7574 (leader)
>>        |             |-- server4:7574
>>        |
>>         --- shard5 ----- server3:7574 (leader)
>>                      |-- server5:8983
>>
>> From my point of view, if server2 is going to crash then shard1 will
>> disappear and
>> 1/5th of the index is missing.
>>
>> What is your opinion?
>>
>> Regards
>> Bernd
>>
>>
>>
>>
>