You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@solr.apache.org by Jan Høydahl <ja...@cominvent.com> on 2021/03/10 14:00:01 UTC

Solr not distributing search requests among replicas

Hi,

A client has a SolrCloud 8.4 setup with two nodes, and one collection with one shard and replicationFactor=2.
Of course we want search traffic to be evenly distributed between the two replicas.
The client is using plain HTTP requests, no SolrJ or anything fancy, and sends all requests to one of the two nodes.
I was expecting Solr to forward about 50% of those requests to the other replica, but it is serving them all locally.

I know we can setup an LB in front or re-program the client to do round robin, but that is not my question.
Is the select-random-replica logic only active when we have a sharded oollection, and not for a single-shard?

Jan

Re: Solr not distributing search requests among replicas

Posted by Chris Hostetter <ho...@fucit.org>.
all of the "routing" logic (preferLocal, shards.preference, etc...) really 
only comes into play once solr "code" (either CloudSolrClient, or a solr 
server recieving a request) decides that it needs to make a remote 
connection.

If a node recieves a request, and it has a local core capable of handling 
that request, it will process it in order to avoid the network overhead of 
sending it somewhere else.

Where things like shards.preference (and the deprecated 
"preferLocalShards") come into play (on the server side) is in situations 
where a Solr node is acting as a a Solr client:

1) the Solr node does't have any cores suitable for dealing with teh 
request (ie: it's a 'top level' request for a collection that has no local 
replicas and must be forwarded)

2) Solr is processing a top-level request and now needs to federate 
distributed sub-requests to each of the shards and is deciding which 
replica of each shard should get the request.




: Date: Wed, 10 Mar 2021 18:42:35 +0100
: From: Jan Høydahl <ja...@cominvent.com>
: Reply-To: users@solr.apache.org
: To: users@solr.apache.org
: Subject: Re: Solr not distributing search requests among replicas
: 
: We have not set any shard.preference, and I also think preferLocal defaults to false, i.e random
: 
: Earlier we had 2 shares for the same collection (both existed on both nodes) and then requests were distributed to both nodes. That’s why, when we went to 1 shard, I was wondering if the “single-shard” code path perhaps never attempts to utilize replicas?? But have not looked in code yet.
: 
: Guess next step is to setup a small local test cluster and see what happens.
: 
: Jan Høydahl
: 
: > 10. mar. 2021 kl. 15:46 skrev Michael Gibney <mi...@michaelgibney.net>:
: > 
: > You say not "anything fancy" -- depending on how you define "fancy", if you
: > have an explicit `shards.preference` param, based on the version you're
: > running (8.4) you might also take a look at
: > https://issues.apache.org/jira/browse/SOLR-14471. (If SOLR-14471 is the
: > problem, removing the explicit `shards.preference` param should restore
: > default "shuffling" routing).
: > 
: > I haven't dug too deep, but it looks like for 8.4 preferLocalShards
: > actually defaults to false? I might be missing something though:
: > https://github.com/apache/lucene-solr/blob/releases/lucene-solr/8.4.1/solr/solrj/src/java/org/apache/solr/client/solrj/routing/RequestReplicaListTransformerGenerator.java#L85
: > 
: > 
: > 
: >> On Wed, Mar 10, 2021 at 9:10 AM Houston Putman <ho...@gmail.com>
: >> wrote:
: >> 
: >> I could be wrong, but i dont think preferLocalShards is the default in
: >> multi-shard use cases.
: >> 
: >>> On Wed, Mar 10, 2021 at 9:07 AM Mike Drob <md...@mdrob.com> wrote:
: >>> 
: >>> I believe a server will always try to prefer local cores. Can you do an
: >>> experiment with 3 nodes, and send http queries to the node not hosting
: >> any
: >>> replicas? That should confirm the balanced distribution.
: >>> 
: >>> If you have multiple shards, the receiving server will forward the
: >> requests
: >>> for shards it doesn’t have, but would still prefer local shards when they
: >>> are available.
: >>> 
: >>> On Wed, Mar 10, 2021 at 8:00 AM Jan Høydahl <ja...@cominvent.com>
: >> wrote:
: >>> 
: >>>> Hi,
: >>>> 
: >>>> A client has a SolrCloud 8.4 setup with two nodes, and one collection
: >>> with
: >>>> one shard and replicationFactor=2.
: >>>> Of course we want search traffic to be evenly distributed between the
: >> two
: >>>> replicas.
: >>>> The client is using plain HTTP requests, no SolrJ or anything fancy,
: >> and
: >>>> sends all requests to one of the two nodes.
: >>>> I was expecting Solr to forward about 50% of those requests to the
: >> other
: >>>> replica, but it is serving them all locally.
: >>>> 
: >>>> I know we can setup an LB in front or re-program the client to do round
: >>>> robin, but that is not my question.
: >>>> Is the select-random-replica logic only active when we have a sharded
: >>>> oollection, and not for a single-shard?
: >>>> 
: >>>> Jan
: >>> 
: >> 
: 

-Hoss
http://www.lucidworks.com/

Re: Solr not distributing search requests among replicas

Posted by Walter Underwood <wu...@wunderwood.org>.
You could even run a separate Solr on the node just to redistribute the queries.
But if I was going to do that, I’d run a copy of nginx as a load balancer instead.

wunder
Walter Underwood
wunder@wunderwood.org
http://observer.wunderwood.org/  (my blog)

> On Mar 10, 2021, at 4:51 PM, Mike Drob <md...@mdrob.com> wrote:
> 
>>> 2) a single "extra" solr node in the cluster can be used as a "self
> configuring" load balancer
> 
> I’ve thought about this a bunch before, are there mechanisms to instruct
> Solr to not host shards for this purpose? Maybe it deserves its own
> discussion.
> 
> On Wed, Mar 10, 2021 at 5:14 PM Chris Hostetter <ho...@fucit.org>
> wrote:
> 
>> 
>> : > that seems... dangerous.  you could easily wind up in a situation
>> where
>> : > nodes just keep trying to forward forever?
>> :
>> : There is some special http parameter being added when forwarding
>> : requests, so I'm sure each node will be able to decide whether it should
>> : act as LB or if it is supposed to be the final destination. Or we can
>> : add such a param. Of course, if SolrJ on the client side has already
>> : selected a replica, the receiving node should not discard that and do
>> : its own balancing. So there is some state to get right here.
>> 
>> "Forever" wasn'treally what i ment to say ... I'm concerned more about how
>> you would implement this to work well in the 'general case' -- ie:
>> multiple nodes, multiple collections, multiple shards, multiple replicas
>> per shard -- w/o doing "too much" forwarding.
>> 
>> 
>> If nodeA gets a request, when exactly should it decide "i *COULD* handle
>> this request for collection1 using local core, but I'll go ahead and
>> forward it to nodeB instead." ? ... should it be based on what percentage
>> of collection1's total replica list are located on nodeA, or based on what
>> pecentage of nodeA is dedicated to collection1? ... should nodeB be more
>> or less likely then nodeC to get the request based on how many total cores
>> each node has for collection1, or how many unique shards each one has?
>> 
>> 
>> Also bear in mind that even if you assumed everything was nice and evenly
>> distributed, a "simple" round robin based approach would have some pretty
>> signifincat impacts on the number of intra-node network requests....
>> 
>> Say you have a 5 node cluster, hosting a 1shard/5replica collection such
>> that each node has 1 replica:  today any node can process the request
>> locally; but if we did a round robin proxy of the request, that means we'd
>> only handle it locally 1/5th the time, and 4/5ths of the time you add an
>> extra network hop and the assocaited network IO involved (plus the
>> original node has a thread tied up waiting to proxy the response) .. so
>> you'd go from needing 0 "internal" network requests/IO to having internal
>> traffic of 80% of the amount of external traffic recieved.
>> 
>> If those 5 nodes host a collection with 2 shards/5replicas each, spread
>> evenly over the 5 nodes: today any given request typically causes 2
>> intra-cluster network requests to get the per-shard data; but if we round
>> robin proxy the initial request to a differnet node 4/5ths of the time we
>> now typically need 2.8 internal requests for each external request...
>> 
>> 
>> It just seems like adding more forwarding/proxy logic -- that isn't
>> strictly neccessary to compute complete results -- could introduce a lot
>> of complexity risk for a problem that already has multiple solutions:
>> 
>> 1) client (or external load blanacer) can round robin over live nodes (and
>> given that cluster state and metrics are available via HTTP, a client can
>> make very sophisticated choices)
>> 
>> 2) a single "extra" solr node in the cluster can be used as a "self
>> configuring" load balancer that will automatically know when new nodes are
>> added to the cluster, or when replicas get moved/added, etc...
>> 
>> 
>> 
>> 
>> 
>> 
>> :
>> : Jan
>> :
>> : > 10. mar. 2021 kl. 19:32 skrev Chris Hostetter <
>> hossman_lucene@fucit.org>:
>> : >
>> : >
>> : > : Is there any way whatsoever to solve this on the Solr side only?
>> : > :
>> : > : Only I can think of is to send all requests to a 3rd node in the
>> cluster
>> : > : that does not have a core for the collection, then it will balance
>> : > : between the two :)
>> : >
>> : > correct -- you can create a Solr node w/o any cores that will act as a
>> : > "load balancer" to other solr nodes.
>> : >
>> : > : Or create a new, empty collection on the node, which acts as a
>> routing
>> : > : collection only to the target collection?
>> : >
>> : > no -- this won't work, because the requerst your remote client sends
>> will
>> : > need to specify the actual collection you want to query, and when the
>> node
>> : > gets this it will hand it to the local core for that collection -- it
>> : > won't care that there is another local collection that's unrelated.
>> : >
>> : > : Sounds like there should be a way to explicitly disable the
>> : > : "optimization" of always handling the request locally in
>> single-shard
>> : > : collections, i.e. always try to balance unless
>> shards.preference=local?
>> : >
>> : > that seems... dangerous.  you could easily wind up in a situation
>> where
>> : > nodes just keep trying to forward forever?
>> : >
>> : >
>> : >
>> : > :
>> : > : Jan
>> : > :
>> : > : > 10. mar. 2021 kl. 19:06 skrev Chris Hostetter <
>> hossman_lucene@fucit.org <ma...@fucit.org>>:
>> : > : >
>> : > : >
>> : > : > : Ah, I missed "single shard" ... this looks relevant:
>> : > : > : https://issues.apache.org/jira/browse/SOLR-12217 <
>> https://issues.apache.org/jira/browse/SOLR-12217>
>> : > : >
>> : > : > That improvement still isn't going to impact Jan's situation where
>> the
>> : > : > *client* isn't SolrJ ... as the description says:
>> : > : >
>> : > : >>> NOTE: This Jira doesn't cover the single-sharded collections
>> cases when
>> : > : >>> not using the CloudSolrClient or Streaming Expressions (i.e. if
>> you do
>> : > : >>> a non-streaming curl request to a random node in the cluster,
>> the
>> : > : >>> shards.preference parameter is not considered in the case of
>> single
>> : > : >>> shards collections).
>> : > : >
>> : > : >
>> : > : > :
>> : > : > : On Wed, Mar 10, 2021 at 12:43 PM Jan Høydahl <
>> jan.asf@cominvent.com <ma...@cominvent.com>> wrote:
>> : > : > :
>> : > : > : > We have not set any shard.preference, and I also think
>> preferLocal
>> : > : > : > defaults to false, i.e random
>> : > : > : >
>> : > : > : > Earlier we had 2 shares for the same collection (both existed
>> on both
>> : > : > : > nodes) and then requests were distributed to both nodes.
>> That’s why, when
>> : > : > : > we went to 1 shard, I was wondering if the “single-shard” code
>> path perhaps
>> : > : > : > never attempts to utilize replicas?? But have not looked in
>> code yet.
>> : > : > : >
>> : > : > : > Guess next step is to setup a small local test cluster and see
>> what
>> : > : > : > happens.
>> : > : > : >
>> : > : > : > Jan Høydahl
>> : > : > : >
>> : > : > : > > 10. mar. 2021 kl. 15:46 skrev Michael Gibney <
>> michael@michaelgibney.net <ma...@michaelgibney.net>
>> : > : > : > >:
>> : > : > : > >
>> : > : > : > > You say not "anything fancy" -- depending on how you define
>> "fancy", if
>> : > : > : > you
>> : > : > : > > have an explicit `shards.preference` param, based on the
>> version you're
>> : > : > : > > running (8.4) you might also take a look at
>> : > : > : > > https://issues.apache.org/jira/browse/SOLR-14471 <
>> https://issues.apache.org/jira/browse/SOLR-14471>. (If SOLR-14471 is the
>> : > : > : > > problem, removing the explicit `shards.preference` param
>> should restore
>> : > : > : > > default "shuffling" routing).
>> : > : > : > >
>> : > : > : > > I haven't dug too deep, but it looks like for 8.4
>> preferLocalShards
>> : > : > : > > actually defaults to false? I might be missing something
>> though:
>> : > : > : > >
>> : > : > : >
>> https://github.com/apache/lucene-solr/blob/releases/lucene-solr/8.4.1/solr/solrj/src/java/org/apache/solr/client/solrj/routing/RequestReplicaListTransformerGenerator.java#L85
>> <
>> https://github.com/apache/lucene-solr/blob/releases/lucene-solr/8.4.1/solr/solrj/src/java/org/apache/solr/client/solrj/routing/RequestReplicaListTransformerGenerator.java#L85
>>> 
>> : > : > : > >
>> : > : > : > >
>> : > : > : > >
>> : > : > : > >> On Wed, Mar 10, 2021 at 9:10 AM Houston Putman <
>> houstonputman@gmail.com
>> : > : > : > >
>> : > : > : > >> wrote:
>> : > : > : > >>
>> : > : > : > >> I could be wrong, but i dont think preferLocalShards is the
>> default in
>> : > : > : > >> multi-shard use cases.
>> : > : > : > >>
>> : > : > : > >>> On Wed, Mar 10, 2021 at 9:07 AM Mike Drob <md...@mdrob.com>
>> wrote:
>> : > : > : > >>>
>> : > : > : > >>> I believe a server will always try to prefer local cores.
>> Can you do an
>> : > : > : > >>> experiment with 3 nodes, and send http queries to the node
>> not hosting
>> : > : > : > >> any
>> : > : > : > >>> replicas? That should confirm the balanced distribution.
>> : > : > : > >>>
>> : > : > : > >>> If you have multiple shards, the receiving server will
>> forward the
>> : > : > : > >> requests
>> : > : > : > >>> for shards it doesn’t have, but would still prefer local
>> shards when
>> : > : > : > they
>> : > : > : > >>> are available.
>> : > : > : > >>>
>> : > : > : > >>> On Wed, Mar 10, 2021 at 8:00 AM Jan Høydahl <
>> jan.asf@cominvent.com>
>> : > : > : > >> wrote:
>> : > : > : > >>>
>> : > : > : > >>>> Hi,
>> : > : > : > >>>>
>> : > : > : > >>>> A client has a SolrCloud 8.4 setup with two nodes, and
>> one collection
>> : > : > : > >>> with
>> : > : > : > >>>> one shard and replicationFactor=2.
>> : > : > : > >>>> Of course we want search traffic to be evenly distributed
>> between the
>> : > : > : > >> two
>> : > : > : > >>>> replicas.
>> : > : > : > >>>> The client is using plain HTTP requests, no SolrJ or
>> anything fancy,
>> : > : > : > >> and
>> : > : > : > >>>> sends all requests to one of the two nodes.
>> : > : > : > >>>> I was expecting Solr to forward about 50% of those
>> requests to the
>> : > : > : > >> other
>> : > : > : > >>>> replica, but it is serving them all locally.
>> : > : > : > >>>>
>> : > : > : > >>>> I know we can setup an LB in front or re-program the
>> client to do
>> : > : > : > round
>> : > : > : > >>>> robin, but that is not my question.
>> : > : > : > >>>> Is the select-random-replica logic only active when we
>> have a sharded
>> : > : > : > >>>> oollection, and not for a single-shard?
>> : > : > : > >>>>
>> : > : > : > >>>> Jan
>> : > : > : > >>>
>> : > : > : > >>
>> : > : > : >
>> : > : > :
>> : > : >
>> : > : > -Hoss
>> : > : > http://www.lucidworks.com/ <http://www.lucidworks.com/>
>> : > :
>> : > :
>> : >
>> : > -Hoss
>> : > http://www.lucidworks.com/ <http://www.lucidworks.com/>
>> :
>> 
>> -Hoss
>> http://www.lucidworks.com/


Re: Solr not distributing search requests among replicas

Posted by Chris Hostetter <ho...@fucit.org>.
: >> 2) a single "extra" solr node in the cluster can be used as a "self
: configuring" load balancer
: 
: I’ve thought about this a bunch before, are there mechanisms to instruct
: Solr to not host shards for this purpose? Maybe it deserves its own
: discussion.

Rules based replica placement can prevent any replica (of any shard, of 
any collection) from being put on a particular host by ip (or by sys prop 
set when the node is started) ... 

https://solr.apache.org/guide/8_6/rule-based-replica-placement.html#do-not-create-any-replicas-in-host-192-45-67-3

...similar restrictions can easily be imposed by an autoscaling policy 
rules (although AFAIK you have to flip the logic: "all replicas of all 
collections should only live on nodes with prop X") ...

https://solr.apache.org/guide/8_6/solrcloud-autoscaling-policy-preferences.html#node-selector

[ { "replica": "#ALL", "nodeset": {"sysprop.use_for_replicas": "search"}} ] } }'


... but I don't know how much of this functionality has survived "The 
Great Autoscaling Purge of 9.x"




-Hoss
http://www.lucidworks.com/

Re: Solr not distributing search requests among replicas

Posted by Mike Drob <md...@mdrob.com>.
>> 2) a single "extra" solr node in the cluster can be used as a "self
configuring" load balancer

I’ve thought about this a bunch before, are there mechanisms to instruct
Solr to not host shards for this purpose? Maybe it deserves its own
discussion.

On Wed, Mar 10, 2021 at 5:14 PM Chris Hostetter <ho...@fucit.org>
wrote:

>
> : > that seems... dangerous.  you could easily wind up in a situation
> where
> : > nodes just keep trying to forward forever?
> :
> : There is some special http parameter being added when forwarding
> : requests, so I'm sure each node will be able to decide whether it should
> : act as LB or if it is supposed to be the final destination. Or we can
> : add such a param. Of course, if SolrJ on the client side has already
> : selected a replica, the receiving node should not discard that and do
> : its own balancing. So there is some state to get right here.
>
> "Forever" wasn'treally what i ment to say ... I'm concerned more about how
> you would implement this to work well in the 'general case' -- ie:
> multiple nodes, multiple collections, multiple shards, multiple replicas
> per shard -- w/o doing "too much" forwarding.
>
>
> If nodeA gets a request, when exactly should it decide "i *COULD* handle
> this request for collection1 using local core, but I'll go ahead and
> forward it to nodeB instead." ? ... should it be based on what percentage
> of collection1's total replica list are located on nodeA, or based on what
> pecentage of nodeA is dedicated to collection1? ... should nodeB be more
> or less likely then nodeC to get the request based on how many total cores
> each node has for collection1, or how many unique shards each one has?
>
>
> Also bear in mind that even if you assumed everything was nice and evenly
> distributed, a "simple" round robin based approach would have some pretty
> signifincat impacts on the number of intra-node network requests....
>
> Say you have a 5 node cluster, hosting a 1shard/5replica collection such
> that each node has 1 replica:  today any node can process the request
> locally; but if we did a round robin proxy of the request, that means we'd
> only handle it locally 1/5th the time, and 4/5ths of the time you add an
> extra network hop and the assocaited network IO involved (plus the
> original node has a thread tied up waiting to proxy the response) .. so
> you'd go from needing 0 "internal" network requests/IO to having internal
> traffic of 80% of the amount of external traffic recieved.
>
> If those 5 nodes host a collection with 2 shards/5replicas each, spread
> evenly over the 5 nodes: today any given request typically causes 2
> intra-cluster network requests to get the per-shard data; but if we round
> robin proxy the initial request to a differnet node 4/5ths of the time we
> now typically need 2.8 internal requests for each external request...
>
>
> It just seems like adding more forwarding/proxy logic -- that isn't
> strictly neccessary to compute complete results -- could introduce a lot
> of complexity risk for a problem that already has multiple solutions:
>
> 1) client (or external load blanacer) can round robin over live nodes (and
> given that cluster state and metrics are available via HTTP, a client can
> make very sophisticated choices)
>
> 2) a single "extra" solr node in the cluster can be used as a "self
> configuring" load balancer that will automatically know when new nodes are
> added to the cluster, or when replicas get moved/added, etc...
>
>
>
>
>
>
> :
> : Jan
> :
> : > 10. mar. 2021 kl. 19:32 skrev Chris Hostetter <
> hossman_lucene@fucit.org>:
> : >
> : >
> : > : Is there any way whatsoever to solve this on the Solr side only?
> : > :
> : > : Only I can think of is to send all requests to a 3rd node in the
> cluster
> : > : that does not have a core for the collection, then it will balance
> : > : between the two :)
> : >
> : > correct -- you can create a Solr node w/o any cores that will act as a
> : > "load balancer" to other solr nodes.
> : >
> : > : Or create a new, empty collection on the node, which acts as a
> routing
> : > : collection only to the target collection?
> : >
> : > no -- this won't work, because the requerst your remote client sends
> will
> : > need to specify the actual collection you want to query, and when the
> node
> : > gets this it will hand it to the local core for that collection -- it
> : > won't care that there is another local collection that's unrelated.
> : >
> : > : Sounds like there should be a way to explicitly disable the
> : > : "optimization" of always handling the request locally in
> single-shard
> : > : collections, i.e. always try to balance unless
> shards.preference=local?
> : >
> : > that seems... dangerous.  you could easily wind up in a situation
> where
> : > nodes just keep trying to forward forever?
> : >
> : >
> : >
> : > :
> : > : Jan
> : > :
> : > : > 10. mar. 2021 kl. 19:06 skrev Chris Hostetter <
> hossman_lucene@fucit.org <ma...@fucit.org>>:
> : > : >
> : > : >
> : > : > : Ah, I missed "single shard" ... this looks relevant:
> : > : > : https://issues.apache.org/jira/browse/SOLR-12217 <
> https://issues.apache.org/jira/browse/SOLR-12217>
> : > : >
> : > : > That improvement still isn't going to impact Jan's situation where
> the
> : > : > *client* isn't SolrJ ... as the description says:
> : > : >
> : > : >>> NOTE: This Jira doesn't cover the single-sharded collections
> cases when
> : > : >>> not using the CloudSolrClient or Streaming Expressions (i.e. if
> you do
> : > : >>> a non-streaming curl request to a random node in the cluster,
> the
> : > : >>> shards.preference parameter is not considered in the case of
> single
> : > : >>> shards collections).
> : > : >
> : > : >
> : > : > :
> : > : > : On Wed, Mar 10, 2021 at 12:43 PM Jan Høydahl <
> jan.asf@cominvent.com <ma...@cominvent.com>> wrote:
> : > : > :
> : > : > : > We have not set any shard.preference, and I also think
> preferLocal
> : > : > : > defaults to false, i.e random
> : > : > : >
> : > : > : > Earlier we had 2 shares for the same collection (both existed
> on both
> : > : > : > nodes) and then requests were distributed to both nodes.
> That’s why, when
> : > : > : > we went to 1 shard, I was wondering if the “single-shard” code
> path perhaps
> : > : > : > never attempts to utilize replicas?? But have not looked in
> code yet.
> : > : > : >
> : > : > : > Guess next step is to setup a small local test cluster and see
> what
> : > : > : > happens.
> : > : > : >
> : > : > : > Jan Høydahl
> : > : > : >
> : > : > : > > 10. mar. 2021 kl. 15:46 skrev Michael Gibney <
> michael@michaelgibney.net <ma...@michaelgibney.net>
> : > : > : > >:
> : > : > : > >
> : > : > : > > You say not "anything fancy" -- depending on how you define
> "fancy", if
> : > : > : > you
> : > : > : > > have an explicit `shards.preference` param, based on the
> version you're
> : > : > : > > running (8.4) you might also take a look at
> : > : > : > > https://issues.apache.org/jira/browse/SOLR-14471 <
> https://issues.apache.org/jira/browse/SOLR-14471>. (If SOLR-14471 is the
> : > : > : > > problem, removing the explicit `shards.preference` param
> should restore
> : > : > : > > default "shuffling" routing).
> : > : > : > >
> : > : > : > > I haven't dug too deep, but it looks like for 8.4
> preferLocalShards
> : > : > : > > actually defaults to false? I might be missing something
> though:
> : > : > : > >
> : > : > : >
> https://github.com/apache/lucene-solr/blob/releases/lucene-solr/8.4.1/solr/solrj/src/java/org/apache/solr/client/solrj/routing/RequestReplicaListTransformerGenerator.java#L85
> <
> https://github.com/apache/lucene-solr/blob/releases/lucene-solr/8.4.1/solr/solrj/src/java/org/apache/solr/client/solrj/routing/RequestReplicaListTransformerGenerator.java#L85
> >
> : > : > : > >
> : > : > : > >
> : > : > : > >
> : > : > : > >> On Wed, Mar 10, 2021 at 9:10 AM Houston Putman <
> houstonputman@gmail.com
> : > : > : > >
> : > : > : > >> wrote:
> : > : > : > >>
> : > : > : > >> I could be wrong, but i dont think preferLocalShards is the
> default in
> : > : > : > >> multi-shard use cases.
> : > : > : > >>
> : > : > : > >>> On Wed, Mar 10, 2021 at 9:07 AM Mike Drob <md...@mdrob.com>
> wrote:
> : > : > : > >>>
> : > : > : > >>> I believe a server will always try to prefer local cores.
> Can you do an
> : > : > : > >>> experiment with 3 nodes, and send http queries to the node
> not hosting
> : > : > : > >> any
> : > : > : > >>> replicas? That should confirm the balanced distribution.
> : > : > : > >>>
> : > : > : > >>> If you have multiple shards, the receiving server will
> forward the
> : > : > : > >> requests
> : > : > : > >>> for shards it doesn’t have, but would still prefer local
> shards when
> : > : > : > they
> : > : > : > >>> are available.
> : > : > : > >>>
> : > : > : > >>> On Wed, Mar 10, 2021 at 8:00 AM Jan Høydahl <
> jan.asf@cominvent.com>
> : > : > : > >> wrote:
> : > : > : > >>>
> : > : > : > >>>> Hi,
> : > : > : > >>>>
> : > : > : > >>>> A client has a SolrCloud 8.4 setup with two nodes, and
> one collection
> : > : > : > >>> with
> : > : > : > >>>> one shard and replicationFactor=2.
> : > : > : > >>>> Of course we want search traffic to be evenly distributed
> between the
> : > : > : > >> two
> : > : > : > >>>> replicas.
> : > : > : > >>>> The client is using plain HTTP requests, no SolrJ or
> anything fancy,
> : > : > : > >> and
> : > : > : > >>>> sends all requests to one of the two nodes.
> : > : > : > >>>> I was expecting Solr to forward about 50% of those
> requests to the
> : > : > : > >> other
> : > : > : > >>>> replica, but it is serving them all locally.
> : > : > : > >>>>
> : > : > : > >>>> I know we can setup an LB in front or re-program the
> client to do
> : > : > : > round
> : > : > : > >>>> robin, but that is not my question.
> : > : > : > >>>> Is the select-random-replica logic only active when we
> have a sharded
> : > : > : > >>>> oollection, and not for a single-shard?
> : > : > : > >>>>
> : > : > : > >>>> Jan
> : > : > : > >>>
> : > : > : > >>
> : > : > : >
> : > : > :
> : > : >
> : > : > -Hoss
> : > : > http://www.lucidworks.com/ <http://www.lucidworks.com/>
> : > :
> : > :
> : >
> : > -Hoss
> : > http://www.lucidworks.com/ <http://www.lucidworks.com/>
> :
>
> -Hoss
> http://www.lucidworks.com/

Re: Solr not distributing search requests among replicas

Posted by Chris Hostetter <ho...@fucit.org>.
: > that seems... dangerous.  you could easily wind up in a situation where 
: > nodes just keep trying to forward forever?
: 
: There is some special http parameter being added when forwarding 
: requests, so I'm sure each node will be able to decide whether it should 
: act as LB or if it is supposed to be the final destination. Or we can 
: add such a param. Of course, if SolrJ on the client side has already 
: selected a replica, the receiving node should not discard that and do 
: its own balancing. So there is some state to get right here.

"Forever" wasn'treally what i ment to say ... I'm concerned more about how 
you would implement this to work well in the 'general case' -- ie: 
multiple nodes, multiple collections, multiple shards, multiple replicas 
per shard -- w/o doing "too much" forwarding.


If nodeA gets a request, when exactly should it decide "i *COULD* handle 
this request for collection1 using local core, but I'll go ahead and 
forward it to nodeB instead." ? ... should it be based on what percentage 
of collection1's total replica list are located on nodeA, or based on what 
pecentage of nodeA is dedicated to collection1? ... should nodeB be more 
or less likely then nodeC to get the request based on how many total cores 
each node has for collection1, or how many unique shards each one has?


Also bear in mind that even if you assumed everything was nice and evenly 
distributed, a "simple" round robin based approach would have some pretty 
signifincat impacts on the number of intra-node network requests....  

Say you have a 5 node cluster, hosting a 1shard/5replica collection such 
that each node has 1 replica:  today any node can process the request 
locally; but if we did a round robin proxy of the request, that means we'd 
only handle it locally 1/5th the time, and 4/5ths of the time you add an 
extra network hop and the assocaited network IO involved (plus the 
original node has a thread tied up waiting to proxy the response) .. so 
you'd go from needing 0 "internal" network requests/IO to having internal 
traffic of 80% of the amount of external traffic recieved.

If those 5 nodes host a collection with 2 shards/5replicas each, spread 
evenly over the 5 nodes: today any given request typically causes 2 
intra-cluster network requests to get the per-shard data; but if we round 
robin proxy the initial request to a differnet node 4/5ths of the time we 
now typically need 2.8 internal requests for each external request...


It just seems like adding more forwarding/proxy logic -- that isn't 
strictly neccessary to compute complete results -- could introduce a lot 
of complexity risk for a problem that already has multiple solutions:

1) client (or external load blanacer) can round robin over live nodes (and 
given that cluster state and metrics are available via HTTP, a client can 
make very sophisticated choices)

2) a single "extra" solr node in the cluster can be used as a "self 
configuring" load balancer that will automatically know when new nodes are 
added to the cluster, or when replicas get moved/added, etc...






: 
: Jan
: 
: > 10. mar. 2021 kl. 19:32 skrev Chris Hostetter <ho...@fucit.org>:
: > 
: > 
: > : Is there any way whatsoever to solve this on the Solr side only?
: > : 
: > : Only I can think of is to send all requests to a 3rd node in the cluster 
: > : that does not have a core for the collection, then it will balance 
: > : between the two :)
: > 
: > correct -- you can create a Solr node w/o any cores that will act as a 
: > "load balancer" to other solr nodes.
: > 
: > : Or create a new, empty collection on the node, which acts as a routing 
: > : collection only to the target collection?
: > 
: > no -- this won't work, because the requerst your remote client sends will 
: > need to specify the actual collection you want to query, and when the node 
: > gets this it will hand it to the local core for that collection -- it 
: > won't care that there is another local collection that's unrelated.
: > 
: > : Sounds like there should be a way to explicitly disable the 
: > : "optimization" of always handling the request locally in single-shard 
: > : collections, i.e. always try to balance unless shards.preference=local?
: > 
: > that seems... dangerous.  you could easily wind up in a situation where 
: > nodes just keep trying to forward forever?
: > 
: > 
: > 
: > : 
: > : Jan
: > : 
: > : > 10. mar. 2021 kl. 19:06 skrev Chris Hostetter <hossman_lucene@fucit.org <ma...@fucit.org>>:
: > : > 
: > : > 
: > : > : Ah, I missed "single shard" ... this looks relevant:
: > : > : https://issues.apache.org/jira/browse/SOLR-12217 <https://issues.apache.org/jira/browse/SOLR-12217>
: > : > 
: > : > That improvement still isn't going to impact Jan's situation where the 
: > : > *client* isn't SolrJ ... as the description says:
: > : > 
: > : >>> NOTE: This Jira doesn't cover the single-sharded collections cases when 
: > : >>> not using the CloudSolrClient or Streaming Expressions (i.e. if you do 
: > : >>> a non-streaming curl request to a random node in the cluster, the 
: > : >>> shards.preference parameter is not considered in the case of single 
: > : >>> shards collections).
: > : > 
: > : > 
: > : > : 
: > : > : On Wed, Mar 10, 2021 at 12:43 PM Jan Høydahl <jan.asf@cominvent.com <ma...@cominvent.com>> wrote:
: > : > : 
: > : > : > We have not set any shard.preference, and I also think preferLocal
: > : > : > defaults to false, i.e random
: > : > : >
: > : > : > Earlier we had 2 shares for the same collection (both existed on both
: > : > : > nodes) and then requests were distributed to both nodes. That’s why, when
: > : > : > we went to 1 shard, I was wondering if the “single-shard” code path perhaps
: > : > : > never attempts to utilize replicas?? But have not looked in code yet.
: > : > : >
: > : > : > Guess next step is to setup a small local test cluster and see what
: > : > : > happens.
: > : > : >
: > : > : > Jan Høydahl
: > : > : >
: > : > : > > 10. mar. 2021 kl. 15:46 skrev Michael Gibney <michael@michaelgibney.net <ma...@michaelgibney.net>
: > : > : > >:
: > : > : > >
: > : > : > > You say not "anything fancy" -- depending on how you define "fancy", if
: > : > : > you
: > : > : > > have an explicit `shards.preference` param, based on the version you're
: > : > : > > running (8.4) you might also take a look at
: > : > : > > https://issues.apache.org/jira/browse/SOLR-14471 <https://issues.apache.org/jira/browse/SOLR-14471>. (If SOLR-14471 is the
: > : > : > > problem, removing the explicit `shards.preference` param should restore
: > : > : > > default "shuffling" routing).
: > : > : > >
: > : > : > > I haven't dug too deep, but it looks like for 8.4 preferLocalShards
: > : > : > > actually defaults to false? I might be missing something though:
: > : > : > >
: > : > : > https://github.com/apache/lucene-solr/blob/releases/lucene-solr/8.4.1/solr/solrj/src/java/org/apache/solr/client/solrj/routing/RequestReplicaListTransformerGenerator.java#L85 <https://github.com/apache/lucene-solr/blob/releases/lucene-solr/8.4.1/solr/solrj/src/java/org/apache/solr/client/solrj/routing/RequestReplicaListTransformerGenerator.java#L85>
: > : > : > >
: > : > : > >
: > : > : > >
: > : > : > >> On Wed, Mar 10, 2021 at 9:10 AM Houston Putman <houstonputman@gmail.com
: > : > : > >
: > : > : > >> wrote:
: > : > : > >>
: > : > : > >> I could be wrong, but i dont think preferLocalShards is the default in
: > : > : > >> multi-shard use cases.
: > : > : > >>
: > : > : > >>> On Wed, Mar 10, 2021 at 9:07 AM Mike Drob <md...@mdrob.com> wrote:
: > : > : > >>>
: > : > : > >>> I believe a server will always try to prefer local cores. Can you do an
: > : > : > >>> experiment with 3 nodes, and send http queries to the node not hosting
: > : > : > >> any
: > : > : > >>> replicas? That should confirm the balanced distribution.
: > : > : > >>>
: > : > : > >>> If you have multiple shards, the receiving server will forward the
: > : > : > >> requests
: > : > : > >>> for shards it doesn’t have, but would still prefer local shards when
: > : > : > they
: > : > : > >>> are available.
: > : > : > >>>
: > : > : > >>> On Wed, Mar 10, 2021 at 8:00 AM Jan Høydahl <ja...@cominvent.com>
: > : > : > >> wrote:
: > : > : > >>>
: > : > : > >>>> Hi,
: > : > : > >>>>
: > : > : > >>>> A client has a SolrCloud 8.4 setup with two nodes, and one collection
: > : > : > >>> with
: > : > : > >>>> one shard and replicationFactor=2.
: > : > : > >>>> Of course we want search traffic to be evenly distributed between the
: > : > : > >> two
: > : > : > >>>> replicas.
: > : > : > >>>> The client is using plain HTTP requests, no SolrJ or anything fancy,
: > : > : > >> and
: > : > : > >>>> sends all requests to one of the two nodes.
: > : > : > >>>> I was expecting Solr to forward about 50% of those requests to the
: > : > : > >> other
: > : > : > >>>> replica, but it is serving them all locally.
: > : > : > >>>>
: > : > : > >>>> I know we can setup an LB in front or re-program the client to do
: > : > : > round
: > : > : > >>>> robin, but that is not my question.
: > : > : > >>>> Is the select-random-replica logic only active when we have a sharded
: > : > : > >>>> oollection, and not for a single-shard?
: > : > : > >>>>
: > : > : > >>>> Jan
: > : > : > >>>
: > : > : > >>
: > : > : >
: > : > : 
: > : > 
: > : > -Hoss
: > : > http://www.lucidworks.com/ <http://www.lucidworks.com/>
: > : 
: > : 
: > 
: > -Hoss
: > http://www.lucidworks.com/ <http://www.lucidworks.com/>
: 

-Hoss
http://www.lucidworks.com/

Re: Solr not distributing search requests among replicas

Posted by Jan Høydahl <ja...@cominvent.com>.
> no -- this won't work, because the requerst your remote client sends will 
> need to specify the actual collection you want to query, and when the node 

I was more thinking of some explicit &collections=otherColl or &shards=other_1,other_2 but easier to just send to a node without that collection - we have 2 more nodes in the cluster

> that seems... dangerous.  you could easily wind up in a situation where 
> nodes just keep trying to forward forever?

There is some special http parameter being added when forwarding requests, so I'm sure each node will be able to decide whether it should act as LB or if it is supposed to be the final destination. Or we can add such a param. Of course, if SolrJ on the client side has already selected a replica, the receiving node should not discard that and do its own balancing. So there is some state to get right here.

Jan

> 10. mar. 2021 kl. 19:32 skrev Chris Hostetter <ho...@fucit.org>:
> 
> 
> : Is there any way whatsoever to solve this on the Solr side only?
> : 
> : Only I can think of is to send all requests to a 3rd node in the cluster 
> : that does not have a core for the collection, then it will balance 
> : between the two :)
> 
> correct -- you can create a Solr node w/o any cores that will act as a 
> "load balancer" to other solr nodes.
> 
> : Or create a new, empty collection on the node, which acts as a routing 
> : collection only to the target collection?
> 
> no -- this won't work, because the requerst your remote client sends will 
> need to specify the actual collection you want to query, and when the node 
> gets this it will hand it to the local core for that collection -- it 
> won't care that there is another local collection that's unrelated.
> 
> : Sounds like there should be a way to explicitly disable the 
> : "optimization" of always handling the request locally in single-shard 
> : collections, i.e. always try to balance unless shards.preference=local?
> 
> that seems... dangerous.  you could easily wind up in a situation where 
> nodes just keep trying to forward forever?
> 
> 
> 
> : 
> : Jan
> : 
> : > 10. mar. 2021 kl. 19:06 skrev Chris Hostetter <hossman_lucene@fucit.org <ma...@fucit.org>>:
> : > 
> : > 
> : > : Ah, I missed "single shard" ... this looks relevant:
> : > : https://issues.apache.org/jira/browse/SOLR-12217 <https://issues.apache.org/jira/browse/SOLR-12217>
> : > 
> : > That improvement still isn't going to impact Jan's situation where the 
> : > *client* isn't SolrJ ... as the description says:
> : > 
> : >>> NOTE: This Jira doesn't cover the single-sharded collections cases when 
> : >>> not using the CloudSolrClient or Streaming Expressions (i.e. if you do 
> : >>> a non-streaming curl request to a random node in the cluster, the 
> : >>> shards.preference parameter is not considered in the case of single 
> : >>> shards collections).
> : > 
> : > 
> : > : 
> : > : On Wed, Mar 10, 2021 at 12:43 PM Jan Høydahl <jan.asf@cominvent.com <ma...@cominvent.com>> wrote:
> : > : 
> : > : > We have not set any shard.preference, and I also think preferLocal
> : > : > defaults to false, i.e random
> : > : >
> : > : > Earlier we had 2 shares for the same collection (both existed on both
> : > : > nodes) and then requests were distributed to both nodes. That’s why, when
> : > : > we went to 1 shard, I was wondering if the “single-shard” code path perhaps
> : > : > never attempts to utilize replicas?? But have not looked in code yet.
> : > : >
> : > : > Guess next step is to setup a small local test cluster and see what
> : > : > happens.
> : > : >
> : > : > Jan Høydahl
> : > : >
> : > : > > 10. mar. 2021 kl. 15:46 skrev Michael Gibney <michael@michaelgibney.net <ma...@michaelgibney.net>
> : > : > >:
> : > : > >
> : > : > > You say not "anything fancy" -- depending on how you define "fancy", if
> : > : > you
> : > : > > have an explicit `shards.preference` param, based on the version you're
> : > : > > running (8.4) you might also take a look at
> : > : > > https://issues.apache.org/jira/browse/SOLR-14471 <https://issues.apache.org/jira/browse/SOLR-14471>. (If SOLR-14471 is the
> : > : > > problem, removing the explicit `shards.preference` param should restore
> : > : > > default "shuffling" routing).
> : > : > >
> : > : > > I haven't dug too deep, but it looks like for 8.4 preferLocalShards
> : > : > > actually defaults to false? I might be missing something though:
> : > : > >
> : > : > https://github.com/apache/lucene-solr/blob/releases/lucene-solr/8.4.1/solr/solrj/src/java/org/apache/solr/client/solrj/routing/RequestReplicaListTransformerGenerator.java#L85 <https://github.com/apache/lucene-solr/blob/releases/lucene-solr/8.4.1/solr/solrj/src/java/org/apache/solr/client/solrj/routing/RequestReplicaListTransformerGenerator.java#L85>
> : > : > >
> : > : > >
> : > : > >
> : > : > >> On Wed, Mar 10, 2021 at 9:10 AM Houston Putman <houstonputman@gmail.com
> : > : > >
> : > : > >> wrote:
> : > : > >>
> : > : > >> I could be wrong, but i dont think preferLocalShards is the default in
> : > : > >> multi-shard use cases.
> : > : > >>
> : > : > >>> On Wed, Mar 10, 2021 at 9:07 AM Mike Drob <md...@mdrob.com> wrote:
> : > : > >>>
> : > : > >>> I believe a server will always try to prefer local cores. Can you do an
> : > : > >>> experiment with 3 nodes, and send http queries to the node not hosting
> : > : > >> any
> : > : > >>> replicas? That should confirm the balanced distribution.
> : > : > >>>
> : > : > >>> If you have multiple shards, the receiving server will forward the
> : > : > >> requests
> : > : > >>> for shards it doesn’t have, but would still prefer local shards when
> : > : > they
> : > : > >>> are available.
> : > : > >>>
> : > : > >>> On Wed, Mar 10, 2021 at 8:00 AM Jan Høydahl <ja...@cominvent.com>
> : > : > >> wrote:
> : > : > >>>
> : > : > >>>> Hi,
> : > : > >>>>
> : > : > >>>> A client has a SolrCloud 8.4 setup with two nodes, and one collection
> : > : > >>> with
> : > : > >>>> one shard and replicationFactor=2.
> : > : > >>>> Of course we want search traffic to be evenly distributed between the
> : > : > >> two
> : > : > >>>> replicas.
> : > : > >>>> The client is using plain HTTP requests, no SolrJ or anything fancy,
> : > : > >> and
> : > : > >>>> sends all requests to one of the two nodes.
> : > : > >>>> I was expecting Solr to forward about 50% of those requests to the
> : > : > >> other
> : > : > >>>> replica, but it is serving them all locally.
> : > : > >>>>
> : > : > >>>> I know we can setup an LB in front or re-program the client to do
> : > : > round
> : > : > >>>> robin, but that is not my question.
> : > : > >>>> Is the select-random-replica logic only active when we have a sharded
> : > : > >>>> oollection, and not for a single-shard?
> : > : > >>>>
> : > : > >>>> Jan
> : > : > >>>
> : > : > >>
> : > : >
> : > : 
> : > 
> : > -Hoss
> : > http://www.lucidworks.com/ <http://www.lucidworks.com/>
> : 
> : 
> 
> -Hoss
> http://www.lucidworks.com/ <http://www.lucidworks.com/>

Re: Solr not distributing search requests among replicas

Posted by Chris Hostetter <ho...@fucit.org>.
: Is there any way whatsoever to solve this on the Solr side only?
: 
: Only I can think of is to send all requests to a 3rd node in the cluster 
: that does not have a core for the collection, then it will balance 
: between the two :)

correct -- you can create a Solr node w/o any cores that will act as a 
"load balancer" to other solr nodes.

: Or create a new, empty collection on the node, which acts as a routing 
: collection only to the target collection?

no -- this won't work, because the requerst your remote client sends will 
need to specify the actual collection you want to query, and when the node 
gets this it will hand it to the local core for that collection -- it 
won't care that there is another local collection that's unrelated.

: Sounds like there should be a way to explicitly disable the 
: "optimization" of always handling the request locally in single-shard 
: collections, i.e. always try to balance unless shards.preference=local?

that seems... dangerous.  you could easily wind up in a situation where 
nodes just keep trying to forward forever?



: 
: Jan
: 
: > 10. mar. 2021 kl. 19:06 skrev Chris Hostetter <ho...@fucit.org>:
: > 
: > 
: > : Ah, I missed "single shard" ... this looks relevant:
: > : https://issues.apache.org/jira/browse/SOLR-12217
: > 
: > That improvement still isn't going to impact Jan's situation where the 
: > *client* isn't SolrJ ... as the description says:
: > 
: >>> NOTE: This Jira doesn't cover the single-sharded collections cases when 
: >>> not using the CloudSolrClient or Streaming Expressions (i.e. if you do 
: >>> a non-streaming curl request to a random node in the cluster, the 
: >>> shards.preference parameter is not considered in the case of single 
: >>> shards collections).
: > 
: > 
: > : 
: > : On Wed, Mar 10, 2021 at 12:43 PM Jan Høydahl <ja...@cominvent.com> wrote:
: > : 
: > : > We have not set any shard.preference, and I also think preferLocal
: > : > defaults to false, i.e random
: > : >
: > : > Earlier we had 2 shares for the same collection (both existed on both
: > : > nodes) and then requests were distributed to both nodes. That’s why, when
: > : > we went to 1 shard, I was wondering if the “single-shard” code path perhaps
: > : > never attempts to utilize replicas?? But have not looked in code yet.
: > : >
: > : > Guess next step is to setup a small local test cluster and see what
: > : > happens.
: > : >
: > : > Jan Høydahl
: > : >
: > : > > 10. mar. 2021 kl. 15:46 skrev Michael Gibney <michael@michaelgibney.net
: > : > >:
: > : > >
: > : > > You say not "anything fancy" -- depending on how you define "fancy", if
: > : > you
: > : > > have an explicit `shards.preference` param, based on the version you're
: > : > > running (8.4) you might also take a look at
: > : > > https://issues.apache.org/jira/browse/SOLR-14471. (If SOLR-14471 is the
: > : > > problem, removing the explicit `shards.preference` param should restore
: > : > > default "shuffling" routing).
: > : > >
: > : > > I haven't dug too deep, but it looks like for 8.4 preferLocalShards
: > : > > actually defaults to false? I might be missing something though:
: > : > >
: > : > https://github.com/apache/lucene-solr/blob/releases/lucene-solr/8.4.1/solr/solrj/src/java/org/apache/solr/client/solrj/routing/RequestReplicaListTransformerGenerator.java#L85
: > : > >
: > : > >
: > : > >
: > : > >> On Wed, Mar 10, 2021 at 9:10 AM Houston Putman <houstonputman@gmail.com
: > : > >
: > : > >> wrote:
: > : > >>
: > : > >> I could be wrong, but i dont think preferLocalShards is the default in
: > : > >> multi-shard use cases.
: > : > >>
: > : > >>> On Wed, Mar 10, 2021 at 9:07 AM Mike Drob <md...@mdrob.com> wrote:
: > : > >>>
: > : > >>> I believe a server will always try to prefer local cores. Can you do an
: > : > >>> experiment with 3 nodes, and send http queries to the node not hosting
: > : > >> any
: > : > >>> replicas? That should confirm the balanced distribution.
: > : > >>>
: > : > >>> If you have multiple shards, the receiving server will forward the
: > : > >> requests
: > : > >>> for shards it doesn’t have, but would still prefer local shards when
: > : > they
: > : > >>> are available.
: > : > >>>
: > : > >>> On Wed, Mar 10, 2021 at 8:00 AM Jan Høydahl <ja...@cominvent.com>
: > : > >> wrote:
: > : > >>>
: > : > >>>> Hi,
: > : > >>>>
: > : > >>>> A client has a SolrCloud 8.4 setup with two nodes, and one collection
: > : > >>> with
: > : > >>>> one shard and replicationFactor=2.
: > : > >>>> Of course we want search traffic to be evenly distributed between the
: > : > >> two
: > : > >>>> replicas.
: > : > >>>> The client is using plain HTTP requests, no SolrJ or anything fancy,
: > : > >> and
: > : > >>>> sends all requests to one of the two nodes.
: > : > >>>> I was expecting Solr to forward about 50% of those requests to the
: > : > >> other
: > : > >>>> replica, but it is serving them all locally.
: > : > >>>>
: > : > >>>> I know we can setup an LB in front or re-program the client to do
: > : > round
: > : > >>>> robin, but that is not my question.
: > : > >>>> Is the select-random-replica logic only active when we have a sharded
: > : > >>>> oollection, and not for a single-shard?
: > : > >>>>
: > : > >>>> Jan
: > : > >>>
: > : > >>
: > : >
: > : 
: > 
: > -Hoss
: > http://www.lucidworks.com/
: 
: 

-Hoss
http://www.lucidworks.com/

Re: Solr not distributing search requests among replicas

Posted by Jan Høydahl <ja...@cominvent.com>.
Aha, I'm starting to see what is happening here.

So on the server side, a node hosting one of N replicas for a shard, and that collection is single-sharded, then no randomization or forwarding will ever take place.
Before SOLR-12217 it would not happen when using SolrJ either, but after SOLR-12217, SolrJ will load balance between replicas when selecting a node to send request to.

So in this case the client is a .NET app and has no SolrJ.

Is there any way whatsoever to solve this on the Solr side only?

Only I can think of is to send all requests to a 3rd node in the cluster that does not have a core for the collection, then it will balance between the two :)
Or create a new, empty collection on the node, which acts as a routing collection only to the target collection?

Sounds like there should be a way to explicitly disable the "optimization" of always handling the request locally in single-shard collections, i.e. always try to balance unless shards.preference=local?

Jan

> 10. mar. 2021 kl. 19:06 skrev Chris Hostetter <ho...@fucit.org>:
> 
> 
> : Ah, I missed "single shard" ... this looks relevant:
> : https://issues.apache.org/jira/browse/SOLR-12217
> 
> That improvement still isn't going to impact Jan's situation where the 
> *client* isn't SolrJ ... as the description says:
> 
>>> NOTE: This Jira doesn't cover the single-sharded collections cases when 
>>> not using the CloudSolrClient or Streaming Expressions (i.e. if you do 
>>> a non-streaming curl request to a random node in the cluster, the 
>>> shards.preference parameter is not considered in the case of single 
>>> shards collections).
> 
> 
> : 
> : On Wed, Mar 10, 2021 at 12:43 PM Jan Høydahl <ja...@cominvent.com> wrote:
> : 
> : > We have not set any shard.preference, and I also think preferLocal
> : > defaults to false, i.e random
> : >
> : > Earlier we had 2 shares for the same collection (both existed on both
> : > nodes) and then requests were distributed to both nodes. That’s why, when
> : > we went to 1 shard, I was wondering if the “single-shard” code path perhaps
> : > never attempts to utilize replicas?? But have not looked in code yet.
> : >
> : > Guess next step is to setup a small local test cluster and see what
> : > happens.
> : >
> : > Jan Høydahl
> : >
> : > > 10. mar. 2021 kl. 15:46 skrev Michael Gibney <michael@michaelgibney.net
> : > >:
> : > >
> : > > You say not "anything fancy" -- depending on how you define "fancy", if
> : > you
> : > > have an explicit `shards.preference` param, based on the version you're
> : > > running (8.4) you might also take a look at
> : > > https://issues.apache.org/jira/browse/SOLR-14471. (If SOLR-14471 is the
> : > > problem, removing the explicit `shards.preference` param should restore
> : > > default "shuffling" routing).
> : > >
> : > > I haven't dug too deep, but it looks like for 8.4 preferLocalShards
> : > > actually defaults to false? I might be missing something though:
> : > >
> : > https://github.com/apache/lucene-solr/blob/releases/lucene-solr/8.4.1/solr/solrj/src/java/org/apache/solr/client/solrj/routing/RequestReplicaListTransformerGenerator.java#L85
> : > >
> : > >
> : > >
> : > >> On Wed, Mar 10, 2021 at 9:10 AM Houston Putman <houstonputman@gmail.com
> : > >
> : > >> wrote:
> : > >>
> : > >> I could be wrong, but i dont think preferLocalShards is the default in
> : > >> multi-shard use cases.
> : > >>
> : > >>> On Wed, Mar 10, 2021 at 9:07 AM Mike Drob <md...@mdrob.com> wrote:
> : > >>>
> : > >>> I believe a server will always try to prefer local cores. Can you do an
> : > >>> experiment with 3 nodes, and send http queries to the node not hosting
> : > >> any
> : > >>> replicas? That should confirm the balanced distribution.
> : > >>>
> : > >>> If you have multiple shards, the receiving server will forward the
> : > >> requests
> : > >>> for shards it doesn’t have, but would still prefer local shards when
> : > they
> : > >>> are available.
> : > >>>
> : > >>> On Wed, Mar 10, 2021 at 8:00 AM Jan Høydahl <ja...@cominvent.com>
> : > >> wrote:
> : > >>>
> : > >>>> Hi,
> : > >>>>
> : > >>>> A client has a SolrCloud 8.4 setup with two nodes, and one collection
> : > >>> with
> : > >>>> one shard and replicationFactor=2.
> : > >>>> Of course we want search traffic to be evenly distributed between the
> : > >> two
> : > >>>> replicas.
> : > >>>> The client is using plain HTTP requests, no SolrJ or anything fancy,
> : > >> and
> : > >>>> sends all requests to one of the two nodes.
> : > >>>> I was expecting Solr to forward about 50% of those requests to the
> : > >> other
> : > >>>> replica, but it is serving them all locally.
> : > >>>>
> : > >>>> I know we can setup an LB in front or re-program the client to do
> : > round
> : > >>>> robin, but that is not my question.
> : > >>>> Is the select-random-replica logic only active when we have a sharded
> : > >>>> oollection, and not for a single-shard?
> : > >>>>
> : > >>>> Jan
> : > >>>
> : > >>
> : >
> : 
> 
> -Hoss
> http://www.lucidworks.com/


Re: Solr not distributing search requests among replicas

Posted by Chris Hostetter <ho...@fucit.org>.
: Ah, I missed "single shard" ... this looks relevant:
: https://issues.apache.org/jira/browse/SOLR-12217

That improvement still isn't going to impact Jan's situation where the 
*client* isn't SolrJ ... as the description says:

>> NOTE: This Jira doesn't cover the single-sharded collections cases when 
>> not using the CloudSolrClient or Streaming Expressions (i.e. if you do 
>> a non-streaming curl request to a random node in the cluster, the 
>> shards.preference parameter is not considered in the case of single 
>> shards collections).


: 
: On Wed, Mar 10, 2021 at 12:43 PM Jan Høydahl <ja...@cominvent.com> wrote:
: 
: > We have not set any shard.preference, and I also think preferLocal
: > defaults to false, i.e random
: >
: > Earlier we had 2 shares for the same collection (both existed on both
: > nodes) and then requests were distributed to both nodes. That’s why, when
: > we went to 1 shard, I was wondering if the “single-shard” code path perhaps
: > never attempts to utilize replicas?? But have not looked in code yet.
: >
: > Guess next step is to setup a small local test cluster and see what
: > happens.
: >
: > Jan Høydahl
: >
: > > 10. mar. 2021 kl. 15:46 skrev Michael Gibney <michael@michaelgibney.net
: > >:
: > >
: > > You say not "anything fancy" -- depending on how you define "fancy", if
: > you
: > > have an explicit `shards.preference` param, based on the version you're
: > > running (8.4) you might also take a look at
: > > https://issues.apache.org/jira/browse/SOLR-14471. (If SOLR-14471 is the
: > > problem, removing the explicit `shards.preference` param should restore
: > > default "shuffling" routing).
: > >
: > > I haven't dug too deep, but it looks like for 8.4 preferLocalShards
: > > actually defaults to false? I might be missing something though:
: > >
: > https://github.com/apache/lucene-solr/blob/releases/lucene-solr/8.4.1/solr/solrj/src/java/org/apache/solr/client/solrj/routing/RequestReplicaListTransformerGenerator.java#L85
: > >
: > >
: > >
: > >> On Wed, Mar 10, 2021 at 9:10 AM Houston Putman <houstonputman@gmail.com
: > >
: > >> wrote:
: > >>
: > >> I could be wrong, but i dont think preferLocalShards is the default in
: > >> multi-shard use cases.
: > >>
: > >>> On Wed, Mar 10, 2021 at 9:07 AM Mike Drob <md...@mdrob.com> wrote:
: > >>>
: > >>> I believe a server will always try to prefer local cores. Can you do an
: > >>> experiment with 3 nodes, and send http queries to the node not hosting
: > >> any
: > >>> replicas? That should confirm the balanced distribution.
: > >>>
: > >>> If you have multiple shards, the receiving server will forward the
: > >> requests
: > >>> for shards it doesn’t have, but would still prefer local shards when
: > they
: > >>> are available.
: > >>>
: > >>> On Wed, Mar 10, 2021 at 8:00 AM Jan Høydahl <ja...@cominvent.com>
: > >> wrote:
: > >>>
: > >>>> Hi,
: > >>>>
: > >>>> A client has a SolrCloud 8.4 setup with two nodes, and one collection
: > >>> with
: > >>>> one shard and replicationFactor=2.
: > >>>> Of course we want search traffic to be evenly distributed between the
: > >> two
: > >>>> replicas.
: > >>>> The client is using plain HTTP requests, no SolrJ or anything fancy,
: > >> and
: > >>>> sends all requests to one of the two nodes.
: > >>>> I was expecting Solr to forward about 50% of those requests to the
: > >> other
: > >>>> replica, but it is serving them all locally.
: > >>>>
: > >>>> I know we can setup an LB in front or re-program the client to do
: > round
: > >>>> robin, but that is not my question.
: > >>>> Is the select-random-replica logic only active when we have a sharded
: > >>>> oollection, and not for a single-shard?
: > >>>>
: > >>>> Jan
: > >>>
: > >>
: >
: 

-Hoss
http://www.lucidworks.com/

Re: Solr not distributing search requests among replicas

Posted by Michael Gibney <mi...@michaelgibney.net>.
Ah, I missed "single shard" ... this looks relevant:
https://issues.apache.org/jira/browse/SOLR-12217

On Wed, Mar 10, 2021 at 12:43 PM Jan Høydahl <ja...@cominvent.com> wrote:

> We have not set any shard.preference, and I also think preferLocal
> defaults to false, i.e random
>
> Earlier we had 2 shares for the same collection (both existed on both
> nodes) and then requests were distributed to both nodes. That’s why, when
> we went to 1 shard, I was wondering if the “single-shard” code path perhaps
> never attempts to utilize replicas?? But have not looked in code yet.
>
> Guess next step is to setup a small local test cluster and see what
> happens.
>
> Jan Høydahl
>
> > 10. mar. 2021 kl. 15:46 skrev Michael Gibney <michael@michaelgibney.net
> >:
> >
> > You say not "anything fancy" -- depending on how you define "fancy", if
> you
> > have an explicit `shards.preference` param, based on the version you're
> > running (8.4) you might also take a look at
> > https://issues.apache.org/jira/browse/SOLR-14471. (If SOLR-14471 is the
> > problem, removing the explicit `shards.preference` param should restore
> > default "shuffling" routing).
> >
> > I haven't dug too deep, but it looks like for 8.4 preferLocalShards
> > actually defaults to false? I might be missing something though:
> >
> https://github.com/apache/lucene-solr/blob/releases/lucene-solr/8.4.1/solr/solrj/src/java/org/apache/solr/client/solrj/routing/RequestReplicaListTransformerGenerator.java#L85
> >
> >
> >
> >> On Wed, Mar 10, 2021 at 9:10 AM Houston Putman <houstonputman@gmail.com
> >
> >> wrote:
> >>
> >> I could be wrong, but i dont think preferLocalShards is the default in
> >> multi-shard use cases.
> >>
> >>> On Wed, Mar 10, 2021 at 9:07 AM Mike Drob <md...@mdrob.com> wrote:
> >>>
> >>> I believe a server will always try to prefer local cores. Can you do an
> >>> experiment with 3 nodes, and send http queries to the node not hosting
> >> any
> >>> replicas? That should confirm the balanced distribution.
> >>>
> >>> If you have multiple shards, the receiving server will forward the
> >> requests
> >>> for shards it doesn’t have, but would still prefer local shards when
> they
> >>> are available.
> >>>
> >>> On Wed, Mar 10, 2021 at 8:00 AM Jan Høydahl <ja...@cominvent.com>
> >> wrote:
> >>>
> >>>> Hi,
> >>>>
> >>>> A client has a SolrCloud 8.4 setup with two nodes, and one collection
> >>> with
> >>>> one shard and replicationFactor=2.
> >>>> Of course we want search traffic to be evenly distributed between the
> >> two
> >>>> replicas.
> >>>> The client is using plain HTTP requests, no SolrJ or anything fancy,
> >> and
> >>>> sends all requests to one of the two nodes.
> >>>> I was expecting Solr to forward about 50% of those requests to the
> >> other
> >>>> replica, but it is serving them all locally.
> >>>>
> >>>> I know we can setup an LB in front or re-program the client to do
> round
> >>>> robin, but that is not my question.
> >>>> Is the select-random-replica logic only active when we have a sharded
> >>>> oollection, and not for a single-shard?
> >>>>
> >>>> Jan
> >>>
> >>
>

Re: Solr not distributing search requests among replicas

Posted by Jan Høydahl <ja...@cominvent.com>.
We have not set any shard.preference, and I also think preferLocal defaults to false, i.e random

Earlier we had 2 shares for the same collection (both existed on both nodes) and then requests were distributed to both nodes. That’s why, when we went to 1 shard, I was wondering if the “single-shard” code path perhaps never attempts to utilize replicas?? But have not looked in code yet.

Guess next step is to setup a small local test cluster and see what happens.

Jan Høydahl

> 10. mar. 2021 kl. 15:46 skrev Michael Gibney <mi...@michaelgibney.net>:
> 
> You say not "anything fancy" -- depending on how you define "fancy", if you
> have an explicit `shards.preference` param, based on the version you're
> running (8.4) you might also take a look at
> https://issues.apache.org/jira/browse/SOLR-14471. (If SOLR-14471 is the
> problem, removing the explicit `shards.preference` param should restore
> default "shuffling" routing).
> 
> I haven't dug too deep, but it looks like for 8.4 preferLocalShards
> actually defaults to false? I might be missing something though:
> https://github.com/apache/lucene-solr/blob/releases/lucene-solr/8.4.1/solr/solrj/src/java/org/apache/solr/client/solrj/routing/RequestReplicaListTransformerGenerator.java#L85
> 
> 
> 
>> On Wed, Mar 10, 2021 at 9:10 AM Houston Putman <ho...@gmail.com>
>> wrote:
>> 
>> I could be wrong, but i dont think preferLocalShards is the default in
>> multi-shard use cases.
>> 
>>> On Wed, Mar 10, 2021 at 9:07 AM Mike Drob <md...@mdrob.com> wrote:
>>> 
>>> I believe a server will always try to prefer local cores. Can you do an
>>> experiment with 3 nodes, and send http queries to the node not hosting
>> any
>>> replicas? That should confirm the balanced distribution.
>>> 
>>> If you have multiple shards, the receiving server will forward the
>> requests
>>> for shards it doesn’t have, but would still prefer local shards when they
>>> are available.
>>> 
>>> On Wed, Mar 10, 2021 at 8:00 AM Jan Høydahl <ja...@cominvent.com>
>> wrote:
>>> 
>>>> Hi,
>>>> 
>>>> A client has a SolrCloud 8.4 setup with two nodes, and one collection
>>> with
>>>> one shard and replicationFactor=2.
>>>> Of course we want search traffic to be evenly distributed between the
>> two
>>>> replicas.
>>>> The client is using plain HTTP requests, no SolrJ or anything fancy,
>> and
>>>> sends all requests to one of the two nodes.
>>>> I was expecting Solr to forward about 50% of those requests to the
>> other
>>>> replica, but it is serving them all locally.
>>>> 
>>>> I know we can setup an LB in front or re-program the client to do round
>>>> robin, but that is not my question.
>>>> Is the select-random-replica logic only active when we have a sharded
>>>> oollection, and not for a single-shard?
>>>> 
>>>> Jan
>>> 
>> 

Re: Solr not distributing search requests among replicas

Posted by Michael Gibney <mi...@michaelgibney.net>.
You say not "anything fancy" -- depending on how you define "fancy", if you
have an explicit `shards.preference` param, based on the version you're
running (8.4) you might also take a look at
https://issues.apache.org/jira/browse/SOLR-14471. (If SOLR-14471 is the
problem, removing the explicit `shards.preference` param should restore
default "shuffling" routing).

I haven't dug too deep, but it looks like for 8.4 preferLocalShards
actually defaults to false? I might be missing something though:
https://github.com/apache/lucene-solr/blob/releases/lucene-solr/8.4.1/solr/solrj/src/java/org/apache/solr/client/solrj/routing/RequestReplicaListTransformerGenerator.java#L85



On Wed, Mar 10, 2021 at 9:10 AM Houston Putman <ho...@gmail.com>
wrote:

> I could be wrong, but i dont think preferLocalShards is the default in
> multi-shard use cases.
>
> On Wed, Mar 10, 2021 at 9:07 AM Mike Drob <md...@mdrob.com> wrote:
>
> > I believe a server will always try to prefer local cores. Can you do an
> > experiment with 3 nodes, and send http queries to the node not hosting
> any
> > replicas? That should confirm the balanced distribution.
> >
> > If you have multiple shards, the receiving server will forward the
> requests
> > for shards it doesn’t have, but would still prefer local shards when they
> > are available.
> >
> > On Wed, Mar 10, 2021 at 8:00 AM Jan Høydahl <ja...@cominvent.com>
> wrote:
> >
> > > Hi,
> > >
> > > A client has a SolrCloud 8.4 setup with two nodes, and one collection
> > with
> > > one shard and replicationFactor=2.
> > > Of course we want search traffic to be evenly distributed between the
> two
> > > replicas.
> > > The client is using plain HTTP requests, no SolrJ or anything fancy,
> and
> > > sends all requests to one of the two nodes.
> > > I was expecting Solr to forward about 50% of those requests to the
> other
> > > replica, but it is serving them all locally.
> > >
> > > I know we can setup an LB in front or re-program the client to do round
> > > robin, but that is not my question.
> > > Is the select-random-replica logic only active when we have a sharded
> > > oollection, and not for a single-shard?
> > >
> > > Jan
> >
>

Re: Solr not distributing search requests among replicas

Posted by Houston Putman <ho...@gmail.com>.
I could be wrong, but i dont think preferLocalShards is the default in
multi-shard use cases.

On Wed, Mar 10, 2021 at 9:07 AM Mike Drob <md...@mdrob.com> wrote:

> I believe a server will always try to prefer local cores. Can you do an
> experiment with 3 nodes, and send http queries to the node not hosting any
> replicas? That should confirm the balanced distribution.
>
> If you have multiple shards, the receiving server will forward the requests
> for shards it doesn’t have, but would still prefer local shards when they
> are available.
>
> On Wed, Mar 10, 2021 at 8:00 AM Jan Høydahl <ja...@cominvent.com> wrote:
>
> > Hi,
> >
> > A client has a SolrCloud 8.4 setup with two nodes, and one collection
> with
> > one shard and replicationFactor=2.
> > Of course we want search traffic to be evenly distributed between the two
> > replicas.
> > The client is using plain HTTP requests, no SolrJ or anything fancy, and
> > sends all requests to one of the two nodes.
> > I was expecting Solr to forward about 50% of those requests to the other
> > replica, but it is serving them all locally.
> >
> > I know we can setup an LB in front or re-program the client to do round
> > robin, but that is not my question.
> > Is the select-random-replica logic only active when we have a sharded
> > oollection, and not for a single-shard?
> >
> > Jan
>

Re: Solr not distributing search requests among replicas

Posted by Mike Drob <md...@mdrob.com>.
I believe a server will always try to prefer local cores. Can you do an
experiment with 3 nodes, and send http queries to the node not hosting any
replicas? That should confirm the balanced distribution.

If you have multiple shards, the receiving server will forward the requests
for shards it doesn’t have, but would still prefer local shards when they
are available.

On Wed, Mar 10, 2021 at 8:00 AM Jan Høydahl <ja...@cominvent.com> wrote:

> Hi,
>
> A client has a SolrCloud 8.4 setup with two nodes, and one collection with
> one shard and replicationFactor=2.
> Of course we want search traffic to be evenly distributed between the two
> replicas.
> The client is using plain HTTP requests, no SolrJ or anything fancy, and
> sends all requests to one of the two nodes.
> I was expecting Solr to forward about 50% of those requests to the other
> replica, but it is serving them all locally.
>
> I know we can setup an LB in front or re-program the client to do round
> robin, but that is not my question.
> Is the select-random-replica logic only active when we have a sharded
> oollection, and not for a single-shard?
>
> Jan