You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Jason J Baik <ja...@gmail.com> on 2020/07/20 13:02:36 UTC

How to route requests to a specific core of a node hosting multiple shards?

Hi,

After upgrading from Solr 6.6.2 to 7.6.0, we're seeing an issue with
request routing in CloudSolrClient. It seems that we've lost the ability to
route a request to a specific core of a node.

For example, if a host is serving shard 1 core 1, and shard 2 core
1, @6.6.2, adding a "_route_=<value whose hash falls in shard 1 range>"
param was sufficient for CloudSolrClient to figure out the request should
go to shard 1 core 1, but @7.6.0, the request is routed to one of them
randomly.

It seems the core-level url resolution has been removed from
CloudSolrClient at commit e001f352895c83652c3cf31e3c724d29a46bb721 around
L1053, as part of SOLR-11444
<https://issues.apache.org/jira/browse/SOLR-11444>. The url the request is
sent to is now constructed only to the node level, and no longer to the
core level.

There's a related issue for this at SOLR-10695
<https://issues.apache.org/jira/browse/SOLR-10695>, and SOLR-9063
<https://issues.apache.org/jira/browse/SOLR-9063> but not quite the same.
Can somebody please advise what the new way to achieve this nowadays is?

Re: How to route requests to a specific core of a node hosting multiple shards?

Posted by Jason J Baik <ja...@gmail.com>.
Thanks for looking into this @Erick Erickson.
What'd be the proper way to get David Smiley's attention on this issue? A
JIRA ticket?

As for the performance difference, we haven't had a chance to test it.
We're still in the dev phase for migrating to solr 8, so we'll run our
benchmarks afterward, and try to see if it's a serious problem.

On Mon, Jul 20, 2020 at 10:43 AM Erick Erickson <er...@gmail.com>
wrote:

> Hmm, ok.
>
> I’d have to defer to David Smiley about whether that was an intended
> change.
>
> I’m curious whether you can actually measure the difference in
> performance. If
> you can then that changes the urgency. Of course it’ll be a little more
> expensive
> for the replica serving shard2 on that machine to forward it to the replica
> serving shard1, but since it’s not going across the network IDK if it’s a
> consequential difference.
>
> Best,
> Erick
>
> > On Jul 20, 2020, at 10:04 AM, Jason J Baik <ja...@gmail.com>
> wrote:
> >
> > Our use case here is that we want to highlight a single document (against
> > user-provided keywords), and we know the document's unique key already.
> > So this is really not a distributed query, but more of a get by id, but
> we
> > use SolrClient.query() for highlighting capabilities.
> > And since we know the unique key, for speed gains, we've been making use
> of
> > the "_route_" param to limit the request to the shard containing the
> > document.
> >
> > Our use case aside, SOLR-11444
> > <https://issues.apache.org/jira/browse/SOLR-11444> generally seems to
> be at
> > odds with the advertised use of the "_route_" param
> >
> https://lucene.apache.org/solr/guide/7_5/solrcloud-query-routing-and-read-tolerance.html#_route_-parameter
> > .
> > Solr is routing the request to the correct "node", but it no longer
> routes
> > to the correct "shard" on that node?
> >
> >
> > On Mon, Jul 20, 2020 at 9:33 AM Erick Erickson <er...@gmail.com>
> > wrote:
> >
> >> First I want to check if this is an XY problem. Why do you want to do
> this?
> >>
> >> If you’re using CloudSolrClient, requests are automatically load
> balanced.
> >> And
> >> even if you send a top-level request (assuming you do NOT set
> >> distrib=false),
> >> then the request may be forwarded to another Solr node anyway. This is
> to
> >> handle the case where people are sending requests to a specific node,
> you
> >> don’t
> >> really want that node doing all the aggregating.
> >>
> >> Of course if you’re using an external load balancer, you can avoid all
> >> that.
> >>
> >> I’m not sure what the value is of sending a general request to a
> specific
> >> core in the same JVM. A “node” is really Solr running in a JVM, so there
> >> may be multiple of these on a particular machine, but the resolution
> >> takes that into account.
> >>
> >> If you have reason to ping a specific replica _only_ (I’ve often done
> this
> >> for
> >> troubleshooting), address the full replica and add “distrib=false”, i.e.
> >> http://…../solr/collection1_shard1_replica1?q=*:*&distrib=false
> >>
> >> Best,
> >> Erick
> >>
> >>> On Jul 20, 2020, at 9:02 AM, Jason J Baik <ja...@gmail.com>
> >> wrote:
> >>>
> >>> Hi,
> >>>
> >>> After upgrading from Solr 6.6.2 to 7.6.0, we're seeing an issue with
> >>> request routing in CloudSolrClient. It seems that we've lost the
> ability
> >> to
> >>> route a request to a specific core of a node.
> >>>
> >>> For example, if a host is serving shard 1 core 1, and shard 2 core
> >>> 1, @6.6.2, adding a "_route_=<value whose hash falls in shard 1 range>"
> >>> param was sufficient for CloudSolrClient to figure out the request
> should
> >>> go to shard 1 core 1, but @7.6.0, the request is routed to one of them
> >>> randomly.
> >>>
> >>> It seems the core-level url resolution has been removed from
> >>> CloudSolrClient at commit e001f352895c83652c3cf31e3c724d29a46bb721
> around
> >>> L1053, as part of SOLR-11444
> >>> <https://issues.apache.org/jira/browse/SOLR-11444>. The url the
> request
> >> is
> >>> sent to is now constructed only to the node level, and no longer to the
> >>> core level.
> >>>
> >>> There's a related issue for this at SOLR-10695
> >>> <https://issues.apache.org/jira/browse/SOLR-10695>, and SOLR-9063
> >>> <https://issues.apache.org/jira/browse/SOLR-9063> but not quite the
> >> same.
> >>> Can somebody please advise what the new way to achieve this nowadays
> is?
> >>
> >>
>
>

Re: How to route requests to a specific core of a node hosting multiple shards?

Posted by Erick Erickson <er...@gmail.com>.
Hmm, ok. 

I’d have to defer to David Smiley about whether that was an intended change.

I’m curious whether you can actually measure the difference in performance. If
you can then that changes the urgency. Of course it’ll be a little more expensive
for the replica serving shard2 on that machine to forward it to the replica
serving shard1, but since it’s not going across the network IDK if it’s a 
consequential difference.

Best,
Erick

> On Jul 20, 2020, at 10:04 AM, Jason J Baik <ja...@gmail.com> wrote:
> 
> Our use case here is that we want to highlight a single document (against
> user-provided keywords), and we know the document's unique key already.
> So this is really not a distributed query, but more of a get by id, but we
> use SolrClient.query() for highlighting capabilities.
> And since we know the unique key, for speed gains, we've been making use of
> the "_route_" param to limit the request to the shard containing the
> document.
> 
> Our use case aside, SOLR-11444
> <https://issues.apache.org/jira/browse/SOLR-11444> generally seems to be at
> odds with the advertised use of the "_route_" param
> https://lucene.apache.org/solr/guide/7_5/solrcloud-query-routing-and-read-tolerance.html#_route_-parameter
> .
> Solr is routing the request to the correct "node", but it no longer routes
> to the correct "shard" on that node?
> 
> 
> On Mon, Jul 20, 2020 at 9:33 AM Erick Erickson <er...@gmail.com>
> wrote:
> 
>> First I want to check if this is an XY problem. Why do you want to do this?
>> 
>> If you’re using CloudSolrClient, requests are automatically load balanced.
>> And
>> even if you send a top-level request (assuming you do NOT set
>> distrib=false),
>> then the request may be forwarded to another Solr node anyway. This is to
>> handle the case where people are sending requests to a specific node, you
>> don’t
>> really want that node doing all the aggregating.
>> 
>> Of course if you’re using an external load balancer, you can avoid all
>> that.
>> 
>> I’m not sure what the value is of sending a general request to a specific
>> core in the same JVM. A “node” is really Solr running in a JVM, so there
>> may be multiple of these on a particular machine, but the resolution
>> takes that into account.
>> 
>> If you have reason to ping a specific replica _only_ (I’ve often done this
>> for
>> troubleshooting), address the full replica and add “distrib=false”, i.e.
>> http://…../solr/collection1_shard1_replica1?q=*:*&distrib=false
>> 
>> Best,
>> Erick
>> 
>>> On Jul 20, 2020, at 9:02 AM, Jason J Baik <ja...@gmail.com>
>> wrote:
>>> 
>>> Hi,
>>> 
>>> After upgrading from Solr 6.6.2 to 7.6.0, we're seeing an issue with
>>> request routing in CloudSolrClient. It seems that we've lost the ability
>> to
>>> route a request to a specific core of a node.
>>> 
>>> For example, if a host is serving shard 1 core 1, and shard 2 core
>>> 1, @6.6.2, adding a "_route_=<value whose hash falls in shard 1 range>"
>>> param was sufficient for CloudSolrClient to figure out the request should
>>> go to shard 1 core 1, but @7.6.0, the request is routed to one of them
>>> randomly.
>>> 
>>> It seems the core-level url resolution has been removed from
>>> CloudSolrClient at commit e001f352895c83652c3cf31e3c724d29a46bb721 around
>>> L1053, as part of SOLR-11444
>>> <https://issues.apache.org/jira/browse/SOLR-11444>. The url the request
>> is
>>> sent to is now constructed only to the node level, and no longer to the
>>> core level.
>>> 
>>> There's a related issue for this at SOLR-10695
>>> <https://issues.apache.org/jira/browse/SOLR-10695>, and SOLR-9063
>>> <https://issues.apache.org/jira/browse/SOLR-9063> but not quite the
>> same.
>>> Can somebody please advise what the new way to achieve this nowadays is?
>> 
>> 


Re: How to route requests to a specific core of a node hosting multiple shards?

Posted by Jason J Baik <ja...@gmail.com>.
Our use case here is that we want to highlight a single document (against
user-provided keywords), and we know the document's unique key already.
So this is really not a distributed query, but more of a get by id, but we
use SolrClient.query() for highlighting capabilities.
And since we know the unique key, for speed gains, we've been making use of
the "_route_" param to limit the request to the shard containing the
document.

Our use case aside, SOLR-11444
<https://issues.apache.org/jira/browse/SOLR-11444> generally seems to be at
odds with the advertised use of the "_route_" param
https://lucene.apache.org/solr/guide/7_5/solrcloud-query-routing-and-read-tolerance.html#_route_-parameter
.
Solr is routing the request to the correct "node", but it no longer routes
to the correct "shard" on that node?


On Mon, Jul 20, 2020 at 9:33 AM Erick Erickson <er...@gmail.com>
wrote:

> First I want to check if this is an XY problem. Why do you want to do this?
>
> If you’re using CloudSolrClient, requests are automatically load balanced.
> And
> even if you send a top-level request (assuming you do NOT set
> distrib=false),
> then the request may be forwarded to another Solr node anyway. This is to
> handle the case where people are sending requests to a specific node, you
> don’t
> really want that node doing all the aggregating.
>
> Of course if you’re using an external load balancer, you can avoid all
> that.
>
> I’m not sure what the value is of sending a general request to a specific
> core in the same JVM. A “node” is really Solr running in a JVM, so there
> may be multiple of these on a particular machine, but the resolution
> takes that into account.
>
> If you have reason to ping a specific replica _only_ (I’ve often done this
> for
> troubleshooting), address the full replica and add “distrib=false”, i.e.
> http://…../solr/collection1_shard1_replica1?q=*:*&distrib=false
>
> Best,
> Erick
>
> > On Jul 20, 2020, at 9:02 AM, Jason J Baik <ja...@gmail.com>
> wrote:
> >
> > Hi,
> >
> > After upgrading from Solr 6.6.2 to 7.6.0, we're seeing an issue with
> > request routing in CloudSolrClient. It seems that we've lost the ability
> to
> > route a request to a specific core of a node.
> >
> > For example, if a host is serving shard 1 core 1, and shard 2 core
> > 1, @6.6.2, adding a "_route_=<value whose hash falls in shard 1 range>"
> > param was sufficient for CloudSolrClient to figure out the request should
> > go to shard 1 core 1, but @7.6.0, the request is routed to one of them
> > randomly.
> >
> > It seems the core-level url resolution has been removed from
> > CloudSolrClient at commit e001f352895c83652c3cf31e3c724d29a46bb721 around
> > L1053, as part of SOLR-11444
> > <https://issues.apache.org/jira/browse/SOLR-11444>. The url the request
> is
> > sent to is now constructed only to the node level, and no longer to the
> > core level.
> >
> > There's a related issue for this at SOLR-10695
> > <https://issues.apache.org/jira/browse/SOLR-10695>, and SOLR-9063
> > <https://issues.apache.org/jira/browse/SOLR-9063> but not quite the
> same.
> > Can somebody please advise what the new way to achieve this nowadays is?
>
>

Re: How to route requests to a specific core of a node hosting multiple shards?

Posted by Erick Erickson <er...@gmail.com>.
First I want to check if this is an XY problem. Why do you want to do this?

If you’re using CloudSolrClient, requests are automatically load balanced. And
even if you send a top-level request (assuming you do NOT set distrib=false),
then the request may be forwarded to another Solr node anyway. This is to
handle the case where people are sending requests to a specific node, you don’t
really want that node doing all the aggregating.

Of course if you’re using an external load balancer, you can avoid all that.

I’m not sure what the value is of sending a general request to a specific
core in the same JVM. A “node” is really Solr running in a JVM, so there
may be multiple of these on a particular machine, but the resolution
takes that into account.

If you have reason to ping a specific replica _only_ (I’ve often done this for
troubleshooting), address the full replica and add “distrib=false”, i.e.
http://…../solr/collection1_shard1_replica1?q=*:*&distrib=false

Best,
Erick

> On Jul 20, 2020, at 9:02 AM, Jason J Baik <ja...@gmail.com> wrote:
> 
> Hi,
> 
> After upgrading from Solr 6.6.2 to 7.6.0, we're seeing an issue with
> request routing in CloudSolrClient. It seems that we've lost the ability to
> route a request to a specific core of a node.
> 
> For example, if a host is serving shard 1 core 1, and shard 2 core
> 1, @6.6.2, adding a "_route_=<value whose hash falls in shard 1 range>"
> param was sufficient for CloudSolrClient to figure out the request should
> go to shard 1 core 1, but @7.6.0, the request is routed to one of them
> randomly.
> 
> It seems the core-level url resolution has been removed from
> CloudSolrClient at commit e001f352895c83652c3cf31e3c724d29a46bb721 around
> L1053, as part of SOLR-11444
> <https://issues.apache.org/jira/browse/SOLR-11444>. The url the request is
> sent to is now constructed only to the node level, and no longer to the
> core level.
> 
> There's a related issue for this at SOLR-10695
> <https://issues.apache.org/jira/browse/SOLR-10695>, and SOLR-9063
> <https://issues.apache.org/jira/browse/SOLR-9063> but not quite the same.
> Can somebody please advise what the new way to achieve this nowadays is?