You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Fernando Otero <fe...@olx.com> on 2018/10/23 15:31:05 UTC

Internal Solr communication question

Hey all
     I'm running some tests on Solr cloud (10 nodes, 3 shards, 3 replicas),
when I run the queries I end up seeing 7x traffic ( requests / minute)  in
Newrelic.

Could it be that the internal communication between nodes is done through
HTTP and newrelic counts those calls?

Thanks!

Re: Internal Solr communication question

Posted by Erick Erickson <er...@gmail.com>.
preferLocalShards is a bit of a misnomer. I usually think of it as
"don't go to another Solr node if possible".
On Thu, Oct 25, 2018 at 10:46 AM Fernando Otero <fe...@olx.com> wrote:
>
> Thanks Emir!
>     I was already looking at preferLocalShards but I wasn't sure it'll help
> with only 1 shard, I'll give it a try
>
>
> On Thu, Oct 25, 2018 at 11:26 AM Emir Arnautović <
> emir.arnautovic@sematext.com> wrote:
>
> > Hi Fernando,
> > I did not look at code and not sure if there is special handling in case
> > of a single shard collection, but Solr does not have to choose local shard
> > to query. It assumes that one node will receive all requests and that it
> > needs to balance. What you can do is add preferLocalShards=true to make
> > sure local shards are queried.
> >
> > HTH,
> > Emir
> > --
> > Monitoring - Log Management - Alerting - Anomaly Detection
> > Solr & Elasticsearch Consulting Support Training - http://sematext.com/
> >
> >
> >
> > > On 25 Oct 2018, at 16:18, Fernando Otero <fe...@olx.com> wrote:
> > >
> > > Hey Shawn
> > >    Thanks for your answer!. I changed the config to 1 shard with 7
> > > replicas but I still see communication between nodes, is that expected?
> > > Each node has 1 shard so it should have all the data needed to compute, I
> > > don't get why I'm seeing communication between them.
> > >
> > > Thanks
> > >
> > > On Tue, Oct 23, 2018 at 2:21 PM Shawn Heisey <ap...@elyograg.org>
> > wrote:
> > >
> > >> On 10/23/2018 9:31 AM, Fernando Otero wrote:
> > >>> Hey all
> > >>>      I'm running some tests on Solr cloud (10 nodes, 3 shards, 3
> > >> replicas),
> > >>> when I run the queries I end up seeing 7x traffic ( requests / minute)
> > >> in
> > >>> Newrelic.
> > >>>
> > >>> Could it be that the internal communication between nodes is done
> > through
> > >>> HTTP and newrelic counts those calls?
> > >>
> > >> The inter-node communication is indeed done over HTTP, using the same
> > >> handlers that clients use, and if you have something watching Solr's
> > >> statistics or watching Jetty's counters, one of the counters will go up
> > >> when an inter-node request happens.
> > >>
> > >> With 3 shards, one request coming in will generate as many as six
> > >> additional requests -- one request to a replica for each shard, and then
> > >> another request to each shard that has matches for the query, to
> > >> retrieve the documents that will be in the response. The node that
> > >> received the initial request will compile the results from all the
> > >> shards and send them back in response to the original request.
> > >> Nutshell:  One request from a client expands. With three shards, that
> > >> will be four to seven requests total.  If you have 10 shards, it will be
> > >> between 11 and 21 total requests.
> > >>
> > >> Thanks,
> > >> Shawn
> > >>
> > >>
> > >
> > > --
> > >
> > > Fernando Otero
> > >
> > > Sr Engineering Manager, Panamera
> > >
> > > Buenos Aires - Argentina
> > >
> > > Mobile: +54 911 67697108
> > >
> > > Email:  fernando.otero@olx.com
> >
> >
>
> --
>
> Fernando Otero
>
> Sr Engineering Manager, Panamera
>
> Buenos Aires - Argentina
>
> Mobile: +54 911 67697108
>
> Email:  fernando.otero@olx.com

Re: Internal Solr communication question

Posted by Fernando Otero <fe...@olx.com>.
Thanks Emir!
    I was already looking at preferLocalShards but I wasn't sure it'll help
with only 1 shard, I'll give it a try


On Thu, Oct 25, 2018 at 11:26 AM Emir Arnautović <
emir.arnautovic@sematext.com> wrote:

> Hi Fernando,
> I did not look at code and not sure if there is special handling in case
> of a single shard collection, but Solr does not have to choose local shard
> to query. It assumes that one node will receive all requests and that it
> needs to balance. What you can do is add preferLocalShards=true to make
> sure local shards are queried.
>
> HTH,
> Emir
> --
> Monitoring - Log Management - Alerting - Anomaly Detection
> Solr & Elasticsearch Consulting Support Training - http://sematext.com/
>
>
>
> > On 25 Oct 2018, at 16:18, Fernando Otero <fe...@olx.com> wrote:
> >
> > Hey Shawn
> >    Thanks for your answer!. I changed the config to 1 shard with 7
> > replicas but I still see communication between nodes, is that expected?
> > Each node has 1 shard so it should have all the data needed to compute, I
> > don't get why I'm seeing communication between them.
> >
> > Thanks
> >
> > On Tue, Oct 23, 2018 at 2:21 PM Shawn Heisey <ap...@elyograg.org>
> wrote:
> >
> >> On 10/23/2018 9:31 AM, Fernando Otero wrote:
> >>> Hey all
> >>>      I'm running some tests on Solr cloud (10 nodes, 3 shards, 3
> >> replicas),
> >>> when I run the queries I end up seeing 7x traffic ( requests / minute)
> >> in
> >>> Newrelic.
> >>>
> >>> Could it be that the internal communication between nodes is done
> through
> >>> HTTP and newrelic counts those calls?
> >>
> >> The inter-node communication is indeed done over HTTP, using the same
> >> handlers that clients use, and if you have something watching Solr's
> >> statistics or watching Jetty's counters, one of the counters will go up
> >> when an inter-node request happens.
> >>
> >> With 3 shards, one request coming in will generate as many as six
> >> additional requests -- one request to a replica for each shard, and then
> >> another request to each shard that has matches for the query, to
> >> retrieve the documents that will be in the response. The node that
> >> received the initial request will compile the results from all the
> >> shards and send them back in response to the original request.
> >> Nutshell:  One request from a client expands. With three shards, that
> >> will be four to seven requests total.  If you have 10 shards, it will be
> >> between 11 and 21 total requests.
> >>
> >> Thanks,
> >> Shawn
> >>
> >>
> >
> > --
> >
> > Fernando Otero
> >
> > Sr Engineering Manager, Panamera
> >
> > Buenos Aires - Argentina
> >
> > Mobile: +54 911 67697108
> >
> > Email:  fernando.otero@olx.com
>
>

-- 

Fernando Otero

Sr Engineering Manager, Panamera

Buenos Aires - Argentina

Mobile: +54 911 67697108

Email:  fernando.otero@olx.com

Re: Internal Solr communication question

Posted by Emir Arnautović <em...@sematext.com>.
Hi Fernando,
I did not look at code and not sure if there is special handling in case of a single shard collection, but Solr does not have to choose local shard to query. It assumes that one node will receive all requests and that it needs to balance. What you can do is add preferLocalShards=true to make sure local shards are queried.

HTH,
Emir
--
Monitoring - Log Management - Alerting - Anomaly Detection
Solr & Elasticsearch Consulting Support Training - http://sematext.com/



> On 25 Oct 2018, at 16:18, Fernando Otero <fe...@olx.com> wrote:
> 
> Hey Shawn
>    Thanks for your answer!. I changed the config to 1 shard with 7
> replicas but I still see communication between nodes, is that expected?
> Each node has 1 shard so it should have all the data needed to compute, I
> don't get why I'm seeing communication between them.
> 
> Thanks
> 
> On Tue, Oct 23, 2018 at 2:21 PM Shawn Heisey <ap...@elyograg.org> wrote:
> 
>> On 10/23/2018 9:31 AM, Fernando Otero wrote:
>>> Hey all
>>>      I'm running some tests on Solr cloud (10 nodes, 3 shards, 3
>> replicas),
>>> when I run the queries I end up seeing 7x traffic ( requests / minute)
>> in
>>> Newrelic.
>>> 
>>> Could it be that the internal communication between nodes is done through
>>> HTTP and newrelic counts those calls?
>> 
>> The inter-node communication is indeed done over HTTP, using the same
>> handlers that clients use, and if you have something watching Solr's
>> statistics or watching Jetty's counters, one of the counters will go up
>> when an inter-node request happens.
>> 
>> With 3 shards, one request coming in will generate as many as six
>> additional requests -- one request to a replica for each shard, and then
>> another request to each shard that has matches for the query, to
>> retrieve the documents that will be in the response. The node that
>> received the initial request will compile the results from all the
>> shards and send them back in response to the original request.
>> Nutshell:  One request from a client expands. With three shards, that
>> will be four to seven requests total.  If you have 10 shards, it will be
>> between 11 and 21 total requests.
>> 
>> Thanks,
>> Shawn
>> 
>> 
> 
> -- 
> 
> Fernando Otero
> 
> Sr Engineering Manager, Panamera
> 
> Buenos Aires - Argentina
> 
> Mobile: +54 911 67697108
> 
> Email:  fernando.otero@olx.com


Re: Internal Solr communication question

Posted by Fernando Otero <fe...@olx.com>.
Hey Shawn
    Thanks for your answer!. I changed the config to 1 shard with 7
replicas but I still see communication between nodes, is that expected?
Each node has 1 shard so it should have all the data needed to compute, I
don't get why I'm seeing communication between them.

Thanks

On Tue, Oct 23, 2018 at 2:21 PM Shawn Heisey <ap...@elyograg.org> wrote:

> On 10/23/2018 9:31 AM, Fernando Otero wrote:
> > Hey all
> >       I'm running some tests on Solr cloud (10 nodes, 3 shards, 3
> replicas),
> > when I run the queries I end up seeing 7x traffic ( requests / minute)
> in
> > Newrelic.
> >
> > Could it be that the internal communication between nodes is done through
> > HTTP and newrelic counts those calls?
>
> The inter-node communication is indeed done over HTTP, using the same
> handlers that clients use, and if you have something watching Solr's
> statistics or watching Jetty's counters, one of the counters will go up
> when an inter-node request happens.
>
> With 3 shards, one request coming in will generate as many as six
> additional requests -- one request to a replica for each shard, and then
> another request to each shard that has matches for the query, to
> retrieve the documents that will be in the response. The node that
> received the initial request will compile the results from all the
> shards and send them back in response to the original request.
> Nutshell:  One request from a client expands. With three shards, that
> will be four to seven requests total.  If you have 10 shards, it will be
> between 11 and 21 total requests.
>
> Thanks,
> Shawn
>
>

-- 

Fernando Otero

Sr Engineering Manager, Panamera

Buenos Aires - Argentina

Mobile: +54 911 67697108

Email:  fernando.otero@olx.com

Re: Internal Solr communication question

Posted by Shawn Heisey <ap...@elyograg.org>.
On 10/23/2018 9:31 AM, Fernando Otero wrote:
> Hey all
>       I'm running some tests on Solr cloud (10 nodes, 3 shards, 3 replicas),
> when I run the queries I end up seeing 7x traffic ( requests / minute)  in
> Newrelic.
>
> Could it be that the internal communication between nodes is done through
> HTTP and newrelic counts those calls?

The inter-node communication is indeed done over HTTP, using the same 
handlers that clients use, and if you have something watching Solr's 
statistics or watching Jetty's counters, one of the counters will go up 
when an inter-node request happens.

With 3 shards, one request coming in will generate as many as six 
additional requests -- one request to a replica for each shard, and then 
another request to each shard that has matches for the query, to 
retrieve the documents that will be in the response. The node that 
received the initial request will compile the results from all the 
shards and send them back in response to the original request.  
Nutshell:  One request from a client expands. With three shards, that 
will be four to seven requests total.  If you have 10 shards, it will be 
between 11 and 21 total requests.

Thanks,
Shawn