You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@solr.apache.org by "Ishan Chattopadhyaya (Jira)" <ji...@apache.org> on 2022/09/12 13:07:00 UTC
[jira] [Comment Edited] (SOLR-15715) Dedicated query coordinator nodes in the solr cluster

    [ https://issues.apache.org/jira/browse/SOLR-15715?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17603076#comment-17603076 ] 

Ishan Chattopadhyaya edited comment on SOLR-15715 at 9/12/22 1:06 PM:
----------------------------------------------------------------------

Here are final benchmark numbers.

*Setup*
Branch: [https://github.com/apache/solr/pull/996]
No. of Solr nodes: 6 (1 dedicated overseer, 1 coordinator node, 4 regular data nodes)
No. of collections: 1
No. of shards: 256
No. of documents: 25 million
No. of queries: 2000 (faceting queries, a few join queries)
Hardware: One machine with 64GB RAM, at least 16 CPUs.

*Comparison:*
Scenario 1) All queries sent to the dedicated overseer node, and hence forwarded to data nodes and executed there.
Scenario 2) All queries sent to coordinator node, hence executed on that node.

*Results*
Here are the heap usage graphs.
!coordinator-vs-data-nodes.jpg!

The left graphs are for scenario 1 (queries executed on data nodes) and right graphs are scenario 2 (queries executed on coordinator node). It is clear that the heap usage on data nodes (ports 50002+) is less in scenario 2.

*Reproducing these benchmarks*
On a laptop, desktop or VM, with at least 64GB RAM and 16 CPUs, do the following:

{{git clone [https://github.com/fullstorydev/solr-bench]
 # Prerequisites
apt install wget unzip zip ant ivy lsof git netcat make openjdk-11-jdk maven jq

mvn clean compile assembly:single

./stress.sh coordinator-node.json
}}

To run scenario 1, keep the `query-node` to 1 in `querying` section of `task-types` (coordinator-node.json). To run scenario 2, change it to 2. Here 1 and 2 represent the node index (check the `startup-params-overrides` in `cluster` section).


was (Author: ichattopadhyaya):
Here are final benchmark numbers.

*Setup*
Branch: https://github.com/apache/solr/pull/996
No. of Solr nodes: 6 (1 dedicated overseer, 1 coordinator node, 4 regular data nodes)
No. of collections: 1
No. of shards: 256
No. of documents: 25 million
No. of queries: 2000 (faceting queries, a few join queries)
Hardware: One machine with 64GB RAM, at least 16 CPUs.

*Comparison:*
Scenario 1) All queries sent to the dedicated overseer node, and hence forwarded to data nodes and executed there.
Scenario 2) All queries sent to coordinator node, hence executed on that node.

*Results*
Here are the heap usage graphs.
 !coordinator-vs-data-nodes.jpg! 

The left graphs are for scenario 1 (queries executed on data nodes) and right graphs are scenario 2 (queries executed on coordinator node). It is clear that the heap usage on data nodes (ports 50002+) is less in scenario 2.

*Reproducing these benchmarks*
On a laptop, desktop or VM, with at least 64GB RAM and 16 CPUs, do the following:

{{code}}
git clone https://github.com/fullstorydev/solr-bench

# Prerequisites
apt install wget unzip zip ant ivy lsof git netcat make openjdk-11-jdk maven jq

mvn clean compile assembly:single

./stress.sh coordinator-node.json
{{code}}

To run scenario 1, keep the `query-node` to 1 in `querying` section of `task-types` (coordinator-node.json). To run scenario 2, change it to 2. Here 1 and 2 represent the node index (check the `startup-params-overrides` in `cluster` section).

> Dedicated query coordinator nodes in the solr cluster
> -----------------------------------------------------
>
>                 Key: SOLR-15715
>                 URL: https://issues.apache.org/jira/browse/SOLR-15715
>             Project: Solr
>          Issue Type: New Feature
>          Components: SearchComponents - other
>    Affects Versions: 8.10.1
>            Reporter: Hitesh Khamesra
>            Assignee: Noble Paul
>            Priority: Major
>         Attachments: coordinator-poc.jpg, coordinator-poc.pdf, coordinator-vs-data-nodes.jpg, regular-node.jpg, regular-node.pdf
>
>          Time Spent: 3.5h
>  Remaining Estimate: 0h
>
> We have a large collection with 1000s of shards in the solr cluster. We have observed that distributed solr query takes many resources(thread, memory, etc.) on the solr data node(node which contains indexes). Thus we need dedicated query nodes to execute distributed queries on large solr collection. That would reduce the memory/cpu pressure from solr data nodes.
> Elastis search has similar functionality [here|https://www.elastic.co/guide/en/elasticsearch/reference/current/modules-node.html#coordinating-node]
>  
> [~noble.paul] [~ichattopadhyaya]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org
For additional commands, e-mail: issues-help@solr.apache.org