You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "ASF GitHub Bot (JIRA)" <ji...@apache.org> on 2016/08/10 01:30:20 UTC

[jira] [Commented] (SOLR-8146) Allowing SolrJ CloudSolrClient to have preferred replica for query/read

    [ https://issues.apache.org/jira/browse/SOLR-8146?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15414569#comment-15414569 ] 

ASF GitHub Bot commented on SOLR-8146:
--------------------------------------

GitHub user susheelks opened a pull request:

    https://github.com/apache/lucene-solr/pull/66

    SOLR-8146: Allowing SolrJ CloudSolrClient to have preferred replica for query/read

    This pull request is to get feedback on the approach of implementing routingRule. 
    
    The unit test is not ready yet as facing challenges on how to mock/ inject dependency to simulate a cluster with different IP addresses machines and only matching one gets added to urlList which ultimately gets passed to LBHttpSolrClient.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/susheelks/lucene-solr SOLR-8146

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/lucene-solr/pull/66.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #66
    
----
commit e761d7c2e1ecf6ce37eb43bc764897fbed8cdc4e
Author: Kumar, Susheel (CORP) <su...@adp.com>
Date:   2016-08-10T00:42:45Z

    changes for limiting query to shard matching routing rule

----


> Allowing SolrJ CloudSolrClient to have preferred replica for query/read
> -----------------------------------------------------------------------
>
>                 Key: SOLR-8146
>                 URL: https://issues.apache.org/jira/browse/SOLR-8146
>             Project: Solr
>          Issue Type: New Feature
>          Components: clients - java
>    Affects Versions: 5.3
>            Reporter: Arcadius Ahouansou
>         Attachments: SOLR-8146.patch, SOLR-8146.patch, SOLR-8146.patch
>
>
> h2. Backgrouds
> Currently, the CloudSolrClient randomly picks a replica to query.
> This is done by shuffling the list of live URLs to query then, picking the first item from the list.
> This ticket is to allow more flexibility and control to some extend which URLs will be picked up for queries.
> Note that this is for queries only and would not affect update/delete/admin operations.
> h2. Implementation
> The current patch uses regex pattern and moves to the top of the list of URLs only those matching the given regex specified by the system property {code}solr.preferredQueryNodePattern{code}
> Initially, I thought it may be good to have Solr nodes tagged with a string pattern (snitch?) and use that pattern for matching the URLs.
> Any comment, recommendation or feedback would be appreciated.
> h2. Use Cases
> There are many cases where the ability to choose the node where queries go can be very handy:
> h3. Special node for manual user queries and analytics
> One may have a SolrCLoud cluster where every node host the same set of collections with:  
> - multiple large SolrCLoud nodes (L) used for production apps and 
> - have 1 small node (S) in the same cluster with less ram/cpu used only for manual user queries, data export and other production issue investigation.
> This ticket would allow to configure the applications using SolrJ to query only the (L) nodes
> This use case is similar to the one described in SOLR-5501 raised by [~manuel lenormand]
> h3. Minimizing network traffic
>  
> For simplicity, let's say that we have  a SolrSloud cluster deployed on 2 (or N) separate racks: rack1 and rack2.
> On each rack, we have a set of SolrCloud VMs as well as a couple of client VMs querying solr using SolrJ.
> All solr nodes are identical and have the same number of collections.
> What we would like to achieve is:
> - clients on rack1 will by preference query only SolrCloud nodes on rack1, and 
> - clients on rack2 will by preference query only SolrCloud nodes on rack2.
> - Cross-rack read will happen if and only if one of the racks has no available Solr node to serve a request.
> In other words, we want read operations to be local to a rack whenever possible.
> Note that write/update/delete/admin operations should not be affected.
> Note that in our use case, we have a cross DC deployment. So, replace rack1/rack2 by DC1/DC2
> Any comment would be very appreciated.
> Thanks.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org