You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "Scott Blum (JIRA)" <ji...@apache.org> on 2016/04/26 01:54:13 UTC

[jira] [Comment Edited] (SOLR-9014) Deprecate and reduce usage of ClusterState methods which may make calls to ZK via the lazy collection reference

    [ https://issues.apache.org/jira/browse/SOLR-9014?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15257276#comment-15257276 ] 

Scott Blum edited comment on SOLR-9014 at 4/25/16 11:54 PM:
------------------------------------------------------------

Yes, unfortunately.  ClusterState.collectionStates is driven (in part) by /solr/collections/<children>.  In particular, if /solr/collections/foo exists, the foo collection is not being watched, and /solr/collections/foo/state.json does NOT exist, then the collection will appear in collectionStates as a LazyCollectionRef, but it won't resolve to a DocCollection since there's no state.json.  We don't poll or watch for the existence of the state.json for non-watched collections.

Watched collections don't have this problem, due to how interestingCollections vs. watchedCollections is handled in ZkStateReader.

We should figure out if there's a better API for ClusterState to handle this more efficiently.  If you naively remove the guard in ClusterState.getCollections(), what will happen is a lot of calling code will break.  For example, in Assign.getNodeNameVsShardCount(), the caller loops over the returned set of collection names, calling clusterState.getCollection(collectionName) and expecting the result to be non-null.  We would either need to update all those callers to check, or else have LazyCollectionRef.get() return an empty DocCollection if the node doesn't exist.


was (Author: dragonsinth):
Yes, unfortunately.  ClusterState.collectionStates is driven (in part) by /solr/collections/<children>.  In particular, if foo unwatched, /solr/collections/foo exists, but /solr/collections/foo/state.json does NOT exist, then the collection will appear in collectionStates as a LazyCollectionRef, but it won't resolve to a DocCollection since there's no state.json.  We don't poll or watch for the existence of the state.json for non-watched collections.

Watched collections don't have this problem, due to how interestingCollections vs. watchedCollections is handled in ZkStateReader.

We should figure out if there's a better API for ClusterState to handle this more efficiently.  If you naively remove the guard in ClusterState.getCollections(), what will happen is a lot of calling code will break.  For example, in Assign.getNodeNameVsShardCount(), the caller loops over the returned set of collection names, calling clusterState.getCollection(collectionName) and expecting the result to be non-null.  We would either need to update all those callers to check, or else have LazyCollectionRef.get() return an empty DocCollection if the node doesn't exist.

> Deprecate and reduce usage of ClusterState methods which may make calls to ZK via the lazy collection reference
> ---------------------------------------------------------------------------------------------------------------
>
>                 Key: SOLR-9014
>                 URL: https://issues.apache.org/jira/browse/SOLR-9014
>             Project: Solr
>          Issue Type: Improvement
>          Components: SolrCloud
>            Reporter: Shalin Shekhar Mangar
>             Fix For: master, 6.1
>
>         Attachments: SOLR-9014.patch
>
>
> ClusterState has a bunch of methods such as getSlice and getReplica which internally call getCollectionOrNull that ends up making a call to ZK via the lazy collection reference. Many classes use these methods even though a DocCollection object is available. In such cases, multiple redundant calls to ZooKeeper can happen if the collection is not watched locally. This is especially true for Overseer classes which operate on all collections.
> We should audit all usages of these methods and replace them with calls to appropriate DocCollection methods.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org