You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@lucene.apache.org by "Andy Vuong (Jira)" <ji...@apache.org> on 2020/07/16 22:31:00 UTC

[jira] [Updated] (SOLR-14658) SolrJ COLSTATUS interface returns all collections when specifying one

     [ https://issues.apache.org/jira/browse/SOLR-14658?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Andy Vuong updated SOLR-14658:
------------------------------
    Description: 
In SolrJ, using the interface available for making COLSTATUS collection calls will return the collection status for all collections on the cluster as opposed to just one when specifying the collection.

The API in question is  CollectionAdminRequest.collectionStatus(collection) in the class [CollectionAdminRequest.java|#L903]]. This will create and return a new CollectionAdminRequest#ColStatus instance that extends AsyncCollectionSpecificAdminRequest.

The constructor of that class will pass the collection passed in to the [parent|#L250]] class and keep it in a param “collection”.

When we call AsyncCollectionSpecificAdminRequest.getParams(), we return:
{code:java}
@Override 
public SolrParams getParams() { 
  ModifiableSolrParams params = new ModifiableSolrParams(super.getParams());   
  params.set(CoreAdminParams.NAME, collection);   
  params.setNonNull(CollectionAdminParams.FOLLOW_ALIASES, followAliases); 
  return params;
}{code}
 With “name” set to the collection name. In [CollectionsHandler|https://github.com/apache/lucene-solr/blob/03d658a7bc306370cfce6ef92f34f151db7ad3dc/solr/core/src/java/org/apache/solr/handler/admin/CollectionsHandler.java#L514] where Solr handles the collection API operation request, it only copies the follow params if they exist into a new map of properties. Notice how it is missing “name” so we'll never copy that param even if it's included.

We then call ColStatus.java#getColStatus in the same block, getColStatus method will try to retrieve "collection" but will always return null since it's not used in the solrj interfaces and therefore grab the status from all collections vs the one specified when using solrj.  
{code:java}
String col = props.getStr(ZkStateReader.COLLECTION_PROP);    
if (col == null) {      
  collections = new HashSet<>(clusterState.getCollectionsMap().keySet());    
} else {      
  collections = Collections.singleton(col);    
}

{code}
[ColStatus.java|#L514]] 

This is unfortunate as the command will send a request to every leader replica per collection when that was not the intention.

This can be reproduced by spinning up a cluster with 2+ collections and a short snippet of code which will return the status for all collections as opposed to one:
{code:java}
 String host = "http://localhost:8983/solr";
HttpSolrClient.Builder builder = new HttpSolrClient.Builder(host);
HttpSolrClient solrClient = builder.build(); String collection = "test";
final NamedList<Object> response = 
   solrClient.request(CollectionAdminRequest.collectionStatus(collection)); System.out.println(response);{code}
I think the simplest fix is to just add "NAME" to the list of params that get copied in CollectionsHandler and can try that a bit later.

  was:
In SolrJ, using the interface available for making COLSTATUS collection calls will return the collection status for all collections on the cluster as opposed to just one when specifying the collection.

The API in question is  CollectionAdminRequest.collectionStatus(collection) in the class [CollectionAdminRequest.java|[https://github.com/apache/lucene-solr/blob/master/solr/solrj/src/java/org/apache/solr/client/solrj/request/CollectionAdminRequest.java#L903]]. This will create and return a new CollectionAdminRequest#ColStatus instance that extends AsyncCollectionSpecificAdminRequest.

The constructor of that class will pass the collection passed in to the [parent|[https://github.com/apache/lucene-solr/blob/03d658a7bc306370cfce6ef92f34f151db7ad3dc/solr/solrj/src/java/org/apache/solr/client/solrj/request/CollectionAdminRequest.java#L250]] class and keep it in a param “collection”.


When we call AsyncCollectionSpecificAdminRequest.getParams(), we return:

{code:java}
@Override 
public SolrParams getParams() { 
  ModifiableSolrParams params = new ModifiableSolrParams(super.getParams());   
  params.set(CoreAdminParams.NAME, collection);   
  params.setNonNull(CollectionAdminParams.FOLLOW_ALIASES, followAliases); 
  return params;
}{code}

 With “name” set to the collection name. In [CollectionsHandler|https://github.com/apache/lucene-solr/blob/03d658a7bc306370cfce6ef92f34f151db7ad3dc/solr/core/src/java/org/apache/solr/handler/admin/CollectionsHandler.java#L514] where Solr handles the collection API operation request, it only copies the follow params if they exist into a new map of properties. Notice how it is missing “name” so we'll never copy that param if it's included.



We then call ColStatus.java#getColStatus in the same block, getColStatus method will try to retrieve "collection" but will always return null since it's not used in the solrj interfaces and therefore grab the status from all collections vs the one specified when using solrj.  
{code:java}
String col = props.getStr(ZkStateReader.COLLECTION_PROP);    
if (col == null) {      
  collections = new HashSet<>(clusterState.getCollectionsMap().keySet());    
} else {      
  collections = Collections.singleton(col);    
}

{code}
[ColStatus.java|[https://github.com/apache/lucene-solr/blob/03d658a7bc306370cfce6ef92f34f151db7ad3dc/solr/core/src/java/org/apache/solr/handler/admin/CollectionsHandler.java#L514]] 

This is unfortunate as the command will send a request to every leader replica per collection when that was not the intention.

This can be reproduced by spinning up a cluster with 2+ collections and a short snippet of code which will return the status for all collections as opposed to one:
{code:java}
 String host = "http://localhost:8983/solr";
HttpSolrClient.Builder builder = new HttpSolrClient.Builder(host);
HttpSolrClient solrClient = builder.build(); String collection = "test";
final NamedList<Object> response = 
   solrClient.request(CollectionAdminRequest.collectionStatus(collection)); System.out.println(response);{code}

I think the simplest fix is to just add "NAME" to the list of params that get copied in CollectionsHandler and can try that a bit later.


> SolrJ COLSTATUS interface returns all collections when specifying one
> ---------------------------------------------------------------------
>
>                 Key: SOLR-14658
>                 URL: https://issues.apache.org/jira/browse/SOLR-14658
>             Project: Solr
>          Issue Type: Bug
>      Security Level: Public(Default Security Level. Issues are Public) 
>          Components: SolrCloud
>    Affects Versions: master (9.0), 8.3
>            Reporter: Andy Vuong
>            Priority: Minor
>
> In SolrJ, using the interface available for making COLSTATUS collection calls will return the collection status for all collections on the cluster as opposed to just one when specifying the collection.
> The API in question is  CollectionAdminRequest.collectionStatus(collection) in the class [CollectionAdminRequest.java|#L903]]. This will create and return a new CollectionAdminRequest#ColStatus instance that extends AsyncCollectionSpecificAdminRequest.
> The constructor of that class will pass the collection passed in to the [parent|#L250]] class and keep it in a param “collection”.
> When we call AsyncCollectionSpecificAdminRequest.getParams(), we return:
> {code:java}
> @Override 
> public SolrParams getParams() { 
>   ModifiableSolrParams params = new ModifiableSolrParams(super.getParams());   
>   params.set(CoreAdminParams.NAME, collection);   
>   params.setNonNull(CollectionAdminParams.FOLLOW_ALIASES, followAliases); 
>   return params;
> }{code}
>  With “name” set to the collection name. In [CollectionsHandler|https://github.com/apache/lucene-solr/blob/03d658a7bc306370cfce6ef92f34f151db7ad3dc/solr/core/src/java/org/apache/solr/handler/admin/CollectionsHandler.java#L514] where Solr handles the collection API operation request, it only copies the follow params if they exist into a new map of properties. Notice how it is missing “name” so we'll never copy that param even if it's included.
> We then call ColStatus.java#getColStatus in the same block, getColStatus method will try to retrieve "collection" but will always return null since it's not used in the solrj interfaces and therefore grab the status from all collections vs the one specified when using solrj.  
> {code:java}
> String col = props.getStr(ZkStateReader.COLLECTION_PROP);    
> if (col == null) {      
>   collections = new HashSet<>(clusterState.getCollectionsMap().keySet());    
> } else {      
>   collections = Collections.singleton(col);    
> }
> {code}
> [ColStatus.java|#L514]] 
> This is unfortunate as the command will send a request to every leader replica per collection when that was not the intention.
> This can be reproduced by spinning up a cluster with 2+ collections and a short snippet of code which will return the status for all collections as opposed to one:
> {code:java}
>  String host = "http://localhost:8983/solr";
> HttpSolrClient.Builder builder = new HttpSolrClient.Builder(host);
> HttpSolrClient solrClient = builder.build(); String collection = "test";
> final NamedList<Object> response = 
>    solrClient.request(CollectionAdminRequest.collectionStatus(collection)); System.out.println(response);{code}
> I think the simplest fix is to just add "NAME" to the list of params that get copied in CollectionsHandler and can try that a bit later.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org