You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "Ben DeMott (JIRA)" <ji...@apache.org> on 2017/03/14 17:22:41 UTC

[jira] [Updated] (SOLR-10284) Solr connection to Standalone node in Ensemble causes cluster failure

     [ https://issues.apache.org/jira/browse/SOLR-10284?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Ben DeMott updated SOLR-10284:
------------------------------
    Description: 
I posted this issue on the Dev mailing list and was encouraged to create a Jira ticket.  This isn't a bug per-se.

Solr connects / reconnects to "Standalone" Zookeeper nodes, within an ensemble cluster, which causes absolute havoc. 

I work for Dice.com, as one of the core search developers.
I'm happy to write a patch, as we'll probably do that internally anyways.  I just want to get consensus from the community about how to provide the best solution.

My original email describing the issue: 
http://mail-archives.apache.org/mod_mbox/lucene-dev/201703.mbox/raw/%3CCACbtCQ2cSPA8NbnqCbXZE9nZdT40xFHjpUhAOqUnd%3DqZaRMEsA%40mail.gmail.com%3E/2

Proposed Solution:

Hi Jan,

My thought was an explicit setting in solr.in.sh "ZK_STANDALONE" (which would default to TRUE for the solr.in.sh file found next to bin/solr).  Upon connection or reconnection of the Zookeeper Client, it would ask the server "are you standalone", and disconnect if it is and ZK_STANDALONE=false, and try the next host.  If all hosts are in standalone, an error would be shown - "No zookeeper hosts available, that aren't in standalone operation - The setting ZK_STANDALONE=false prevents connecting to a standalone Zookeeper"

In order to urge users to use the setting, I would possibly also have a warning shown in the logs, if your ZK_HOSTS is set, has multiple hosts in the connection string, and ZK_STANDALONE is not false.

I can't think of any implicit way to internalize a setting.... Other than....  ZK_HOSTS connection string setting has multiple hosts, there should be no scenario in which any node is standalone, so you could assume there should be no standalone servers.  But maybe an explicit setting is preferable.

This solution should be:
1.) backwards compatible
2.) have very little performance impact (1 extra call upon connection to ZK)
3.) be isolated to one part of the code.

  was:
I posted this issue on the Dev mailing list and was encouraged to create a Jira ticket.  This isn't a bug per-se.

Solr connects / reconnects to "Standalone" Zookeeper nodes, within an ensemble cluster, which causes absolute havoc. 

I work for Dice.com, as one of the core search developers.
I'm happy to write a patch, as we'll probably do that internally anyways.  I just want to get consensus from the community about how to provide the best solution.

My original email describing the issue: 
http://mail-archives.apache.org/mod_mbox/lucene-dev/201703.mbox/browser

Proposed Solution:

Hi Jan,

My thought was an explicit setting in solr.in.sh "ZK_STANDALONE" (which would default to TRUE for the solr.in.sh file found next to bin/solr).  Upon connection or reconnection of the Zookeeper Client, it would ask the server "are you standalone", and disconnect if it is and ZK_STANDALONE=false, and try the next host.  If all hosts are in standalone, an error would be shown - "No zookeeper hosts available, that aren't in standalone operation - The setting ZK_STANDALONE=false prevents connecting to a standalone Zookeeper"

In order to urge users to use the setting, I would possibly also have a warning shown in the logs, if your ZK_HOSTS is set, has multiple hosts in the connection string, and ZK_STANDALONE is not false.

I can't think of any implicit way to internalize a setting.... Other than....  ZK_HOSTS connection string setting has multiple hosts, there should be no scenario in which any node is standalone, so you could assume there should be no standalone servers.  But maybe an explicit setting is preferable.

This solution should be:
1.) backwards compatible
2.) have very little performance impact (1 extra call upon connection to ZK)
3.) be isolated to one part of the code.


> Solr connection to Standalone node in Ensemble causes cluster failure
> ---------------------------------------------------------------------
>
>                 Key: SOLR-10284
>                 URL: https://issues.apache.org/jira/browse/SOLR-10284
>             Project: Solr
>          Issue Type: Bug
>      Security Level: Public(Default Security Level. Issues are Public) 
>          Components: SolrCloud
>    Affects Versions: 6.3, 6.4
>         Environment: Solrcloud, with Zookeeper <any version>
>            Reporter: Ben DeMott
>
> I posted this issue on the Dev mailing list and was encouraged to create a Jira ticket.  This isn't a bug per-se.
> Solr connects / reconnects to "Standalone" Zookeeper nodes, within an ensemble cluster, which causes absolute havoc. 
> I work for Dice.com, as one of the core search developers.
> I'm happy to write a patch, as we'll probably do that internally anyways.  I just want to get consensus from the community about how to provide the best solution.
> My original email describing the issue: 
> http://mail-archives.apache.org/mod_mbox/lucene-dev/201703.mbox/raw/%3CCACbtCQ2cSPA8NbnqCbXZE9nZdT40xFHjpUhAOqUnd%3DqZaRMEsA%40mail.gmail.com%3E/2
> Proposed Solution:
> Hi Jan,
> My thought was an explicit setting in solr.in.sh "ZK_STANDALONE" (which would default to TRUE for the solr.in.sh file found next to bin/solr).  Upon connection or reconnection of the Zookeeper Client, it would ask the server "are you standalone", and disconnect if it is and ZK_STANDALONE=false, and try the next host.  If all hosts are in standalone, an error would be shown - "No zookeeper hosts available, that aren't in standalone operation - The setting ZK_STANDALONE=false prevents connecting to a standalone Zookeeper"
> In order to urge users to use the setting, I would possibly also have a warning shown in the logs, if your ZK_HOSTS is set, has multiple hosts in the connection string, and ZK_STANDALONE is not false.
> I can't think of any implicit way to internalize a setting.... Other than....  ZK_HOSTS connection string setting has multiple hosts, there should be no scenario in which any node is standalone, so you could assume there should be no standalone servers.  But maybe an explicit setting is preferable.
> This solution should be:
> 1.) backwards compatible
> 2.) have very little performance impact (1 extra call upon connection to ZK)
> 3.) be isolated to one part of the code.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org