You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "Andrzej Bialecki (JIRA)" <ji...@apache.org> on 2017/08/24 13:45:01 UTC
[jira] [Updated] (SOLR-11285) Support simulations at scale in the autoscaling framework

     [ https://issues.apache.org/jira/browse/SOLR-11285?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Andrzej Bialecki  updated SOLR-11285:
-------------------------------------
    Attachment: SOLR-11285.patch

This patch extends the scope of {{ClusterDataProvider}} interface that was already used in the policy framework to include ZK-like and Solr-like operations, which then can be delegated to real ZK / Solr or to their mocks.

Changes in this patch allow (almost...) running {{OverseerTriggerThread}} with simulated ZK / Solr. One open issue is that {{Assign}} uses {{ReplicaAssigner}}, which uses snitches and CoreContainer - in this patch I punted on changing this, it's too entangled, but probably could be changed to use the same approach as {{SolrClientDataProvider.getNodeValues}}.

Another open issue was the refactoring of {{DistributedQueue}} and widening of "throws" clauses, which are no longer ZK-specific - this probably needs to be partially reverted, or a set of specialized exception classes needs to be introduced instead of ZK-specific ones.

The patch came out quite large, but most of it are pretty rote substitutions / renames to use the {{ClusterDataProvider}} interface instead of {{ZkStateReader}}, {{SolrZkClient}} etc. It's probably best to review the changes using branch {{jira/solr-11285}}.

> Support simulations at scale in the autoscaling framework
> ---------------------------------------------------------
>
>                 Key: SOLR-11285
>                 URL: https://issues.apache.org/jira/browse/SOLR-11285
>             Project: Solr
>          Issue Type: Improvement
>      Security Level: Public(Default Security Level. Issues are Public) 
>          Components: AutoScaling
>            Reporter: Andrzej Bialecki 
>            Assignee: Andrzej Bialecki 
>         Attachments: SOLR-11285.patch
>
>
> This is a spike to investigate how difficult it would be to modify the autoscaling framework so that it's possible to run simulated large-scale experiments and test its dynamic behavior without actually spinning up a large cluster.
> Currently many components rely heavily on actual Solr, ZK and behavior of ZK watches, or insist on making actual HTTP calls. Notable exception is the core Policy framework where most of the ZK / Solr details are abstracted.
> As the algorithms for autoscaling that we implement become more and more complex the ability to effectively run multiple large simulations will be crucial - it's very easy to unknowingly introduce catastrophic instabilities that don't manifest themselves in regular unit tests.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org