You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@aurora.apache.org by wf...@apache.org on 2015/09/09 17:08:43 UTC

aurora git commit: Adding notes on changing the scheduler quorum size.

Repository: aurora
Updated Branches:
  refs/heads/master 277382633 -> 4577de4dd


Adding notes on changing the scheduler quorum size.

Bugs closed: AURORA-1484

Reviewed at https://reviews.apache.org/r/38200/


Project: http://git-wip-us.apache.org/repos/asf/aurora/repo
Commit: http://git-wip-us.apache.org/repos/asf/aurora/commit/4577de4d
Tree: http://git-wip-us.apache.org/repos/asf/aurora/tree/4577de4d
Diff: http://git-wip-us.apache.org/repos/asf/aurora/diff/4577de4d

Branch: refs/heads/master
Commit: 4577de4dd4b48b4519d120aace8b94215cd1299d
Parents: 2773826
Author: Jeffrey Schroeder <je...@computer.org>
Authored: Wed Sep 9 08:08:38 2015 -0700
Committer: Bill Farner <wf...@apache.org>
Committed: Wed Sep 9 08:08:38 2015 -0700

----------------------------------------------------------------------
 docs/deploying-aurora-scheduler.md | 19 +++++++++++++++++++
 1 file changed, 19 insertions(+)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/aurora/blob/4577de4d/docs/deploying-aurora-scheduler.md
----------------------------------------------------------------------
diff --git a/docs/deploying-aurora-scheduler.md b/docs/deploying-aurora-scheduler.md
index 8a1e68e..73f7b19 100644
--- a/docs/deploying-aurora-scheduler.md
+++ b/docs/deploying-aurora-scheduler.md
@@ -31,6 +31,9 @@ machines.  This guide helps you get the scheduler set up and troubleshoot some c
   - [Tasks are stuck in PENDING forever](#tasks-are-stuck-in-pending-forever)
     - [Symptoms](#symptoms-2)
     - [Solution](#solution-2)
+- [Changing Scheduler Quorum Size](#changing-scheduler-quorum-size)
+    - [Preparation](#preparation)
+    - [Adding New Schedulers](#adding-new-schedulers)
 
 ## Installing Aurora
 The Aurora scheduler is a standalone Java server. As part of the build process it creates a bundle
@@ -287,3 +290,19 @@ slaves are tagged with these two common failure domains to ensure that it can sa
 such that jobs are resilient to failure.
 
 See our [vagrant example](examples/vagrant/upstart/mesos-slave.conf) for details.
+
+## Changing Scheduler Quorum Size
+Special care needs to be taken when changing the size of the Aurora scheduler quorum.
+Since Aurora uses a Mesos replicated log, similar steps need to be followed as when
+[changing the mesos quorum size](http://mesos.apache.org/documentation/latest/operational-guide).
+
+### Preparation
+Increase [-native_log_quorum_size](storage-config.md#-native_log_quorum_size) on each
+existing scheduler and restart them. When updating from 3 to 5 schedulers, the quorum size
+would grow from 2 to 3.
+
+### Adding New Schedulers
+Start the new schedulers with `-native_log_quorum_size` set to the new value. Failing to
+first increase the quorum size on running schedulers can in some cases result in corruption
+or truncating of the replicated log used by Aurora. In that case, see the documentation on
+[recovering from backup](storage-config.md#recovering-from-a-scheduler-backup).