You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@pulsar.apache.org by GitBox <gi...@apache.org> on 2019/07/22 00:59:49 UTC

[GitHub] [pulsar] yjshen commented on a change in pull request #4770: Add Upgrade Guide to Apache Pulsar

yjshen commented on a change in pull request #4770:  Add Upgrade Guide to Apache Pulsar
URL: https://github.com/apache/pulsar/pull/4770#discussion_r305646222
 
 

 ##########
 File path: site2/docs/administration-upgrade.md
 ##########
 @@ -0,0 +1,182 @@
+---
+id: administration-upgrade
+title: The Pulsar Upgrade Guide
+sidebar_label: Upgrade
+---
+
+<!--
+
+    Licensed to the Apache Software Foundation (ASF) under one
+    or more contributor license agreements.  See the NOTICE file
+    distributed with this work for additional information
+    regarding copyright ownership.  The ASF licenses this file
+    to you under the Apache License, Version 2.0 (the
+    "License"); you may not use this file except in compliance
+    with the License.  You may obtain a copy of the License at
+
+      http://www.apache.org/licenses/LICENSE-2.0
+
+    Unless required by applicable law or agreed to in writing,
+    software distributed under the License is distributed on an
+    "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+    KIND, either express or implied.  See the License for the
+    specific language governing permissions and limitations
+    under the License.
+
+-->
+
+## Overview
+
+Apache Pulsar is comprised of multiple components, zookeeper, bookies and brokers. These components are either
+stateful or stateless. They will be taken care of differently. In general, the ZooKeeper nodes don't necessarily
+need to be upgraded. Bookies are stateful so that's where you need to take the most care. Brokers and proxies
+are stateless so less care is needed.
+
+This page provides a guideline on upgrading a Pulsar cluster. Please consider the below guidelines in preparation
+for upgrading.
+
+- Always back up all your configuration files before upgrading.
+- Read through the documentation and draft an upgrade plan that matches your specific requirements and environment
+  before starting the upgrade process. In other words, don't start working through this guide on a live cluster.
+  Read guide entirely, make a plan, and then execute the plan.
+- Pay careful consideration to the order in which components are upgraded. In general, you don't need to upgrade
+  your zookeeper or configuration store cluster (the global zookeeper cluster). For the remaining components, you
+  need to upgrade bookies first, upgrade brokers and proxies next, and then upgrade your clients.
+- If autorecovery is enabled, you need to disable `autorecovery` during the whole upgrade process,
+  and re-enable it after the process is finished.
+- Read the release notes carefully for each release. Not only do they contain information about noteworthy features,
+  but also changes to configuration that may impact your upgrade.
+- Always upgrade a small subset of nodes of each type to canary test the new version before upgrading all nodes of that type
+  in the cluster. When you have upgraded the canary nodes, allow them to run for a while to ensure that they are working correctly.
+- Always upgrade one datacenter to verify new version before upgrading all datacenters if your cluster is running in multi-clusters
+  replicated mode.
+
+Following is the detailed sequence to upgrade an Apache Pulsar cluster. If there is anything specials needed to be taken care,
+we will describe the details for individual versions in "Upgrade Guides" section.
+
+## Sequence
+
+Follow the sequence below on upgrading an Apache Pulsar cluster
+
+> Typically you don't need to upgrade zookeeper unless there are strong reasons documented in "Upgrade Guides" section below.
+
+1. Upgrade ZooKeeper (optionally)
+  1. Canary Test: test an upgraded version in one or small set of zookeeper nodes.
+  2. Rolling Upgrade: rollout the upgraded version to all zookeeper servers incrementally, one at a time. Don't be too fast and always
+     monitor your dashboard during the whole rolling upgrade process.
+2. Upgrade Bookies
+  1. Canary Test: test an upgraded version in one or small set of bookies.
+  2. Rolling Upgrade:
+    1. Disable *autorecovery* by running the following command:
+       ```shell
+       bin/bookkeeper shell autorecovery -disable
+       ```
+    2. rollout the upgraded version to all bookies in the cluster once you determined a version is safe after canary.
+    3. After all the bookies are upgraded to run the new version, re-enable *autorecovery*:
+       ```shell
+       bin/bookkeeper shell autorecovery -enable
+       ```
+3. Upgrade Brokers
+  1. Canary Test: test an upgraded version in one or small set of brokers.
+  2. Rolling Upgrade: rollout the upgraded version to all brokers in the cluster once you determined a version is safe after canary.
+4. Upgrade Proxies
+  1. Canary Test: test an upgraded version in one or small set of proxies.
+  2. Rolling Upgrade: rollout the upgraded version to all proxies in the cluster once you determined a version is safe after canary.
+
+## Upgrade Bookies
+
+> You can also read Apache BookKeeper [Upgrade Guide](http://bookkeeper.apache.org/docs/latest/admin/upgrade) for more details.
+
+### Canary Test
+
+It is wise to test an upgraded version in one or small set of bookies before upgrading all bookies in your live cluster.
+
+You can follow below steps on how to canary a upgraded version:
+
+1. Stop a Bookie.
+2. Upgrade the binary and configuration.
+3. Start the Bookie in `ReadOnly` mode. This can be used to verify if the Bookie of this new version can run well for read workload.
+   ```shell
+   bin/pulsar bookie --readOnly
+   ```
+4. Once the Bookie is running at `ReadOnly` mode successfully for a while, stop the bookie and restart it in `Write/Read` mode.
+   ```shell
+   bin/pulsar bookie
+   ```
+5. After step 4, the Bookie will serve both write and read traffic.
+
+#### Canary Rollback
+
+If problems occur during canarying an upgraded version, you can simply take down the problematic Bookie node. The remain bookies in the old cluster
+will repair this problematic bookie node by autorecovery. Nothing needs to be worried about.
 
 Review comment:
   Does it assume autorecovery is enabled? Is it a standard configuration for bookies? What if it's disabled? 
   
   Does it mean all the ledgers in this bookie are replicated to the left nodes? What should I do if I don't want to shrink the size of my bookie cluster? Should I manually delete all the data in this node and added it back to bookie cluster then?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services