You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@uniffle.apache.org by zh...@apache.org on 2022/10/22 15:10:06 UTC

[incubator-uniffle-website] 01/01: Add deploy doc (v2)

This is an automated email from the ASF dual-hosted git repository.

zhifgli pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/incubator-uniffle-website.git

commit e77d73f1fb8b7264ee93451113ea730c4b5b0719
Author: frankzfli <fr...@tencent.com>
AuthorDate: Sat Oct 22 21:50:15 2022 +0800

    Add deploy doc (v2)
---
 docs/01-intro.md                                   |   4 +
 docs/02-client-guide.md                            | 182 +++++++++++++++++++++
 docs/02-client_guide.md                            |   0
 docs/03-coordinator_guide.md                       |   0
 .../00-coordinator-guide.md                        | 111 +++++++++++++
 .../01-server-guide.md}                            |   0
 .../00-uniffle-operator-design.md                  | 137 ++++++++++++++++
 .../04-deploy-uniffle-cluster-on-k8s/01-install.md |  76 +++++++++
 .../02-examples.md                                 |  31 ++++
 docs/img/rss-crd-state-transition.png              | Bin 0 -> 71668 bytes
 10 files changed, 541 insertions(+)

diff --git a/docs/01-intro.md b/docs/01-intro.md
index f4a6da9..4773fc9 100644
--- a/docs/01-intro.md
+++ b/docs/01-intro.md
@@ -15,6 +15,10 @@
   ~ limitations under the License.
   -->
 
+---
+sidebar_position: 1
+---
+
 # What is Uniffle
 
 Uniffle is a Remote Shuffle Service, and provides the capability for Apache Spark applications
diff --git a/docs/02-client-guide.md b/docs/02-client-guide.md
new file mode 100644
index 0000000..dce7120
--- /dev/null
+++ b/docs/02-client-guide.md
@@ -0,0 +1,182 @@
+<!--
+  ~ Licensed to the Apache Software Foundation (ASF) under one or more
+  ~ contributor license agreements.  See the NOTICE file distributed with
+  ~ this work for additional information regarding copyright ownership.
+  ~ The ASF licenses this file to You under the Apache License, Version 2.0
+  ~ (the "License"); you may not use this file except in compliance with
+  ~ the License.  You may obtain a copy of the License at
+  ~
+  ~    http://www.apache.org/licenses/LICENSE-2.0
+  ~
+  ~ Unless required by applicable law or agreed to in writing, software
+  ~ distributed under the License is distributed on an "AS IS" BASIS,
+  ~ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+  ~ See the License for the specific language governing permissions and
+  ~ limitations under the License.
+  -->
+
+# Uniffle Shuffle Client Guide
+
+Uniffle is designed as a unified shuffle engine for multiple computing frameworks, including Apache Spark and Apache Hadoop.
+Uniffle has provided pluggable client plugins to enable remote shuffle in Spark and MapReduce.
+
+## Deploy
+This document will introduce how to deploy Uniffle client plugins with Spark and MapReduce.
+
+### Deploy Spark Client Plugin
+
+1. Add client jar to Spark classpath, eg, SPARK_HOME/jars/
+
+   The jar for Spark2 is located in <RSS_HOME>/jars/client/spark2/rss-client-XXXXX-shaded.jar
+
+   The jar for Spark3 is located in <RSS_HOME>/jars/client/spark3/rss-client-XXXXX-shaded.jar
+
+2. Update Spark conf to enable Uniffle, eg,
+
+   ```
+   spark.shuffle.manager org.apache.spark.shuffle.RssShuffleManager
+   spark.rss.coordinator.quorum <coordinatorIp1>:19999,<coordinatorIp2>:19999
+   # Note: For Spark2, spark.sql.adaptive.enabled should be false because Spark2 doesn't support AQE.
+   ```
+
+### Support Spark Dynamic Allocation
+
+To support spark dynamic allocation with Uniffle, spark code should be updated.
+There are 2 patches for spark-2.4.6 and spark-3.1.2 in spark-patches folder for reference.
+
+After apply the patch and rebuild spark, add following configuration in spark conf to enable dynamic allocation:
+  ```
+  spark.shuffle.service.enabled false
+  spark.dynamicAllocation.enabled true
+  ```
+
+### Deploy MapReduce Client Plugin
+
+1. Add client jar to the classpath of each NodeManager, e.g., <HADOOP_HOME>/share/hadoop/mapreduce/
+
+The jar for MapReduce is located in <RSS_HOME>/jars/client/mr/rss-client-mr-XXXXX-shaded.jar
+
+2. Update MapReduce conf to enable Uniffle, eg,
+
+   ```
+   -Dmapreduce.rss.coordinator.quorum=<coordinatorIp1>:19999,<coordinatorIp2>:19999
+   -Dyarn.app.mapreduce.am.command-opts=org.apache.hadoop.mapreduce.v2.app.RssMRAppMaster
+   -Dmapreduce.job.map.output.collector.class=org.apache.hadoop.mapred.RssMapOutputCollector
+   -Dmapreduce.job.reduce.shuffle.consumer.plugin.class=org.apache.hadoop.mapreduce.task.reduce.RssShuffle
+   ```
+Note that the RssMRAppMaster will automatically disable slow start (i.e., `mapreduce.job.reduce.slowstart.completedmaps=1`)
+and job recovery (i.e., `yarn.app.mapreduce.am.job.recovery.enable=false`)
+
+
+## Configuration
+
+The important configuration of client is listed as following.
+
+### Common Setting
+These configurations are shared by all types of clients.
+
+|Property Name|Default|Description|
+|---|---|---|
+|<client_type>.rss.coordinator.quorum|-|Coordinator quorum|
+|<client_type>.rss.writer.buffer.size|3m|Buffer size for single partition data|
+|<client_type>.rss.storage.type|-|Supports MEMORY_LOCALFILE, MEMORY_HDFS, MEMORY_LOCALFILE_HDFS|
+|<client_type>.rss.client.read.buffer.size|14m|The max data size read from storage|
+|<client_type>.rss.client.send.threadPool.size|5|The thread size for send shuffle data to shuffle server|
+|<client_type>.rss.client.assignment.tags|-|The comma-separated list of tags for deciding assignment shuffle servers. Notice that the SHUFFLE_SERVER_VERSION will always as the assignment tag whether this conf is set or not|
+|<client_type>.rss.client.data.commit.pool.size|The number of assigned shuffle servers|The thread size for sending commit to shuffle servers|
+|<client_type>.rss.client.assignment.shuffle.nodes.max|-1|The number of required assignment shuffle servers. If it is less than 0 or equals to 0 or greater than the coordinator's config of "rss.coordinator.shuffle.nodes.max", it will use the size of "rss.coordinator.shuffle.nodes.max" default|
+Notice:
+
+1. `<client_type>` should be `spark` or `mapreduce`
+
+2. `<client_type>.rss.coordinator.quorum` is compulsory, and other configurations are optional when coordinator dynamic configuration is enabled.
+
+### Adaptive Remote Shuffle Enabling
+
+To select build-in shuffle or remote shuffle in a smart manner, Uniffle support adaptive enabling.
+The client should use `DelegationRssShuffleManager` and provide its unique <access_id> so that the coordinator could distinguish whether it should enable remote shuffle.
+
+```
+spark.shuffle.manager org.apache.spark.shuffle.DelegationRssShuffleManager
+spark.rss.access.id=<access_id> 
+```
+
+Notice:
+Currently, this feature only supports Spark.
+
+Other configuration:
+
+|Property Name|Default|Description|
+|---|---|---|
+|spark.rss.access.timeout.ms|10000|The timeout to access Uniffle coordinator|
+|spark.rss.client.access.retry.interval.ms|20000|The interval between retries fallback to SortShuffleManager|
+|spark.rss.client.access.retry.times|0|The number of retries fallback to SortShuffleManager|
+
+
+### Client Quorum Setting
+
+Uniffle supports client-side quorum protocol to tolerant shuffle server crash.
+This feature is client-side behaviour, in which shuffle writer sends each block to multiple servers, and shuffle readers could fetch block data from one of server.
+Since sending multiple replicas of blocks can reduce the shuffle performance and resource consumption, we designed it as an optional feature.
+
+|Property Name|Default|Description|
+|---|---|---|
+|<client_type>.rss.data.replica|1|The max server number that each block can be send by client in quorum protocol|
+|<client_type>.rss.data.replica.write|1|The min server number that each block should be send by client successfully|
+|<client_type>.rss.data.replica.read|1|The min server number that metadata should be fetched by client successfully |
+
+Notice:
+
+1. `spark.rss.data.replica.write` + `spark.rss.data.replica.read` > `spark.rss.data.replica`
+
+Recommended examples:
+
+1. Performance First (default)
+```
+spark.rss.data.replica 1
+spark.rss.data.replica.write 1
+spark.rss.data.replica.read 1
+```
+
+2. Fault-tolerant First
+```
+spark.rss.data.replica 3
+spark.rss.data.replica.write 2
+spark.rss.data.replica.read 2
+```
+
+### Spark Specialized Setting
+
+The important configuration is listed as following.
+
+|Property Name|Default|Description|
+|---|---|---|
+|spark.rss.writer.buffer.spill.size|128m|Buffer size for total partition data|
+|spark.rss.client.send.size.limit|16m|The max data size sent to shuffle server|
+|spark.rss.client.unregister.thread.pool.size|10|The max size of thread pool of unregistering|
+|spark.rss.client.unregister.request.timeout.sec|10|The max timeout sec when doing unregister to remote shuffle-servers|
+
+
+### MapReduce Specialized Setting
+
+|Property Name|Default|Description|
+|---|---|---|
+|mapreduce.rss.client.max.buffer.size|3k|The max buffer size in map side|
+|mapreduce.rss.client.batch.trigger.num|50|The max batch of buffers to send data in map side|
+
+
+
+
+### Remote Spill (Experimental)
+
+In cloud environment, VM may have very limited disk space and performance.
+This experimental feature allows reduce tasks to spill data to remote storage (e.g., hdfs)
+
+|Property Name|Default|Description|
+|---|---|---|
+|mapreduce.rss.reduce.remote.spill.enable|false|Whether to use remote spill|
+|mapreduce.rss.reduce.remote.spill.attempt.inc|1|Increase reduce attempts as hdfs is easier to crash than disk|
+|mapreduce.rss.reduce.remote.spill.replication|1|The replication number to spill data to hdfs|
+|mapreduce.rss.reduce.remote.spill.retries|5|The retry number to spill data to hdfs|
+
+Notice: this feature requires the MEMORY_LOCAL_HDFS mode.
\ No newline at end of file
diff --git a/docs/02-client_guide.md b/docs/02-client_guide.md
deleted file mode 100644
index e69de29..0000000
diff --git a/docs/03-coordinator_guide.md b/docs/03-coordinator_guide.md
deleted file mode 100644
index e69de29..0000000
diff --git a/docs/03-deploy-uniffle-cluster/00-coordinator-guide.md b/docs/03-deploy-uniffle-cluster/00-coordinator-guide.md
new file mode 100644
index 0000000..0afb99c
--- /dev/null
+++ b/docs/03-deploy-uniffle-cluster/00-coordinator-guide.md
@@ -0,0 +1,111 @@
+<!--
+  ~ Licensed to the Apache Software Foundation (ASF) under one or more
+  ~ contributor license agreements.  See the NOTICE file distributed with
+  ~ this work for additional information regarding copyright ownership.
+  ~ The ASF licenses this file to You under the Apache License, Version 2.0
+  ~ (the "License"); you may not use this file except in compliance with
+  ~ the License.  You may obtain a copy of the License at
+  ~
+  ~    http://www.apache.org/licenses/LICENSE-2.0
+  ~
+  ~ Unless required by applicable law or agreed to in writing, software
+  ~ distributed under the License is distributed on an "AS IS" BASIS,
+  ~ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+  ~ See the License for the specific language governing permissions and
+  ~ limitations under the License.
+  -->
+
+# Uniffle Coordinator Guide
+
+Uniffle is a unified remote shuffle service for compute engines, the role of coordinator is responsibility for
+collecting status of shuffle server and doing the assignment for the job.
+
+## Deploy
+This document will introduce how to deploy Uniffle coordinators.
+
+### Steps
+1. unzip package to RSS_HOME
+2. update RSS_HOME/bin/rss-env.sh, eg,
+   ```
+     JAVA_HOME=<java_home>
+     HADOOP_HOME=<hadoop home>
+     XMX_SIZE="16g"
+   ```
+3. update RSS_HOME/conf/coordinator.conf, eg,
+   ```
+     rss.rpc.server.port 19999
+     rss.jetty.http.port 19998
+     rss.coordinator.server.heartbeat.timeout 30000
+     rss.coordinator.app.expired 60000
+     rss.coordinator.shuffle.nodes.max 5
+     # enable dynamicClientConf, and coordinator will be responsible for most of client conf
+     rss.coordinator.dynamicClientConf.enabled true
+     # config the path of client conf
+     rss.coordinator.dynamicClientConf.path <RSS_HOME>/conf/dynamic_client.conf
+     # config the path of excluded shuffle server
+     rss.coordinator.exclude.nodes.file.path <RSS_HOME>/conf/exclude_nodes
+   ```
+4. update <RSS_HOME>/conf/dynamic_client.conf, rss client will get default conf from coordinator eg,
+   ```
+    # MEMORY_LOCALFILE_HDFS is recommandation for production environment
+    rss.storage.type MEMORY_LOCALFILE_HDFS
+    # multiple remote storages are supported, and client will get assignment from coordinator
+    rss.coordinator.remote.storage.path hdfs://cluster1/path,hdfs://cluster2/path
+    rss.writer.require.memory.retryMax 1200
+    rss.client.retry.max 100
+    rss.writer.send.check.timeout 600000
+    rss.client.read.buffer.size 14m
+   ```
+
+5. update <RSS_HOME>/conf/exclude_nodes, coordinator will update excluded node by this file eg,
+   ```
+    # shuffleServer's ip and port, connected with "-"
+    110.23.15.36-19999
+    110.23.15.35-19996
+   ```
+
+6. start Coordinator
+   ```
+    bash RSS_HOME/bin/start-coordnator.sh
+   ```
+
+## Configuration
+
+### Common settings
+|Property Name|Default|	Description|
+|---|---|---|
+|rss.coordinator.server.heartbeat.timeout|30000|Timeout if can't get heartbeat from shuffle server|
+|rss.coordinator.server.periodic.output.interval.times|30|The periodic interval times of output alive nodes. The interval sec can be calculated by (rss.coordinator.server.heartbeat.timeout/3 * rss.coordinator.server.periodic.output.interval.times). Default output interval is 5min.|
+|rss.coordinator.assignment.strategy|PARTITION_BALANCE|Strategy for assigning shuffle server, PARTITION_BALANCE should be used for workload balance|
+|rss.coordinator.app.expired|60000|Application expired time (ms), the heartbeat interval should be less than it|
+|rss.coordinator.shuffle.nodes.max|9|The max number of shuffle server when do the assignment|
+|rss.coordinator.dynamicClientConf.path|-|The path of configuration file which have default conf for rss client|
+|rss.coordinator.exclude.nodes.file.path|-|The path of configuration file which have exclude nodes|
+|rss.coordinator.exclude.nodes.check.interval.ms|60000|Update interval (ms) for exclude nodes|
+|rss.coordinator.access.checkers|org.apache.uniffle.coordinator.AccessClusterLoadChecker|The access checkers will be used when the spark client use the DelegationShuffleManager, which will decide whether to use rss according to the result of the specified access checkers|
+|rss.coordinator.access.loadChecker.memory.percentage|15.0|The minimal percentage of available memory percentage of a server|
+|rss.coordinator.dynamicClientConf.enabled|false|whether to enable dynamic client conf, which will be fetched by spark client|
+|rss.coordinator.dynamicClientConf.path|-|The dynamic client conf of this cluster and can be stored in HDFS or local|
+|rss.coordinator.dynamicClientConf.updateIntervalSec|120|The dynamic client conf update interval in seconds|
+|rss.coordinator.remote.storage.cluster.conf|-|Remote Storage Cluster related conf with format $clusterId,$key=$value, separated by ';'|
+|rss.rpc.server.port|-|RPC port for coordinator|
+|rss.jetty.http.port|-|Http port for coordinator|
+|rss.coordinator.remote.storage.select.strategy|APP_BALANCE|Strategy for selecting the remote path|
+|rss.coordinator.remote.storage.io.sample.schedule.time|60000|The time of scheduling the read and write time of the paths to obtain different HDFS|
+|rss.coordinator.remote.storage.io.sample.file.size|204800000|The size of the file that the scheduled thread reads and writes|
+|rss.coordinator.remote.storage.io.sample.access.times|3|The number of times to read and write HDFS files|
+|rss.coordinator.startup-silent-period.enabled|false|Enable the startup-silent-period to reject the assignment requests for avoiding partial assignments. To avoid service interruption, this mechanism is disabled by default. Especially it's recommended to use in coordinator HA mode when restarting single coordinator.|
+|rss.coordinator.startup-silent-period.duration|20000|The waiting duration(ms) when conf of rss.coordinator.startup-silent-period.enabled is enabled.|
+
+### AccessClusterLoadChecker settings
+|Property Name|Default|	Description|
+|---|---|---|
+|rss.coordinator.access.loadChecker.serverNum.threshold|-|The minimal required number of healthy shuffle servers when being accessed by client. And when not specified, it will use the required shuffle-server number from client as the checking condition. If there is no client shuffle-server number specified, the coordinator conf of rss.coordinator.shuffle.nodes.max will be adopted|
+
+### AccessCandidatesChecker settings
+AccessCandidatesChecker is one of the built-in access checker, which will allow user to define the candidates list to use rss.
+
+|Property Name|Default|	Description|
+|---|---|---|
+|rss.coordinator.access.candidates.updateIntervalSec|120|Accessed candidates update interval in seconds, which is only valid when AccessCandidatesChecker is enabled.|
+|rss.coordinator.access.candidates.path|-|Accessed candidates file path, the file can be stored on HDFS|
diff --git a/docs/04-server_guide.md b/docs/03-deploy-uniffle-cluster/01-server-guide.md
similarity index 100%
rename from docs/04-server_guide.md
rename to docs/03-deploy-uniffle-cluster/01-server-guide.md
diff --git a/docs/04-deploy-uniffle-cluster-on-k8s/00-uniffle-operator-design.md b/docs/04-deploy-uniffle-cluster-on-k8s/00-uniffle-operator-design.md
new file mode 100644
index 0000000..13c81da
--- /dev/null
+++ b/docs/04-deploy-uniffle-cluster-on-k8s/00-uniffle-operator-design.md
@@ -0,0 +1,137 @@
+<!--
+  ~ Licensed to the Apache Software Foundation (ASF) under one or more
+  ~ contributor license agreements.  See the NOTICE file distributed with
+  ~ this work for additional information regarding copyright ownership.
+  ~ The ASF licenses this file to You under the Apache License, Version 2.0
+  ~ (the "License"); you may not use this file except in compliance with
+  ~ the License.  You may obtain a copy of the License at
+  ~
+  ~    http://www.apache.org/licenses/LICENSE-2.0
+  ~
+  ~ Unless required by applicable law or agreed to in writing, software
+  ~ distributed under the License is distributed on an "AS IS" BASIS,
+  ~ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+  ~ See the License for the specific language governing permissions and
+  ~ limitations under the License.
+  -->
+
+# Uniffle Operator Design
+
+## Summary
+
+The purpose is to develop an operator to facilitate the rapid deployment of Uniffle in kubernetes environments.
+
+## Motivation
+
+Using the advantages of kubernetes in container orchestration, elastic scaling, and rolling upgrades, uniffle can more
+easily manage coordinator and shuffle server clusters.
+
+In addition, based on the operating characteristics of shuffle servers, we hope to achieve safe offline:
+
+1. Before a shuffle server is scaled down or upgraded, it should be added to the Coordinator's blacklist in advance.
+2. After ensuring that the number of remaining applications is 0, allow its corresponding pod to be deleted and removed
+   from the blacklist.
+
+We don't just want to simply pull up the coordinators and shuffle servers, but also ensure that running jobs are not
+affected. Therefore, we decided to develop an operator specifically.
+
+## Goals
+
+Operator will implement the following functions:
+
+1. Normally pull up two coordinator deployments (to ensure active-active) and a shuffle server statefulSet.
+2. Supports replica expansion and upgrade of coordinators and shuffle servers, among which shuffle server also supports
+   grayscale upgrade.
+3. Using the webhook mechanism, before a shuffle server is deleted, add its name to the coordinator's blacklist, and
+   check the number of applications remaining running, and then release the pod deletion request after ensuring safety.
+
+## Design Details
+
+This operator consists of two components: a crd controller and a webhook that admits crd and pod requests.
+
+The crd controller observes the status changes of the crd and controls the workload changes.
+
+The webhook verifies the changes of the crd, and admits the pod deletion request according to whether the number of
+remaining applications is 0.
+
+The webhook will add the pod to be deleted to the coordinator's blacklist. When the pod is actually deleted, the
+controller will remove it from the blacklist.
+
+## CRD Definition
+
+An example of a crd object is as follows:
+
+```yaml
+apiVersion: uniffle.apache.org/v1alpha1
+kind: RemoteShuffleService
+metadata:
+  name: rss-demo
+  namespace: kube-system
+spec:
+  # ConfigMapName indicates configMap name stores configurations of coordinators and shuffle servers.
+  configMapName: rss-demo
+  # Coordinator represents the relevant configuration of the coordinators.
+  coordinator:
+    # Image represents the mirror image used by coordinators.
+    image: ${coordinator-image}
+    # InitContainerImage is optional, mainly for non-root users to initialize host path permissions.
+    initContainerImage: "busybox:latest"
+    # Count is the number of coordinator workloads to be generated.
+    # By default, we will deploy two coordinators to ensure active-active.
+    count: 2
+    # RpcNodePort represents the port required by the rpc protocol of the coordinators,
+    # and the range is the same as the port range of the NodePort type service in kubernetes.
+    # By default, we will deploy two coordinators to ensure active-active.
+    rpcNodePort:
+      - 30001
+      - 30011
+    # httpNodePort represents the port required by the http protocol of the coordinators,
+    # and the range is the same as the port range of the NodePort type service in kubernetes.
+    # By default, we will deploy two coordinators to ensure active-active.
+    httpNodePort:
+      - 30002
+      - 30012
+    # XmxSize indicates the xmx size configured for coordinators.
+    xmxSize: "10G"
+    # ConfigDir records the directory where the configuration of coordinators reside.
+    configDir: "/data/rssadmin/rss/conf"
+    # Replicas field is the replicas of each coordinator's deployment.
+    replicas: 1
+    # ExcludeNodesFilePath indicates exclude nodes file path in coordinators' containers.
+    excludeNodesFilePath: "/data/rssadmin/rss/coo/exclude_nodes"
+    # SecurityContext holds pod-level security attributes and common container settings.
+    securityContext:
+      # RunAsUser specifies the user ID of all processes in coordinator pods.
+      runAsUser: 1000
+      # FsGroup specifies the group ID of the owner of the volume within coordinator pods.
+      fsGroup: 1000
+    # LogHostPath represents the host path used to save logs of coordinators.
+    logHostPath: "/data/logs/rss"
+    # HostPathMounts field indicates host path volumes and their mounting path within coordinators' containers.
+    hostPathMounts:
+      /data/logs/rss: /data/rssadmin/rss/logs
+  # shuffleServer represents the relevant configuration of the shuffleServers
+  shuffleServer:
+    # Sync marks whether the shuffle server needs to be updated or restarted.
+    # When the user needs to update the shuffle servers, it needs to be set to true.
+    # After the update is successful, the controller will modify it to false.
+    sync: true
+    # Replicas field is the replicas of each coordinator's deployment.
+    replicas: 3
+    # Image represents the mirror image used by shuffle servers.
+    image: ${shuffle-server-image}
+```
+
+After a user creates a rss object, the rss-controller component will create the corresponding workloads.
+
+For coordinators, the user directly modifies the rss object, and the controller synchronizes the corresponding state to
+the workloads.
+
+For shuffle servers, only by changing the spec.shuffleServer.sync field to true, the controller will apply the
+corresponding updates to the workloads.
+
+If you want more examples, please read more in [examples](02-examples.md).
+
+## State Transition
+
+![state transition](../img/rss-crd-state-transition.png)
diff --git a/docs/04-deploy-uniffle-cluster-on-k8s/01-install.md b/docs/04-deploy-uniffle-cluster-on-k8s/01-install.md
new file mode 100644
index 0000000..400c50f
--- /dev/null
+++ b/docs/04-deploy-uniffle-cluster-on-k8s/01-install.md
@@ -0,0 +1,76 @@
+<!--
+  ~ Licensed to the Apache Software Foundation (ASF) under one or more
+  ~ contributor license agreements.  See the NOTICE file distributed with
+  ~ this work for additional information regarding copyright ownership.
+  ~ The ASF licenses this file to You under the Apache License, Version 2.0
+  ~ (the "License"); you may not use this file except in compliance with
+  ~ the License.  You may obtain a copy of the License at
+  ~
+  ~    http://www.apache.org/licenses/LICENSE-2.0
+  ~
+  ~ Unless required by applicable law or agreed to in writing, software
+  ~ distributed under the License is distributed on an "AS IS" BASIS,
+  ~ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+  ~ See the License for the specific language governing permissions and
+  ~ limitations under the License.
+  -->
+
+
+# Installation
+
+This section shows us how to install operator in our cluster.
+
+## Requirements
+
+1. Kubernetes 1.14+
+2. Kubectl 1.14+
+
+Please make sure the kubectl is properly configured to interact with the Kubernetes environment.
+
+## Preparing Images of Coordinators and Shuffle Servers
+
+Run the following command:
+
+```
+cd /deploy/kubernetes/docker && sh build.sh --registry ${our-registry}
+```
+
+## Creating or Updating CRD
+
+We can refer
+to [crd yaml file](https://github.com/apache/incubator-uniffle/tree/master/deploy/kubernetes/operator/config/crd/bases/uniffle.apache.org_remoteshuffleservices.yaml).
+
+Run the following command:
+
+```
+kubectl apply -f ${crd-yaml-file}
+```
+
+## Setup or Update Uniffle Webhook
+
+We can refer to [webhook yaml file](../../deploy/kubernetes/operator/config/manager/rss-webhook.yaml).
+
+Run the following command:
+
+```
+kubectl apply -f ${webhook-yaml-file}
+```
+
+## Setup or Update Uniffle Controller
+
+We can refer to [controller yaml file](../../deploy/kubernetes/operator/config/manager/rss-controller.yaml).
+
+Run the following command:
+
+```
+kubectl apply -f ${controller-yaml-file}
+```
+
+## How To Use
+
+We can learn more details about usage of CRD
+from [uniffle operator design](00-uniffle-operator-design.md).
+
+## Examples
+
+Example uses of CRD have been [provided](02-examples.md).
diff --git a/docs/04-deploy-uniffle-cluster-on-k8s/02-examples.md b/docs/04-deploy-uniffle-cluster-on-k8s/02-examples.md
new file mode 100644
index 0000000..f8514c0
--- /dev/null
+++ b/docs/04-deploy-uniffle-cluster-on-k8s/02-examples.md
@@ -0,0 +1,31 @@
+<!--
+  ~ Licensed to the Apache Software Foundation (ASF) under one or more
+  ~ contributor license agreements.  See the NOTICE file distributed with
+  ~ this work for additional information regarding copyright ownership.
+  ~ The ASF licenses this file to You under the Apache License, Version 2.0
+  ~ (the "License"); you may not use this file except in compliance with
+  ~ the License.  You may obtain a copy of the License at
+  ~
+  ~    http://www.apache.org/licenses/LICENSE-2.0
+  ~
+  ~ Unless required by applicable law or agreed to in writing, software
+  ~ distributed under the License is distributed on an "AS IS" BASIS,
+  ~ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+  ~ See the License for the specific language governing permissions and
+  ~ limitations under the License.
+  -->
+
+
+# Examples
+
+We need to create configMap first which saves coordinators, shuffleServers and log4j's configuration(we can refer
+to [configuration](https://github.com/apache/incubator-uniffle/tree/master/deploy/kubernetes/operator/examples/configuration.yaml)).
+
+Coordinator is a stateless service, when upgrading, we can directly update the configuration and then update the image.
+
+Shuffle server is a stateful service, and the upgrade operation is more complicated, so we show examples of different
+upgrade modes.
+- [Full Upgrade](https://github.com/apache/incubator-uniffle/tree/master/deploy/kubernetes/operator/examples/full-upgrade)
+- [Full Restart](https://github.com/apache/incubator-uniffle/tree/master/deploy/kubernetes/operator/examples/full-restart)
+- [Partition Upgrade](https://github.com/apache/incubator-uniffle/tree/master/deploy/kubernetes/operator/examples/partition-upgrade)
+- [Specific Upgrade](https://github.com/apache/incubator-uniffle/tree/master/deploy/kubernetes/operator/examples/specific-upgrade)
diff --git a/docs/img/rss-crd-state-transition.png b/docs/img/rss-crd-state-transition.png
new file mode 100644
index 0000000..f5329b8
Binary files /dev/null and b/docs/img/rss-crd-state-transition.png differ