You are viewing a plain text version of this content. The canonical link for it is here.
Posted to notifications@skywalking.apache.org by ke...@apache.org on 2021/03/17 07:14:42 UTC
[skywalking] branch master updated: Refine FAQ (#6560)

This is an automated email from the ASF dual-hosted git repository.

kezhenxu94 pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/skywalking.git


The following commit(s) were added to refs/heads/master by this push:
     new bc77726  Refine FAQ (#6560)
bc77726 is described below

commit bc77726333f704e059c85e682e3a4becce17c306
Author: Wing <69...@users.noreply.github.com>
AuthorDate: Wed Mar 17 15:14:24 2021 +0800

    Refine FAQ (#6560)
---
 docs/en/FAQ/v6-version-upgrade.md  | 34 ++++++++++++++++------------------
 docs/en/FAQ/v8-version-upgrade.md  | 14 +++++++-------
 docs/en/FAQ/vnode.md               | 20 ++++++++------------
 docs/en/FAQ/why_mq_not_involved.md | 38 +++++++++++++++++---------------------
 4 files changed, 48 insertions(+), 58 deletions(-)

diff --git a/docs/en/FAQ/v6-version-upgrade.md b/docs/en/FAQ/v6-version-upgrade.md
index 1ee6989..117a934 100644
--- a/docs/en/FAQ/v6-version-upgrade.md
+++ b/docs/en/FAQ/v6-version-upgrade.md
@@ -1,30 +1,28 @@
 # V6 upgrade
-SkyWalking v6 is widely used in many production environments. Users may wants to upgrade to an old release to new.
-This is a guidance to tell users how to do that.
+SkyWalking v6 is widely used in many production environments. Follow the steps in the guide below to learn how to upgrade to a new release.
 
-**NOTICE**, the following ways are not the only ways to do upgrade.
+**NOTE**: The ways to upgrade are not limited to the steps below. 
 
 ## Use Canary Release
-Like all applications, SkyWalking could use `canary release` method to upgrade by following these steps
-1. Deploy a new cluster by using the latest(or new) version of SkyWalking OAP cluster with new database cluster.
-1. Once the target(being monitored) service has chance to upgrade the agent.jar(or just simply reboot), change the `collector.backend_service`
+Like all applications, you may upgrade SkyWalking using the `canary release` method through the following steps.
+1. Deploy a new cluster by using the latest version of SkyWalking OAP cluster with the new database cluster.
+1. Once the target service (i.e. the service being monitored) has upgraded the agent.jar (or simply by rebooting), have `collector.backend_service`
 pointing to the new OAP backend, and use/add a new namespace(`agent.namespace` in [Table of Agent Configuration Properties](../setup/service-agent/java-agent/README.md#table-of-agent-configuration-properties)).
-The namespace will avoid the conflict between different versions.
+The namespace will prevent conflicts from arising between different versions.
 1. When all target services have been rebooted, the old OAP clusters could be discarded.
 
-`Canary Release` methods works for any version upgrade.
+The `Canary Release` method works for any version upgrades.
 
 ## Online Hot Reboot Upgrade
-The reason we required `Canary Release` is, SkyWalking agent has cache mechanisms, switching to a new cluster makes the 
-cache unavailable for new OAP cluster.
-In the 6.5.0+(especially for agent version), we have [**Agent hot reboot trigger mechanism**](../setup/backend/backend-setup.md#agent-hot-reboot-trigger-mechanism-in-oap-server-upgrade).
-By using that, we could do upgrade an easier way, **deploy a new cluster by using the latest(or new) version of SkyWalking OAP cluster with new database cluster**,
-and shift the traffic to the new cluster once for all. Based on the mechanism, all agents will go into `cool_down` mode, then
-back online. More detail, read the backend setup document.
+The reason we require `Canary Release` is that the SkyWalking agent has cache mechanisms, and switching to a new cluster causes the 
+cache to become unavailable for new OAP clusters.
+In version 6.5.0+ (especially for agent versions), we have [**Agent hot reboot trigger mechanism**](../setup/backend/backend-setup.md#agent-hot-reboot-trigger-mechanism-in-oap-server-upgrade).
+This streamlines the upgrade process as we **deploy a new cluster by using the latest version of SkyWalking OAP cluster with the new database cluster**,
+and shift the traffic to the new cluster once and for all. Based on the mechanism, all agents will enter the `cool_down` mode, and come
+back online. For more details, see the backend setup documentation.
 
-**NOTICE**, as a known bug in 6.4.0, its agent could have re-connection issue, so, even this bot reboot mechanism included in 6.4.0,
-it may not work in some network scenarios, especially in k8s.
+**NOTE**: A known bug in 6.4.0 is that its agent may have re-connection issues; therefore, even though this bot reboot mechanism has been included in 6.4.0, it may not work under some network scenarios, especially in Kubernetes.
 
 ## Agent Compatibility
-All versions of SkyWalking 6.x(even 7.x) are compatible with each others, so users could only upgrade the OAP servers first. 
-The agent is also enhanced from version to version, so from SkyWalking team's recommendations, upgrade the agent once you have the chance.
+All versions of SkyWalking 6.x (and even 7.x) are compatible with each other, so users could simply upgrade the OAP servers. 
+As the agent has also been enhanced in the latest versions, according to the SkyWalking team's recommendation, upgrade the agent as soon as practicable.
diff --git a/docs/en/FAQ/v8-version-upgrade.md b/docs/en/FAQ/v8-version-upgrade.md
index 2fc6e65..df0d001 100644
--- a/docs/en/FAQ/v8-version-upgrade.md
+++ b/docs/en/FAQ/v8-version-upgrade.md
@@ -1,10 +1,10 @@
 # V8 upgrade
-SkyWalking v8 begins to use [v3 protocol](../protocols/README.md), so, it is incompatible with previous releases.
-Users who intend to upgrade in v8 series releases could follow this guidance.
+Starting from SkyWalking v8, the [v3 protocol](../protocols/README.md) has been used. This makes it incompatible with previous releases.
+Users who intend to upgrade in v8 series releases could follow the steps below.
 
 
-Register in v6 and v7 has been removed in v8 for better scaling out performance, please upgrade in the following ways.
-1. Use a different storage or a new namespace. Also, could consider erasing the whole storage index/table(s) related to SkyWalking.
-1. Deploy the whole SkyWalking cluster, and expose in a new network address.
-1. If you are using the language agents, upgrade the new agents too, meanwhile, make sure the agent has supported the different language.
-And set up the backend address to the new SkyWalking OAP cluster.
\ No newline at end of file
+Registers in v6 and v7 have been removed in v8 for better scaling out performance. Please upgrade following the instructions below.
+1. Use a different storage or a new namespace. You may also consider erasing the whole storage indexes or tables related to SkyWalking.
+2. Deploy the whole SkyWalking cluster, and expose it in a new network address.
+3. If you are using language agents, upgrade the new agents too; meanwhile, make sure the agents are supported in a different language.
+Then, set up the backend address to the new SkyWalking OAP cluster.
diff --git a/docs/en/FAQ/vnode.md b/docs/en/FAQ/vnode.md
index 71fa44f..4ad94e3 100644
--- a/docs/en/FAQ/vnode.md
+++ b/docs/en/FAQ/vnode.md
@@ -1,19 +1,15 @@
 # What is VNode?
-In the trace page, sometimes, people could find there are nodes named **VNode** as the span name, and there is no attribute 
-for this span.
+On the trace page, you may sometimes find nodes with their spans named **VNode**, and that there are no attributes for such spans.
 
-**VNode** is created by the UI itself, rather than reported from the agent or tracing SDK. It represents there are some
-span(s) missed from the trace data in this query.
+**VNode** is created by the UI itself, rather than being reported by the agent or tracing SDK. It indicates that some spans are missed in the trace data in this query.
 
 ## How does the UI detect the missing span(s)?
-The UI real check the parent spans and reference segments of all spans, if a parent id(segment id + span id) can't be found,
+The UI checks the parent spans and reference segments of all spans in real time. If no parent id(segment id + span id) could be found,
 then it creates a VNode automatically.
 
-## How does this happen?
-The VNode was introduced, because there are some cases which could cause the trace data are not always completed.
-1. The agent fail-safe mechanism activated. The SkyWalking agent has the capability to abandon the trace data, if
-there is agent->OAP network issue(unconnected, slow network speed), or the performance of the OAP cluster is not enough
-to process all traces. 
-1. Some plugins could have bugs, then some segments in the trace never stop correctly, it is hold in the memory.
+## How did this happen?
+The VNode appears when the trace data is incomplete.
+1. The agent fail-safe mechanism has been activated. The SkyWalking agent could abandon the trace data if there are any network issues between the agent and the OAP (e.g. failure to connect, slow network speeds, etc.), or if the OAP cluster is not capable of processing all traces. 
+2. Some plug-ins may have bugs, and some segments in the trace do not stop correctly and are held in the memory.
 
-In these cases, the trace would not exist in the query. Then VNode shows up. 
+In such case, the trace would not exist in the query, thus the VNode shows up. 
diff --git a/docs/en/FAQ/why_mq_not_involved.md b/docs/en/FAQ/why_mq_not_involved.md
index 08e285f..21b1bb3 100644
--- a/docs/en/FAQ/why_mq_not_involved.md
+++ b/docs/en/FAQ/why_mq_not_involved.md
@@ -1,30 +1,26 @@
-# Why doesn't SkyWalking involve MQ in the architecture?
-People usually ask about these questions when they know SkyWalking at the first time.
-They think MQ should be better in the performance and supporting high throughput, like the following
+# Why doesn't SkyWalking involve MQ in its architecture?
+This is often asked by those who are first introduced to SkyWalking. Many believe that MQ should have better performance and should be able to support higher throughput, like the following:
 
 <img src="MQ-involved-architecture.png"/>
 
-Here are the reasons the SkyWalking's opinions.
+Here's what we think.
 
-### Is MQ a good or right way to communicate with OAP backend?
-This question comes out when people think about what happens when the OAP cluster is not powerful enough or offline. 
-But I want to ask the questions before answer this.
-1. Why do you think OAP should be not powerful enough? As it is not, the speed of data analysis wouldn't catch up with producers(agents). Then what is the point of adding new deployment requirement?
-1. Maybe you will argue says, the payload is sometimes higher than usual as there is hot business time. But, my question is how much higher? 
-1. If less than 40%, how many resources will you use for the new MQ cluster? How about moving them to new OAP and ES nodes?
-1. If higher than 40%, such as 70%-2x times? Then, I could say, your MQ wastes more resources than it saves. 
-Your MQ would support 2x-3x payload, and with 10%-20% cost in general time. Furthermore, in this case, 
-if the payload/throughput are so high, how long the OAP cluster could catch up. I would say never before it catches up, the next hot time event is coming.
+### Is MQ appropriate for communicating with the OAP backend?
+This question arises when users consider the circumstances where the OAP cluster may not be powerful enough or becomes offline. 
+But the following issues must first be addressed:
+1. Why do you think that the OAP is not powerful enough? Were it not powerful, the speed of data analysis wouldn't have caught up with the producers (or agents). Then what is the point of adding new deployment requirements?
+1. Some may argue that the payload is sometimes higher than usual during peak times. But we must consider how much higher the payload really is.
+1. If it is higher by less than 40%, how many resources would you use for the new MQ cluster? How about moving them to new OAP and ES nodes?
+1. Say it is higher by 40% or more, such as by 70% to 200%. Then, it is likely that your MQ would use up more resources than it saves. 
+Your MQ would support 2 to 3 times the payload using 10%-20% of the cost during usual times. Furthermore, in this case, 
+if the payload/throughput are so high, how long would it take for the OAP cluster to catch up? The challenge here is that well before it catches up, the next peak times would have come.
 
-Besides all this analysis, why do you want the traces still 100%, as you are costing so many resources? 
-Better than this, 
-we could consider adding a better dynamic trace sampling mechanism at the backend, 
-when throughput goes over the threshold, active the sampling rate to 100%->10% step by step, 
-which means you could get the OAP and ES 3 times more powerful than usual, just ignore the traces at hot time.
+With the analysis above in mind, why would you still want the traces to be 100%, given the resources they would cost? 
+The preferred way to do this would be adding a better dynamic trace sampling mechanism at the backend. When throughput exceeds the threshold, gradually modify the active sampling rate from 100% to 10%, which means you could get the OAP and ES 3 times more powerful than usual, while ignoring the traces at peak times.
 
-### Is MQ transport acceptable even there are several side effects?
-Even MQ transport is not recommended from the production perspective, SkyWalking still has optional plugins named
+### Is MQ transport recommended despite its side effects?
+Even though MQ transport is not recommended from the production perspective, SkyWalking still provides optional plugins named
 `kafka-reporter` and `kafka-fetcher` for this feature since 8.1.0. 
 
 ### How about MQ metrics data exporter?
-I would say, it is already available there. Exporter module with gRPC default mechanism is there. It is easy to provide a new implementor of that module.
\ No newline at end of file
+The answer is that the MQ metrics data exporter is already readily available. The exporter module with gRPC default mechanism is there, and you can easily provide a new implementor of this module.