You are viewing a plain text version of this content. The canonical link for it is here.
Posted to notifications@skywalking.apache.org by wu...@apache.org on 2023/03/29 13:44:14 UTC
[skywalking] branch master updated: Add Profiling related documentations (#10612)

This is an automated email from the ASF dual-hosted git repository.

wusheng pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/skywalking.git


The following commit(s) were added to refs/heads/master by this push:
     new fffedd3914 Add Profiling related documentations (#10612)
fffedd3914 is described below

commit fffedd391493feed07ac34ef9046daf6c3778679
Author: mrproliu <74...@qq.com>
AuthorDate: Wed Mar 29 21:44:02 2023 +0800

    Add Profiling related documentations (#10612)
---
 docs/en/changes/changes.md                         |   1 +
 docs/en/concepts-and-designs/profiling.md          |  26 ++-
 docs/en/guides/README.md                           |   6 -
 .../setup/backend/backend-continuous-profiling.md  |  71 ++++++++
 docs/en/setup/backend/backend-ebpf-profiling.md    | 198 +++++++++++++++++++++
 .../backend/backend-profile-thread-merging.md}     |   2 +-
 docs/en/setup/backend/backend-trace-profiling.md   | 102 +++++++++++
 docs/menu.yml                                      |   8 +
 8 files changed, 405 insertions(+), 9 deletions(-)

diff --git a/docs/en/changes/changes.md b/docs/en/changes/changes.md
index b76c3f3272..19a0ac4121 100644
--- a/docs/en/changes/changes.md
+++ b/docs/en/changes/changes.md
@@ -34,5 +34,6 @@
 
 #### Documentation
 
+* Add Profiling related documentations.
 
 All issues and pull requests are [here](https://github.com/apache/skywalking/milestone/169?closed=1)
diff --git a/docs/en/concepts-and-designs/profiling.md b/docs/en/concepts-and-designs/profiling.md
index d198a372aa..6ddd84a039 100644
--- a/docs/en/concepts-and-designs/profiling.md
+++ b/docs/en/concepts-and-designs/profiling.md
@@ -8,10 +8,11 @@ These typical scenarios usually are suitable for profiling through various profi
 3. Massive RPC requests block the network to cause responding slowly.
 4. Unexpected network requests caused by security issues or codes' bug.
 
-In the SkyWalking landscape, we provided two ways to support profiling within reasonable resource cost.
+In the SkyWalking landscape, we provided three ways to support profiling within reasonable resource cost.
 
 1. In-process profiling is bundled with auto-instrument agents.
 2. Out-of-process profiling is powered by eBPF agent.
+3. Continuous profiling is powered by eBPF agent.
 
 ## In-process profiling
 
@@ -79,4 +80,25 @@ Network profiling provides
 5. Observe time costs for local I/O costing on the OS. Such as the time of Linux process HTTP request/response.
 
 Learn more tech details from the post, [**Diagnose Service Mesh Network Performance with
-eBPF**](../academy/diagnose-service-mesh-network-performance-with-ebpf.md)
\ No newline at end of file
+eBPF**](../academy/diagnose-service-mesh-network-performance-with-ebpf.md)
+
+## Continuous Profiling
+
+Continuous Profiling utilizes monitoring of system, processes, and network, 
+and automatically initiates profiling tasks when conditions meet the configured thresholds and time windows.
+
+### Monitor type
+
+Continuous profiling periodically collects the following types of performance metrics for processes and systems:
+1. System Load: Monitor current system load value.
+2. Process CPU: Monitor process CPU usage percent, value in [0-100].
+3. Process Thread Count: Monitor process thread count.
+4. HTTP Error Rate: Monitor the process HTTP(/1.x) response error(response status >= 500) percent, value in [0-100].
+5. HTTP Avg Response Time: Monitor the process HTTP(/1.x) response duration(ms).
+
+### Trigger Target
+
+When the collected metric data matches the configured threshold, the following types of profiling tasks could be triggered:
+1. On CPU Profiling: Perform eBPF On CPU Profiling on processes that meet the threshold.
+2. Off CPU Profiling: Perform eBPF Off CPU Profiling on processes that meet the threshold.
+3. Network Profiling: Perform eBPF Network Profiling on all processes within the same instance as the processes that meet the threshold.
diff --git a/docs/en/guides/README.md b/docs/en/guides/README.md
index 4a26f9bbbf..e8c5a1cda9 100755
--- a/docs/en/guides/README.md
+++ b/docs/en/guides/README.md
@@ -128,11 +128,5 @@ We use [license-eye](https://github.com/apache/skywalking-eyes) to help you make
 - Add the new dependencies' notice files (if any) to `./dist-material/release-docs/NOTICE` if they are Apache 2.0 license. Copy their license files to `./dist-material/release-docs/licenses` if they are not standard Apache 2.0 license.
 - Copy the new dependencies' license file to `./dist-material/release-docs/licenses` if they are not standard Apache 2.0 license.
 
-## Profile
-The performance profile is an enhancement feature in the APM system. We use thread dump to estimate the method execution time, rather than adding multiple local spans. In this way, the cost would be significantly reduced compared to using distributed tracing to locate the slow method. This feature is suitable in the production environment. The following documents are key to understanding the essential parts of this feature.
-- [Profile data report protocol](https://github.com/apache/skywalking-data-collect-protocol/tree/master/profile) is provided through gRPC, just like other traces and JVM data.
-- [Thread dump merging mechanism](backend-profile.md) introduces the merging mechanism. This mechanism helps end users understand profile reports.
-- [Exporter tool of profile raw data](backend-profile-export.md) guides you on how to package the original profile data for issue reports when the visualization doesn't work well on the official UI.
-
 ## Release
 If you're a committer, read the [Apache Release Guide](How-to-release.md) to learn about how to create an official Apache version release in accordance with avoid Apache's rules. As long as you keep our LICENSE and NOTICE, the Apache license allows everyone to redistribute.
diff --git a/docs/en/setup/backend/backend-continuous-profiling.md b/docs/en/setup/backend/backend-continuous-profiling.md
new file mode 100644
index 0000000000..5348e1b891
--- /dev/null
+++ b/docs/en/setup/backend/backend-continuous-profiling.md
@@ -0,0 +1,71 @@
+# Continuous Profiling
+
+Continuous profiling utilizes [eBPF](https://ebpf.io), process monitoring, and other technologies to collect data. 
+When the configured threshold is met, it would automatically start profiling tasks. Corresponds to [Continuous Profiling](../../concepts-and-designs/profiling.md#continuous-profiling) in the concepts and designs.
+This approach helps identify performance bottlenecks and potential issues in a proactive manner, 
+allowing users to optimize their applications and systems more effectively.
+
+## Active in the OAP
+Continuous profiling uses the same protocol service as eBPF Profiling, so you only need to ensure that the eBPF Profiling receiver is running.
+
+```yaml
+receiver-ebpf:
+  selector: ${SW_RECEIVER_EBPF:default}
+  default:
+```
+
+## Configuration of Continuous Profiling Policy
+
+Continuous profiling can be configured on a service entity, with the following fields in the configuration:
+
+1. **Service**: The service entity for which you want to monitor the processes.
+2. **Targets**: Configuration conditions.
+   1. **Target Type**: Target profiling type, currently supporting On CPU Profiling, Off CPU Profiling, and Network Profiling.
+   2. **Check Items**: Detection conditions, only one of the multiple condition rules needs to be met to start the task.
+       1. **Type**: Monitoring type, currently supporting "System Load", "Process CPU", "Process Thread Count", "HTTP Error Rate", "HTTP Avg Response Time". 
+       2. **Threshold**: Check if the monitoring value meets the specified expectations. 
+       3. **Period**: The time period for monitoring data, which can also be understood as the most recent duration. 
+       4. **Count**: The number of times the threshold is triggered within the detection period, which can also be understood as the total number of times the specified threshold rule is triggered in the most recent duration. Once the count check is met, the specified Profiling task will be started.
+       5. **URI**: For HTTP-related monitoring types, used to filter specific URIs.
+
+## Monitoring
+
+After saving the configuration, the eBPF agent can perform monitoring operations on the processes under the specified service based on the service-level configuration. 
+
+### Metrics
+
+While performing monitoring, the eBPF agent would report the monitoring data to OAP for storage, making it more convenient to understand the real-time monitoring status. The main metrics include:
+
+| Monitor Type | Unit | Description |
+|--------------|------|-------------|
+| System Load | Load | System load average over a specified period. |
+| Process CPU | Percentage | The CPU usage of the process as a percentage. |
+| Process Thread Count | Count | The number of threads in the process. |
+| HTTP Error Rate | Percentage | The percentage of HTTP requests that result in error responses (e.g., 4xx or 5xx status codes). |
+| HTTP Avg Response Time | Millisecond | The average response time for HTTP requests. |
+
+### Threshold With Trigger
+
+In the eBPF agent, data is collected periodically, and the sliding time window technique is used to store the data from the most recent **Period** cycles. 
+The **Threshold** rule is used to verify whether the data within each cycle meets the specified criteria. 
+If the number of times the conditions are met within the sliding time window exceeds the **Count** value, the corresponding Profiling task would be triggered.
+
+The sliding time window technique ensures that the most recent and relevant data is considered when evaluating the conditions. 
+This approach allows for a more accurate and dynamic assessment of the system's performance, 
+making it possible to identify and respond to issues in a timely manner. 
+By triggering Profiling tasks when specific conditions are met, the system can automatically initiate performance analysis and help uncover potential bottlenecks or areas for improvement.
+
+#### Causes
+
+When the eBPF agent reports a Profiling task, it also reports the reason for triggering the Profiling task, which mainly includes the following information:
+
+1. **Process**: The specific process that triggered the policy.
+2. **Monitor Type**: The type of monitoring that was triggered.
+3. **Threshold**: The configured threshold value.
+4. **Current**: The monitoring value at the time the rule was triggered.
+
+#### Silence Period
+
+Upon triggering a continuous profiling task, the eBPF agent supports a feature that prevents re-triggering tasks within a specified period. 
+This feature is designed to prevent an unlimited number of profiling tasks from being initiated if the process continuously reaches the threshold, 
+which could potentially cause system issues.
\ No newline at end of file
diff --git a/docs/en/setup/backend/backend-ebpf-profiling.md b/docs/en/setup/backend/backend-ebpf-profiling.md
new file mode 100644
index 0000000000..9b039efd1b
--- /dev/null
+++ b/docs/en/setup/backend/backend-ebpf-profiling.md
@@ -0,0 +1,198 @@
+# eBPF Profiling
+
+eBPF Profiling utilizes the [eBPF](https://ebpf.io/) technology to monitor applications without requiring any modifications to the application itself. Corresponds to [Out-Process Profiling](../../concepts-and-designs/profiling.md#out-of-process-profiling).
+
+To use eBPF Profiling, the SkyWalking Rover application (eBPF Agent) needs to be installed on the host machine. 
+When the agent receives a Profiling task, it starts the Profiling task for the specific application to analyze performance bottlenecks for the corresponding type of Profiling.
+
+Lean more about the eBPF profiling in following blogs:
+1. [**Pinpoint Service Mesh Critical Performance Impact by using eBPF**](../../concepts-and-designs/ebpf-cpu-profiling.md)
+2. [**Diagnose Service Mesh Network Performance with eBPF**](../../academy/diagnose-service-mesh-network-performance-with-ebpf.md)
+
+## Active in the OAP
+OAP and the agent use a brand-new protocol to exchange eBPF Profiling data, so it is necessary to start OAP with the following configuration:
+
+```yaml
+receiver-ebpf:
+  selector: ${SW_RECEIVER_EBPF:default}
+  default:
+```
+
+## Profiling type
+
+eBPF Profiling leverages eBPF technology to provide support for the following types of tasks:
+
+1. **On CPU Profiling**: Periodically samples the thread stacks of the current program while it's executing on the CPU using `PERF_COUNT_SW_CPU_CLOCK`.
+2. **Off CPU Profiling**: Collects and aggregates thread stacks when the program executes the kernel function `finish_task_switch`.
+3. **Network Profiling**: Collects the execution details of the application when performing network-related syscalls, and then aggregates them into a topology map and metrics for different network protocols.
+
+### On CPU Profiling
+
+On CPU Profiling periodically samples the thread stacks of the target program while it's executing on the CPU and aggregates the thread stacks to create a flame graph. 
+This helps users identify performance bottlenecks based on the flame graph information.
+
+#### Creating task
+
+When creating an On CPU Profiling task, you need to specify which eligible processes need to be sampled. The required configuration information is as follows:
+
+1. **Service**: The processes under which service entity need to perform Profiling tasks.
+2. **Labels**: Specifies which processes with certain labels under the service entity can perform profiling tasks. If left blank, all processes under the specified service will require profiling.
+3. **Start Time**: Whether the current task needs to be executed immediately or at a future point in time.
+4. **Duration**: The execution time of the current profiling task.
+
+The eBPF agent would periodically request from the OAP whether there are any eligible tasks among all the processes collected by the current eBPF agent. 
+When the eBPF agent receives a task, it would start the profiling task with the process.
+
+#### Profiling analyze
+
+Once the eBPF agent starts a profiling task for a specific process, it would periodically collect data and report it to the OAP. 
+At this point, a scheduling of task is generated. The scheduling data contains the following information:
+
+1. **Schedule ID**: The ID of current schedule.
+2. **Task**: The task to which the current scheduling data belongs.
+3. **Process**: The process for which the current scheduling Profiling data is being collected.
+4. **Start Time**: The execution start time of the current schedule.
+5. **End Time**: The time when the last sampling of the current schedule was completed.
+
+Once the schedule is created, we can use the existing scheduling ID and time range to query the CPU execution situation of the specified process within a specific time period. 
+The query contains the following fields:
+1. **Schedule ID**: The schedule ID you want to query.
+2. **Time**: The start and end times you want to query.
+
+After the query, the following data would be returned. With the data, it's easy to generate a flame graph:
+1. **Id**: Element ID.
+2. **Parent ID**: Parent element ID. The dependency relationship between elements can be determined using the element ID and parent element ID.
+3. **Symbol**: The symbol name of the current element. Usually, it represents the method names of thread stacks in different languages.
+4. **Stack Type**: The type of thread stack where the current element is located. Supports `KERNEL_SPACE` and `USER_SPACE`, which represent user mode and kernel mode, respectively.
+5. **Dump Count**: The number of times the current element was sampled. The more samples of symbol, means the longer the method execution time.
+
+### Off CPU Profiling
+
+Off CPU Profiling can analyze the thread state when a thread switch occurs in the current process, thereby determining performance loss caused by blocked on I/O, locks, timers, paging/swapping, and other reasons. 
+The execution flow between the eBPF agent and OAP in Off CPU Profiling is the same as in On CPU Profiling, but the data content being analyzed is different.
+
+#### Create task
+
+The process of creating an Off CPU Profiling task is the same as creating an On CPU Profiling task, 
+with the only difference being that the Profiling task type is changed to OFF CPU Profiling. For specific parameters, please refer to the [previous section](#on-cpu-profiling).
+
+#### Profiling analyze
+
+When the eBPF agent receives a Off CPU Profiling task, it would also collect data and generate a schedule. 
+When analyzing data, unlike On CPU Profiling, Off CPU Profiling can generate different flame graphs based on the following two aggregation methods:
+1. **By Time**: Aggregate based on the time consumed by each method, allowing you to analyze which methods take longer.
+2. **By Count**: Aggregate based on the number of times a method switches to non-CPU execution, allowing you to analyze which methods cause more non-CPU executions for the task.
+
+### Network Profiling
+
+Network Profiling can analyze and monitor network requests related to process, and based on the data, generate topology diagrams, metrics, and other information. 
+Furthermore, it can be integrated with existing Tracing systems to enhance the data content.
+
+#### Create task
+
+Unlike On/Off CPU Profiling, Network Profiling requires specifying the instance entity information when creating a task. 
+For example, in a Service Mesh, there may be multiple processes under a single instance(Pod), such as an application and Envoy. 
+In network analysis, they usually work together, so analyzing them together can give you a better understanding of the network execution situation of the Pod. 
+The following parameters are needed:
+
+1. **Instance**: The current Instance entity.
+2. **Sampling**: Sampling information for network requests.
+
+Sampling represents how the current system samples raw data and combines it with the existing Tracing system, 
+allowing you to see the complete network data corresponding to a Span in Tracing Span. 
+Currently, it supports sampling Raw information for Spans using HTTP/1.x as RPC and parsing SkyWalking and Zipkin protocols. 
+The sampling information configuration is as follows:
+
+1. **URI Regex**: Only collect requests that match the specified URI. If empty, all requests will be collected.
+2. **Min Duration**: Only sample data with a response time greater than or equal to the specified duration. If empty, all requests will be collected.
+3. **When 4XX**: Only sample data with a response status code between 400 and 500 (exclusive).
+4. **When 5XX**: Only sample data with a response status code between 500 and 600 (exclusive).
+5. **Settings**: When network data meets the above rules, how to collect the data.
+   1. **Require Complete Request**: Whether to collect request data.
+   2. **Max Request Size**: The maximum data size for collecting requests. If empty, all data will be collected.
+   3. **Require Complete Response**: Whether to collect response data.
+   4. **Max Response Size**: The maximum data size for collecting responses. If empty, all data will be collected.
+
+#### Profiling analysis
+
+After starting the task, the following data can be analyzed:
+
+1. **Topology**: Analyze the data flow and data types when the current instance interacts internally and externally.
+2. **TCP Metrics**: Network Layer-4 metrics between two process.
+3. **HTTP/1.x Metrics**: If there are HTTP/1.x requests between two nodes, the HTTP/1.x metrics would be analyzed based on the data content.
+4. **HTTP Request**: If two nodes use HTTP/1.x and include a tracing system, the tracing data would be extended with events.
+
+##### Topology
+
+The topology can generate two types of data:
+1. **Internal entities**: The network call relationships between all processes within the current instance.
+2. **Entities and external**: The call relationships between processes inside the entity and external network nodes.
+
+For external nodes, since eBPF can only collect remote IP and port information during data collection, 
+OAP can use Kubernetes cluster information to recognize the corresponding **Service** or **Pod** names.
+
+Between two nodes, data flow direction can be detected, and the following types of data protocols can be identified:
+
+1. **HTTP**: Two nodes communicate using HTTP/1.x or HTTP/2.x protocol.
+2. **HTTPS**: Two nodes communicate using HTTPS.
+3. **TLS**: Two nodes use encrypted data for transition, such as when using `OpenSSL`.
+4. **TCP**: There is TCP data transmission between two nodes.
+
+##### TCP Metrics
+
+In the TCP metrics, each metric includes both **client-side** and **server-side** data. The metrics are as follows:
+
+|Name|Unit|Description|
+|----|----|------|
+|Write CPM|Count|Number of write requests initiated per minute|
+|Write Total Bytes|B|Total data size written per minute|
+|Write Avg Execute Time|ns|Average execution time for each write operation|
+|Write RTT|ns|Round Trip Time (RTT)|
+|Read CPM|Count|Number of read requests per minute|
+|Read Total Bytes|B|Total data size read per minute|
+|Read Avg Execute Time|ns|Average execution time for each read operation|
+|Connect CPM|Count|Number of new connections established|
+|Connect Execute Time|ns|Time taken to establish a connection|
+|Close CPM|Count|Number of closed connections|
+|Close Execute Time|ns|Time taken to close a connection|
+|Retransmit CPM|Count|Number of data retransmissions per minute|
+|Drop CPM|Count|Number of dropped packets per minute|
+
+##### HTTP/1.x Metrics
+
+If there is HTTP/1.x protocol communication between two nodes, the eBPF agent can recognize the request data and parse the following metric information:
+
+|Name|Unit|Description|
+|----|----|------|
+|Request CPM|Count|Number of requests received per minute|
+|Response Status CPM|Count|Number of occurrences of each response status code per minute|
+|Request Package Size|B|Average request package data size|
+|Response Package Size|B|Average response package data size|
+|Client Duration|ns|Time taken for the client to receive a response|
+|Server Duration|ns|Time taken for the server to send a response|
+
+##### HTTP Request
+
+If two nodes communicate using the HTTP/1.x protocol, and they employ a distributed tracing system, 
+then eBPf agent can collect raw data according to the sampling rules configured in the previous sections.
+
+###### Sampling Raw Data
+
+When the sampling conditions are met, the original request or response data would be collected, including the following fields:
+
+1. **Data Size**: The data size of the current request/response content.
+2. **Data Content**: The raw data content. **Non-plain** format content would not be collected.
+3. **Data Direction**: The data transfer direction, either Ingress or Egress.
+4. **Data Type**: The data type, either Request or Response.
+5. **Connection Role**: The current node's role as a client or server.
+6. **Entity**: The entity information of the current process.
+7. **Time**: The Request or response sent/received time.
+
+###### Syscall Event
+
+When sampling rules are applied, the related Syscall invocations for the request or response would also be collected, including the following information:
+
+1. **Method Name**: System Syscall method names such as `read`, `write`, `readv`, `writev`, etc.
+2. **Packet Size**: The current TCP packet size.
+3. **Packet Count**: The number of sent or received packets.
+4. **Network Interface Information**: The network interface from which the packet was sent.
diff --git a/docs/en/guides/backend-profile.md b/docs/en/setup/backend/backend-profile-thread-merging.md
similarity index 90%
rename from docs/en/guides/backend-profile.md
rename to docs/en/setup/backend/backend-profile-thread-merging.md
index e6d93404b0..fa687c71fa 100644
--- a/docs/en/guides/backend-profile.md
+++ b/docs/en/setup/backend/backend-profile-thread-merging.md
@@ -49,4 +49,4 @@ The reason for generating multiple top-level trees is that original data can be
     3. Calculate each node execution in parallel. For each node, the duration of the current node should deduct the time consumed by all children.
 
 ## Profile data debugging
-Please follow the [exporter tool](backend-profile-export.md#export-using-command-line) to package profile data. Unzip the profile data and use [analyzer main function](../../../oap-server/server-tools/profile-exporter/tool-profile-snapshot-bootstrap/src/test/java/org/apache/skywalking/oap/server/tool/profile/exporter/ProfileExportedAnalyze.java) to run it.
+Please follow the [exporter tool](../../guides/backend-profile-export.md#export-using-command-line) to package profile data. Unzip the profile data and use [analyzer main function](../../../../oap-server/server-tools/profile-exporter/tool-profile-snapshot-bootstrap/src/test/java/org/apache/skywalking/oap/server/tool/profile/exporter/ProfileExportedAnalyze.java) to run it.
\ No newline at end of file
diff --git a/docs/en/setup/backend/backend-trace-profiling.md b/docs/en/setup/backend/backend-trace-profiling.md
new file mode 100644
index 0000000000..4689e8582a
--- /dev/null
+++ b/docs/en/setup/backend/backend-trace-profiling.md
@@ -0,0 +1,102 @@
+# Trace Profiling
+
+Trace Profiling is bound within the auto-instrument agent and corresponds to [In-Process Profiling](../../concepts-and-designs/profiling.md#in-process-profiling). 
+
+It is delivered to the agent in the form of a task, allowing for dynamic enabling or disabling. 
+Trace Profiling tasks can be created when an `endpoint` within a service experiences high latency. 
+When the agent receives the task, it periodically samples the thread stack related to the endpoint when requested. 
+Once the sampling is complete, the thread stack within the endpoint can be analyzed to determine the specific line of business code causing the performance issue.
+
+Lean more about the trace profiling, [please read this blog](../../concepts-and-designs/sdk-profiling.md).
+
+## Active in the OAP
+OAP and the agent use a brand-new protocol to exchange Trace Profiling data, so it is necessary to start OAP with the following configuration:
+
+```yaml
+receiver-profile:
+  selector: ${SW_RECEIVER_PROFILE:default}
+  default:
+```
+
+## Trace Profiling Task with Analysis
+
+To use the Trace Profiling feature, please follow these steps:
+
+1. **Create profiling task**: Use the UI or CLI tool to create a task.
+2. **Generate requests**: Ensure that the service has generated requests.
+3. **Query task details**: Check that the created task has Trace data generated.
+4. **Analyze the data**: Analyze the Trace data to determine where performance bottlenecks exist in the service.
+
+### Create profiling task
+
+Creating a Trace Profiling task is used to notify all agent nodes that execute the service entity which endpoint needs to perform the Trace Profiling feature. 
+This Endpoint is typically an HTTP request or an RPC request address.
+
+When creating a task, the following configuration fields are required:
+
+1. **Service**: Which agent under the service needs to be monitored.
+2. **Endpoint**: The specific endpoint name, such as "POST:/path/to/request."
+3. **Start Time**: The start time of the task, which can be executed immediately or at a future time.
+4. **Duration**: The duration of the task execution.
+5. **Min Duration Threshold**: The monitoring will only be triggered when the specified endpoint's execution time exceeds this threshold. This effectively prevents the collection of ineffective data due to short execution times.
+6. **Dump Period**: The thread stack collection period, which will trigger thread sampling every specified number of milliseconds.
+7. **Max Sampling Count**: The maximum number of traces that can be collected in a task. This effectively prevents the program execution from being affected by excessive trace sampling, such as the Stop The World situation in Java.
+
+When the Agent receives a Trace Profiling task from OAP, it automatically generates a log to notify that the task has been acknowledged. The log contains the following field information:
+
+1. **Instance**: The name of the instance where the Agent is located.
+2. **Type**: Supports "NOTIFIED" and "EXECUTION_FINISHED", with the current log displaying "NOTIFIED".
+3. **Time**: The time when the Agent received the task.
+
+### Generate Requests
+
+At this point, Tracing requests matching the specified Endpoint and other conditions would undergo Profiling.
+
+Notice, whether profiling is thread sensitive, it relies on the agent side implementation. The Java Agent already supports cross-thread requests, so when a request involves cross-thread operations, it would also be periodically sampled for thread stack.
+
+### Query task details
+
+Once the Tracing request is completed, we can query the Tracing data associated with this Trace Profiling task, which includes the following information:
+
+1. **TraceId**: The Trace ID of the current request.
+2. **Instance**: The instance to which the current profiling data belongs.
+3. **Duration**: The total time taken by the current instance to process the Tracing request.
+4. **Spans**: The list of Spans associated with the current Tracing.
+   1. **SpanId**: The ID of the current span.
+   2. **Parent Span Id**: The ID of the parent span, allowing for a tree structure.
+   3. **SegmentId**: The ID of the segment to which the span belongs.
+   4. **Refs**: References of the current span, note that it only includes "CROSS_THREAD" type references.
+   5. **Service**: The service entity information to which the current span belongs.
+   6. **Instance**: The instance entity information to which the current span belongs.
+   7. **Time**: The start and end time of the current span.
+   8. **Endpoint Name**: The name of the current Span.
+   9. **Type**: The type of the current span, either "Entry", "Local", or "Exit".
+   10. **Peer**: The remote network address.
+   11. **Component**: The name of the component used by the current span.
+   12. **Layer**: The layer to which the current span belongs.
+   13. **Tags**: The tags information contained in the current span.
+   14. **Logs**: The log information in the current span.
+   15. **Profiled**: Whether the current span supports Profiling data analysis.
+   
+### Analyze the data
+
+Once we know which segments can be analyzed for profiling, we can then determine the time ranges available for thread stack analysis based on the "profiled" field in the span. Next, we can provide the following query content to analyze the data:
+
+1. **segmentId**: The segment to be analyzed. Segments are usually bound to individual threads, so we can determine which thread needs to be analyzed.
+2. **time range**: Includes the start and end time.
+
+By combining the segmentId with the time range, we can confirm the data for a specific thread during a specific time period. 
+This allows us to merge the thread stack data from the specified thread and time range and analyze which lines of code take longer to execute.
+The following fields help you understand the program execution:
+1. **Id**: Used to identify the current thread stack frame.
+2. **Parent Id**: Combined with "id" to determine the hierarchical relationship.
+3. **Code Signature**: The method signature of the current thread stack frame.
+4. **Duration**: The total time consumed by the current thread stack frame.
+5. **Duration Child Excluded**: Excludes the child method calls of the current method, only obtaining the time consumed by the current method.
+6. **Count**: The number of times the current thread stack frame was sampled.
+
+If you want to learn more about the thread stack merging mechanism, please read [this documentation](backend-profile-thread-merging.md).
+
+## Exporter
+
+If you find that the results of profiling data are not correct, you can report an issue through [this documentation](../../guides/backend-profile-export.md).
\ No newline at end of file
diff --git a/docs/menu.yml b/docs/menu.yml
index eed5df0560..fbf1bd695c 100644
--- a/docs/menu.yml
+++ b/docs/menu.yml
@@ -135,6 +135,14 @@ catalog:
             path: "/en/setup/backend/log-analyzer"
           - name: "On Demand Pod Logs"
             path: "/en/setup/backend/on-demand-pod-log"
+      - name: "Profiling"
+        catalog:
+          - name: "Tracing Profiling"
+            path: "/en/setup/backend/backend-trace-profiling"
+          - name: "eBPF Profiling"
+            path: "/en/setup/backend/backend-ebpf-profiling"
+          - name: "Continuous Profiling"
+            path: "/en/setup/backend/backend-continuous-profiling"
       - name: "Extension"
         catalog:
           - name: "Exporter"