You are viewing a plain text version of this content. The canonical link for it is here.
Posted to notifications@skywalking.apache.org by "mrproliu (via GitHub)" <gi...@apache.org> on 2023/03/29 12:23:32 UTC

[GitHub] [skywalking] mrproliu opened a new pull request, #10612: Add Profiling related documentations

mrproliu opened a new pull request, #10612:
URL: https://github.com/apache/skywalking/pull/10612

   
   - [x] If this pull request closes/resolves/fixes an existing issue, replace the issue number. Closes #10558 .
   - [x] Update the [`CHANGES` log](https://github.com/apache/skywalking/blob/master/docs/en/changes/changes.md).
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@skywalking.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [skywalking] wu-sheng commented on a diff in pull request #10612: Add Profiling related documentations

Posted by "wu-sheng (via GitHub)" <gi...@apache.org>.
wu-sheng commented on code in PR #10612:
URL: https://github.com/apache/skywalking/pull/10612#discussion_r1151949783


##########
docs/en/setup/backend/backend-ebpf-profiling.md:
##########
@@ -0,0 +1,198 @@
+# EBPF Profiling
+
+eBPF Profiling utilizes the [eBPF](https://ebpf.io/) technology to monitor applications without requiring any modifications to the application itself. Corresponds to [Out-Process Profiling](../../concepts-and-designs/profiling.md#out-of-process-profiling).
+
+To use eBPF Profiling, the SkyWalking Rover application (eBPF Agent) needs to be installed on the host machine. 
+When the agent receives a Profiling task, it starts the Profiling task for the specific application to analyze performance bottlenecks for the corresponding type of Profiling.
+
+Lean more about the eBPF profiling in following blogs:
+1. [**Pinpoint Service Mesh Critical Performance Impact by using eBPF**](../../concepts-and-designs/ebpf-cpu-profiling.md)
+2. [**Diagnose Service Mesh Network Performance with eBPF**](../../academy/diagnose-service-mesh-network-performance-with-ebpf.md)
+
+## Active in the OAP
+OAP and the agent use a brand-new protocol to exchange eBPF Profiling data, so it is necessary to start OAP with the following configuration:
+
+```yaml
+receiver-ebpf:
+  selector: ${SW_RECEIVER_EBPF:default}
+  default:
+```
+
+## Profiling type
+
+eBPF Profiling leverages eBPF technology to provide support for the following types of tasks:
+
+1. **On CPU Profiling**: Periodically samples the thread stacks of the current program while it's executing on the CPU using `PERF_COUNT_SW_CPU_CLOCK`.
+2. **Off CPU Profiling**: Collects and aggregates thread stacks when the program executes the kernel function `finish_task_switch`.
+3. **Network Profiling**: Collects the execution details of the application when performing network-related syscalls, and then aggregates them into a topology map and metrics for different network protocols.
+
+### On CPU Profiling
+
+On CPU Profiling periodically samples the thread stacks of the target program while it's executing on the CPU and aggregates the thread stacks to create a flame graph. 
+This helps users identify performance bottlenecks based on the flame graph information.
+
+#### Creating task
+
+When creating an On CPU Profiling task, you need to specify which eligible processes need to be sampled. The required configuration information is as follows:
+
+1. **Service**: The processes under which service entity need to perform Profiling tasks.
+2. **Labels**: Specifies which processes with certain labels under the service entity can perform profiling tasks. If left blank, all processes under the specified service will require profiling.
+3. **Start Time**: Whether the current task needs to be executed immediately or at a future point in time.
+4. **Duration**: The execution time of the current profiling task.
+
+Once the task is created, the eBPF agent would periodically request from the OAP whether there are any eligible tasks among all the processes collected by the current eBPF agent. 

Review Comment:
   ```suggestion
   The eBPF agent would periodically request from the OAP whether there are any eligible tasks among all the processes collected by the current eBPF agent. 
   ```
   
   I think the requesting doesn't rely on task creation.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@skywalking.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [skywalking] wu-sheng commented on a diff in pull request #10612: Add Profiling related documentations

Posted by "wu-sheng (via GitHub)" <gi...@apache.org>.
wu-sheng commented on code in PR #10612:
URL: https://github.com/apache/skywalking/pull/10612#discussion_r1151948354


##########
docs/en/setup/backend/backend-ebpf-profiling.md:
##########
@@ -0,0 +1,198 @@
+# EBPF Profiling

Review Comment:
   ```suggestion
   # eBPF Profiling
   ```
   
   eBPF should be the only form.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@skywalking.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [skywalking] wu-sheng commented on a diff in pull request #10612: Add Profiling related documentations

Posted by "wu-sheng (via GitHub)" <gi...@apache.org>.
wu-sheng commented on code in PR #10612:
URL: https://github.com/apache/skywalking/pull/10612#discussion_r1151962482


##########
docs/en/setup/backend/backend-trace-profiling.md:
##########
@@ -0,0 +1,102 @@
+# Trace Profiling
+
+Trace Profiling is bound within the auto-instrument agent and corresponds to [In-Process Profiling](../../concepts-and-designs/profiling.md#in-process-profiling). 
+
+It is delivered to the agent in the form of a task, allowing for dynamic enabling or disabling. 
+Trace Profiling tasks can be created when an `endpoint` within a service experiences high latency. 
+When the agent receives the task, it periodically samples the thread stack related to the endpoint when requested. 
+Once the sampling is complete, the thread stack within the endpoint can be analyzed to determine the specific line of business code causing the performance issue.
+
+Lean more about the trace profiling, [please read this blog](../../concepts-and-designs/sdk-profiling.md).
+
+## Active in the OAP
+OAP and the agent use a brand-new protocol to exchange Trace Profiling data, so it is necessary to start OAP with the following configuration:
+
+```yaml
+receiver-profile:
+  selector: ${SW_RECEIVER_PROFILE:default}
+  default:
+```
+
+## Trace Profiling Task with Analysis
+
+To use the Trace Profiling feature, please follow these steps:
+
+1. **Create profiling task**: Use the UI or CLI tool to create a task.
+2. **Generate requests**: Ensure that the service has generated requests.
+3. **Query task details**: Check that the created task has Trace data generated.
+4. **Analyze the data**: Analyze the Trace data to determine where performance bottlenecks exist in the service.
+
+### Create profiling task
+
+Creating a Trace Profiling task is used to notify all agent nodes that execute the service entity which endpoint needs to perform the Trace Profiling feature. 
+This Endpoint is typically an HTTP request or an RPC request address.
+
+When creating a task, the following configuration fields are required:
+
+1. **Service**: Which agent under the service needs to be monitored.
+2. **Endpoint**: The specific endpoint name, such as "POST:/path/to/request."
+3. **Start Time**: The start time of the task, which can be executed immediately or at a future time.
+4. **Duration**: The duration of the task execution.
+5. **Min Duration Threshold**: The monitoring will only be triggered when the specified endpoint's execution time exceeds this threshold. This effectively prevents the collection of ineffective data due to short execution times.
+6. **Dump Period**: The thread stack collection period, which will trigger thread sampling every specified number of milliseconds.
+7. **Max Sampling Count**: The maximum number of traces that can be collected in a task. This effectively prevents the program execution from being affected by excessive trace sampling, such as the Stop The World situation in Java.
+
+When the Agent receives a Trace Profiling task from OAP, it automatically generates a log to notify that the task has been acknowledged. The log contains the following field information:
+
+1. **Instance**: The name of the instance where the Agent is located.
+2. **Type**: Supports "NOTIFIED" and "EXECUTION_FINISHED", with the current log displaying "NOTIFIED".
+3. **Time**: The time when the Agent received the task.
+
+### Generate Requests
+
+At this point, Tracing requests matching the specified Endpoint would undergo Profiling.
+
+Notice, the Java Agent already supports cross-thread requests, so when a request involves cross-thread operations, it would also be periodically sampled for thread stack.

Review Comment:
   ```suggestion
   Notice, whether profiling is thread sensitive, it relies on the agent side implementation. The Java Agent already supports cross-thread requests, so when a request involves cross-thread operations, it would also be periodically sampled for thread stack.
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@skywalking.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [skywalking] wu-sheng commented on a diff in pull request #10612: Add Profiling related documentations

Posted by "wu-sheng (via GitHub)" <gi...@apache.org>.
wu-sheng commented on code in PR #10612:
URL: https://github.com/apache/skywalking/pull/10612#discussion_r1151951236


##########
docs/en/setup/backend/backend-ebpf-profiling.md:
##########
@@ -0,0 +1,198 @@
+# EBPF Profiling
+
+eBPF Profiling utilizes the [eBPF](https://ebpf.io/) technology to monitor applications without requiring any modifications to the application itself. Corresponds to [Out-Process Profiling](../../concepts-and-designs/profiling.md#out-of-process-profiling).
+
+To use eBPF Profiling, the SkyWalking Rover application (eBPF Agent) needs to be installed on the host machine. 
+When the agent receives a Profiling task, it starts the Profiling task for the specific application to analyze performance bottlenecks for the corresponding type of Profiling.
+
+Lean more about the eBPF profiling in following blogs:
+1. [**Pinpoint Service Mesh Critical Performance Impact by using eBPF**](../../concepts-and-designs/ebpf-cpu-profiling.md)
+2. [**Diagnose Service Mesh Network Performance with eBPF**](../../academy/diagnose-service-mesh-network-performance-with-ebpf.md)
+
+## Active in the OAP
+OAP and the agent use a brand-new protocol to exchange eBPF Profiling data, so it is necessary to start OAP with the following configuration:
+
+```yaml
+receiver-ebpf:
+  selector: ${SW_RECEIVER_EBPF:default}
+  default:
+```
+
+## Profiling type
+
+eBPF Profiling leverages eBPF technology to provide support for the following types of tasks:
+
+1. **On CPU Profiling**: Periodically samples the thread stacks of the current program while it's executing on the CPU using `PERF_COUNT_SW_CPU_CLOCK`.
+2. **Off CPU Profiling**: Collects and aggregates thread stacks when the program executes the kernel function `finish_task_switch`.
+3. **Network Profiling**: Collects the execution details of the application when performing network-related syscalls, and then aggregates them into a topology map and metrics for different network protocols.
+
+### On CPU Profiling
+
+On CPU Profiling periodically samples the thread stacks of the target program while it's executing on the CPU and aggregates the thread stacks to create a flame graph. 
+This helps users identify performance bottlenecks based on the flame graph information.
+
+#### Creating task
+
+When creating an On CPU Profiling task, you need to specify which eligible processes need to be sampled. The required configuration information is as follows:
+
+1. **Service**: The processes under which service entity need to perform Profiling tasks.
+2. **Labels**: Specifies which processes with certain labels under the service entity can perform profiling tasks. If left blank, all processes under the specified service will require profiling.
+3. **Start Time**: Whether the current task needs to be executed immediately or at a future point in time.
+4. **Duration**: The execution time of the current profiling task.
+
+Once the task is created, the eBPF agent would periodically request from the OAP whether there are any eligible tasks among all the processes collected by the current eBPF agent. 
+When the eBPF agent receives a task, it would start the profiling task with the process.
+
+#### Profiling analyze
+
+Once the eBPF agent starts a profiling task for a specific process, it would periodically collect data and report it to the OAP. 
+At this point, a scheduling of task is generated. The scheduling data contains the following information:
+
+1. **Schedule ID**: The ID of current schedule.
+2. **Task**: The task to which the current scheduling data belongs.
+3. **Process**: The process for which the current scheduling Profiling data is being collected.
+4. **Start Time**: The execution start time of the current schedule.
+5. **End Time**: The time when the last sampling of the current schedule was completed.
+
+At this point, we can use the existing scheduling ID and time range to query the CPU execution situation of the specified process within a specific time period. 

Review Comment:
   ```suggestion
   Once the schedule is created, we can use the existing scheduling ID and time range to query the CPU execution situation of the specified process within a specific time period. 
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@skywalking.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [skywalking] wu-sheng commented on a diff in pull request #10612: Add Profiling related documentations

Posted by "wu-sheng (via GitHub)" <gi...@apache.org>.
wu-sheng commented on code in PR #10612:
URL: https://github.com/apache/skywalking/pull/10612#discussion_r1151961147


##########
docs/en/setup/backend/backend-trace-profiling.md:
##########
@@ -0,0 +1,102 @@
+# Trace Profiling
+
+Trace Profiling is bound within the auto-instrument agent and corresponds to [In-Process Profiling](../../concepts-and-designs/profiling.md#in-process-profiling). 
+
+It is delivered to the agent in the form of a task, allowing for dynamic enabling or disabling. 
+Trace Profiling tasks can be created when an `endpoint` within a service experiences high latency. 
+When the agent receives the task, it periodically samples the thread stack related to the endpoint when requested. 
+Once the sampling is complete, the thread stack within the endpoint can be analyzed to determine the specific line of business code causing the performance issue.
+
+Lean more about the trace profiling, [please read this blog](../../concepts-and-designs/sdk-profiling.md).
+
+## Active in the OAP
+OAP and the agent use a brand-new protocol to exchange Trace Profiling data, so it is necessary to start OAP with the following configuration:
+
+```yaml
+receiver-profile:
+  selector: ${SW_RECEIVER_PROFILE:default}
+  default:
+```
+
+## Trace Profiling Task with Analysis
+
+To use the Trace Profiling feature, please follow these steps:
+
+1. **Create profiling task**: Use the UI or CLI tool to create a task.
+2. **Generate requests**: Ensure that the service has generated requests.
+3. **Query task details**: Check that the created task has Trace data generated.
+4. **Analyze the data**: Analyze the Trace data to determine where performance bottlenecks exist in the service.
+
+### Create profiling task
+
+Creating a Trace Profiling task is used to notify all agent nodes that execute the service entity which endpoint needs to perform the Trace Profiling feature. 
+This Endpoint is typically an HTTP request or an RPC request address.
+
+When creating a task, the following configuration fields are required:
+
+1. **Service**: Which agent under the service needs to be monitored.
+2. **Endpoint**: The specific endpoint name, such as "POST:/path/to/request."
+3. **Start Time**: The start time of the task, which can be executed immediately or at a future time.
+4. **Duration**: The duration of the task execution.
+5. **Min Duration Threshold**: The monitoring will only be triggered when the specified endpoint's execution time exceeds this threshold. This effectively prevents the collection of ineffective data due to short execution times.
+6. **Dump Period**: The thread stack collection period, which will trigger thread sampling every specified number of milliseconds.
+7. **Max Sampling Count**: The maximum number of traces that can be collected in a task. This effectively prevents the program execution from being affected by excessive trace sampling, such as the Stop The World situation in Java.
+
+When the Agent receives a Trace Profiling task from OAP, it automatically generates a log to notify that the task has been acknowledged. The log contains the following field information:
+
+1. **Instance**: The name of the instance where the Agent is located.
+2. **Type**: Supports "NOTIFIED" and "EXECUTION_FINISHED", with the current log displaying "NOTIFIED".
+3. **Time**: The time when the Agent received the task.
+
+### Generate Requests
+
+At this point, Tracing requests matching the specified Endpoint would undergo Profiling.

Review Comment:
   ```suggestion
   At this point, Tracing requests matching the specified Endpoint and other conditions would undergo Profiling.
   ```
   
   We have `duration` condition.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@skywalking.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [skywalking] wu-sheng merged pull request #10612: Add Profiling related documentations

Posted by "wu-sheng (via GitHub)" <gi...@apache.org>.
wu-sheng merged PR #10612:
URL: https://github.com/apache/skywalking/pull/10612


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@skywalking.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org