You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by GitBox <gi...@apache.org> on 2022/12/03 09:56:56 UTC

[GitHub] [flink] 1996fanrui opened a new pull request, #21447: [FLINK-30185] Provide the flame graph to the subtask level

1996fanrui opened a new pull request, #21447:
URL: https://github.com/apache/flink/pull/21447

   ## What is the purpose of the change
   
   Provide the flame graph to the subtask level
   
   ## Brief change log
   
   - [FLINK-30185][rest][refactor] Change the cache key of flame graph info from job vertex to execution vertex
   - [FLINK-30185][rest] Add the subtask index parameter for flame graph handler
   - [FLINK-30185][ui] Allow choose the subtask index in the flame graph page
   
   
   ## Verifying this change
   
   This change added tests and can be verified as follows:
   
   
   
   ## Does this pull request potentially affect one of the following parts:
   
     - Dependencies (does it add or upgrade a dependency): no
     - The public API, i.e., is any changed class annotated with `@Public(Evolving)`: no
     - The serializers: no
     - The runtime per-record code paths (performance sensitive): no
     - Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Kubernetes/Yarn, ZooKeeper: no
     - The S3 file system connector: no
   
   ## Documentation
   
     - Does this pull request introduce a new feature? yes
     - If yes, how is the feature documented? docs
   
   
   
   Currently, this pr is a draft, I'm writing the unit tests and add the doc later.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@flink.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [flink] 1996fanrui commented on pull request #21447: [FLINK-30185] Provide the flame graph to the subtask level

Posted by GitBox <gi...@apache.org>.
1996fanrui commented on PR #21447:
URL: https://github.com/apache/flink/pull/21447#issuecomment-1357004999

   Hi @xintongsong , I have finished the step1, could you help take a look in your free time?
   
   And could I finish the step2 in the next PR?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@flink.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [flink] 1996fanrui commented on a diff in pull request #21447: [FLINK-30185] Provide the flame graph to the subtask level

Posted by GitBox <gi...@apache.org>.
1996fanrui commented on code in PR #21447:
URL: https://github.com/apache/flink/pull/21447#discussion_r1054272283


##########
flink-runtime/src/main/java/org/apache/flink/runtime/messages/ThreadInfoSample.java:
##########
@@ -32,25 +36,33 @@
  */
 public class ThreadInfoSample implements Serializable {
 
+    private final ExecutionAttemptID executionId;

Review Comment:
   > I'd suggest to change TaskThreadInfoResponse to contain a Map<ExecutionAttemptID, ThreadInfoSample>. This class, as indicated by its name, is specific for a Flink task thread.
   
   Updated.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@flink.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [flink] xintongsong commented on a diff in pull request #21447: [FLINK-30185] Provide the flame graph to the subtask level

Posted by GitBox <gi...@apache.org>.
xintongsong commented on code in PR #21447:
URL: https://github.com/apache/flink/pull/21447#discussion_r1051736030


##########
flink-runtime/src/main/java/org/apache/flink/runtime/messages/ThreadInfoSample.java:
##########
@@ -32,25 +36,33 @@
  */
 public class ThreadInfoSample implements Serializable {
 
+    private final ExecutionAttemptID executionId;

Review Comment:
   I'd suggest to change `TaskThreadInfoResponse` to contain a `Map<ExecutionAttemptID, ThreadInfoSample>`. This class, as indicated by its name, is specific for a Flink task thread.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@flink.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [flink] 1996fanrui commented on a diff in pull request #21447: [FLINK-30185] Provide the flame graph to the subtask level

Posted by GitBox <gi...@apache.org>.
1996fanrui commented on code in PR #21447:
URL: https://github.com/apache/flink/pull/21447#discussion_r1053207231


##########
flink-runtime/src/main/java/org/apache/flink/runtime/messages/ThreadInfoSample.java:
##########
@@ -32,25 +36,33 @@
  */
 public class ThreadInfoSample implements Serializable {
 
+    private final ExecutionAttemptID executionId;

Review Comment:
   Sorry for missing this message yesterday. I added the `<OWNER>`  last weekend, could you help take a look first? 
   
   If it isn't ok, I can refactor it according to your suggestion.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@flink.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [flink] xintongsong commented on pull request #21447: [FLINK-30185] Provide the flame graph to the subtask level

Posted by GitBox <gi...@apache.org>.
xintongsong commented on PR #21447:
URL: https://github.com/apache/flink/pull/21447#issuecomment-1354053526

   > The execution-vertex cache just saves the result when requesting the flame graph for subtask level, right?
   
   Yes.
   
   > Is it typo? For the subtask flame graph, we should first check the execution-vertex cache, then the job-vertex cache, right?
   
   Well, there's no typo, but I think the order doesn't really matter. Checking the job-vertex cache first would allow us to access the execution-vertex cache with a simple computeIfAbsent-style logic. But that depends on the detailed implementation.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@flink.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [flink] 1996fanrui commented on a diff in pull request #21447: [FLINK-30185] Provide the flame graph to the subtask level

Posted by GitBox <gi...@apache.org>.
1996fanrui commented on code in PR #21447:
URL: https://github.com/apache/flink/pull/21447#discussion_r1053913594


##########
flink-runtime/src/main/java/org/apache/flink/runtime/messages/ThreadInfoSample.java:
##########
@@ -32,25 +36,33 @@
  */
 public class ThreadInfoSample implements Serializable {
 
+    private final ExecutionAttemptID executionId;

Review Comment:
   Thanks for your quick feedback, I will refactor it asap.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@flink.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [flink] 1996fanrui commented on pull request #21447: [FLINK-30185] Provide the flame graph to the subtask level

Posted by GitBox <gi...@apache.org>.
1996fanrui commented on PR #21447:
URL: https://github.com/apache/flink/pull/21447#issuecomment-1354154092

   > In the first step, we do not touch how thread samples are captured / cached, and only provide apis for building flame graph with samples from a specific subtask. This should be good enough for addressing the original demand, which is to provide drill-down visualizations for the operator-level flame graph.
   
   Do you mean that we get the result of subtask level from job-vertex-cache in the first step, right? And don't trigger any thread sampling when get the result of subtask level, right?
   
   > I think the order doesn't really matter. Checking the job-vertex cache first would allow us to access the execution-vertex cache with a simple computeIfAbsent-style logic. But that depends on the detailed implementation.
   
   In the second step, we get the result of subtask level from `job-vertex cache` and `execution-vertex cache`, and choose the newer one, right? And trigger the subtask thread sampling when the `execution-vertex cache` is empty or stale, right?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@flink.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [flink] 1996fanrui commented on pull request #21447: [FLINK-30185] Provide the flame graph to the subtask level

Posted by GitBox <gi...@apache.org>.
1996fanrui commented on PR #21447:
URL: https://github.com/apache/flink/pull/21447#issuecomment-1372327247

   Thanks @xintongsong and @yangjunhan 's review🤔


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@flink.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [flink] xintongsong commented on a diff in pull request #21447: [FLINK-30185] Provide the flame graph to the subtask level

Posted by GitBox <gi...@apache.org>.
xintongsong commented on code in PR #21447:
URL: https://github.com/apache/flink/pull/21447#discussion_r1049319867


##########
flink-runtime/src/main/java/org/apache/flink/runtime/webmonitor/threadinfo/JobVertexThreadInfoTracker.java:
##########
@@ -151,60 +146,88 @@
     }
 
     @Override
-    public Optional<T> getVertexStats(JobID jobId, AccessExecutionJobVertex vertex) {
+    public Optional<ThreadInfoStats> getVertexStats(JobID jobId, AccessExecutionJobVertex vertex) {
         synchronized (lock) {
-            final Key key = getKey(jobId, vertex);
+            List<AccessExecutionVertex> needRefreshedExecutionVertices = new ArrayList<>();
+            List<ThreadInfoSample> results = new ArrayList<>();
+
+            int requestId = Integer.MAX_VALUE;
+            long startTime = Long.MAX_VALUE;
+            long endTime = Long.MIN_VALUE;
+            for (AccessExecutionVertex executionVertex : vertex.getTaskVertices()) {
+                Key key = getKey(jobId, executionVertex);
+                final ThreadInfoStats stats = executionVertexStatsCache.getIfPresent(key);
+                if (stats != null) {
+                    results.addAll(stats.getSamples());
+                    requestId = Math.min(requestId, stats.getRequestId());
+                    startTime = Math.min(startTime, stats.getStartTime());
+                    endTime = Math.max(endTime, stats.getEndTime());

Review Comment:
   Not sure about these aggregations.
   - For `requestId`, there's no longer 1-1 mappings between the pending requests and the thread info stats.
   - For `start/endTime`, it does not sounds right sampling a group of tasks with different start / end time.



##########
flink-runtime/src/main/java/org/apache/flink/runtime/util/JvmUtils.java:
##########
@@ -53,29 +56,32 @@ public static Collection<ThreadInfo> createThreadDump() {
      * Creates a {@link ThreadInfoSample} for a specific thread. Contains thread traces if
      * maxStackTraceDepth > 0.
      *
+     * @param executionId The execution id of this threadInfo.
      * @param threadId The ID of the thread to create the thread dump for.
      * @param maxStackTraceDepth The maximum number of entries in the stack trace to be collected.
      * @return The thread information of a specific thread.
      */
     public static Optional<ThreadInfoSample> createThreadInfoSample(
-            long threadId, int maxStackTraceDepth) {
+            ExecutionAttemptID executionId, long threadId, int maxStackTraceDepth) {

Review Comment:
   A `JvmUtils` should not be aware of the Flink DAG concepts such as `ExecutionAttemptID` and `SampleableTask`.



##########
flink-runtime/src/main/java/org/apache/flink/runtime/messages/ThreadInfoSample.java:
##########
@@ -32,25 +36,33 @@
  */
 public class ThreadInfoSample implements Serializable {
 
+    private final ExecutionAttemptID executionId;

Review Comment:
   Not sure about including the `executionId` in `ThreadInfoSample`. I think it changes `ThreadInfoSample` from a generalized class that wraps parts of `java.lang.management.ThreadInfo` to a specialized class that only serves for a flink execution. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@flink.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [flink] 1996fanrui commented on pull request #21447: [FLINK-30185] Provide the flame graph to the subtask level

Posted by GitBox <gi...@apache.org>.
1996fanrui commented on PR #21447:
URL: https://github.com/apache/flink/pull/21447#issuecomment-1347814792

   The flame graph with this PR:
   
   FlameGraph of all subtasks: 
   
   <img width="1368" alt="FlameGraph of all subtasks" src="https://user-images.githubusercontent.com/38427477/207243774-9fc3ff2d-5c39-4643-81e4-b9387367d6fd.png">
   
   choose subtask:
   
   <img width="1391" alt="choose subtask" src="https://user-images.githubusercontent.com/38427477/207243762-80dd05e3-1dbd-41f9-8c71-2c4ac49f2eb0.png">
   
   FlameGraph of subtask0:
   
   <img width="1318" alt="FlameGraph of subtask0" src="https://user-images.githubusercontent.com/38427477/207243786-01fbfc64-ac80-40b8-98f2-8a993c215d72.png">
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@flink.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [flink] flinkbot commented on pull request #21447: [FLINK-30185] Provide the flame graph to the subtask level

Posted by GitBox <gi...@apache.org>.
flinkbot commented on PR #21447:
URL: https://github.com/apache/flink/pull/21447#issuecomment-1336126337

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "b8b928d7e32faab538d2d81d2087df5ced006efc",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "b8b928d7e32faab538d2d81d2087df5ced006efc",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * b8b928d7e32faab538d2d81d2087df5ced006efc UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     The @flinkbot bot supports the following commands:
   
    - `@flinkbot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@flink.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [flink] 1996fanrui commented on a diff in pull request #21447: [FLINK-30185] Provide the flame graph to the subtask level

Posted by GitBox <gi...@apache.org>.
1996fanrui commented on code in PR #21447:
URL: https://github.com/apache/flink/pull/21447#discussion_r1053207231


##########
flink-runtime/src/main/java/org/apache/flink/runtime/messages/ThreadInfoSample.java:
##########
@@ -32,25 +36,33 @@
  */
 public class ThreadInfoSample implements Serializable {
 
+    private final ExecutionAttemptID executionId;

Review Comment:
   Sorry for missing this message yesterday. I added the `<OWNER>` to the `ThreadInfoSample` last weekend, could you help take a look first? 
   
   If it isn't ok, I can refactor it according to your suggestion.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@flink.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [flink] 1996fanrui commented on pull request #21447: [FLINK-30185] Provide the flame graph to the subtask level

Posted by GitBox <gi...@apache.org>.
1996fanrui commented on PR #21447:
URL: https://github.com/apache/flink/pull/21447#issuecomment-1371029585

   Hi @yangjunhan , thanks for your review, and sorry for the late address due to I was on vacation recently.
   
   I have updated your comments, please help check again in your free time, thanks~


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@flink.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [flink] 1996fanrui commented on pull request #21447: [FLINK-30185] Provide the flame graph to the subtask level

Posted by GitBox <gi...@apache.org>.
1996fanrui commented on PR #21447:
URL: https://github.com/apache/flink/pull/21447#issuecomment-1347816498

   Hi @xintongsong , the PR is ready, and I have updated some pictures here. Please help take a look in your free time, thanks~
   
   I will update the doc after you think it's ok.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@flink.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [flink] xintongsong commented on a diff in pull request #21447: [FLINK-30185] Provide the flame graph to the subtask level

Posted by GitBox <gi...@apache.org>.
xintongsong commented on code in PR #21447:
URL: https://github.com/apache/flink/pull/21447#discussion_r1053911284


##########
flink-runtime/src/main/java/org/apache/flink/runtime/messages/ThreadInfoSample.java:
##########
@@ -32,25 +36,33 @@
  */
 public class ThreadInfoSample implements Serializable {
 
+    private final ExecutionAttemptID executionId;

Review Comment:
   I don't think generic type solves the problem here, because the concept **owner** itself is not general.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@flink.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [flink] yangjunhan commented on a diff in pull request #21447: [FLINK-30185] Provide the flame graph to the subtask level

Posted by GitBox <gi...@apache.org>.
yangjunhan commented on code in PR #21447:
URL: https://github.com/apache/flink/pull/21447#discussion_r1059229187


##########
flink-runtime-web/web-dashboard/src/app/pages/job/overview/flamegraph/job-overview-drawer-flamegraph.component.ts:
##########
@@ -100,9 +118,38 @@ export class JobOverviewDrawerFlameGraphComponent implements OnInit, OnDestroy {
       );
   }
 
+  private requestRunningSubtasks(): void {
+    this.jobLocalService
+      .jobWithVertexChanges()
+      .pipe(
+        tap(data => (this.selectedVertex = data.vertex)),
+        mergeMap(data => {
+          return this.jobService.loadSubTasks(data.job.jid, data.vertex!.id);
+        }),
+        takeUntil(this.destroy$)
+      )
+      .subscribe(
+        data => {
+          const runningSubtasks = data?.subtasks

Review Comment:
   Can `data` be null or undefined in this case? If not, you can skip the optional chaining (`?.`).
    
   `runningSubtasks` can be undefined due to the use of optional chaining. This causes the spread operator on line 136 unsafe. Instead, you should wrap it with a IF statement checking whether `runningSubtasks` is not undefined.



##########
flink-runtime-web/web-dashboard/src/app/pages/job/overview/flamegraph/job-overview-drawer-flamegraph.component.ts:
##########
@@ -100,9 +118,38 @@ export class JobOverviewDrawerFlameGraphComponent implements OnInit, OnDestroy {
       );
   }
 
+  private requestRunningSubtasks(): void {
+    this.jobLocalService
+      .jobWithVertexChanges()
+      .pipe(
+        tap(data => (this.selectedVertex = data.vertex)),
+        mergeMap(data => {
+          return this.jobService.loadSubTasks(data.job.jid, data.vertex!.id);
+        }),
+        takeUntil(this.destroy$)
+      )
+      .subscribe(
+        data => {
+          const runningSubtasks = data?.subtasks
+            .filter(subtaskInfo => subtaskInfo.status === 'RUNNING')
+            .map(subtaskInfo => subtaskInfo.subtask.toString());
+          this.listOfRunningSubtasks = [this.allSubtasks, ...runningSubtasks];
+          this.cdr.markForCheck();
+        },
+        () => {
+          this.cdr.markForCheck();
+        }
+      );
+  }
+
+  public selectSubtask(subtaskIndex: string): void {
+    this.destroy$.next();
+    this.subtaskIndex = subtaskIndex;

Review Comment:
   Miss a markForCheck here



##########
flink-runtime-web/web-dashboard/src/app/pages/job/overview/flamegraph/job-overview-drawer-flamegraph.component.ts:
##########
@@ -100,9 +118,38 @@ export class JobOverviewDrawerFlameGraphComponent implements OnInit, OnDestroy {
       );
   }
 
+  private requestRunningSubtasks(): void {
+    this.jobLocalService
+      .jobWithVertexChanges()
+      .pipe(
+        tap(data => (this.selectedVertex = data.vertex)),
+        mergeMap(data => {
+          return this.jobService.loadSubTasks(data.job.jid, data.vertex!.id);
+        }),
+        takeUntil(this.destroy$)
+      )
+      .subscribe(
+        data => {
+          const runningSubtasks = data?.subtasks
+            .filter(subtaskInfo => subtaskInfo.status === 'RUNNING')
+            .map(subtaskInfo => subtaskInfo.subtask.toString());
+          this.listOfRunningSubtasks = [this.allSubtasks, ...runningSubtasks];
+          this.cdr.markForCheck();
+        },
+        () => {
+          this.cdr.markForCheck();

Review Comment:
   This line alone is useless if no view data is changed. Do something in the error callback if you want such as `this.listOfRunningSubtasks = [];`



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@flink.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [flink] yangjunhan commented on pull request #21447: [FLINK-30185] Provide the flame graph to the subtask level

Posted by GitBox <gi...@apache.org>.
yangjunhan commented on PR #21447:
URL: https://github.com/apache/flink/pull/21447#issuecomment-1367708053

   Hi, @1996fanrui . Sorry for the late review. I left a few comments to be improved regarding the last commit (ui). The rest looks good to me :)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@flink.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [flink] 1996fanrui commented on a diff in pull request #21447: [FLINK-30185] Provide the flame graph to the subtask level

Posted by GitBox <gi...@apache.org>.
1996fanrui commented on code in PR #21447:
URL: https://github.com/apache/flink/pull/21447#discussion_r1050345039


##########
flink-runtime/src/main/java/org/apache/flink/runtime/messages/ThreadInfoSample.java:
##########
@@ -32,25 +36,33 @@
  */
 public class ThreadInfoSample implements Serializable {
 
+    private final ExecutionAttemptID executionId;

Review Comment:
   Do you have any suggestions about it? When a TM has multiple slots, tm will return the thread result of all subtasks only once, JM needs to distinguish them, which subtask each ThreadInfoSample belongs to.
   
   How about using `<GenenicType> id` here? it can be any type, and not only serves for a flink execution.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@flink.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [flink] xintongsong commented on pull request #21447: [FLINK-30185] Provide the flame graph to the subtask level

Posted by GitBox <gi...@apache.org>.
xintongsong commented on PR #21447:
URL: https://github.com/apache/flink/pull/21447#issuecomment-1354188912

   > Do you mean that we get the result of subtask level from job-vertex-cache in the first step, right? And don't trigger any thread sampling when get the result of subtask level, right?
   
   I meant, regardless of operator / subtask level, we always use job-vertex cache, and always trigger sampling of all subtasks if the target vertex is not cached. The only difference would be which samples we use to assemble the flame graph, all or of a specific subtask.
   
   > In the second step, we get the result of subtask level from job-vertex cache and execution-vertex cache, and choose the newer one, right? And trigger the subtask thread sampling when the execution-vertex cache is empty or stale, right?
   
   Yes. It's just there should not be a newer execution-vertex cache if there's already a valid job-vertex cache. So checking the job-vertex cache first practically means choosing the newer one. Alternatively, you can also update both job-vertex cache and execution-vertex cache when retrieving samples of the entire operator. In this way, you can only check the execution-vertex cache upon a subtask level query.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@flink.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [flink] xintongsong commented on pull request #21447: [FLINK-30185] Provide the flame graph to the subtask level

Posted by GitBox <gi...@apache.org>.
xintongsong commented on PR #21447:
URL: https://github.com/apache/flink/pull/21447#issuecomment-1352862976

   BTW, I'm sorry that I didn't complete reviewing of all the code changes, because I soon run into the above mentioned issues.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@flink.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [flink] 1996fanrui commented on pull request #21447: [FLINK-30185] Provide the flame graph to the subtask level

Posted by GitBox <gi...@apache.org>.
1996fanrui commented on PR #21447:
URL: https://github.com/apache/flink/pull/21447#issuecomment-1352953783

   Hi @xintongsong , thanks a lot for your review.
   
   > There's a job-vertex cache and an execution-vertex cache. For the operator flame graph, we always use the job-vertex cache. For the subtask flame graph, we first check the job-vertex cache, then the execution-vertex cache.
   
   Is it typo? For the subtask flame graph, we should first check the `execution-vertex cache`, then the `job-vertex` cache, right?
   
   The `execution-vertex cache` just saves the result when requesting the flame graph for subtask level, right?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@flink.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [flink] xintongsong closed pull request #21447: [FLINK-30185] Provide the flame graph to the subtask level

Posted by GitBox <gi...@apache.org>.
xintongsong closed pull request #21447: [FLINK-30185] Provide the flame graph to the subtask level
URL: https://github.com/apache/flink/pull/21447


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@flink.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org