You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by "陈梓立 (JIRA)" <ji...@apache.org> on 2018/09/12 05:49:00 UTC

[jira] [Created] (FLINK-10320) Introduce JobMaster schedule micro-benchmark

陈梓立 created FLINK-10320:
---------------------------

             Summary: Introduce JobMaster schedule micro-benchmark
                 Key: FLINK-10320
                 URL: https://issues.apache.org/jira/browse/FLINK-10320
             Project: Flink
          Issue Type: Improvement
          Components: Tests
            Reporter: 陈梓立
            Assignee: 陈梓立


Based on {{org.apache.flink.streaming.runtime.io.benchmark}} stuff and the repo [flink-benchmark|https://github.com/dataArtisans/flink-benchmarks], I proposal to introduce another micro-benchmark which focuses on {{JobMaster}} schedule performance

h3. Target
Benchmark how long from {{JobMaster}} startup(receive the {{JobGraph}} and init) to all tasks RUNNING. Technically we use bounded stream and TM finishes tasks as soon as they arrived. So the real interval we measure is to all tasks FINISHED.

h3. Case
1. JobGraph that cover EAGER + PIPELINED edges
2. JobGraph that cover LAZY_FROM_SOURCES + PIPELINED edges
3. JobGraph that cover LAZY_FROM_SOURCES + BLOCKING edges
ps: maybe benchmark if the source is get from {{InputSplit}}?

h3. Implement
Based on the flink-benchmark repo, we finally run benchmark using jmh. So the whole test suit is separated into two repos. The testing environment could be located in the main repo, maybe under flink-runtime/src/test/java/org/apache/flink/runtime/jobmaster/benchmark.
To measure the performance of {{JobMaster}} scheduling, we need to simulate an environment that:
1. has a real {{JobMaster}}
2. has a mock/testing {{ResourceManager}} that having infinite resource and react immediately.
3. has a(many?) mock/testing {{TaskExecutor}} that deploy and finish tasks immediately.

[~trohrmann@apache.org] [~GJL] [~pnowojski] could you please review this proposal to help clarify the goal and concrete details? Thanks in advance.

Any suggestions are welcome.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)