You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "jiaan.geng (Jira)" <ji...@apache.org> on 2020/05/26 09:30:00 UTC

[jira] [Created] (SPARK-31823) Improve the current Spark Scheduler test framework

jiaan.geng created SPARK-31823:
----------------------------------

             Summary: Improve the current Spark Scheduler test framework
                 Key: SPARK-31823
                 URL: https://issues.apache.org/jira/browse/SPARK-31823
             Project: Spark
          Issue Type: Improvement
          Components: Spark Core
    Affects Versions: 3.1.0
            Reporter: jiaan.geng


The major source of Spark Scheduler unit test cases are DAGSchedulerSuite、TaskSchedulerImplSuite、TaskSetManagerSuite. These test suites have played an important role to ensure the Spark Scheduler behaves as we expected, however, we should significantly improve these suites to provide better organized and more extendable test framework now, to further support the evolution of the Spark Scheduler.

The major limitations of the current Spark Scheduler test framework:
* The test framework was designed at the very early stage of Spark, so it doesn’t integrate well with the features introduced later, e.g. barrier execution, indeterminate stage, zombie taskset, resource profile.
* Many test cases are added in a hacky way, don’t fully utilize or expend the original test framework (while they could have been), this leads to a heavy maintenance burden.
* The test cases are not organized well, many test cases are appended case by case, each test file consists of thousands of LOCs.
* Frequently introducing flaky test cases because there is no standard way to generate test data and verify the result.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org