You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@spark.apache.org by "Josh Rosen (JIRA)" <ji...@apache.org> on 2015/06/18 03:14:00 UTC

[jira] [Created] (SPARK-8422) Introduce a module abstraction inside of dev/run-tests

Josh Rosen created SPARK-8422:
---------------------------------

             Summary: Introduce a module abstraction inside of dev/run-tests
                 Key: SPARK-8422
                 URL: https://issues.apache.org/jira/browse/SPARK-8422
             Project: Spark
          Issue Type: Sub-task
          Components: Build, Project Infra
            Reporter: Josh Rosen
            Assignee: Josh Rosen


At a high level, we have Spark modules / components which

1. are affected / impacted by file changes (e.g. a module is associated with a set of source files, so changes to those files change the module),
2. contain a set of tests to run, which are triggered via Maven, SBT, or via Python / R scripts.
3. depend on other modules and have dependent modules: if we change core, then every downstream test should be run.  If we change only MLlib, then we can skip the SQL tests but should probably run the Python MLlib tests, etc.

Right now, the per-module logic is spread across a few different places inside of the {{dev/run-tests}} script: we have one function that describes how to detect changes for all modules, another function that (implicitly) deals with module dependencies, etc.

Instead, I propose that we introduce a class for describing a module, use instances of this class to build up a dependency graph, then phrase the "find which tests to run" operations in terms of that graph.  I think that this will be easier to understand / maintain.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org