You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by GitBox <gi...@apache.org> on 2022/06/16 08:49:30 UTC

[GitHub] [spark] EnricoMi opened a new pull request, #36888: [CI] Check if tests are to be run for build and pyspark matrix jobs

EnricoMi opened a new pull request, #36888:
URL: https://github.com/apache/spark/pull/36888

   ### What changes were proposed in this pull request?
   This makes the CI aware when instances of matrix jobs `build` and `pyspark` do actually not have any changes and there do not run any tests. Then, uploading files can be skipped as well, allowing to enable #36413.
   
   ### Why are the changes needed?
   All instances of matrix jobs `build` and `pyspark` are skipped if none of their respective modules have changes. When changes exists, all instances are run, but some instances then do not run any tests, as their sub-set of modules do not have changes.
   
   ### Does this PR introduce _any_ user-facing change?
   No
   
   ### How was this patch tested?
   Tested on GitHub Actions.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] EnricoMi commented on pull request #36888: [CI] Check if tests are to be run for build and pyspark matrix jobs

Posted by GitBox <gi...@apache.org>.
EnricoMi commented on PR #36888:
URL: https://github.com/apache/spark/pull/36888#issuecomment-1158568912

   Yes, the idea was to define the strategy matrix as JSON in `precondition`. GitHub gives this working example:
   https://docs.github.com/en/actions/learn-github-actions/expressions#fromjson
   
   Then, module sets would be defined once in one place. There, it could run `is-changed.py` for each of the module sets and add that to the JSON. Here is a sketch: https://docs.github.com/en/actions/learn-github-actions/expressions#fromjson
   
   Ideally, individual jobs of the `build` and `pyspark` matrix then would be skipped as easily as this:
   ```yaml
   build:
     strategy:
       matrix: ${{ fromJSON( needs.precondition.outputs.spark-matrix ) }}
     if: matrix.run != 'false' && matrix.changed != 'false'
   ```
   
   But this is not allowed, `matrix` does not work in the `if` clause: https://docs.github.com/en/actions/learn-github-actions/contexts#context-availability


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] HyukjinKwon commented on pull request #36888: [CI] Check if tests are to be run for build and pyspark matrix jobs

Posted by GitBox <gi...@apache.org>.
HyukjinKwon commented on PR #36888:
URL: https://github.com/apache/spark/pull/36888#issuecomment-1157560112

   Yeah, would that need a very big change? If we can make the change surgically, that would be great.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] EnricoMi commented on pull request #36888: [CI] Check if tests are to be run for build and pyspark matrix jobs

Posted by GitBox <gi...@apache.org>.
EnricoMi commented on PR #36888:
URL: https://github.com/apache/spark/pull/36888#issuecomment-1157483928

   Shall I try to sketch out how that could look like?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] HyukjinKwon commented on pull request #36888: [CI] Check if tests are to be run for build and pyspark matrix jobs

Posted by GitBox <gi...@apache.org>.
HyukjinKwon commented on PR #36888:
URL: https://github.com/apache/spark/pull/36888#issuecomment-1158381431

   > I wouldn't call it surgically, more a rework :-/
   
   Ah, now I got your point. So `precondition` defines the modules to build/test, and the following jobs actully test/build those modules (from `precondition`)   .... I think it sounds like it would be a great improvement.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #36888: [CI] Check if tests are to be run for build and pyspark matrix jobs

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on PR #36888:
URL: https://github.com/apache/spark/pull/36888#issuecomment-1158430892

   Can one of the admins verify this patch?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] EnricoMi commented on pull request #36888: [CI] Check if tests are to be run for build and pyspark matrix jobs

Posted by GitBox <gi...@apache.org>.
EnricoMi commented on PR #36888:
URL: https://github.com/apache/spark/pull/36888#issuecomment-1157706632

   I wouldn't call it surgically, more a rework :-/


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] EnricoMi commented on pull request #36888: [CI] Check if tests are to be run for build and pyspark matrix jobs

Posted by GitBox <gi...@apache.org>.
EnricoMi commented on PR #36888:
URL: https://github.com/apache/spark/pull/36888#issuecomment-1160553251

   @HyukjinKwon here is an improved version: https://github.com/G-Research/spark/pull/9/files
   
   It is based on https://github.com/apache/spark/pull/36928.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] HyukjinKwon commented on pull request #36888: [CI] Check if tests are to be run for build and pyspark matrix jobs

Posted by GitBox <gi...@apache.org>.
HyukjinKwon commented on PR #36888:
URL: https://github.com/apache/spark/pull/36888#issuecomment-1157480526

   Yeah, I think ideally all has to be skipped properly at precondition, and your previous fix makes sense.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] EnricoMi commented on pull request #36888: [CI] Check if tests are to be run for build and pyspark matrix jobs

Posted by GitBox <gi...@apache.org>.
EnricoMi commented on PR #36888:
URL: https://github.com/apache/spark/pull/36888#issuecomment-1157458123

   @HyukjinKwon do you agree that ideally, the subset of modules build and tested in matrix jobs should be defined in `precondition` job, where it can check each subset for changes and then configure the matrix strategy and skip respective jobs? Than all this magic happens in one place.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] EnricoMi commented on pull request #36888: [CI] Check if tests are to be run for build and pyspark matrix jobs

Posted by GitBox <gi...@apache.org>.
EnricoMi commented on PR #36888:
URL: https://github.com/apache/spark/pull/36888#issuecomment-1158130503

   This is the most surgically I could get. Individual matrix jobs cannot be skipped (which I would have preferred) because the `job.<job-id>.if` field cannot evaluate `matrix`, so there is no way to have individual conditions (e.g. sql being skipped entirely while core being tested).
   
   If you are happy with it, I'll create a corresponding JIRA. If not, please feel free to close this.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] EnricoMi closed pull request #36888: [CI] Check if tests are to be run for build and pyspark matrix jobs

Posted by GitBox <gi...@apache.org>.
EnricoMi closed pull request #36888: [CI] Check if tests are to be run for build and pyspark matrix jobs
URL: https://github.com/apache/spark/pull/36888


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org