You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@spark.apache.org by do...@apache.org on 2020/11/05 05:59:09 UTC
[spark] branch branch-3.0 updated: [SPARK-33239][INFRA][3.0] Use
pre-built image at GitHub Action SparkR job
This is an automated email from the ASF dual-hosted git repository.
dongjoon pushed a commit to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git
The following commit(s) were added to refs/heads/branch-3.0 by this push:
new 14eb8b16 [SPARK-33239][INFRA][3.0] Use pre-built image at GitHub Action SparkR job
14eb8b16 is described below
commit 14eb8b164df5fdb3715b7212ba3f5b2e88ec7c53
Author: Dongjoon Hyun <dh...@apple.com>
AuthorDate: Wed Nov 4 21:56:21 2020 -0800
[SPARK-33239][INFRA][3.0] Use pre-built image at GitHub Action SparkR job
### What changes were proposed in this pull request?
This is a backport of https://github.com/apache/spark/pull/30066 .
This PR aims to use a pre-built image for Github Action SparkR job.
### Why are the changes needed?
This will reduce the execution time and the flakiness.
**BEFORE (branch-3.0: 21 minutes 7 seconds)**
![Screen Shot 2020-11-04 at 8 53 50 PM](https://user-images.githubusercontent.com/9700541/98199386-e39a1b80-1edf-11eb-8dec-c6819ebb3f0d.png)
**AFTER**
No R and R package installation steps.
### Does this PR introduce _any_ user-facing change?
No.
### How was this patch tested?
Pass the GitHub Action `sparkr` job in this PR.
Closes #30258 from dongjoon-hyun/SPARK-33239-3.0.
Authored-by: Dongjoon Hyun <dh...@apple.com>
Signed-off-by: Dongjoon Hyun <dh...@apple.com>
---
.github/workflows/build_and_test.yml | 79 ++++++++++++++++++++++++++++--------
1 file changed, 63 insertions(+), 16 deletions(-)
diff --git a/.github/workflows/build_and_test.yml b/.github/workflows/build_and_test.yml
index 7956d9e..9b4f41a 100644
--- a/.github/workflows/build_and_test.yml
+++ b/.github/workflows/build_and_test.yml
@@ -37,8 +37,6 @@ jobs:
streaming, sql-kafka-0-10, streaming-kafka-0-10,
mllib-local, mllib,
yarn, mesos, kubernetes, hadoop-cloud, spark-ganglia-lgpl
- - >-
- sparkr
# Here, we split Hive and SQL tests into some of slow ones and the rest of them.
included-tags: [""]
# Some tests are disabled in GitHun Actions. Ideally, we should remove this tag
@@ -131,20 +129,6 @@ jobs:
run: |
python3.8 -m pip install numpy 'pyarrow<3.0.0' pandas scipy xmlrunner
python3.8 -m pip list
- # SparkR
- - name: Install R 4.0
- uses: r-lib/actions/setup-r@v1
- if: contains(matrix.modules, 'sparkr')
- with:
- r-version: 4.0
- - name: Install R packages
- if: contains(matrix.modules, 'sparkr')
- run: |
- # qpdf is required to reduce the size of PDFs to make CRAN check pass. See SPARK-32497.
- sudo apt-get install -y libcurl4-openssl-dev qpdf
- sudo Rscript -e "install.packages(c('knitr', 'rmarkdown', 'testthat', 'devtools', 'e1071', 'survival', 'arrow', 'roxygen2'), repos='https://cloud.r-project.org/')"
- # Show installed packages in R.
- sudo Rscript -e 'pkg_list <- as.data.frame(installed.packages()[, c(1,3:4)]); pkg_list[is.na(pkg_list$Priority), 1:2, drop = FALSE]'
# Run the tests.
- name: Run tests
run: |
@@ -246,6 +230,69 @@ jobs:
name: unit-tests-log-${{ matrix.modules }}--1.8-hadoop2.7-hive2.3
path: "**/target/unit-tests.log"
+ sparkr:
+ name: Build modules - sparkr
+ runs-on: ubuntu-20.04
+ container:
+ image: dongjoon/apache-spark-github-action-image:20201025
+ env:
+ HADOOP_PROFILE: hadoop2.7
+ HIVE_PROFILE: hive2.3
+ GITHUB_PREV_SHA: ${{ github.event.before }}
+ steps:
+ - name: Checkout Spark repository
+ uses: actions/checkout@v2
+ # In order to fetch changed files
+ with:
+ fetch-depth: 0
+ # Cache local repositories. Note that GitHub Actions cache has a 2G limit.
+ - name: Cache Scala, SBT, Maven and Zinc
+ uses: actions/cache@v2
+ with:
+ path: |
+ build/apache-maven-*
+ build/zinc-*
+ build/scala-*
+ build/*.jar
+ key: build-${{ hashFiles('**/pom.xml', 'project/build.properties', 'build/mvn', 'build/sbt', 'build/sbt-launch-lib.bash', 'build/spark-build-info') }}
+ restore-keys: |
+ build-
+ - name: Cache Maven local repository
+ uses: actions/cache@v2
+ with:
+ path: ~/.m2/repository
+ key: sparkr-maven-${{ hashFiles('**/pom.xml') }}
+ restore-keys: |
+ sparkr-maven-
+ - name: Cache Ivy local repository
+ uses: actions/cache@v2
+ with:
+ path: ~/.ivy2/cache
+ key: sparkr-ivy-${{ hashFiles('**/pom.xml', '**/plugins.sbt') }}
+ restore-keys: |
+ sparkr-ivy-
+ - name: Run tests
+ run: |
+ mkdir -p ~/.m2
+ # The followings are also used by `r-lib/actions/setup-r` to avoid
+ # R issues at docker environment
+ export TZ=UTC
+ export _R_CHECK_SYSTEM_CLOCK_=FALSE
+ ./dev/run-tests --parallelism 2 --modules sparkr
+ rm -rf ~/.m2/repository/org/apache/spark
+ - name: Upload test results to report
+ if: always()
+ uses: actions/upload-artifact@v2
+ with:
+ name: test-results-sparkr--1.8-hadoop2.7-hive2.3
+ path: "**/target/test-reports/*.xml"
+ - name: Upload unit tests log files
+ if: failure()
+ uses: actions/upload-artifact@v2
+ with:
+ name: unit-tests-log-sparkr--1.8-hadoop2.7-hive2.3
+ path: "**/target/unit-tests.log"
+
# Static analysis, and documentation build
lint:
name: Linters, licenses, dependencies and documentation generation
---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@spark.apache.org
For additional commands, e-mail: commits-help@spark.apache.org