You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@arrow.apache.org by cu...@apache.org on 2018/07/27 22:41:32 UTC

[arrow] branch master updated: ARROW-2923: [DOC] Adding Apache Spark integration test instructions

This is an automated email from the ASF dual-hosted git repository.

cutlerb pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/arrow.git


The following commit(s) were added to refs/heads/master by this push:
     new 432dd93  ARROW-2923: [DOC] Adding Apache Spark integration test instructions
432dd93 is described below

commit 432dd936e97bfbf4c9b4a4536d8b0267ffc97074
Author: Bryan Cutler <cu...@gmail.com>
AuthorDate: Fri Jul 27 15:41:24 2018 -0700

    ARROW-2923: [DOC] Adding Apache Spark integration test instructions
    
    This adds instructions to dev/README for running the docker based Spark integration tests
    
    Author: Bryan Cutler <cu...@gmail.com>
    
    Closes #2333 from BryanCutler/doc-spark-integration-instr-ARROW-2923 and squashes the following commits:
    
    8af4119 <Bryan Cutler> typo
    79643a8 <Bryan Cutler> Added instructions to dev/README
---
 dev/README.md | 27 +++++++++++++++++++++++++++
 1 file changed, 27 insertions(+)

diff --git a/dev/README.md b/dev/README.md
index 971fb5f..276d75f 100644
--- a/dev/README.md
+++ b/dev/README.md
@@ -128,3 +128,30 @@ bash dev/release/js-verify-release-candidate.sh 0.7.0 0
 ```shell
 run_docker_compose.sh hdfs_integration
 ```
+
+## Apache Spark Integration Tests
+
+Tests can be run to ensure that the current snapshot of Java and Python Arrow
+works with Spark. This will run a docker image to build Arrow C++
+and Python in a Conda environment, build and install Arrow Java to the local
+Maven repositiory, build Spark with the new Arrow artifact, and run Arrow
+related unit tests in Spark for Java and Python. Any errors will exit with a
+non-zero value. To run, use the following command:
+
+```shell
+./run_docker_compose.sh spark_integration
+
+```
+
+Alternatively, you can build and run the Docker images seperately. If you
+already are building Spark, these commands will map your local Maven repo
+to the image and save time by not having to download all dependencies. These
+should be run in a directory one level up from your Arrow repository:
+
+```shell
+docker build -f arrow/dev/spark_integration/Dockerfile -t spark-arrow .
+docker run -v $HOME/.m2:/root/.m2 spark-arrow
+```
+
+NOTE: If the Java API has breaking changes, a patched version of Spark might
+need to be used to successfully build.