You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@spark.apache.org by do...@apache.org on 2021/08/24 20:30:53 UTC

[spark] branch master updated: [SPARK-36573][BUILD][TEST] Add a default value to ORACLE_DOCKER_IMAGE

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
     new e03afc9  [SPARK-36573][BUILD][TEST] Add a default value to ORACLE_DOCKER_IMAGE
e03afc9 is described below

commit e03afc906fd87b0354783b438fa9f7e36231b778
Author: Luca Canali <lu...@cern.ch>
AuthorDate: Tue Aug 24 13:30:21 2021 -0700

    [SPARK-36573][BUILD][TEST] Add a default value to ORACLE_DOCKER_IMAGE
    
    ### What changes were proposed in this pull request?
    Currently, the procedure to run the Oracle Integration Suite is based on building the Oracle RDBMS image from the Dockerfiles provided by Oracle.
    Recently, Oracle has started providing database images, see  https://container-registry.oracle.com
    Moreover an Oracle employee is maintaining Oracle XE images that are streamlined for testing at https://hub.docker.com/r/gvenzl/oracle-xe and https://github.com/gvenzl/oci-oracle-xe This solves the issue that official images are quite large and make testing resource-intensive and slow.
    This proposes to document the available options and to introduce a default value for ORACLE_DOCKER_IMAGE
    
    ### Why are the changes needed?
    This change will make it easier and faster to run the Oracle Integration Suite, removing the need to manually build an Oracle DB image.
    
    ### Does this PR introduce _any_ user-facing change?
    No
    
    ### How was this patch tested?
    Manually tested:
    ```
    export ENABLE_DOCKER_INTEGRATION_TESTS=1
    ./build/sbt -Pdocker-integration-tests "testOnly org.apache.spark.sql.jdbc.OracleIntegrationSuite"
    ./build/sbt -Pdocker-integration-tests "testOnly org.apache.spark.sql.jdbc.v2.OracleIntegrationSuite"
    ```
    
    Closes #33821 from LucaCanali/oracleDockerIntegration.
    
    Authored-by: Luca Canali <lu...@cern.ch>
    Signed-off-by: Dongjoon Hyun <do...@apache.org>
---
 .github/workflows/build_and_test.yml               | 20 +----------
 .../spark/sql/jdbc/OracleIntegrationSuite.scala    | 40 ++++++++++++----------
 .../spark/sql/jdbc/v2/OracleIntegrationSuite.scala | 40 ++++++++++++----------
 3 files changed, 45 insertions(+), 55 deletions(-)

diff --git a/.github/workflows/build_and_test.yml b/.github/workflows/build_and_test.yml
index 85a26b5..77b7111 100644
--- a/.github/workflows/build_and_test.yml
+++ b/.github/workflows/build_and_test.yml
@@ -686,7 +686,7 @@ jobs:
       HIVE_PROFILE: hive2.3
       GITHUB_PREV_SHA: ${{ github.event.before }}
       SPARK_LOCAL_IP: localhost
-      ORACLE_DOCKER_IMAGE_NAME: oracle/database:18.4.0-xe
+      ORACLE_DOCKER_IMAGE_NAME: gvenzl/oracle-xe:18.4.0
       SKIP_MIMA: true
     steps:
     - name: Checkout Spark repository
@@ -724,24 +724,6 @@ jobs:
       uses: actions/setup-java@v1
       with:
         java-version: 8
-    - name: Cache Oracle docker-images repository
-      id: cache-oracle-docker-images
-      uses: actions/cache@v2
-      with:
-        path: ./oracle/docker-images
-        # key should contains the commit hash of the Oracle docker images to be checkout.
-        key: oracle-docker-images-3f422c4a35b423dfcdbcc57a84f01db6c82eb6c1
-    - name: Checkout Oracle docker-images repository
-      uses: actions/checkout@v2
-      with:
-        fetch-depth: 0
-        repository: oracle/docker-images
-        ref: 3f422c4a35b423dfcdbcc57a84f01db6c82eb6c1
-        path: ./oracle/docker-images
-    - name: Install Oracle Docker image
-      run: |
-        cd oracle/docker-images/OracleDatabase/SingleInstance/dockerfiles
-        ./buildContainerImage.sh -v 18.4.0 -x
     - name: Run tests
       run: |
         ./dev/run-tests --parallelism 1 --modules docker-integration-tests --included-tags org.apache.spark.tags.DockerTest
diff --git a/external/docker-integration-tests/src/test/scala/org/apache/spark/sql/jdbc/OracleIntegrationSuite.scala b/external/docker-integration-tests/src/test/scala/org/apache/spark/sql/jdbc/OracleIntegrationSuite.scala
index eb8d286..8972c53 100644
--- a/external/docker-integration-tests/src/test/scala/org/apache/spark/sql/jdbc/OracleIntegrationSuite.scala
+++ b/external/docker-integration-tests/src/test/scala/org/apache/spark/sql/jdbc/OracleIntegrationSuite.scala
@@ -34,42 +34,46 @@ import org.apache.spark.sql.types._
 import org.apache.spark.tags.DockerTest
 
 /**
- * The following would be the steps to test this
- * 1. Build Oracle database in Docker, please refer below link about how to.
- *    https://github.com/oracle/docker-images/blob/master/OracleDatabase/SingleInstance/README.md
- * 2. export ORACLE_DOCKER_IMAGE_NAME=$ORACLE_DOCKER_IMAGE_NAME
- *    Pull oracle $ORACLE_DOCKER_IMAGE_NAME image - docker pull $ORACLE_DOCKER_IMAGE_NAME
- * 3. Start docker - sudo service docker start
- * 4. Run spark test - ./build/sbt -Pdocker-integration-tests
- *    "testOnly org.apache.spark.sql.jdbc.OracleIntegrationSuite"
+ * The following are the steps to test this:
  *
- * An actual sequence of commands to run the test is as follows
+ * 1. Choose to use a prebuilt image or build Oracle database in a container
+ *    - The documentation on how to build Oracle RDBMS in a container is at
+ *      https://github.com/oracle/docker-images/blob/master/OracleDatabase/SingleInstance/README.md
+ *    - Official Oracle container images can be found at https://container-registry.oracle.com
+ *    - A trustable and streamlined Oracle XE database image can be found on Docker Hub at
+ *      https://hub.docker.com/r/gvenzl/oracle-xe see also https://github.com/gvenzl/oci-oracle-xe
+ * 2. Run: export ORACLE_DOCKER_IMAGE_NAME=image_you_want_to_use_for_testing
+ *    - Example: export ORACLE_DOCKER_IMAGE_NAME=gvenzl/oracle-xe:latest
+ * 3. Run: export ENABLE_DOCKER_INTEGRATION_TESTS=1
+ * 4. Start docker: sudo service docker start
+ *    - Optionally, docker pull $ORACLE_DOCKER_IMAGE_NAME
+ * 5. Run Spark integration tests for Oracle with: ./build/sbt -Pdocker-integration-tests
+ *    "testOnly org.apache.spark.sql.jdbc.OracleIntegrationSuite"
  *
+ * A sequence of commands to build the Oracle XE database container image:
  *  $ git clone https://github.com/oracle/docker-images.git
- *  // Head SHA: 3f422c4a35b423dfcdbcc57a84f01db6c82eb6c1
  *  $ cd docker-images/OracleDatabase/SingleInstance/dockerfiles
  *  $ ./buildContainerImage.sh -v 18.4.0 -x
  *  $ export ORACLE_DOCKER_IMAGE_NAME=oracle/database:18.4.0-xe
- *  $ export ENABLE_DOCKER_INTEGRATION_TESTS=1
- *  $ cd $SPARK_HOME
- *  $ ./build/sbt -Pdocker-integration-tests
- *    "testOnly org.apache.spark.sql.jdbc.OracleIntegrationSuite"
  *
- * It has been validated with 18.4.0 Express Edition.
+ * This procedure has been validated with Oracle 18.4.0 Express Edition.
  */
 @DockerTest
 class OracleIntegrationSuite extends DockerJDBCIntegrationSuite with SharedSparkSession {
   import testImplicits._
 
   override val db = new DatabaseOnDocker {
-    lazy override val imageName = sys.env("ORACLE_DOCKER_IMAGE_NAME")
+    lazy override val imageName =
+      sys.env.getOrElse("ORACLE_DOCKER_IMAGE_NAME", "gvenzl/oracle-xe:18.4.0")
+    val oracle_password = "Th1s1sThe0racle#Pass"
     override val env = Map(
-      "ORACLE_PWD" -> "oracle"
+      "ORACLE_PWD" -> oracle_password,      // oracle images uses this
+      "ORACLE_PASSWORD" -> oracle_password  // gvenzl/oracle-xe uses this
     )
     override val usesIpc = false
     override val jdbcPort: Int = 1521
     override def getJdbcUrl(ip: String, port: Int): String =
-      s"jdbc:oracle:thin:system/oracle@//$ip:$port/xe"
+      s"jdbc:oracle:thin:system/$oracle_password@//$ip:$port/xe"
   }
 
   override val connectionTimeout = timeout(7.minutes)
diff --git a/external/docker-integration-tests/src/test/scala/org/apache/spark/sql/jdbc/v2/OracleIntegrationSuite.scala b/external/docker-integration-tests/src/test/scala/org/apache/spark/sql/jdbc/v2/OracleIntegrationSuite.scala
index 45d793a..ef8fe53 100644
--- a/external/docker-integration-tests/src/test/scala/org/apache/spark/sql/jdbc/v2/OracleIntegrationSuite.scala
+++ b/external/docker-integration-tests/src/test/scala/org/apache/spark/sql/jdbc/v2/OracleIntegrationSuite.scala
@@ -29,41 +29,45 @@ import org.apache.spark.sql.types._
 import org.apache.spark.tags.DockerTest
 
 /**
- * The following would be the steps to test this
- * 1. Build Oracle database in Docker, please refer below link about how to.
- *    https://github.com/oracle/docker-images/blob/master/OracleDatabase/SingleInstance/README.md
- * 2. export ORACLE_DOCKER_IMAGE_NAME=$ORACLE_DOCKER_IMAGE_NAME
- *    Pull oracle $ORACLE_DOCKER_IMAGE_NAME image - docker pull $ORACLE_DOCKER_IMAGE_NAME
- * 3. Start docker - sudo service docker start
- * 4. Run spark test - ./build/sbt -Pdocker-integration-tests
- *    "testOnly org.apache.spark.sql.jdbc.v2.OracleIntegrationSuite"
+ * The following are the steps to test this:
  *
- * An actual sequence of commands to run the test is as follows
+ * 1. Choose to use a prebuilt image or build Oracle database in a container
+ *    - The documentation on how to build Oracle RDBMS in a container is at
+ *      https://github.com/oracle/docker-images/blob/master/OracleDatabase/SingleInstance/README.md
+ *    - Official Oracle container images can be found at https://container-registry.oracle.com
+ *    - A trustable and streamlined Oracle XE database image can be found on Docker Hub at
+ *      https://hub.docker.com/r/gvenzl/oracle-xe see also https://github.com/gvenzl/oci-oracle-xe
+ * 2. Run: export ORACLE_DOCKER_IMAGE_NAME=image_you_want_to_use_for_testing
+ *    - Example: export ORACLE_DOCKER_IMAGE_NAME=gvenzl/oracle-xe:latest
+ * 3. Run: export ENABLE_DOCKER_INTEGRATION_TESTS=1
+ * 4. Start docker: sudo service docker start
+ *    - Optionally, docker pull $ORACLE_DOCKER_IMAGE_NAME
+ * 5. Run Spark integration tests for Oracle with: ./build/sbt -Pdocker-integration-tests
+ *    "testOnly org.apache.spark.sql.jdbc.v2.OracleIntegrationSuite"
  *
+ * A sequence of commands to build the Oracle XE database container image:
  *  $ git clone https://github.com/oracle/docker-images.git
- *  // Head SHA: 3f422c4a35b423dfcdbcc57a84f01db6c82eb6c1
  *  $ cd docker-images/OracleDatabase/SingleInstance/dockerfiles
  *  $ ./buildContainerImage.sh -v 18.4.0 -x
  *  $ export ORACLE_DOCKER_IMAGE_NAME=oracle/database:18.4.0-xe
- *  $ export ENABLE_DOCKER_INTEGRATION_TESTS=1
- *  $ cd $SPARK_HOME
- *  $ ./build/sbt -Pdocker-integration-tests
- *    "testOnly org.apache.spark.sql.jdbc.v2.OracleIntegrationSuite"
  *
- * It has been validated with 18.4.0 Express Edition.
+ * This procedure has been validated with Oracle 18.4.0 Express Edition.
  */
 @DockerTest
 class OracleIntegrationSuite extends DockerJDBCIntegrationSuite with V2JDBCTest {
   override val catalogName: String = "oracle"
   override val db = new DatabaseOnDocker {
-    lazy override val imageName = sys.env("ORACLE_DOCKER_IMAGE_NAME")
+    lazy override val imageName =
+      sys.env.getOrElse("ORACLE_DOCKER_IMAGE_NAME", "gvenzl/oracle-xe:18.4.0")
+    val oracle_password = "Th1s1sThe0racle#Pass"
     override val env = Map(
-      "ORACLE_PWD" -> "oracle"
+      "ORACLE_PWD" -> oracle_password,      // oracle images uses this
+      "ORACLE_PASSWORD" -> oracle_password  // gvenzl/oracle-xe uses this
     )
     override val usesIpc = false
     override val jdbcPort: Int = 1521
     override def getJdbcUrl(ip: String, port: Int): String =
-      s"jdbc:oracle:thin:system/oracle@//$ip:$port/xe"
+      s"jdbc:oracle:thin:system/$oracle_password@//$ip:$port/xe"
   }
 
   override def sparkConf: SparkConf = super.sparkConf

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@spark.apache.org
For additional commands, e-mail: commits-help@spark.apache.org