You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@spark.apache.org by do...@apache.org on 2021/01/23 16:04:59 UTC

[spark] branch branch-3.0 updated: [SPARK-34202][SQL][TEST] Add ability to fetch spark release package from internal environment in HiveExternalCatalogVersionsSuite

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a commit to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.0 by this push:
     new 224b7fa  [SPARK-34202][SQL][TEST] Add ability to fetch spark release package from internal environment in HiveExternalCatalogVersionsSuite
224b7fa is described below

commit 224b7fa1dcea8bf00012f4eb3454a626052e28ea
Author: yangjie01 <ya...@baidu.com>
AuthorDate: Sat Jan 23 08:02:52 2021 -0800

    [SPARK-34202][SQL][TEST] Add ability to fetch spark release package from internal environment in HiveExternalCatalogVersionsSuite
    
    ### What changes were proposed in this pull request?
    `HiveExternalCatalogVersionsSuite` can't run in orgs internal environment where access to outside internet is not allowed because `HiveExternalCatalogVersionsSuite` will download spark release package from internet.
    
    Similar to SPARK-32998, this pr add 1 environment variables `SPARK_RELEASE_MIRROR` to let user can specify an accessible download address of spark release package and run `HiveExternalCatalogVersionsSuite`  in orgs internal environment.
    
    ### Why are the changes needed?
    Let `HiveExternalCatalogVersionsSuite` can run in orgs internal environment without relying on external spark release download address.
    
    ### Does this PR introduce _any_ user-facing change?
    No
    
    ### How was this patch tested?
    
    - Pass the Jenkins or GitHub Action
    
    - Manual test  with and without env variables set in internal environment can't access internet.
    
    execute
    ```
    mvn clean install -Dhadoop-3.2 -Phive-2.3 -Phadoop-cloud -Pmesos -Pyarn -Pkinesis-asl -Phive-thriftserver -Pspark-ganglia-lgpl -Pkubernetes -PhPhive -pl  sql/hive -am -DskipTests
    
    mvn clean install -Dhadoop-3.2 -Phive-2.3 -Phadoop-cloud -Pmesos -Pyarn -Pkinesis-asl -Phive-thriftserver -Pspark-ganglia-lgpl -Pkubernetes -PhPhive -pl  sql/hive -DwildcardSuites=org.apache.spark.sql.hive.HiveExternalCatalogVersionsSuite -Dtest=none
    ```
    
    **Without env**
    
    ```
    HiveExternalCatalogVersionsSuite:
    19:50:35.123 WARN org.apache.spark.sql.hive.HiveExternalCatalogVersionsSuite: Failed to download Spark 3.0.1 from https://archive.apache.org/dist/spark/spark-3.0.1/spark-3.0.1-bin-hadoop3.2.tgz: Network is unreachable (connect failed)
    19:50:35.126 WARN org.apache.spark.sql.hive.HiveExternalCatalogVersionsSuite: Failed to download Spark 3.0.1 from https://dist.apache.org/repos/dist/release/spark/spark-3.0.1/spark-3.0.1-bin-hadoop3.2.tgz: Network is unreachable (connect failed)
    org.apache.spark.sql.hive.HiveExternalCatalogVersionsSuite *** ABORTED ***
      Exception encountered when invoking run on a nested suite - Unable to download Spark 3.0.1 (HiveExternalCatalogVersionsSuite.scala:125)
    Run completed in 2 seconds, 669 milliseconds.
    Total number of tests run: 0
    Suites: completed 1, aborted 1
    Tests: succeeded 0, failed 0, canceled 0, ignored 0, pending 0
    ```
    
    **With env**
    
    ```
    export SPARK_RELEASE_MIRROR=${spark-release.internal.com}/dist/release/
    ```
    
    ```
    HiveExternalCatalogVersionsSuite
    - backward compatibility
    Run completed in 1 minute, 32 seconds.
    Total number of tests run: 1
    Suites: completed 2, aborted 0
    Tests: succeeded 1, failed 0, canceled 0, ignored 0, pending 0
    All tests passed.
    ```
    
    Closes #31294 from LuciferYang/SPARK-34202.
    
    Authored-by: yangjie01 <ya...@baidu.com>
    Signed-off-by: Dongjoon Hyun <dh...@apple.com>
    (cherry picked from commit e48a8ad1a20446fcaaee6750084faa273028df3d)
    Signed-off-by: Dongjoon Hyun <dh...@apple.com>
---
 .../org/apache/spark/sql/hive/HiveExternalCatalogVersionsSuite.scala | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/sql/hive/src/test/scala/org/apache/spark/sql/hive/HiveExternalCatalogVersionsSuite.scala b/sql/hive/src/test/scala/org/apache/spark/sql/hive/HiveExternalCatalogVersionsSuite.scala
index b81b7e8..bbbc705 100644
--- a/sql/hive/src/test/scala/org/apache/spark/sql/hive/HiveExternalCatalogVersionsSuite.scala
+++ b/sql/hive/src/test/scala/org/apache/spark/sql/hive/HiveExternalCatalogVersionsSuite.scala
@@ -230,14 +230,15 @@ class HiveExternalCatalogVersionsSuite extends SparkSubmitTestUtils {
 }
 
 object PROCESS_TABLES extends QueryTest with SQLTestUtils {
-  val releaseMirror = "https://dist.apache.org/repos/dist/release"
+  val releaseMirror = sys.env.getOrElse("SPARK_RELEASE_MIRROR",
+    "https://dist.apache.org/repos/dist/release")
   // Tests the latest version of every release line.
   val testingVersions: Seq[String] = {
     import scala.io.Source
     val versions: Seq[String] = try {
       Source.fromURL(s"${releaseMirror}/spark").mkString
         .split("\n")
-        .filter(_.contains("""<li><a href="spark-"""))
+        .filter(_.contains("""<a href="spark-"""))
         .filterNot(_.contains("preview"))
         .map("""<a href="spark-(\d.\d.\d)/">""".r.findFirstMatchIn(_).get.group(1))
         .filter(_ < org.apache.spark.SPARK_VERSION)


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@spark.apache.org
For additional commands, e-mail: commits-help@spark.apache.org