You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@spark.apache.org by ya...@apache.org on 2023/10/19 01:43:27 UTC

[spark] branch master updated: [SPARK-45585][TEST] Fix time format and redirection issues in SparkSubmit tests

This is an automated email from the ASF dual-hosted git repository.

yao pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
     new a14f90941ca [SPARK-45585][TEST] Fix time format and redirection issues in SparkSubmit tests
a14f90941ca is described below

commit a14f90941caf06e2d77789a3952dd588e6900b90
Author: Kent Yao <ya...@apache.org>
AuthorDate: Thu Oct 19 09:43:13 2023 +0800

    [SPARK-45585][TEST] Fix time format and redirection issues in SparkSubmit tests
    
    ### What changes were proposed in this pull request?
    
    This PR fixes:
    
    - The deviation from `new Timestamp(new Date().getTime)` and log4j2 date format pattern from sub spark-submit progress
    ```
    2023-10-17 03:58:48.275 - stderr> 23/10/17 18:58:48 INFO StandaloneSchedulerBackend: Connected to Spark cluster with app ID app-20231017185848-0000
    2023-10-17 03:58:48.278 - stderr> 23/10/17 18:58:48 INFO Utils: Successfully started service 'org.apache.spark.network.netty.NettyBlockTransferService' on port 57637.
    ```
    - The duplication of `new Timestamp(new Date().getTime)` when using logInfo instead of println
    ```
    23/10/17 19:02:34.392 Thread-5 INFO SparkShellSuite: 2023-10-17 04:02:34.392 - stderr> 23/10/17 19:02:34 WARN Utils: Your hostname, hulk.local resolves to a loopback address: 127.0.0.1; using 10.221.103.23 instead (on interface en0)
    23/10/17 19:02:34.393 Thread-5 INFO SparkShellSuite: 2023-10-17 04:02:34.393 - stderr> 23/10/17 19:02:34 WARN Utils: Set SPARK_LOCAL_IP if you need to bind to another address
    
    ```
    - Correctly redirects sub spark-submit progress logs to unit-tests.log
    
    ### Why are the changes needed?
    
    test fixes
    
    ### Does this PR introduce _any_ user-facing change?
    
    no
    
    ### How was this patch tested?
    
    - WholeStageCodegenSparkSubmitSuite - before
    
    ```
    18:58:53.882 shutdown-hook-0 INFO ShutdownHookManager: Shutdown hook called
    18:58:53.882 shutdown-hook-0 INFO ShutdownHookManager: Deleting directory /Users/hzyaoqin/spark/target/tmp/spark-ecd53d47-d109-4ddc-80dd-2d829f34371e
    11:58:18.892 pool-1-thread-1 WARN Utils: Your hostname, hulk.local resolves to a loopback address: 127.0.0.1; using 10.221.103.23 instead (on interface en0)
    11:58:18.893 pool-1-thread-1 WARN Utils: Set SPARK_LOCAL_IP if you need to bind to another address
    11:58:18.932 pool-1-thread-1-ScalaTest-running-WholeStageCodegenSparkSubmitSuite INFO WholeStageCodegenSparkSubmitSuite:
    ```
    
    - WholeStageCodegenSparkSubmitSuite - after
    ```
    ===== TEST OUTPUT FOR o.a.s.sql.execution.WholeStageCodegenSparkSubmitSuite: 'Generated code on driver should not embed platform-specific constant' =====
    
    11:58:19.882 Thread-6 INFO WholeStageCodegenSparkSubmitSuite: stderr> 23/10/18 11:58:19 WARN Utils: Your hostname, hulk.local resolves to a loopback address: 127.0.0.1; using 10.221.103.23 instead (on interface en0)
    11:58:19.883 Thread-6 INFO WholeStageCodegenSparkSubmitSuite: stderr> 23/10/18 11:58:19 WARN Utils: Set SPARK_LOCAL_IP if you need to bind to another address
    11:58:20.195 Thread-6 INFO WholeStageCodegenSparkSubmitSuite: stderr> 23/10/18 11:58:20 INFO SparkContext: Running Spark version 4.0.0-SNAPSHOT
    11:58:20.195 Thread-6 INFO WholeStageCodegenSparkSubmitSuite: stderr> 23/10/18 11:58:20 INFO SparkContext: OS info Mac OS X, 13.4, aarch64
    11:58:20.195 Thread-6 INFO WholeStageCodegenSparkSubmitSuite: stderr> 23/10/18 11:58:20 INFO SparkContext: Java version 17.0.8
    11:58:20.227 Thread-6 INFO WholeStageCodegenSparkSubmitSuite: stderr> 23/10/18 11:58:20 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
    11:58:20.253 Thread-6 INFO WholeStageCodegenSparkSubmitSuite: stderr> 23/10/18 11:58:20 INFO ResourceUtils: ==============================================================
    11:58:20.253 Thread-6 INFO WholeStageCodegenSparkSubmitSuite: stderr> 23/10/18 11:58:20 INFO ResourceUtils: No custom resources configured for spark.driver.
    11:58:20.253 Thread-6 INFO WholeStageCodegenSparkSubmitSuite: stderr> 23/10/18 11:58:20 INFO ResourceUtils: ==============================================================
    11:58:20.254 Thread-6 INFO WholeStageCodegenSparkSubmitSuite: stderr> 23/10/18 11:58:20 INFO SparkContext: Submitted application: org.apache.spark.sql.execution.WholeStageCodegenSparkSubmitSuite
    11:58:20.266 Thread-6 INFO WholeStageCodegenSparkSubmitSuite: stderr> 23/10/18 11:58:20 INFO ResourceProfile: Default ResourceProfile created, executor resources: Map(memory -> name: memory, amount: 1024, script: , vendor: , offHeap -> name: offHeap, amount: 0, script: , vendor: ), task resources: Map(cpus -> name: cpus, amount: 1.0)
    11:58:20.268 Thread-6 INFO WholeStageCodegenSparkSubmitSuite: stderr> 23/10/18 11:58:20 INFO ResourceProfile: Limiting resource is cpu
    11:58:20.268 Thread-6 INFO WholeStageCodegenSparkSubmitSuite: stderr> 23/10/18 11:58:20 INFO ResourceProfileManager: Added ResourceProfile id: 0
    11:58:20.302 Thread-6 INFO WholeStageCodegenSparkSubmitSuite: stderr> 23/10/18 11:58:20 INFO SecurityManager: Changing view acls to: hzyaoqin
    11:58:20.302 Thread-6 INFO WholeStageCodegenSparkSubmitSuite: stderr> 23/10/18 11:58:20 INFO SecurityManager: Changing modify acls to: hzyaoqin
    11:58:20.303 Thread-6 INFO WholeStageCodegenSparkSubmitSuite: stderr> 23/10/18 11:58:20 INFO SecurityManager: Changing view acls groups to:
    11:58:20.303 Thread-6 INFO WholeStageCodegenSparkSubmitSuite: stderr> 23/10/18 11:58:20 INFO SecurityManager: Changing modify acls groups to:
    11:58:20.305 Thread-6 INFO WholeStageCodegenSparkSubmitSuite: stderr> 23/10/18 11:58:20 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: hzyaoqin; groups with view permissions: EMPTY; users with modify permissions: hzyaoqin; groups with modify permissions: EMPTY; RPC SSL disabled
    11:58:20.448 Thread-6 INFO WholeStageCodegenSparkSubmitSuite: stderr> 23/10/18 11:58:20 INFO Utils: Successfully started service 'sparkDriver' on port 52173.
    11:58:20.465 Thread-6 INFO WholeStageCodegenSparkSubmitSuite: stderr> 23/10/18 11:58:20 INFO SparkEnv: Registering MapOutputTracker
    
    ```
    
    ### Was this patch authored or co-authored using generative AI tooling?
    
    no
    
    Closes #43421 from yaooqinn/SPARK-45585.
    
    Authored-by: Kent Yao <ya...@apache.org>
    Signed-off-by: Kent Yao <ya...@apache.org>
---
 .../org/apache/spark/deploy/SparkSubmitTestUtils.scala    | 15 ++-------------
 .../scala/org/apache/spark/repl/SparkShellSuite.scala     | 11 ++++-------
 .../org/apache/spark/sql/hive/thriftserver/CliSuite.scala |  9 ++-------
 3 files changed, 8 insertions(+), 27 deletions(-)

diff --git a/core/src/test/scala/org/apache/spark/deploy/SparkSubmitTestUtils.scala b/core/src/test/scala/org/apache/spark/deploy/SparkSubmitTestUtils.scala
index 2ab2e17df03..932e972374c 100644
--- a/core/src/test/scala/org/apache/spark/deploy/SparkSubmitTestUtils.scala
+++ b/core/src/test/scala/org/apache/spark/deploy/SparkSubmitTestUtils.scala
@@ -18,8 +18,6 @@
 package org.apache.spark.deploy
 
 import java.io.File
-import java.sql.Timestamp
-import java.util.Date
 
 import scala.collection.mutable.ArrayBuffer
 
@@ -69,17 +67,8 @@ trait SparkSubmitTestUtils extends SparkFunSuite with TimeLimits {
     env.put("SPARK_HOME", sparkHome)
 
     def captureOutput(source: String)(line: String): Unit = {
-      // This test suite has some weird behaviors when executed on Jenkins:
-      //
-      // 1. Sometimes it gets extremely slow out of unknown reason on Jenkins.  Here we add a
-      //    timestamp to provide more diagnosis information.
-      // 2. Log lines are not correctly redirected to unit-tests.log as expected, so here we print
-      //    them out for debugging purposes.
-      val logLine = s"${new Timestamp(new Date().getTime)} - $source> $line"
-      // scalastyle:off println
-      println(logLine)
-      // scalastyle:on println
-      history += logLine
+      logInfo(s"$source> $line")
+      history += line
     }
 
     val process = builder.start()
diff --git a/repl/src/test/scala/org/apache/spark/repl/SparkShellSuite.scala b/repl/src/test/scala/org/apache/spark/repl/SparkShellSuite.scala
index 39544beec41..067f08cb675 100644
--- a/repl/src/test/scala/org/apache/spark/repl/SparkShellSuite.scala
+++ b/repl/src/test/scala/org/apache/spark/repl/SparkShellSuite.scala
@@ -19,8 +19,6 @@ package org.apache.spark.repl
 
 import java.io._
 import java.nio.charset.StandardCharsets
-import java.sql.Timestamp
-import java.util.Date
 
 import scala.collection.mutable.ArrayBuffer
 import scala.concurrent.Promise
@@ -70,10 +68,9 @@ class SparkShellSuite extends SparkFunSuite {
     val lock = new Object
 
     def captureOutput(source: String)(line: String): Unit = lock.synchronized {
-      // This test suite sometimes gets extremely slow out of unknown reason on Jenkins.  Here we
-      // add a timestamp to provide more diagnosis information.
-      val newLine = s"${new Timestamp(new Date().getTime)} - $source> $line"
-      log.info(newLine)
+      val newLine = s"$source> $line"
+
+      logInfo(newLine)
       buffer += newLine
 
       if (line.startsWith("Spark context available") && line.contains("app id")) {
@@ -82,7 +79,7 @@ class SparkShellSuite extends SparkFunSuite {
 
       // If we haven't found all expected answers and another expected answer comes up...
       if (next < expectedAnswers.size && line.contains(expectedAnswers(next))) {
-        log.info(s"$source> found expected output line $next: '${expectedAnswers(next)}'")
+        logInfo(s"$source> found expected output line $next: '${expectedAnswers(next)}'")
         next += 1
         // If all expected answers have been found...
         if (next == expectedAnswers.size) {
diff --git a/sql/hive-thriftserver/src/test/scala/org/apache/spark/sql/hive/thriftserver/CliSuite.scala b/sql/hive-thriftserver/src/test/scala/org/apache/spark/sql/hive/thriftserver/CliSuite.scala
index 4588cf39d1f..d5045cb511c 100644
--- a/sql/hive-thriftserver/src/test/scala/org/apache/spark/sql/hive/thriftserver/CliSuite.scala
+++ b/sql/hive-thriftserver/src/test/scala/org/apache/spark/sql/hive/thriftserver/CliSuite.scala
@@ -19,8 +19,6 @@ package org.apache.spark.sql.hive.thriftserver
 
 import java.io._
 import java.nio.charset.StandardCharsets
-import java.sql.Timestamp
-import java.util.Date
 import java.util.concurrent.CountDownLatch
 
 import scala.collection.mutable.ArrayBuffer
@@ -145,11 +143,8 @@ class CliSuite extends SparkFunSuite {
     val lock = new Object
 
     def captureOutput(source: String)(line: String): Unit = lock.synchronized {
-      // This test suite sometimes gets extremely slow out of unknown reason on Jenkins.  Here we
-      // add a timestamp to provide more diagnosis information.
-      val newLine = s"${new Timestamp(new Date().getTime)} - $source> $line"
-      log.info(newLine)
-      buffer += newLine
+      logInfo(s"$source> $line")
+      buffer += line
 
       if (line.startsWith("Spark master: ") && line.contains("Application Id: ")) {
         foundMasterAndApplicationIdMessage.trySuccess(())


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@spark.apache.org
For additional commands, e-mail: commits-help@spark.apache.org