You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@spark.apache.org by ya...@apache.org on 2023/10/19 01:43:27 UTC
[spark] branch master updated: [SPARK-45585][TEST] Fix time format and redirection issues in SparkSubmit tests
This is an automated email from the ASF dual-hosted git repository.
yao pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git
The following commit(s) were added to refs/heads/master by this push:
new a14f90941ca [SPARK-45585][TEST] Fix time format and redirection issues in SparkSubmit tests
a14f90941ca is described below
commit a14f90941caf06e2d77789a3952dd588e6900b90
Author: Kent Yao <ya...@apache.org>
AuthorDate: Thu Oct 19 09:43:13 2023 +0800
[SPARK-45585][TEST] Fix time format and redirection issues in SparkSubmit tests
### What changes were proposed in this pull request?
This PR fixes:
- The deviation from `new Timestamp(new Date().getTime)` and log4j2 date format pattern from sub spark-submit progress
```
2023-10-17 03:58:48.275 - stderr> 23/10/17 18:58:48 INFO StandaloneSchedulerBackend: Connected to Spark cluster with app ID app-20231017185848-0000
2023-10-17 03:58:48.278 - stderr> 23/10/17 18:58:48 INFO Utils: Successfully started service 'org.apache.spark.network.netty.NettyBlockTransferService' on port 57637.
```
- The duplication of `new Timestamp(new Date().getTime)` when using logInfo instead of println
```
23/10/17 19:02:34.392 Thread-5 INFO SparkShellSuite: 2023-10-17 04:02:34.392 - stderr> 23/10/17 19:02:34 WARN Utils: Your hostname, hulk.local resolves to a loopback address: 127.0.0.1; using 10.221.103.23 instead (on interface en0)
23/10/17 19:02:34.393 Thread-5 INFO SparkShellSuite: 2023-10-17 04:02:34.393 - stderr> 23/10/17 19:02:34 WARN Utils: Set SPARK_LOCAL_IP if you need to bind to another address
```
- Correctly redirects sub spark-submit progress logs to unit-tests.log
### Why are the changes needed?
test fixes
### Does this PR introduce _any_ user-facing change?
no
### How was this patch tested?
- WholeStageCodegenSparkSubmitSuite - before
```
18:58:53.882 shutdown-hook-0 INFO ShutdownHookManager: Shutdown hook called
18:58:53.882 shutdown-hook-0 INFO ShutdownHookManager: Deleting directory /Users/hzyaoqin/spark/target/tmp/spark-ecd53d47-d109-4ddc-80dd-2d829f34371e
11:58:18.892 pool-1-thread-1 WARN Utils: Your hostname, hulk.local resolves to a loopback address: 127.0.0.1; using 10.221.103.23 instead (on interface en0)
11:58:18.893 pool-1-thread-1 WARN Utils: Set SPARK_LOCAL_IP if you need to bind to another address
11:58:18.932 pool-1-thread-1-ScalaTest-running-WholeStageCodegenSparkSubmitSuite INFO WholeStageCodegenSparkSubmitSuite:
```
- WholeStageCodegenSparkSubmitSuite - after
```
===== TEST OUTPUT FOR o.a.s.sql.execution.WholeStageCodegenSparkSubmitSuite: 'Generated code on driver should not embed platform-specific constant' =====
11:58:19.882 Thread-6 INFO WholeStageCodegenSparkSubmitSuite: stderr> 23/10/18 11:58:19 WARN Utils: Your hostname, hulk.local resolves to a loopback address: 127.0.0.1; using 10.221.103.23 instead (on interface en0)
11:58:19.883 Thread-6 INFO WholeStageCodegenSparkSubmitSuite: stderr> 23/10/18 11:58:19 WARN Utils: Set SPARK_LOCAL_IP if you need to bind to another address
11:58:20.195 Thread-6 INFO WholeStageCodegenSparkSubmitSuite: stderr> 23/10/18 11:58:20 INFO SparkContext: Running Spark version 4.0.0-SNAPSHOT
11:58:20.195 Thread-6 INFO WholeStageCodegenSparkSubmitSuite: stderr> 23/10/18 11:58:20 INFO SparkContext: OS info Mac OS X, 13.4, aarch64
11:58:20.195 Thread-6 INFO WholeStageCodegenSparkSubmitSuite: stderr> 23/10/18 11:58:20 INFO SparkContext: Java version 17.0.8
11:58:20.227 Thread-6 INFO WholeStageCodegenSparkSubmitSuite: stderr> 23/10/18 11:58:20 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
11:58:20.253 Thread-6 INFO WholeStageCodegenSparkSubmitSuite: stderr> 23/10/18 11:58:20 INFO ResourceUtils: ==============================================================
11:58:20.253 Thread-6 INFO WholeStageCodegenSparkSubmitSuite: stderr> 23/10/18 11:58:20 INFO ResourceUtils: No custom resources configured for spark.driver.
11:58:20.253 Thread-6 INFO WholeStageCodegenSparkSubmitSuite: stderr> 23/10/18 11:58:20 INFO ResourceUtils: ==============================================================
11:58:20.254 Thread-6 INFO WholeStageCodegenSparkSubmitSuite: stderr> 23/10/18 11:58:20 INFO SparkContext: Submitted application: org.apache.spark.sql.execution.WholeStageCodegenSparkSubmitSuite
11:58:20.266 Thread-6 INFO WholeStageCodegenSparkSubmitSuite: stderr> 23/10/18 11:58:20 INFO ResourceProfile: Default ResourceProfile created, executor resources: Map(memory -> name: memory, amount: 1024, script: , vendor: , offHeap -> name: offHeap, amount: 0, script: , vendor: ), task resources: Map(cpus -> name: cpus, amount: 1.0)
11:58:20.268 Thread-6 INFO WholeStageCodegenSparkSubmitSuite: stderr> 23/10/18 11:58:20 INFO ResourceProfile: Limiting resource is cpu
11:58:20.268 Thread-6 INFO WholeStageCodegenSparkSubmitSuite: stderr> 23/10/18 11:58:20 INFO ResourceProfileManager: Added ResourceProfile id: 0
11:58:20.302 Thread-6 INFO WholeStageCodegenSparkSubmitSuite: stderr> 23/10/18 11:58:20 INFO SecurityManager: Changing view acls to: hzyaoqin
11:58:20.302 Thread-6 INFO WholeStageCodegenSparkSubmitSuite: stderr> 23/10/18 11:58:20 INFO SecurityManager: Changing modify acls to: hzyaoqin
11:58:20.303 Thread-6 INFO WholeStageCodegenSparkSubmitSuite: stderr> 23/10/18 11:58:20 INFO SecurityManager: Changing view acls groups to:
11:58:20.303 Thread-6 INFO WholeStageCodegenSparkSubmitSuite: stderr> 23/10/18 11:58:20 INFO SecurityManager: Changing modify acls groups to:
11:58:20.305 Thread-6 INFO WholeStageCodegenSparkSubmitSuite: stderr> 23/10/18 11:58:20 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: hzyaoqin; groups with view permissions: EMPTY; users with modify permissions: hzyaoqin; groups with modify permissions: EMPTY; RPC SSL disabled
11:58:20.448 Thread-6 INFO WholeStageCodegenSparkSubmitSuite: stderr> 23/10/18 11:58:20 INFO Utils: Successfully started service 'sparkDriver' on port 52173.
11:58:20.465 Thread-6 INFO WholeStageCodegenSparkSubmitSuite: stderr> 23/10/18 11:58:20 INFO SparkEnv: Registering MapOutputTracker
```
### Was this patch authored or co-authored using generative AI tooling?
no
Closes #43421 from yaooqinn/SPARK-45585.
Authored-by: Kent Yao <ya...@apache.org>
Signed-off-by: Kent Yao <ya...@apache.org>
---
.../org/apache/spark/deploy/SparkSubmitTestUtils.scala | 15 ++-------------
.../scala/org/apache/spark/repl/SparkShellSuite.scala | 11 ++++-------
.../org/apache/spark/sql/hive/thriftserver/CliSuite.scala | 9 ++-------
3 files changed, 8 insertions(+), 27 deletions(-)
diff --git a/core/src/test/scala/org/apache/spark/deploy/SparkSubmitTestUtils.scala b/core/src/test/scala/org/apache/spark/deploy/SparkSubmitTestUtils.scala
index 2ab2e17df03..932e972374c 100644
--- a/core/src/test/scala/org/apache/spark/deploy/SparkSubmitTestUtils.scala
+++ b/core/src/test/scala/org/apache/spark/deploy/SparkSubmitTestUtils.scala
@@ -18,8 +18,6 @@
package org.apache.spark.deploy
import java.io.File
-import java.sql.Timestamp
-import java.util.Date
import scala.collection.mutable.ArrayBuffer
@@ -69,17 +67,8 @@ trait SparkSubmitTestUtils extends SparkFunSuite with TimeLimits {
env.put("SPARK_HOME", sparkHome)
def captureOutput(source: String)(line: String): Unit = {
- // This test suite has some weird behaviors when executed on Jenkins:
- //
- // 1. Sometimes it gets extremely slow out of unknown reason on Jenkins. Here we add a
- // timestamp to provide more diagnosis information.
- // 2. Log lines are not correctly redirected to unit-tests.log as expected, so here we print
- // them out for debugging purposes.
- val logLine = s"${new Timestamp(new Date().getTime)} - $source> $line"
- // scalastyle:off println
- println(logLine)
- // scalastyle:on println
- history += logLine
+ logInfo(s"$source> $line")
+ history += line
}
val process = builder.start()
diff --git a/repl/src/test/scala/org/apache/spark/repl/SparkShellSuite.scala b/repl/src/test/scala/org/apache/spark/repl/SparkShellSuite.scala
index 39544beec41..067f08cb675 100644
--- a/repl/src/test/scala/org/apache/spark/repl/SparkShellSuite.scala
+++ b/repl/src/test/scala/org/apache/spark/repl/SparkShellSuite.scala
@@ -19,8 +19,6 @@ package org.apache.spark.repl
import java.io._
import java.nio.charset.StandardCharsets
-import java.sql.Timestamp
-import java.util.Date
import scala.collection.mutable.ArrayBuffer
import scala.concurrent.Promise
@@ -70,10 +68,9 @@ class SparkShellSuite extends SparkFunSuite {
val lock = new Object
def captureOutput(source: String)(line: String): Unit = lock.synchronized {
- // This test suite sometimes gets extremely slow out of unknown reason on Jenkins. Here we
- // add a timestamp to provide more diagnosis information.
- val newLine = s"${new Timestamp(new Date().getTime)} - $source> $line"
- log.info(newLine)
+ val newLine = s"$source> $line"
+
+ logInfo(newLine)
buffer += newLine
if (line.startsWith("Spark context available") && line.contains("app id")) {
@@ -82,7 +79,7 @@ class SparkShellSuite extends SparkFunSuite {
// If we haven't found all expected answers and another expected answer comes up...
if (next < expectedAnswers.size && line.contains(expectedAnswers(next))) {
- log.info(s"$source> found expected output line $next: '${expectedAnswers(next)}'")
+ logInfo(s"$source> found expected output line $next: '${expectedAnswers(next)}'")
next += 1
// If all expected answers have been found...
if (next == expectedAnswers.size) {
diff --git a/sql/hive-thriftserver/src/test/scala/org/apache/spark/sql/hive/thriftserver/CliSuite.scala b/sql/hive-thriftserver/src/test/scala/org/apache/spark/sql/hive/thriftserver/CliSuite.scala
index 4588cf39d1f..d5045cb511c 100644
--- a/sql/hive-thriftserver/src/test/scala/org/apache/spark/sql/hive/thriftserver/CliSuite.scala
+++ b/sql/hive-thriftserver/src/test/scala/org/apache/spark/sql/hive/thriftserver/CliSuite.scala
@@ -19,8 +19,6 @@ package org.apache.spark.sql.hive.thriftserver
import java.io._
import java.nio.charset.StandardCharsets
-import java.sql.Timestamp
-import java.util.Date
import java.util.concurrent.CountDownLatch
import scala.collection.mutable.ArrayBuffer
@@ -145,11 +143,8 @@ class CliSuite extends SparkFunSuite {
val lock = new Object
def captureOutput(source: String)(line: String): Unit = lock.synchronized {
- // This test suite sometimes gets extremely slow out of unknown reason on Jenkins. Here we
- // add a timestamp to provide more diagnosis information.
- val newLine = s"${new Timestamp(new Date().getTime)} - $source> $line"
- log.info(newLine)
- buffer += newLine
+ logInfo(s"$source> $line")
+ buffer += line
if (line.startsWith("Spark master: ") && line.contains("Application Id: ")) {
foundMasterAndApplicationIdMessage.trySuccess(())
---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@spark.apache.org
For additional commands, e-mail: commits-help@spark.apache.org