You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by "EnricoMi (via GitHub)" <gi...@apache.org> on 2024/01/22 18:48:26 UTC
[PR] [MINOR][Test][Connect] Discard stdout / stderr of test Spark connect server if not isDebug [spark]

EnricoMi opened a new pull request, #44836:
URL: https://github.com/apache/spark/pull/44836

   ### What changes were proposed in this pull request?
   The stdout and stderr output of the test Spark connect server process used throughout E2E tests should be discarded if not in debug mode.
   
   ### Why are the changes needed?
   Running the E2E tests only works for me in debug mode:
   
   ```
   SPARK_DEBUG_SC_JVM_CLIENT=true JAVA_HOME=/usr/lib/jvm/java-17-openjdk-amd64 SPARK_LOCAL_IP=localhost SKIP_UNIDOC=true SKIP_MIMA=true SERIAL_SBT_TESTS=1 build/sbt -Phadoop-3 -Pspark-ganglia-lgpl -Phadoop-cloud -Pkinesis-asl -Pkubernetes -Pconnect -Pvolcano -Pyarn package connect-client-jvm/test
   ```
   works just fine:
   ```
   [info] Run completed in 5 minutes, 47 seconds.
   [info] Total number of tests run: 1259
   [info] Suites: completed 25, aborted 0
   [info] Tests: succeeded 1259, failed 0, canceled 6, ignored 2, pending 0
   [info] All tests passed.
   [info] Passed: Total 1261, Failed 0, Errors 0, Passed 1261, Ignored 2, Canceled 6
   [success] Total time: 426 s (07:06), completed 22.01.2024, 19:46:21
   ```
   
   Not in debug mode, the service does not seem to be able to start:
   ```
   JAVA_HOME=/usr/lib/jvm/java-17-openjdk-amd64 SPARK_LOCAL_IP=localhost SKIP_UNIDOC=true SKIP_MIMA=true SERIAL_SBT_TESTS=1 build/sbt -Phadoop-3 -Pspark-ganglia-lgpl -Phadoop-cloud -Pkinesis-asl -Pkubernetes -Pconnect -Pvolcano -Pyarn package connect-client-jvm/test
   ```
   ```
   [info] ClientStreamingQuerySuite:
   Will start Spark Connect server with `spark.sql.catalogImplementation=in-memory`, some tests that rely on Hive will be ignored. If you don't want to skip them:
   1. Test with maven: run `build/mvn install -DskipTests -Phive` before testing
   2. Test with sbt: run test with `-Phive` profile
   [info] org.apache.spark.sql.streaming.ClientStreamingQuerySuite *** ABORTED *** (35 seconds, 823 milliseconds)
   [info]   org.apache.spark.sql.connect.client.RetriesExceeded:
   [info]   at org.apache.spark.sql.connect.client.GrpcRetryHandler$Retrying.waitAfterAttempt(GrpcRetryHandler.scala:213)
   [info]   at org.apache.spark.sql.connect.client.GrpcRetryHandler$Retrying.retry(GrpcRetryHandler.scala:222)
   [info]   at org.apache.spark.sql.connect.client.GrpcRetryHandler.retry(GrpcRetryHandler.scala:36)
   [info]   at org.apache.spark.sql.connect.client.CustomSparkConnectBlockingStub.$anonfun$analyzePlan$1(CustomSparkConnectBlockingStub.scala:76)
   [info]   at org.apache.spark.sql.connect.client.GrpcExceptionConverter.convert(GrpcExceptionConverter.scala:58)
   [info]   at org.apache.spark.sql.connect.client.CustomSparkConnectBlockingStub.analyzePlan(CustomSparkConnectBlockingStub.scala:75)
   [info]   at org.apache.spark.sql.connect.client.SparkConnectClient.analyze(SparkConnectClient.scala:83)
   [info]   at org.apache.spark.sql.connect.client.SparkConnectClient.analyze(SparkConnectClient.scala:211)
   [info]   at org.apache.spark.sql.connect.client.SparkConnectClient.analyze(SparkConnectClient.scala:182)
   [info]   at org.apache.spark.sql.SparkSession.version$lzycompute(SparkSession.scala:80)
   [info]   at org.apache.spark.sql.SparkSession.version(SparkSession.scala:79)
   [info]   at org.apache.spark.sql.test.SparkConnectServerUtils$.createSparkSession(RemoteSparkSession.scala:198)
   [info]   at org.apache.spark.sql.test.RemoteSparkSession.beforeAll(RemoteSparkSession.scala:214)
   [info]   at org.apache.spark.sql.test.RemoteSparkSession.beforeAll$(RemoteSparkSession.scala:212)
   [info]   at org.apache.spark.sql.test.QueryTest.beforeAll(QueryTest.scala:28)
   [info]   at org.scalatest.BeforeAndAfterAll.liftedTree1$1(BeforeAndAfterAll.scala:212)
   [info]   at org.scalatest.BeforeAndAfterAll.run(BeforeAndAfterAll.scala:210)
   [info]   at org.scalatest.BeforeAndAfterAll.run$(BeforeAndAfterAll.scala:208)
   [info]   at org.apache.spark.sql.test.QueryTest.run(QueryTest.scala:28)
   [info]   at org.scalatest.tools.Framework.org$scalatest$tools$Framework$$runSuite(Framework.scala:321)
   [info]   at org.scalatest.tools.Framework$ScalaTestTask.execute(Framework.scala:517)
   Warning: Unable to serialize throwable of type org.apache.spark.sql.connect.client.RetriesExceeded for SuiteAborted(Ordinal(0, 2),org.apache.spark.sql.connect.client.RetriesExceeded encountered when attempting to run suite org.apache.spark.sql.streaming.ClientStreamingQuerySuite,ClientStreamingQuerySuite,org.apache.spark.sql.streaming.ClientStreamingQuerySuite,Some(ClientStreamingQuerySuite),Some(org.apache.spark.sql.connect.client.RetriesExceeded),Some(35823),Some(IndentedText(org.apache.spark.sql.streaming.ClientStreamingQuerySuite,org.apache.spark.sql.connect.client.RetriesExceeded encountered when attempting to run suite org.apache.spark.sql.streaming.ClientStreamingQuerySuite,0)),Some(SeeStackDepthException),None,None,pool-1-thread-1,1705938098933), setting it as NotSerializableWrapperException.
   Warning: Unable to read from client, please check on client for further details of the problem.
   [info]   at sbt.ForkMain$Run.lambda$runTest$1(ForkMain.java:414)
   [info]   at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
   [info]   at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)
   [info]   at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
   [info]   at java.base/java.lang.Thread.run(Thread.java:840)
   [info] FlatMapGroupsWithStateStreamingSuite:
   <waits forever>
   ```
   
   Discarding stdout and stderr when not in debug mode makes these tests work for me as expected.
   
   ### Does this PR introduce _any_ user-facing change?
   No.
   
   ### How was this patch tested?
   No.
   
   ### Was this patch authored or co-authored using generative AI tooling?
   No.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org