You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Hyukjin Kwon (Jira)" <ji...@apache.org> on 2020/06/12 04:52:00 UTC

[jira] [Resolved] (SPARK-31928) Flaky test: StreamingDeduplicationSuite.test no-data flag

     [ https://issues.apache.org/jira/browse/SPARK-31928?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Hyukjin Kwon resolved SPARK-31928.
----------------------------------
    Resolution: Duplicate

> Flaky test: StreamingDeduplicationSuite.test no-data flag
> ---------------------------------------------------------
>
>                 Key: SPARK-31928
>                 URL: https://issues.apache.org/jira/browse/SPARK-31928
>             Project: Spark
>          Issue Type: Bug
>          Components: Structured Streaming, Tests
>    Affects Versions: 3.1.0
>            Reporter: Gabor Somogyi
>            Priority: Major
>
> Test failed: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/123621/
> {code:java}
> [info]   with spark.sql.streaming.noDataMicroBatches.enabled = false 
> [info]   Assert on query failed: : 
> [info]   Assert on query failed: 
> [info]   
> [info]   == Progress ==
> [info]      StartStream(ProcessingTimeTrigger(0),org.apache.spark.util.SystemClock@372edb19,Map(spark.sql.streaming.noDataMicroBatches.enabled -> false),null)
> [info]      AddData to MemoryStream[value#437541]: 10,11,12,13,14,15
> [info]      CheckAnswer: [10],[11],[12],[13],[14],[15]
> [info]      AssertOnQuery(<condition>, Check total state rows = List(6), updated state rows = List(6))
> [info]      AddData to MemoryStream[value#437541]: 25
> [info]      CheckNewAnswer: [25]
> [info]      AssertOnQuery(<condition>, Check total state rows = List(7), updated state rows = List(1))
> [info]   => AssertOnQuery(<condition>, )
> [info]   
> [info]   == Stream ==
> [info]   Output Mode: Append
> [info]   Stream state: {MemoryStream[value#437541]: 1}
> [info]   Thread state: alive
> [info]   Thread stack trace: java.lang.Thread.sleep(Native Method)
> [info]   org.apache.spark.sql.execution.streaming.MicroBatchExecution.$anonfun$runActivatedStream$1(MicroBatchExecution.scala:241)
> [info]   org.apache.spark.sql.execution.streaming.MicroBatchExecution$$Lambda$1375/882607691.apply$mcZ$sp(Unknown Source)
> [info]   org.apache.spark.sql.execution.streaming.ProcessingTimeExecutor.execute(TriggerExecutor.scala:57)
> [info]   org.apache.spark.sql.execution.streaming.MicroBatchExecution.runActivatedStream(MicroBatchExecution.scala:185)
> [info]   org.apache.spark.sql.execution.streaming.StreamExecution.org$apache$spark$sql$execution$streaming$StreamExecution$$runStream(StreamExecution.scala:334)
> [info]   org.apache.spark.sql.execution.streaming.StreamExecution$$anon$1.run(StreamExecution.scala:245)
> [info]   
> [info]   
> [info]   == Sink ==
> [info]   0: [11] [14] [13] [10] [15] [12]
> [info]   1: [25]
> [info]   
> [info]   
> [info]   == Plan ==
> [info]   == Parsed Logical Plan ==
> [info]   WriteToDataSourceV2 org.apache.spark.sql.execution.streaming.sources.MicroBatchWrite@158ccd13
> [info]   +- Project [cast(eventTime#437544-T10000ms as bigint) AS eventTime#437548L]
> [info]      +- Deduplicate [value#437541, eventTime#437544-T10000ms]
> [info]         +- EventTimeWatermark eventTime#437544: timestamp, 10 seconds
> [info]            +- Project [value#437541, cast(value#437541 as timestamp) AS eventTime#437544]
> [info]               +- StreamingDataSourceV2Relation [value#437541], org.apache.spark.sql.execution.streaming.MemoryStreamScanBuilder@1802eea6, MemoryStream[value#437541], 0, 1
> [info]   
> [info]   == Analyzed Logical Plan ==
> [info]   
> [info]   WriteToDataSourceV2 org.apache.spark.sql.execution.streaming.sources.MicroBatchWrite@158ccd13
> [info]   +- Project [cast(eventTime#437544-T10000ms as bigint) AS eventTime#437548L]
> [info]      +- Deduplicate [value#437541, eventTime#437544-T10000ms]
> [info]         +- EventTimeWatermark eventTime#437544: timestamp, 10 seconds
> [info]            +- Project [value#437541, cast(value#437541 as timestamp) AS eventTime#437544]
> [info]               +- StreamingDataSourceV2Relation [value#437541], org.apache.spark.sql.execution.streaming.MemoryStreamScanBuilder@1802eea6, MemoryStream[value#437541], 0, 1
> [info]   
> [info]   == Optimized Logical Plan ==
> [info]   WriteToDataSourceV2 org.apache.spark.sql.execution.streaming.sources.MicroBatchWrite@158ccd13
> [info]   +- Project [cast(eventTime#437544-T10000ms as bigint) AS eventTime#437548L]
> [info]      +- Deduplicate [value#437541, eventTime#437544-T10000ms]
> [info]         +- EventTimeWatermark eventTime#437544: timestamp, 10 seconds
> [info]            +- Project [value#437541, cast(value#437541 as timestamp) AS eventTime#437544]
> [info]               +- StreamingDataSourceV2Relation [value#437541], org.apache.spark.sql.execution.streaming.MemoryStreamScanBuilder@1802eea6, MemoryStream[value#437541], 0, 1
> [info]   
> [info]   == Physical Plan ==
> [info]   WriteToDataSourceV2 org.apache.spark.sql.execution.streaming.sources.MicroBatchWrite@158ccd13
> [info]   +- *(2) Project [cast(eventTime#437544-T10000ms as bigint) AS eventTime#437548L]
> [info]      +- StreamingDeduplicate [value#437541, eventTime#437544-T10000ms], state info [ checkpoint = file:/home/jenkins/workspace/SparkPullRequestBuilder@2/target/tmp/streaming.metadata-e446f9e9-0afe-4b5e-b6a6-772d0debc7ee/state, runId = 9ed21ac3-8994-4e13-8da1-db14a1f22da6, opId = 0, ver = 1, numPartitions = 5], 5000
> [info]         +- Exchange hashpartitioning(value#437541, eventTime#437544-T10000ms, 5), true, [id=#875392]
> [info]            +- EventTimeWatermark eventTime#437544: timestamp, 10 seconds
> [info]               +- *(1) Project [value#437541, cast(value#437541 as timestamp) AS eventTime#437544]
> [info]                  +- *(1) Project [value#437541]
> [info]                     +- MicroBatchScan[value#437541] MemoryStreamDataSource
> [info]   
> [info]            
> [info]            
> [info]   org.scalatest.Assertions.newAssertionFailedException(Assertions.scala:530)
> [info]   	org.scalatest.Assertions.newAssertionFailedException$(Assertions.scala:529)
> [info]   	org.scalatest.FunSuite.newAssertionFailedException(FunSuite.scala:1560)
> [info]   	org.scalatest.Assertions.fail(Assertions.scala:1091)
> [info]   	org.scalatest.Assertions.fail$(Assertions.scala:1087)
> [info]   	org.scalatest.FunSuite.fail(FunSuite.scala:1560)
> [info]   	org.apache.spark.sql.streaming.StreamTest.failTest$1(StreamTest.scala:452)
> [info]   	org.apache.spark.sql.streaming.StreamTest.verify$1(StreamTest.scala:423)
> [info]   	org.apache.spark.sql.streaming.StreamTest.executeAction$1(StreamTest.scala:651)
> [info]   	org.apache.spark.sql.streaming.StreamTest.$anonfun$testStream$56(StreamTest.scala:778)
> [info]   
> [info]   
> [info]   == Progress ==
> [info]      StartStream(ProcessingTimeTrigger(0),org.apache.spark.util.SystemClock@372edb19,Map(spark.sql.streaming.noDataMicroBatches.enabled -> false),null)
> [info]      AddData to MemoryStream[value#437541]: 10,11,12,13,14,15
> [info]      CheckAnswer: [10],[11],[12],[13],[14],[15]
> [info]      AssertOnQuery(<condition>, Check total state rows = List(6), updated state rows = List(6))
> [info]      AddData to MemoryStream[value#437541]: 25
> [info]      CheckNewAnswer: [25]
> [info]      AssertOnQuery(<condition>, Check total state rows = List(7), updated state rows = List(1))
> [info]   => AssertOnQuery(<condition>, )
> [info]   
> [info]   == Stream ==
> [info]   Output Mode: Append
> [info]   Stream state: {MemoryStream[value#437541]: 1}
> [info]   Thread state: alive
> [info]   Thread stack trace: java.lang.Thread.sleep(Native Method)
> [info]   org.apache.spark.sql.execution.streaming.MicroBatchExecution.$anonfun$runActivatedStream$1(MicroBatchExecution.scala:241)
> [info]   org.apache.spark.sql.execution.streaming.MicroBatchExecution$$Lambda$1375/882607691.apply$mcZ$sp(Unknown Source)
> [info]   org.apache.spark.sql.execution.streaming.ProcessingTimeExecutor.execute(TriggerExecutor.scala:57)
> [info]   org.apache.spark.sql.execution.streaming.MicroBatchExecution.runActivatedStream(MicroBatchExecution.scala:185)
> [info]   org.apache.spark.sql.execution.streaming.StreamExecution.org$apache$spark$sql$execution$streaming$StreamExecution$$runStream(StreamExecution.scala:334)
> [info]   org.apache.spark.sql.execution.streaming.StreamExecution$$anon$1.run(StreamExecution.scala:245)
> [info]   
> [info]   
> [info]   == Sink ==
> [info]   0: [11] [14] [13] [10] [15] [12]
> [info]   1: [25]
> [info]   
> [info]   
> [info]   == Plan ==
> [info]   == Parsed Logical Plan ==
> [info]   WriteToDataSourceV2 org.apache.spark.sql.execution.streaming.sources.MicroBatchWrite@158ccd13
> [info]   +- Project [cast(eventTime#437544-T10000ms as bigint) AS eventTime#437548L]
> [info]      +- Deduplicate [value#437541, eventTime#437544-T10000ms]
> [info]         +- EventTimeWatermark eventTime#437544: timestamp, 10 seconds
> [info]            +- Project [value#437541, cast(value#437541 as timestamp) AS eventTime#437544]
> [info]               +- StreamingDataSourceV2Relation [value#437541], org.apache.spark.sql.execution.streaming.MemoryStreamScanBuilder@1802eea6, MemoryStream[value#437541], 0, 1
> [info]   
> [info]   == Analyzed Logical Plan ==
> [info]   
> [info]   WriteToDataSourceV2 org.apache.spark.sql.execution.streaming.sources.MicroBatchWrite@158ccd13
> [info]   +- Project [cast(eventTime#437544-T10000ms as bigint) AS eventTime#437548L]
> [info]      +- Deduplicate [value#437541, eventTime#437544-T10000ms]
> [info]         +- EventTimeWatermark eventTime#437544: timestamp, 10 seconds
> [info]            +- Project [value#437541, cast(value#437541 as timestamp) AS eventTime#437544]
> [info]               +- StreamingDataSourceV2Relation [value#437541], org.apache.spark.sql.execution.streaming.MemoryStreamScanBuilder@1802eea6, MemoryStream[value#437541], 0, 1
> [info]   
> [info]   == Optimized Logical Plan ==
> [info]   WriteToDataSourceV2 org.apache.spark.sql.execution.streaming.sources.MicroBatchWrite@158ccd13
> [info]   +- Project [cast(eventTime#437544-T10000ms as bigint) AS eventTime#437548L]
> [info]      +- Deduplicate [value#437541, eventTime#437544-T10000ms]
> [info]         +- EventTimeWatermark eventTime#437544: timestamp, 10 seconds
> [info]            +- Project [value#437541, cast(value#437541 as timestamp) AS eventTime#437544]
> [info]               +- StreamingDataSourceV2Relation [value#437541], org.apache.spark.sql.execution.streaming.MemoryStreamScanBuilder@1802eea6, MemoryStream[value#437541], 0, 1
> [info]   
> [info]   == Physical Plan ==
> [info]   WriteToDataSourceV2 org.apache.spark.sql.execution.streaming.sources.MicroBatchWrite@158ccd13
> [info]   +- *(2) Project [cast(eventTime#437544-T10000ms as bigint) AS eventTime#437548L]
> [info]      +- StreamingDeduplicate [value#437541, eventTime#437544-T10000ms], state info [ checkpoint = file:/home/jenkins/workspace/SparkPullRequestBuilder@2/target/tmp/streaming.metadata-e446f9e9-0afe-4b5e-b6a6-772d0debc7ee/state, runId = 9ed21ac3-8994-4e13-8da1-db14a1f22da6, opId = 0, ver = 1, numPartitions = 5], 5000
> [info]         +- Exchange hashpartitioning(value#437541, eventTime#437544-T10000ms, 5), true, [id=#875392]
> [info]            +- EventTimeWatermark eventTime#437544: timestamp, 10 seconds
> [info]               +- *(1) Project [value#437541, cast(value#437541 as timestamp) AS eventTime#437544]
> [info]                  +- *(1) Project [value#437541]
> [info]                     +- MicroBatchScan[value#437541] MemoryStreamDataSource (StreamTest.scala:452)
> [info]   org.scalatest.exceptions.TestFailedException:
> [info]   at org.scalatest.Assertions.newAssertionFailedException(Assertions.scala:530)
> [info]   at org.scalatest.Assertions.newAssertionFailedException$(Assertions.scala:529)
> [info]   at org.scalatest.FunSuite.newAssertionFailedException(FunSuite.scala:1560)
> [info]   at org.scalatest.Assertions.fail(Assertions.scala:1091)
> [info]   at org.scalatest.Assertions.fail$(Assertions.scala:1087)
> [info]   at org.scalatest.FunSuite.fail(FunSuite.scala:1560)
> [info]   at org.apache.spark.sql.streaming.StreamTest.failTest$1(StreamTest.scala:452)
> [info]   at org.apache.spark.sql.streaming.StreamTest.executeAction$1(StreamTest.scala:654)
> [info]   at org.apache.spark.sql.streaming.StreamTest.$anonfun$testStream$56(StreamTest.scala:778)
> [info]   at org.apache.spark.sql.streaming.StreamTest.$anonfun$testStream$56$adapted(StreamTest.scala:765)
> [info]   at scala.collection.IndexedSeqOptimized.foreach(IndexedSeqOptimized.scala:36)
> [info]   at scala.collection.IndexedSeqOptimized.foreach$(IndexedSeqOptimized.scala:33)
> [info]   at scala.collection.mutable.WrappedArray.foreach(WrappedArray.scala:38)
> [info]   at org.apache.spark.sql.streaming.StreamTest.liftedTree1$1(StreamTest.scala:765)
> [info]   at org.apache.spark.sql.streaming.StreamTest.testStream(StreamTest.scala:764)
> [info]   at org.apache.spark.sql.streaming.StreamTest.testStream$(StreamTest.scala:334)
> [info]   at org.apache.spark.sql.streaming.StreamingDeduplicationSuite.testStream(StreamingDeduplicationSuite.scala:29)
> [info]   at org.apache.spark.sql.streaming.StreamingDeduplicationSuite.$anonfun$new$19(StreamingDeduplicationSuite.scala:284)
> [info]   at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23)
> [info]   at org.scalatest.Assertions.withClue(Assertions.scala:1223)
> [info]   at org.scalatest.Assertions.withClue$(Assertions.scala:1210)
> [info]   at org.scalatest.FunSuite.withClue(FunSuite.scala:1560)
> [info]   at org.apache.spark.sql.streaming.StreamingDeduplicationSuite.testWithFlag$1(StreamingDeduplicationSuite.scala:264)
> [info]   at org.apache.spark.sql.streaming.StreamingDeduplicationSuite.$anonfun$new$18(StreamingDeduplicationSuite.scala:289)
> [info]   at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23)
> [info]   at org.scalatest.OutcomeOf.outcomeOf(OutcomeOf.scala:85)
> [info]   at org.scalatest.OutcomeOf.outcomeOf$(OutcomeOf.scala:83)
> [info]   at org.scalatest.OutcomeOf$.outcomeOf(OutcomeOf.scala:104)
> [info]   at org.scalatest.Transformer.apply(Transformer.scala:22)
> [info]   at org.scalatest.Transformer.apply(Transformer.scala:20)
> [info]   at org.scalatest.FunSuiteLike$$anon$1.apply(FunSuiteLike.scala:186)
> [info]   at org.apache.spark.SparkFunSuite.withFixture(SparkFunSuite.scala:157)
> [info]   at org.scalatest.FunSuiteLike.invokeWithFixture$1(FunSuiteLike.scala:184)
> [info]   at org.scalatest.FunSuiteLike.$anonfun$runTest$1(FunSuiteLike.scala:196)
> [info]   at org.scalatest.SuperEngine.runTestImpl(Engine.scala:286)
> [info]   at org.scalatest.FunSuiteLike.runTest(FunSuiteLike.scala:196)
> [info]   at org.scalatest.FunSuiteLike.runTest$(FunSuiteLike.scala:178)
> [info]   at org.apache.spark.SparkFunSuite.org$scalatest$BeforeAndAfterEach$$super$runTest(SparkFunSuite.scala:59)
> [info]   at org.scalatest.BeforeAndAfterEach.runTest(BeforeAndAfterEach.scala:221)
> [info]   at org.scalatest.BeforeAndAfterEach.runTest$(BeforeAndAfterEach.scala:214)
> [info]   at org.apache.spark.SparkFunSuite.runTest(SparkFunSuite.scala:59)
> [info]   at org.scalatest.FunSuiteLike.$anonfun$runTests$1(FunSuiteLike.scala:229)
> [info]   at org.scalatest.SuperEngine.$anonfun$runTestsInBranch$1(Engine.scala:393)
> [info]   at scala.collection.immutable.List.foreach(List.scala:392)
> [info]   at org.scalatest.SuperEngine.traverseSubNodes$1(Engine.scala:381)
> [info]   at org.scalatest.SuperEngine.runTestsInBranch(Engine.scala:376)
> [info]   at org.scalatest.SuperEngine.runTestsImpl(Engine.scala:458)
> [info]   at org.scalatest.FunSuiteLike.runTests(FunSuiteLike.scala:229)
> [info]   at org.scalatest.FunSuiteLike.runTests$(FunSuiteLike.scala:228)
> [info]   at org.scalatest.FunSuite.runTests(FunSuite.scala:1560)
> [info]   at org.scalatest.Suite.run(Suite.scala:1124)
> [info]   at org.scalatest.Suite.run$(Suite.scala:1106)
> [info]   at org.scalatest.FunSuite.org$scalatest$FunSuiteLike$$super$run(FunSuite.scala:1560)
> [info]   at org.scalatest.FunSuiteLike.$anonfun$run$1(FunSuiteLike.scala:233)
> [info]   at org.scalatest.SuperEngine.runImpl(Engine.scala:518)
> [info]   at org.scalatest.FunSuiteLike.run(FunSuiteLike.scala:233)
> [info]   at org.scalatest.FunSuiteLike.run$(FunSuiteLike.scala:232)
> [info]   at org.apache.spark.SparkFunSuite.org$scalatest$BeforeAndAfterAll$$super$run(SparkFunSuite.scala:59)
> [info]   at org.scalatest.BeforeAndAfterAll.liftedTree1$1(BeforeAndAfterAll.scala:213)
> [info]   at org.scalatest.BeforeAndAfterAll.run(BeforeAndAfterAll.scala:210)
> [info]   at org.scalatest.BeforeAndAfterAll.run$(BeforeAndAfterAll.scala:208)
> [info]   at org.apache.spark.SparkFunSuite.run(SparkFunSuite.scala:59)
> [info]   at org.scalatest.tools.Framework.org$scalatest$tools$Framework$$runSuite(Framework.scala:317)
> [info]   at org.scalatest.tools.Framework$ScalaTestTask.execute(Framework.scala:510)
> [info]   at sbt.ForkMain$Run$2.call(ForkMain.java:296)
> [info]   at sbt.ForkMain$Run$2.call(ForkMain.java:286)
> [info]   at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> [info]   at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> [info]   at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> [info]   at java.lang.Thread.run(Thread.java:748)
> 04:06:09.061 WARN org.apache.spark.sql.streaming.StreamingDeduplicationSuite: 
> ===== POSSIBLE THREAD LEAK IN SUITE o.a.s.sql.streaming.StreamingDeduplicationSuite, thread names: state-store-maintenance-task, shuffle-boss-1682-1, rpc-boss-1679-1 =====
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org