You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Hyukjin Kwon (Jira)" <ji...@apache.org> on 2021/02/08 13:54:08 UTC
[jira] [Updated] (SPARK-34154) Flaky Test:
LocalityPlacementStrategySuite.handle large number of containers and tasks
(SPARK-18750)
[ https://issues.apache.org/jira/browse/SPARK-34154?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Hyukjin Kwon updated SPARK-34154:
---------------------------------
Fix Version/s: 3.1.1
> Flaky Test: LocalityPlacementStrategySuite.handle large number of containers and tasks (SPARK-18750)
> ----------------------------------------------------------------------------------------------------
>
> Key: SPARK-34154
> URL: https://issues.apache.org/jira/browse/SPARK-34154
> Project: Spark
> Issue Type: Bug
> Components: YARN
> Affects Versions: 3.0.2, 3.2.0, 3.1.1
> Reporter: Dongjoon Hyun
> Assignee: Attila Zsolt Piros
> Priority: Major
> Fix For: 3.0.2, 3.2.0, 3.1.1, 3.1.2
>
>
> `LocalityPlacementStrategySuite` hangs sometimes like the following. We can retriever, but it takes our resource significantly because it hangs until the timeout (6 hours) occurs.
> [https://github.com/apache/spark/runs/1719480243]
> [https://github.com/apache/spark/runs/1724459002]
> [https://github.com/apache/spark/runs/1717958874]
> [https://github.com/apache/spark/runs/1731673955] (branch-3.0)
> {code:java}
> [info] LocalityPlacementStrategySuite:
> 17299[info] *** Test still running after 3 minutes, 6 seconds: suite name: LocalityPlacementStrategySuite, test name: handle large number of containers and tasks (SPARK-18750).
> 17300[info] *** Test still running after 8 minutes, 6 seconds: suite name: LocalityPlacementStrategySuite, test name: handle large number of containers and tasks (SPARK-18750).
> 17301[info] *** Test still running after 13 minutes, 6 seconds: suite name: LocalityPlacementStrategySuite, test name: handle large number of containers and tasks (SPARK-18750).
> 17302[info] *** Test still running after 18 minutes, 6 seconds: suite name: LocalityPlacementStrategySuite, test name: handle large number of containers and tasks (SPARK-18750).
> 17303[info] *** Test still running after 23 minutes, 6 seconds: suite name: LocalityPlacementStrategySuite, test name: handle large number of containers and tasks (SPARK-18750).
> 17304[info] *** Test still running after 28 minutes, 6 seconds: suite name: LocalityPlacementStrategySuite, test name: handle large number of containers and tasks (SPARK-18750).
> 17305[info] *** Test still running after 33 minutes, 6 seconds: suite name: LocalityPlacementStrategySuite, test name: handle large number of containers and tasks (SPARK-18750).
> 17306[info] *** Test still running after 38 minutes, 6 seconds: suite name: LocalityPlacementStrategySuite, test name: handle large number of containers and tasks (SPARK-18750).
> 17307[info] *** Test still running after 43 minutes, 6 seconds: suite name: LocalityPlacementStrategySuite, test name: handle large number of containers and tasks (SPARK-18750).
> 17308[info] *** Test still running after 48 minutes, 6 seconds: suite name: LocalityPlacementStrategySuite, test name: handle large number of containers and tasks (SPARK-18750).
> 17309[info] *** Test still running after 53 minutes, 6 seconds: suite name: LocalityPlacementStrategySuite, test name: handle large number of containers and tasks (SPARK-18750).
> 17310[info] *** Test still running after 58 minutes, 6 seconds: suite name: LocalityPlacementStrategySuite, test name: handle large number of containers and tasks (SPARK-18750).
> 17311[info] *** Test still running after 1 hour, 3 minutes, 6 seconds: suite name: LocalityPlacementStrategySuite, test name: handle large number of containers and tasks (SPARK-18750).
> 17312[info] *** Test still running after 1 hour, 8 minutes, 6 seconds: suite name: LocalityPlacementStrategySuite, test name: handle large number of containers and tasks (SPARK-18750).
> 17313[info] *** Test still running after 1 hour, 13 minutes, 6 seconds: suite name: LocalityPlacementStrategySuite, test name: handle large number of containers and tasks (SPARK-18750).
> 17314[info] *** Test still running after 1 hour, 18 minutes, 6 seconds: suite name: LocalityPlacementStrategySuite, test name: handle large number of containers and tasks (SPARK-18750).
> 17315[info] *** Test still running after 1 hour, 23 minutes, 6 seconds: suite name: LocalityPlacementStrategySuite, test name: handle large number of containers and tasks (SPARK-18750).
> 17316[info] *** Test still running after 1 hour, 28 minutes, 6 seconds: suite name: LocalityPlacementStrategySuite, test name: handle large number of containers and tasks (SPARK-18750).
> 17317[info] *** Test still running after 1 hour, 33 minutes, 6 seconds: suite name: LocalityPlacementStrategySuite, test name: handle large number of containers and tasks (SPARK-18750).
> 17318[info] *** Test still running after 1 hour, 38 minutes, 6 seconds: suite name: LocalityPlacementStrategySuite, test name: handle large number of containers and tasks (SPARK-18750).
> 17319[info] *** Test still running after 1 hour, 43 minutes, 6 seconds: suite name: LocalityPlacementStrategySuite, test name: handle large number of containers and tasks (SPARK-18750).
> 17320[info] *** Test still running after 1 hour, 48 minutes, 6 seconds: suite name: LocalityPlacementStrategySuite, test name: handle large number of containers and tasks (SPARK-18750).
> 17321[info] *** Test still running after 1 hour, 53 minutes, 6 seconds: suite name: LocalityPlacementStrategySuite, test name: handle large number of containers and tasks (SPARK-18750).
> 17322[info] *** Test still running after 1 hour, 58 minutes, 6 seconds: suite name: LocalityPlacementStrategySuite, test name: handle large number of containers and tasks (SPARK-18750).
> 17323[info] *** Test still running after 2 hours, 3 minutes, 6 seconds: suite name: LocalityPlacementStrategySuite, test name: handle large number of containers and tasks (SPARK-18750).
> 17324[info] *** Test still running after 2 hours, 8 minutes, 6 seconds: suite name: LocalityPlacementStrategySuite, test name: handle large number of containers and tasks (SPARK-18750).
> 17325[info] *** Test still running after 2 hours, 13 minutes, 6 seconds: suite name: LocalityPlacementStrategySuite, test name: handle large number of containers and tasks (SPARK-18750).
> 17326[info] *** Test still running after 2 hours, 18 minutes, 6 seconds: suite name: LocalityPlacementStrategySuite, test name: handle large number of containers and tasks (SPARK-18750).
> 17327[info] *** Test still running after 2 hours, 23 minutes, 6 seconds: suite name: LocalityPlacementStrategySuite, test name: handle large number of containers and tasks (SPARK-18750).
> 17328[info] *** Test still running after 2 hours, 28 minutes, 6 seconds: suite name: LocalityPlacementStrategySuite, test name: handle large number of containers and tasks (SPARK-18750).
> 17329[info] *** Test still running after 2 hours, 33 minutes, 6 seconds: suite name: LocalityPlacementStrategySuite, test name: handle large number of containers and tasks (SPARK-18750).
> 17330[info] *** Test still running after 2 hours, 38 minutes, 6 seconds: suite name: LocalityPlacementStrategySuite, test name: handle large number of containers and tasks (SPARK-18750).
> 17331[info] *** Test still running after 2 hours, 43 minutes, 6 seconds: suite name: LocalityPlacementStrategySuite, test name: handle large number of containers and tasks (SPARK-18750).
> 17332[info] *** Test still running after 2 hours, 48 minutes, 6 seconds: suite name: LocalityPlacementStrategySuite, test name: handle large number of containers and tasks (SPARK-18750).
> 17333[info] *** Test still running after 2 hours, 53 minutes, 6 seconds: suite name: LocalityPlacementStrategySuite, test name: handle large number of containers and tasks (SPARK-18750).
> 17334[info] *** Test still running after 2 hours, 58 minutes, 6 seconds: suite name: LocalityPlacementStrategySuite, test name: handle large number of containers and tasks (SPARK-18750).
> 17335[info] *** Test still running after 3 hours, 3 minutes, 6 seconds: suite name: LocalityPlacementStrategySuite, test name: handle large number of containers and tasks (SPARK-18750).
> 17336[info] *** Test still running after 3 hours, 8 minutes, 6 seconds: suite name: LocalityPlacementStrategySuite, test name: handle large number of containers and tasks (SPARK-18750).
> 17337[info] *** Test still running after 3 hours, 13 minutes, 6 seconds: suite name: LocalityPlacementStrategySuite, test name: handle large number of containers and tasks (SPARK-18750).
> 17338[info] *** Test still running after 3 hours, 18 minutes, 6 seconds: suite name: LocalityPlacementStrategySuite, test name: handle large number of containers and tasks (SPARK-18750).
> 17339[info] *** Test still running after 3 hours, 23 minutes, 6 seconds: suite name: LocalityPlacementStrategySuite, test name: handle large number of containers and tasks (SPARK-18750).
> 17340[info] *** Test still running after 3 hours, 28 minutes, 6 seconds: suite name: LocalityPlacementStrategySuite, test name: handle large number of containers and tasks (SPARK-18750).
> 17341[info] *** Test still running after 3 hours, 33 minutes, 6 seconds: suite name: LocalityPlacementStrategySuite, test name: handle large number of containers and tasks (SPARK-18750).
> 17342[info] *** Test still running after 3 hours, 38 minutes, 6 seconds: suite name: LocalityPlacementStrategySuite, test name: handle large number of containers and tasks (SPARK-18750).
> 17343[info] *** Test still running after 3 hours, 43 minutes, 6 seconds: suite name: LocalityPlacementStrategySuite, test name: handle large number of containers and tasks (SPARK-18750).
> 17344[info] *** Test still running after 3 hours, 48 minutes, 6 seconds: suite name: LocalityPlacementStrategySuite, test name: handle large number of containers and tasks (SPARK-18750).
> 17345[info] *** Test still running after 3 hours, 53 minutes, 6 seconds: suite name: LocalityPlacementStrategySuite, test name: handle large number of containers and tasks (SPARK-18750).
> 17346[info] *** Test still running after 3 hours, 58 minutes, 6 seconds: suite name: LocalityPlacementStrategySuite, test name: handle large number of containers and tasks (SPARK-18750).
> 17347[info] *** Test still running after 4 hours, 3 minutes, 6 seconds: suite name: LocalityPlacementStrategySuite, test name: handle large number of containers and tasks (SPARK-18750).
> 17348[info] *** Test still running after 4 hours, 8 minutes, 6 seconds: suite name: LocalityPlacementStrategySuite, test name: handle large number of containers and tasks (SPARK-18750).
> 17349[info] *** Test still running after 4 hours, 13 minutes, 6 seconds: suite name: LocalityPlacementStrategySuite, test name: handle large number of containers and tasks (SPARK-18750).
> 17350[info] *** Test still running after 4 hours, 18 minutes, 6 seconds: suite name: LocalityPlacementStrategySuite, test name: handle large number of containers and tasks (SPARK-18750). {code}
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org