You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by GitBox <gi...@apache.org> on 2019/12/17 09:11:58 UTC

[GitHub] [spark] yaooqinn commented on issue #26914: [SPARK-30274][Core] Avoid BytesToBytesMap lookup hang forever when holding keys reaching max capacity

yaooqinn commented on issue #26914: [SPARK-30274][Core] Avoid BytesToBytesMap lookup hang forever when holding keys reaching max capacity
URL: https://github.com/apache/spark/pull/26914#issuecomment-566452079
 
 
   We have a job that seems related to this issue, this job sometimes hangs in a BroadcastJoin stage(task 131726) for about 2 hours.
   ```
   2019-12-17 04:11:31,072 [619016] - INFO  [Executor task launch worker for task 131726:Logging$class@54] - Code generated in 4163.18468 ms
   2019-12-17 04:11:34,701 [622645] - INFO  [dispatcher-event-loop-3:Logging$class@54] - Executor is trying to kill task 2208.0 in stage 16.0 (TID 117693), reason: another attempt succeeded
   2019-12-17 04:11:34,712 [622656] - INFO  [Executor task launch worker for task 117693:Logging$class@54] - Executor killed task 2208.0 in stage 16.0 (TID 117693), reason: another attempt succeeded
   2019-12-17 04:11:34,717 [622661] - INFO  [dispatcher-event-loop-0:Logging$class@54] - Got assigned task 132494
   2019-12-17 04:11:34,717 [622661] - INFO  [Executor task launch worker for task 132494:Logging$class@54] - Running task 13488.1 in stage 16.0 (TID 132494)
   2019-12-17 04:11:34,804 [622748] - INFO  [Executor task launch worker for task 132494:Logging$class@54] - Getting 13825 non-empty blocks out of 15000 blocks
   2019-12-17 04:11:34,904 [622848] - INFO  [Executor task launch worker for task 132494:Logging$class@54] - Started 1006 remote fetches in 107 ms
   2019-12-17 04:11:34,907 [622851] - INFO  [Executor task launch worker for task 132494:Logging$class@54] - Getting 100 non-empty blocks out of 104 blocks
   2019-12-17 04:11:34,909 [622853] - INFO  [Executor task launch worker for task 132494:Logging$class@54] - Started 56 remote fetches in 2 ms
   2019-12-17 04:11:36,870 [624814] - INFO  [Executor task launch worker for task 132012:Logging$class@54] - Code generated in 4465.326931 ms
   2019-12-17 04:11:36,895 [624839] - INFO  [Executor task launch worker for task 132494:Logging$class@54] - Code generated in 227.660186 ms
   2019-12-17 04:11:37,311 [625255] - INFO  [dispatcher-event-loop-2:Logging$class@54] - Executor is trying to kill task 10766.1 in stage 16.0 (TID 130906), reason: another attempt succeeded
   2019-12-17 04:11:37,323 [625267] - INFO  [Executor task launch worker for task 130906:Logging$class@54] - Executor killed task 10766.1 in stage 16.0 (TID 130906), reason: another attempt succeeded
   2019-12-17 04:11:37,327 [625271] - INFO  [dispatcher-event-loop-1:Logging$class@54] - Got assigned task 132680
   2019-12-17 04:11:37,327 [625271] - INFO  [Executor task launch worker for task 132680:Logging$class@54] - Running task 3992.1 in stage 16.0 (TID 132680)
   2019-12-17 04:11:37,399 [625343] - INFO  [Executor task launch worker for task 132680:Logging$class@54] - Getting 14650 non-empty blocks out of 15000 blocks
   2019-12-17 04:11:37,472 [625416] - INFO  [Executor task launch worker for task 132680:Logging$class@54] - Started 1006 remote fetches in 84 ms
   2019-12-17 04:11:37,489 [625433] - INFO  [Executor task launch worker for task 132680:Logging$class@54] - Getting 100 non-empty blocks out of 104 blocks
   2019-12-17 04:11:37,491 [625435] - INFO  [Executor task launch worker for task 132680:Logging$class@54] - Started 56 remote fetches in 2 ms
   2019-12-17 04:11:40,958 [628902] - INFO  [Executor task launch worker for task 132680:Logging$class@54] - Code generated in 1301.014453 ms
   2019-12-17 04:11:48,583 [636527] - INFO  [Executor task launch worker for task 131726:UnsafeExternalSorter@209] - Thread 119 spilling sort data of 2.1 GB to disk (0  time so far)
   2019-12-17 04:11:52,807 [640751] - INFO  [Executor task launch worker for task 132494:Logging$class@54] - Finished task 13488.1 in stage 16.0 (TID 132494). 10395 bytes result sent to driver
   2019-12-17 04:12:07,486 [655430] - INFO  [dispatcher-event-loop-3:Logging$class@54] - Executor is trying to kill task 3992.1 in stage 16.0 (TID 132680), reason: another attempt succeeded
   2019-12-17 04:12:07,489 [655433] - INFO  [Executor task launch worker for task 132680:Logging$class@54] - Executor killed task 3992.1 in stage 16.0 (TID 132680), reason: another attempt succeeded
   2019-12-17 04:12:47,910 [695854] - INFO  [dispatcher-event-loop-0:Logging$class@54] - Executor is trying to kill task 3949.1 in stage 16.0 (TID 132012), reason: another attempt succeeded
   2019-12-17 04:12:47,914 [695858] - INFO  [Executor task launch worker for task 132012:Logging$class@54] - Executor killed task 3949.1 in stage 16.0 (TID 132012), reason: another attempt succeeded
   2019-12-17 06:24:45,498 [8613442] - INFO  [Executor task launch worker for task 131726:Logging$class@54] - Finished task 3638.1 in stage 16.0 (TID 131726). 10524 bytes result sent to driver
   ```

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org