You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by Artem P <yo...@gmail.com> on 2019/02/22 10:40:39 UTC
Occasional broadcast timeout when dynamic allocation is on
Hi!
We have dynamic allocation enabled for our regular jobs and sometimes they fail with java.util.concurrent.TimeoutException: Futures timed out after [300 seconds]. Seems like spark driver starts broadcast just before the job has received any executors from the YARN and if it takes more than 5 minutes to acquire them, the broadcast fails with TimeoutException. Is there any way to force Spark start broadcast only after all executors are in place (at least minExecutors count)?
Relevant logs (it can be seen, that exception happened 5 minutes after broadcast start message):
```
2019-02-22 06:11:48,047 [broadcast-exchange-0] INFO org.apache.spark.sql.catalyst.expressions.codegen.CodeGenerator - Code generated in 361.70265 ms
2019-02-22 06:11:48,237 [broadcast-exchange-0] INFO org.apache.spark.storage.memory.MemoryStore - Block broadcast_0 stored as values in memory (estimated size 485.3 KB, free 365.8 MB)
2019-02-22 06:11:48,297 [main] INFO org.apache.spark.sql.catalyst.expressions.codegen.CodeGenerator - Code generated in 611.624073 ms
2019-02-22 06:11:48,522 [broadcast-exchange-0] INFO org.apache.spark.storage.memory.MemoryStore - Block broadcast_0_piece0 stored as bytes in memory (estimated size 63.5 KB, free 365.8 MB)
2019-02-22 06:11:48,524 [dispatcher-event-loop-4] INFO org.apache.spark.storage.BlockManagerInfo - Added broadcast_0_piece0 in memory on ip-10-4-3-123.eu-central-1.compute.internal:42935 (size: 63.5 KB, free: 366.2 MB)
2019-02-22 06:11:48,531 [broadcast-exchange-0] INFO org.apache.spark.SparkContext - Created broadcast 0 from run at ThreadPoolExecutor.java:1149
2019-02-22 06:11:48,545 [broadcast-exchange-0] INFO org.apache.spark.sql.execution.FileSourceScanExec - Planning scan with bin packing, max size: 4194304 bytes, open cost is considered as scanning 4194304 bytes.
2019-02-22 06:11:48,859 [broadcast-exchange-0] INFO org.apache.spark.SparkContext - Starting job: run at ThreadPoolExecutor.java:1149
2019-02-22 06:11:48,885 [dag-scheduler-event-loop] INFO org.apache.spark.scheduler.DAGScheduler - Got job 0 (run at ThreadPoolExecutor.java:1149) with 1 output partitions
2019-02-22 06:11:48,886 [dag-scheduler-event-loop] INFO org.apache.spark.scheduler.DAGScheduler - Final stage: ResultStage 0 (run at ThreadPoolExecutor.java:1149)
2019-02-22 06:11:48,893 [dag-scheduler-event-loop] INFO org.apache.spark.scheduler.DAGScheduler - Parents of final stage: List()
2019-02-22 06:11:48,895 [dag-scheduler-event-loop] INFO org.apache.spark.scheduler.DAGScheduler - Missing parents: List()
2019-02-22 06:11:48,940 [dag-scheduler-event-loop] INFO org.apache.spark.scheduler.DAGScheduler - Submitting ResultStage 0 (MapPartitionsRDD[2] at run at ThreadPoolExecutor.java:1149), which has no missing parents
2019-02-22 06:11:49,004 [dag-scheduler-event-loop] INFO org.apache.spark.storage.memory.MemoryStore - Block broadcast_1 stored as values in memory (estimated size 11.7 KB, free 365.8 MB)
2019-02-22 06:11:49,024 [main] INFO org.apache.spark.sql.catalyst.expressions.codegen.CodeGenerator - Code generated in 362.596124 ms
2019-02-22 06:11:49,054 [dag-scheduler-event-loop] INFO org.apache.spark.storage.memory.MemoryStore - Block broadcast_1_piece0 stored as bytes in memory (estimated size 5.3 KB, free 365.7 MB)
2019-02-22 06:11:49,055 [dispatcher-event-loop-1] INFO org.apache.spark.storage.BlockManagerInfo - Added broadcast_1_piece0 in memory on ip-10-4-3-123.eu-central-1.compute.internal:42935 (size: 5.3 KB, free: 366.2 MB)
2019-02-22 06:11:49,066 [dag-scheduler-event-loop] INFO org.apache.spark.SparkContext - Created broadcast 1 from broadcast at DAGScheduler.scala:1079
2019-02-22 06:11:49,101 [dag-scheduler-event-loop] INFO org.apache.spark.scheduler.DAGScheduler - Submitting 1 missing tasks from ResultStage 0 (MapPartitionsRDD[2] at run at ThreadPoolExecutor.java:1149) (first 15 tasks are for partitions Vector(0))
2019-02-22 06:11:49,103 [dag-scheduler-event-loop] INFO org.apache.spark.scheduler.cluster.YarnScheduler - Adding task set 0.0 with 1 tasks
2019-02-22 06:11:49,116 [main] INFO org.apache.spark.sql.catalyst.expressions.codegen.CodeGenerator - Code generated in 56.99095 ms
2019-02-22 06:11:49,188 [dag-scheduler-event-loop] INFO org.apache.spark.scheduler.FairSchedulableBuilder - Added task set TaskSet_0.0 tasks to pool default
2019-02-22 06:12:04,190 [Timer-0] WARN org.apache.spark.scheduler.cluster.YarnScheduler - Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient resources
2019-02-22 06:12:19,189 [Timer-0] WARN org.apache.spark.scheduler.cluster.YarnScheduler - Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient resources
2019-02-22 06:12:34,189 [Timer-0] WARN org.apache.spark.scheduler.cluster.YarnScheduler - Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient resources
2019-02-22 06:12:49,190 [Timer-0] WARN org.apache.spark.scheduler.cluster.YarnScheduler - Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient resources
2019-02-22 06:13:04,189 [Timer-0] WARN org.apache.spark.scheduler.cluster.YarnScheduler - Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient resources
2019-02-22 06:13:19,189 [Timer-0] WARN org.apache.spark.scheduler.cluster.YarnScheduler - Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient resources
2019-02-22 06:13:34,189 [Timer-0] WARN org.apache.spark.scheduler.cluster.YarnScheduler - Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient resources
2019-02-22 06:13:49,189 [Timer-0] WARN org.apache.spark.scheduler.cluster.YarnScheduler - Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient resources
2019-02-22 06:14:04,189 [Timer-0] WARN org.apache.spark.scheduler.cluster.YarnScheduler - Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient resources
2019-02-22 06:14:19,189 [Timer-0] WARN org.apache.spark.scheduler.cluster.YarnScheduler - Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient resources
2019-02-22 06:14:34,190 [Timer-0] WARN org.apache.spark.scheduler.cluster.YarnScheduler - Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient resources
2019-02-22 06:14:49,189 [Timer-0] WARN org.apache.spark.scheduler.cluster.YarnScheduler - Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient resources
2019-02-22 06:15:04,189 [Timer-0] WARN org.apache.spark.scheduler.cluster.YarnScheduler - Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient resources
2019-02-22 06:15:19,189 [Timer-0] WARN org.apache.spark.scheduler.cluster.YarnScheduler - Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient resources
2019-02-22 06:15:34,189 [Timer-0] WARN org.apache.spark.scheduler.cluster.YarnScheduler - Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient resources
2019-02-22 06:15:49,189 [Timer-0] WARN org.apache.spark.scheduler.cluster.YarnScheduler - Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient resources
2019-02-22 06:16:04,189 [Timer-0] WARN org.apache.spark.scheduler.cluster.YarnScheduler - Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient resources
2019-02-22 06:16:19,190 [Timer-0] WARN org.apache.spark.scheduler.cluster.YarnScheduler - Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient resources
2019-02-22 06:16:34,189 [Timer-0] WARN org.apache.spark.scheduler.cluster.YarnScheduler - Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient resources
2019-02-22 06:16:49,169 [main] WARN org.apache.spark.util.Utils - Truncated the string representation of a plan since it was too large. This behavior can be adjusted by setting 'spark.debug.maxToStringFields' in SparkEnv.conf.
```
And the exception right after this:
```
Caused by: java.util.concurrent.TimeoutException: Futures timed out after [300 seconds]
at scala.concurrent.impl.Promise$DefaultPromise.ready(Promise.scala:219)
at scala.concurrent.impl.Promise$DefaultPromise.result(Promise.scala:223)
at org.apache.spark.util.ThreadUtils$.awaitResult(ThreadUtils.scala:201)
at org.apache.spark.sql.execution.exchange.BroadcastExchangeExec.doExecuteBroadcast(BroadcastExchangeExec.scala:136)
at org.apache.spark.sql.execution.InputAdapter.doExecuteBroadcast(WholeStageCodegenExec.scala:367)
at org.apache.spark.sql.execution.SparkPlan$$anonfun$executeBroadcast$1.apply(SparkPlan.scala:144)
at org.apache.spark.sql.execution.SparkPlan$$anonfun$executeBroadcast$1.apply(SparkPlan.scala:140)
at org.apache.spark.sql.execution.SparkPlan$$anonfun$executeQuery$1.apply(SparkPlan.scala:155)
at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
at org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:152)
at org.apache.spark.sql.execution.SparkPlan.executeBroadcast(SparkPlan.scala:140)
at org.apache.spark.sql.execution.joins.BroadcastHashJoinExec.prepareBroadcast(BroadcastHashJoinExec.scala:135)
at org.apache.spark.sql.execution.joins.BroadcastHashJoinExec.codegenInner(BroadcastHashJoinExec.scala:232)
at org.apache.spark.sql.execution.joins.BroadcastHashJoinExec.doConsume(BroadcastHashJoinExec.scala:102)
at org.apache.spark.sql.execution.CodegenSupport$class.consume(WholeStageCodegenExec.scala:181)
at org.apache.spark.sql.execution.ProjectExec.consume(basicPhysicalOperators.scala:35)
at org.apache.spark.sql.execution.ProjectExec.doConsume(basicPhysicalOperators.scala:65)
at org.apache.spark.sql.execution.CodegenSupport$class.consume(WholeStageCodegenExec.scala:181)
at org.apache.spark.sql.execution.FilterExec.consume(basicPhysicalOperators.scala:85)
at org.apache.spark.sql.execution.FilterExec.doConsume(basicPhysicalOperators.scala:206)
at org.apache.spark.sql.execution.CodegenSupport$class.consume(WholeStageCodegenExec.scala:181)
at org.apache.spark.sql.execution.FileSourceScanExec.consume(DataSourceScanExec.scala:158)
at org.apache.spark.sql.execution.ColumnarBatchScan$class.produceBatches(ColumnarBatchScan.scala:138)
at org.apache.spark.sql.execution.ColumnarBatchScan$class.doProduce(ColumnarBatchScan.scala:78)
at org.apache.spark.sql.execution.FileSourceScanExec.doProduce(DataSourceScanExec.scala:158)
at org.apache.spark.sql.execution.CodegenSupport$$anonfun$produce$1.apply(WholeStageCodegenExec.scala:88)
at org.apache.spark.sql.execution.CodegenSupport$$anonfun$produce$1.apply(WholeStageCodegenExec.scala:83)
at org.apache.spark.sql.execution.SparkPlan$$anonfun$executeQuery$1.apply(SparkPlan.scala:155)
at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
at org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:152)
at org.apache.spark.sql.execution.CodegenSupport$class.produce(WholeStageCodegenExec.scala:83)
at org.apache.spark.sql.execution.FileSourceScanExec.produce(DataSourceScanExec.scala:158)
at org.apache.spark.sql.execution.FilterExec.doProduce(basicPhysicalOperators.scala:125)
at org.apache.spark.sql.execution.CodegenSupport$$anonfun$produce$1.apply(WholeStageCodegenExec.scala:88)
at org.apache.spark.sql.execution.CodegenSupport$$anonfun$produce$1.apply(WholeStageCodegenExec.scala:83)
at org.apache.spark.sql.execution.SparkPlan$$anonfun$executeQuery$1.apply(SparkPlan.scala:155)
at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
at org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:152)
at org.apache.spark.sql.execution.CodegenSupport$class.produce(WholeStageCodegenExec.scala:83)
at org.apache.spark.sql.execution.FilterExec.produce(basicPhysicalOperators.scala:85)
at org.apache.spark.sql.execution.ProjectExec.doProduce(basicPhysicalOperators.scala:45)
at org.apache.spark.sql.execution.CodegenSupport$$anonfun$produce$1.apply(WholeStageCodegenExec.scala:88)
at org.apache.spark.sql.execution.CodegenSupport$$anonfun$produce$1.apply(WholeStageCodegenExec.scala:83)
at org.apache.spark.sql.execution.SparkPlan$$anonfun$executeQuery$1.apply(SparkPlan.scala:155)
at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
at org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:152)
at org.apache.spark.sql.execution.CodegenSupport$class.produce(WholeStageCodegenExec.scala:83)
at org.apache.spark.sql.execution.ProjectExec.produce(basicPhysicalOperators.scala:35)
at org.apache.spark.sql.execution.joins.BroadcastHashJoinExec.doProduce(BroadcastHashJoinExec.scala:97)
at org.apache.spark.sql.execution.CodegenSupport$$anonfun$produce$1.apply(WholeStageCodegenExec.scala:88)
at org.apache.spark.sql.execution.CodegenSupport$$anonfun$produce$1.apply(WholeStageCodegenExec.scala:83)
at org.apache.spark.sql.execution.SparkPlan$$anonfun$executeQuery$1.apply(SparkPlan.scala:155)
at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
at org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:152)
at org.apache.spark.sql.execution.CodegenSupport$class.produce(WholeStageCodegenExec.scala:83)
at org.apache.spark.sql.execution.joins.BroadcastHashJoinExec.produce(BroadcastHashJoinExec.scala:39)
at org.apache.spark.sql.execution.ProjectExec.doProduce(basicPhysicalOperators.scala:45)
at org.apache.spark.sql.execution.CodegenSupport$$anonfun$produce$1.apply(WholeStageCodegenExec.scala:88)
at org.apache.spark.sql.execution.CodegenSupport$$anonfun$produce$1.apply(WholeStageCodegenExec.scala:83)
at org.apache.spark.sql.execution.SparkPlan$$anonfun$executeQuery$1.apply(SparkPlan.scala:155)
at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
at org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:152)
at org.apache.spark.sql.execution.CodegenSupport$class.produce(WholeStageCodegenExec.scala:83)
at org.apache.spark.sql.execution.ProjectExec.produce(basicPhysicalOperators.scala:35)
at org.apache.spark.sql.execution.WholeStageCodegenExec.doCodeGen(WholeStageCodegenExec.scala:524)
at org.apache.spark.sql.execution.WholeStageCodegenExec.doExecute(WholeStageCodegenExec.scala:576)
at org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:131)
at org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:127)
at org.apache.spark.sql.execution.SparkPlan$$anonfun$executeQuery$1.apply(SparkPlan.scala:155)
at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
at org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:152)
at org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:127)
at org.apache.spark.sql.execution.exchange.ShuffleExchangeExec.prepareShuffleDependency(ShuffleExchangeExec.scala:92)
at org.apache.spark.sql.execution.exchange.ShuffleExchangeExec$$anonfun$doExecute$1.apply(ShuffleExchangeExec.scala:128)
at org.apache.spark.sql.execution.exchange.ShuffleExchangeExec$$anonfun$doExecute$1.apply(ShuffleExchangeExec.scala:119)
at org.apache.spark.sql.catalyst.errors.package$.attachTree(package.scala:52)
```
Re: Occasional broadcast timeout when dynamic allocation is on
Posted by Abdeali Kothari <ab...@gmail.com>.
I've been facing this issue for the past few months too.
I always thought it was an infrastructure issue, but we were never able to
figure out what the infra issue was.
If others are facing this issue too - then maybe it's a valid bug.
Does anyone have any ideas on how we can debug this?
On Fri, Feb 22, 2019, 16:10 Artem P <yonedadev@gmail.com wrote:
> Hi!
>
> We have dynamic allocation enabled for our regular jobs and sometimes they
> fail with java.util.concurrent.TimeoutException: Futures timed out after
> [300 seconds]. Seems like spark driver starts broadcast just before the job
> has received any executors from the YARN and if it takes more than 5
> minutes to acquire them, the broadcast fails with TimeoutException. Is
> there any way to force Spark start broadcast only after all executors are
> in place (at least minExecutors count)?
>
> Relevant logs (it can be seen, that exception happened 5 minutes after
> broadcast start message):
> ```
>
> 2019-02-22 06:11:48,047 [broadcast-exchange-0] INFO org.apache.spark.sql.catalyst.expressions.codegen.CodeGenerator - Code generated in 361.70265 ms
> 2019-02-22 06:11:48,237 [broadcast-exchange-0] INFO org.apache.spark.storage.memory.MemoryStore - Block broadcast_0 stored as values in memory (estimated size 485.3 KB, free 365.8 MB)
> 2019-02-22 06:11:48,297 [main] INFO org.apache.spark.sql.catalyst.expressions.codegen.CodeGenerator - Code generated in 611.624073 ms
> 2019-02-22 06:11:48,522 [broadcast-exchange-0] INFO org.apache.spark.storage.memory.MemoryStore - Block broadcast_0_piece0 stored as bytes in memory (estimated size 63.5 KB, free 365.8 MB)
> 2019-02-22 06:11:48,524 [dispatcher-event-loop-4] INFO org.apache.spark.storage.BlockManagerInfo - Added broadcast_0_piece0 in memory on ip-10-4-3-123.eu-central-1.compute.internal:42935 (size: 63.5 KB, free: 366.2 MB)
> 2019-02-22 06:11:48,531 [broadcast-exchange-0] INFO org.apache.spark.SparkContext - Created broadcast 0 from run at ThreadPoolExecutor.java:1149
> 2019-02-22 06:11:48,545 [broadcast-exchange-0] INFO org.apache.spark.sql.execution.FileSourceScanExec - Planning scan with bin packing, max size: 4194304 bytes, open cost is considered as scanning 4194304 bytes.
> 2019-02-22 06:11:48,859 [broadcast-exchange-0] INFO org.apache.spark.SparkContext - Starting job: run at ThreadPoolExecutor.java:1149
> 2019-02-22 06:11:48,885 [dag-scheduler-event-loop] INFO org.apache.spark.scheduler.DAGScheduler - Got job 0 (run at ThreadPoolExecutor.java:1149) with 1 output partitions
> 2019-02-22 06:11:48,886 [dag-scheduler-event-loop] INFO org.apache.spark.scheduler.DAGScheduler - Final stage: ResultStage 0 (run at ThreadPoolExecutor.java:1149)
> 2019-02-22 06:11:48,893 [dag-scheduler-event-loop] INFO org.apache.spark.scheduler.DAGScheduler - Parents of final stage: List()
> 2019-02-22 06:11:48,895 [dag-scheduler-event-loop] INFO org.apache.spark.scheduler.DAGScheduler - Missing parents: List()
> 2019-02-22 06:11:48,940 [dag-scheduler-event-loop] INFO org.apache.spark.scheduler.DAGScheduler - Submitting ResultStage 0 (MapPartitionsRDD[2] at run at ThreadPoolExecutor.java:1149), which has no missing parents
> 2019-02-22 06:11:49,004 [dag-scheduler-event-loop] INFO org.apache.spark.storage.memory.MemoryStore - Block broadcast_1 stored as values in memory (estimated size 11.7 KB, free 365.8 MB)
> 2019-02-22 06:11:49,024 [main] INFO org.apache.spark.sql.catalyst.expressions.codegen.CodeGenerator - Code generated in 362.596124 ms
> 2019-02-22 06:11:49,054 [dag-scheduler-event-loop] INFO org.apache.spark.storage.memory.MemoryStore - Block broadcast_1_piece0 stored as bytes in memory (estimated size 5.3 KB, free 365.7 MB)
> 2019-02-22 06:11:49,055 [dispatcher-event-loop-1] INFO org.apache.spark.storage.BlockManagerInfo - Added broadcast_1_piece0 in memory on ip-10-4-3-123.eu-central-1.compute.internal:42935 (size: 5.3 KB, free: 366.2 MB)
> 2019-02-22 06:11:49,066 [dag-scheduler-event-loop] INFO org.apache.spark.SparkContext - Created broadcast 1 from broadcast at DAGScheduler.scala:1079
> 2019-02-22 06:11:49,101 [dag-scheduler-event-loop] INFO org.apache.spark.scheduler.DAGScheduler - Submitting 1 missing tasks from ResultStage 0 (MapPartitionsRDD[2] at run at ThreadPoolExecutor.java:1149) (first 15 tasks are for partitions Vector(0))
> 2019-02-22 06:11:49,103 [dag-scheduler-event-loop] INFO org.apache.spark.scheduler.cluster.YarnScheduler - Adding task set 0.0 with 1 tasks
> 2019-02-22 06:11:49,116 [main] INFO org.apache.spark.sql.catalyst.expressions.codegen.CodeGenerator - Code generated in 56.99095 ms
> 2019-02-22 06:11:49,188 [dag-scheduler-event-loop] INFO org.apache.spark.scheduler.FairSchedulableBuilder - Added task set TaskSet_0.0 tasks to pool default
> 2019-02-22 06:12:04,190 [Timer-0] WARN org.apache.spark.scheduler.cluster.YarnScheduler - Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient resources
> 2019-02-22 06:12:19,189 [Timer-0] WARN org.apache.spark.scheduler.cluster.YarnScheduler - Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient resources
> 2019-02-22 06:12:34,189 [Timer-0] WARN org.apache.spark.scheduler.cluster.YarnScheduler - Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient resources
> 2019-02-22 06:12:49,190 [Timer-0] WARN org.apache.spark.scheduler.cluster.YarnScheduler - Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient resources
> 2019-02-22 06:13:04,189 [Timer-0] WARN org.apache.spark.scheduler.cluster.YarnScheduler - Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient resources
> 2019-02-22 06:13:19,189 [Timer-0] WARN org.apache.spark.scheduler.cluster.YarnScheduler - Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient resources
> 2019-02-22 06:13:34,189 [Timer-0] WARN org.apache.spark.scheduler.cluster.YarnScheduler - Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient resources
> 2019-02-22 06:13:49,189 [Timer-0] WARN org.apache.spark.scheduler.cluster.YarnScheduler - Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient resources
> 2019-02-22 06:14:04,189 [Timer-0] WARN org.apache.spark.scheduler.cluster.YarnScheduler - Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient resources
> 2019-02-22 06:14:19,189 [Timer-0] WARN org.apache.spark.scheduler.cluster.YarnScheduler - Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient resources
> 2019-02-22 06:14:34,190 [Timer-0] WARN org.apache.spark.scheduler.cluster.YarnScheduler - Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient resources
> 2019-02-22 06:14:49,189 [Timer-0] WARN org.apache.spark.scheduler.cluster.YarnScheduler - Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient resources
> 2019-02-22 06:15:04,189 [Timer-0] WARN org.apache.spark.scheduler.cluster.YarnScheduler - Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient resources
> 2019-02-22 06:15:19,189 [Timer-0] WARN org.apache.spark.scheduler.cluster.YarnScheduler - Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient resources
> 2019-02-22 06:15:34,189 [Timer-0] WARN org.apache.spark.scheduler.cluster.YarnScheduler - Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient resources
> 2019-02-22 06:15:49,189 [Timer-0] WARN org.apache.spark.scheduler.cluster.YarnScheduler - Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient resources
> 2019-02-22 06:16:04,189 [Timer-0] WARN org.apache.spark.scheduler.cluster.YarnScheduler - Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient resources
> 2019-02-22 06:16:19,190 [Timer-0] WARN org.apache.spark.scheduler.cluster.YarnScheduler - Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient resources
> 2019-02-22 06:16:34,189 [Timer-0] WARN org.apache.spark.scheduler.cluster.YarnScheduler - Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient resources
> 2019-02-22 06:16:49,169 [main] WARN org.apache.spark.util.Utils - Truncated the string representation of a plan since it was too large. This behavior can be adjusted by setting 'spark.debug.maxToStringFields' in SparkEnv.conf.
>
> ```
> And the exception right after this:
> ```
>
> Caused by: java.util.concurrent.TimeoutException: Futures timed out after [300 seconds]
> at scala.concurrent.impl.Promise$DefaultPromise.ready(Promise.scala:219)
> at scala.concurrent.impl.Promise$DefaultPromise.result(Promise.scala:223)
> at org.apache.spark.util.ThreadUtils$.awaitResult(ThreadUtils.scala:201)
> at org.apache.spark.sql.execution.exchange.BroadcastExchangeExec.doExecuteBroadcast(BroadcastExchangeExec.scala:136)
> at org.apache.spark.sql.execution.InputAdapter.doExecuteBroadcast(WholeStageCodegenExec.scala:367)
> at org.apache.spark.sql.execution.SparkPlan$$anonfun$executeBroadcast$1.apply(SparkPlan.scala:144)
> at org.apache.spark.sql.execution.SparkPlan$$anonfun$executeBroadcast$1.apply(SparkPlan.scala:140)
> at org.apache.spark.sql.execution.SparkPlan$$anonfun$executeQuery$1.apply(SparkPlan.scala:155)
> at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
> at org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:152)
> at org.apache.spark.sql.execution.SparkPlan.executeBroadcast(SparkPlan.scala:140)
> at org.apache.spark.sql.execution.joins.BroadcastHashJoinExec.prepareBroadcast(BroadcastHashJoinExec.scala:135)
> at org.apache.spark.sql.execution.joins.BroadcastHashJoinExec.codegenInner(BroadcastHashJoinExec.scala:232)
> at org.apache.spark.sql.execution.joins.BroadcastHashJoinExec.doConsume(BroadcastHashJoinExec.scala:102)
> at org.apache.spark.sql.execution.CodegenSupport$class.consume(WholeStageCodegenExec.scala:181)
> at org.apache.spark.sql.execution.ProjectExec.consume(basicPhysicalOperators.scala:35)
> at org.apache.spark.sql.execution.ProjectExec.doConsume(basicPhysicalOperators.scala:65)
> at org.apache.spark.sql.execution.CodegenSupport$class.consume(WholeStageCodegenExec.scala:181)
> at org.apache.spark.sql.execution.FilterExec.consume(basicPhysicalOperators.scala:85)
> at org.apache.spark.sql.execution.FilterExec.doConsume(basicPhysicalOperators.scala:206)
> at org.apache.spark.sql.execution.CodegenSupport$class.consume(WholeStageCodegenExec.scala:181)
> at org.apache.spark.sql.execution.FileSourceScanExec.consume(DataSourceScanExec.scala:158)
> at org.apache.spark.sql.execution.ColumnarBatchScan$class.produceBatches(ColumnarBatchScan.scala:138)
> at org.apache.spark.sql.execution.ColumnarBatchScan$class.doProduce(ColumnarBatchScan.scala:78)
> at org.apache.spark.sql.execution.FileSourceScanExec.doProduce(DataSourceScanExec.scala:158)
> at org.apache.spark.sql.execution.CodegenSupport$$anonfun$produce$1.apply(WholeStageCodegenExec.scala:88)
> at org.apache.spark.sql.execution.CodegenSupport$$anonfun$produce$1.apply(WholeStageCodegenExec.scala:83)
> at org.apache.spark.sql.execution.SparkPlan$$anonfun$executeQuery$1.apply(SparkPlan.scala:155)
> at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
> at org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:152)
> at org.apache.spark.sql.execution.CodegenSupport$class.produce(WholeStageCodegenExec.scala:83)
> at org.apache.spark.sql.execution.FileSourceScanExec.produce(DataSourceScanExec.scala:158)
> at org.apache.spark.sql.execution.FilterExec.doProduce(basicPhysicalOperators.scala:125)
> at org.apache.spark.sql.execution.CodegenSupport$$anonfun$produce$1.apply(WholeStageCodegenExec.scala:88)
> at org.apache.spark.sql.execution.CodegenSupport$$anonfun$produce$1.apply(WholeStageCodegenExec.scala:83)
> at org.apache.spark.sql.execution.SparkPlan$$anonfun$executeQuery$1.apply(SparkPlan.scala:155)
> at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
> at org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:152)
> at org.apache.spark.sql.execution.CodegenSupport$class.produce(WholeStageCodegenExec.scala:83)
> at org.apache.spark.sql.execution.FilterExec.produce(basicPhysicalOperators.scala:85)
> at org.apache.spark.sql.execution.ProjectExec.doProduce(basicPhysicalOperators.scala:45)
> at org.apache.spark.sql.execution.CodegenSupport$$anonfun$produce$1.apply(WholeStageCodegenExec.scala:88)
> at org.apache.spark.sql.execution.CodegenSupport$$anonfun$produce$1.apply(WholeStageCodegenExec.scala:83)
> at org.apache.spark.sql.execution.SparkPlan$$anonfun$executeQuery$1.apply(SparkPlan.scala:155)
> at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
> at org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:152)
> at org.apache.spark.sql.execution.CodegenSupport$class.produce(WholeStageCodegenExec.scala:83)
> at org.apache.spark.sql.execution.ProjectExec.produce(basicPhysicalOperators.scala:35)
> at org.apache.spark.sql.execution.joins.BroadcastHashJoinExec.doProduce(BroadcastHashJoinExec.scala:97)
> at org.apache.spark.sql.execution.CodegenSupport$$anonfun$produce$1.apply(WholeStageCodegenExec.scala:88)
> at org.apache.spark.sql.execution.CodegenSupport$$anonfun$produce$1.apply(WholeStageCodegenExec.scala:83)
> at org.apache.spark.sql.execution.SparkPlan$$anonfun$executeQuery$1.apply(SparkPlan.scala:155)
> at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
> at org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:152)
> at org.apache.spark.sql.execution.CodegenSupport$class.produce(WholeStageCodegenExec.scala:83)
> at org.apache.spark.sql.execution.joins.BroadcastHashJoinExec.produce(BroadcastHashJoinExec.scala:39)
> at org.apache.spark.sql.execution.ProjectExec.doProduce(basicPhysicalOperators.scala:45)
> at org.apache.spark.sql.execution.CodegenSupport$$anonfun$produce$1.apply(WholeStageCodegenExec.scala:88)
> at org.apache.spark.sql.execution.CodegenSupport$$anonfun$produce$1.apply(WholeStageCodegenExec.scala:83)
> at org.apache.spark.sql.execution.SparkPlan$$anonfun$executeQuery$1.apply(SparkPlan.scala:155)
> at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
> at org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:152)
> at org.apache.spark.sql.execution.CodegenSupport$class.produce(WholeStageCodegenExec.scala:83)
> at org.apache.spark.sql.execution.ProjectExec.produce(basicPhysicalOperators.scala:35)
> at org.apache.spark.sql.execution.WholeStageCodegenExec.doCodeGen(WholeStageCodegenExec.scala:524)
> at org.apache.spark.sql.execution.WholeStageCodegenExec.doExecute(WholeStageCodegenExec.scala:576)
> at org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:131)
> at org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:127)
> at org.apache.spark.sql.execution.SparkPlan$$anonfun$executeQuery$1.apply(SparkPlan.scala:155)
> at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
> at org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:152)
> at org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:127)
> at org.apache.spark.sql.execution.exchange.ShuffleExchangeExec.prepareShuffleDependency(ShuffleExchangeExec.scala:92)
> at org.apache.spark.sql.execution.exchange.ShuffleExchangeExec$$anonfun$doExecute$1.apply(ShuffleExchangeExec.scala:128)
> at org.apache.spark.sql.execution.exchange.ShuffleExchangeExec$$anonfun$doExecute$1.apply(ShuffleExchangeExec.scala:119)
> at org.apache.spark.sql.catalyst.errors.package$.attachTree(package.scala:52)
>
> ```
>