You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Ranga Reddy (Jira)" <ji...@apache.org> on 2021/10/05 08:48:00 UTC

[jira] [Updated] (SPARK-36901) ERROR exchange.BroadcastExchangeExec: Could not execute broadcast in 300 secs

     [ https://issues.apache.org/jira/browse/SPARK-36901?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Ranga Reddy updated SPARK-36901:
--------------------------------
    Description: 
While running Spark application, if there are no further resources to launch executors, Spark application is failed after 5 mins with below exception.
{code:java}
21/09/24 06:12:45 WARN cluster.YarnScheduler: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient resources
...
21/09/24 06:17:29 ERROR exchange.BroadcastExchangeExec: Could not execute broadcast in 300 secs.
java.util.concurrent.TimeoutException: Futures timed out after [300 seconds]
...
Caused by: java.util.concurrent.TimeoutException: Futures timed out after [300 seconds]
	at scala.concurrent.impl.Promise$DefaultPromise.ready(Promise.scala:223)
	at scala.concurrent.impl.Promise$DefaultPromise.result(Promise.scala:227)
	at org.apache.spark.util.ThreadUtils$.awaitResult(ThreadUtils.scala:220)
	at org.apache.spark.sql.execution.exchange.BroadcastExchangeExec.doExecuteBroadcast(BroadcastExchangeExec.scala:146)
	... 71 more
21/09/24 06:17:30 INFO spark.SparkContext: Invoking stop() from shutdown hook
{code}
*Expectation* should be either needs to be throw proper exception saying *"there are no further resources to run the application"* or it needs to be *"wait till it get resources"*.

To reproduce the issue we have used following sample code.

*PySpark Code (test_broadcast_timeout.py):*
{code:java}
from pyspark.sql import SparkSession

spark = SparkSession.builder.appName("Test Broadcast Timeout").getOrCreate()
t1 = spark.range(5)
t2 = spark.range(5)

q = t1.join(t2,t1.id == t2.id)
q.explain
q.show(){code}
*Spark Submit Command:*
{code:java}
spark-submit --executor-memory 512M test_broadcast_timeout.py{code}
 Note: We have tested same code in Spark 3.1, we are able to reproduce the issue in Spark3 as well.

  was:
While running Spark application, if there are no further resources to launch executors, Spark application is failed after 5 mins with below exception.
{code:java}
21/09/24 06:12:45 WARN cluster.YarnScheduler: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient resources
...
21/09/24 06:17:29 ERROR exchange.BroadcastExchangeExec: Could not execute broadcast in 300 secs.
java.util.concurrent.TimeoutException: Futures timed out after [300 seconds]
...
Caused by: java.util.concurrent.TimeoutException: Futures timed out after [300 seconds]
	at scala.concurrent.impl.Promise$DefaultPromise.ready(Promise.scala:223)
	at scala.concurrent.impl.Promise$DefaultPromise.result(Promise.scala:227)
	at org.apache.spark.util.ThreadUtils$.awaitResult(ThreadUtils.scala:220)
	at org.apache.spark.sql.execution.exchange.BroadcastExchangeExec.doExecuteBroadcast(BroadcastExchangeExec.scala:146)
	... 71 more
21/09/24 06:17:30 INFO spark.SparkContext: Invoking stop() from shutdown hook
{code}
*Expectation* should be either needs to be throw proper exception saying *"there are no further to resources to run the application"* or it needs to be *"wait till it get resources"*.

To reproduce the issue we have used following sample code.

*PySpark Code (test_broadcast_timeout.py):*
{code:java}
from pyspark.sql import SparkSession

spark = SparkSession.builder.appName("Test Broadcast Timeout").getOrCreate()
t1 = spark.range(5)
t2 = spark.range(5)

q = t1.join(t2,t1.id == t2.id)
q.explain
q.show(){code}
*Spark Submit Command:*
{code:java}
spark-submit --executor-memory 512M test_broadcast_timeout.py{code}
 


> ERROR exchange.BroadcastExchangeExec: Could not execute broadcast in 300 secs
> -----------------------------------------------------------------------------
>
>                 Key: SPARK-36901
>                 URL: https://issues.apache.org/jira/browse/SPARK-36901
>             Project: Spark
>          Issue Type: Bug
>          Components: Spark Core, SQL
>    Affects Versions: 2.4.0
>            Reporter: Ranga Reddy
>            Priority: Major
>
> While running Spark application, if there are no further resources to launch executors, Spark application is failed after 5 mins with below exception.
> {code:java}
> 21/09/24 06:12:45 WARN cluster.YarnScheduler: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient resources
> ...
> 21/09/24 06:17:29 ERROR exchange.BroadcastExchangeExec: Could not execute broadcast in 300 secs.
> java.util.concurrent.TimeoutException: Futures timed out after [300 seconds]
> ...
> Caused by: java.util.concurrent.TimeoutException: Futures timed out after [300 seconds]
> 	at scala.concurrent.impl.Promise$DefaultPromise.ready(Promise.scala:223)
> 	at scala.concurrent.impl.Promise$DefaultPromise.result(Promise.scala:227)
> 	at org.apache.spark.util.ThreadUtils$.awaitResult(ThreadUtils.scala:220)
> 	at org.apache.spark.sql.execution.exchange.BroadcastExchangeExec.doExecuteBroadcast(BroadcastExchangeExec.scala:146)
> 	... 71 more
> 21/09/24 06:17:30 INFO spark.SparkContext: Invoking stop() from shutdown hook
> {code}
> *Expectation* should be either needs to be throw proper exception saying *"there are no further resources to run the application"* or it needs to be *"wait till it get resources"*.
> To reproduce the issue we have used following sample code.
> *PySpark Code (test_broadcast_timeout.py):*
> {code:java}
> from pyspark.sql import SparkSession
> spark = SparkSession.builder.appName("Test Broadcast Timeout").getOrCreate()
> t1 = spark.range(5)
> t2 = spark.range(5)
> q = t1.join(t2,t1.id == t2.id)
> q.explain
> q.show(){code}
> *Spark Submit Command:*
> {code:java}
> spark-submit --executor-memory 512M test_broadcast_timeout.py{code}
>  Note: We have tested same code in Spark 3.1, we are able to reproduce the issue in Spark3 as well.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org