You are viewing a plain text version of this content. The canonical link for it is here.

Posted to reviews@spark.apache.org by GitBox <gi...@apache.org> on 2022/02/14 08:50:59 UTC

[GitHub] [spark] bozhang2820 opened a new pull request #35510: [SPARK-38202][CORE] Ignore jar download failures in Executor.run()

bozhang2820 opened a new pull request #35510:
URL: https://github.com/apache/spark/pull/35510


   ### What changes were proposed in this pull request?
   This change is to ignore failures in jar download when updating jar dependencies in Executor.run().
   
   ### Why are the changes needed?
   Currently when a user adds a jar with an invalid URL into SparkContext, all subsequent query executions will fail when trying to download that jar, even when the query has noting to do with the jar dependency. This change is to fix that by ignoring the failure when downloading the jar.
   
   ### Does this PR introduce _any_ user-facing change?
   Yes. After this change, users will be able to run queries even there are invalid URLs in SparkContext.addedJars.
   
   ### How was this patch tested?
   Added a unit test.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] dongjoon-hyun commented on pull request #35510: [SPARK-38202][CORE] Ignore jar download failures in Executor.run()

Posted by GitBox <gi...@apache.org>.

dongjoon-hyun commented on pull request #35510:
URL: https://github.com/apache/spark/pull/35510#issuecomment-1039983341


   If your Spark job has a bug, you need to fix and resubmit it. It's not a Spark cluster issue.
   > could we have a way to fix a cluster after an incorrect "addJar"?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] bozhang2820 commented on pull request #35510: [SPARK-38202][CORE] Ignore jar download failures in Executor.run()

Posted by GitBox <gi...@apache.org>.

bozhang2820 commented on pull request #35510:
URL: https://github.com/apache/spark/pull/35510#issuecomment-1039885184


   Yes there is a tradeoff between "making jobs that depend on the jar fail fast" vs "letting jobs that do not depend on the jar be runnable".
   
   @dongjoon-hyun, I trust your judgement but could we have a way to fix a cluster after an incorrect "addJar"? Is adding a new API "removeJar" a good idea? 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] dongjoon-hyun commented on pull request #35510: [SPARK-38202][CORE] Ignore jar download failures in Executor.run()

Posted by GitBox <gi...@apache.org>.

dongjoon-hyun commented on pull request #35510:
URL: https://github.com/apache/spark/pull/35510#issuecomment-1039983341


   If your Spark job has a bug, you need to fix and resubmit it. It's not a Spark cluster issue.
   > could we have a way to fix a cluster after an incorrect "addJar"?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] martin-g commented on pull request #35510: [SPARK-38202][CORE] Ignore jar download failures in Executor.run()

Posted by GitBox <gi...@apache.org>.

martin-g commented on pull request #35510:
URL: https://github.com/apache/spark/pull/35510#issuecomment-1038858652


   > This change is to fix that by ignoring the failure when downloading the jar.
   
   Is this really a good idea ?
   The warning will be buried in the logs and it will be hard to notice that something is not as expected.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] bozhang2820 closed pull request #35510: [SPARK-38202][CORE] Ignore jar download failures in Executor.run()

Posted by GitBox <gi...@apache.org>.

bozhang2820 closed pull request #35510:
URL: https://github.com/apache/spark/pull/35510


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] bozhang2820 commented on pull request #35510: [SPARK-38202][CORE] Ignore jar download failures in Executor.run()

Posted by GitBox <gi...@apache.org>.

bozhang2820 commented on pull request #35510:
URL: https://github.com/apache/spark/pull/35510#issuecomment-1039885184


   Yes there is a tradeoff between "making jobs that depend on the jar fail fast" vs "letting jobs that do not depend on the jar be runnable".
   
   @dongjoon-hyun, I trust your judgement but could we have a way to fix a cluster after an incorrect "addJar"? Is adding a new API "removeJar" a good idea? 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] bozhang2820 commented on pull request #35510: [SPARK-38202][CORE] Ignore jar download failures in Executor.run()

Posted by GitBox <gi...@apache.org>.

bozhang2820 commented on pull request #35510:
URL: https://github.com/apache/spark/pull/35510#issuecomment-1043826382


   This is for the case when an incorrect `addJar` in a single notebook can block all others that connects to the same cluster. 
   
   Anyway let's close this PR for now.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org