You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Todd Morrison (JIRA)" <ji...@apache.org> on 2017/06/21 16:30:00 UTC
[jira] [Commented] (SPARK-10878) Race condition when resolving
Maven coordinates via Ivy
[ https://issues.apache.org/jira/browse/SPARK-10878?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16057797#comment-16057797 ]
Todd Morrison commented on SPARK-10878:
---------------------------------------
Any chance we can move the priority of this issue up?
This is causing some issues on large Spark clusters with Yarn and PySpark.
Currently, a work-around is to throttle a single job to cache then expect concurrent jobs to deploy. This isn't ideal as with parallel jobs, there is a long-poll waiting for the initial job to complete.
Thanks!
> Race condition when resolving Maven coordinates via Ivy
> -------------------------------------------------------
>
> Key: SPARK-10878
> URL: https://issues.apache.org/jira/browse/SPARK-10878
> Project: Spark
> Issue Type: Bug
> Components: Spark Core
> Affects Versions: 1.5.0
> Reporter: Ryan Williams
> Priority: Minor
>
> I've recently been shell-scripting the creation of many concurrent Spark-on-YARN apps and observing a fraction of them to fail with what I'm guessing is a race condition in their Maven-coordinate resolution.
> For example, I might spawn an app for each path in file {{paths}} with the following shell script:
> {code}
> cat paths | parallel "$SPARK_HOME/bin/spark-submit foo.jar {}"
> {code}
> When doing this, I observe some fraction of the spawned jobs to fail with errors like:
> {code}
> :: retrieving :: org.apache.spark#spark-submit-parent
> confs: [default]
> Exception in thread "main" java.lang.RuntimeException: problem during retrieve of org.apache.spark#spark-submit-parent: java.text.ParseException: failed to parse report: /hpc/users/willir31/.ivy2/cache/org.apache.spark-spark-submit-parent-default.xml: Premature end of file.
> at org.apache.ivy.core.retrieve.RetrieveEngine.retrieve(RetrieveEngine.java:249)
> at org.apache.ivy.core.retrieve.RetrieveEngine.retrieve(RetrieveEngine.java:83)
> at org.apache.ivy.Ivy.retrieve(Ivy.java:551)
> at org.apache.spark.deploy.SparkSubmitUtils$.resolveMavenCoordinates(SparkSubmit.scala:1006)
> at org.apache.spark.deploy.SparkSubmit$.prepareSubmitEnvironment(SparkSubmit.scala:286)
> at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:153)
> at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:120)
> at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
> Caused by: java.text.ParseException: failed to parse report: /hpc/users/willir31/.ivy2/cache/org.apache.spark-spark-submit-parent-default.xml: Premature end of file.
> at org.apache.ivy.plugins.report.XmlReportParser.parse(XmlReportParser.java:293)
> at org.apache.ivy.core.retrieve.RetrieveEngine.determineArtifactsToCopy(RetrieveEngine.java:329)
> at org.apache.ivy.core.retrieve.RetrieveEngine.retrieve(RetrieveEngine.java:118)
> ... 7 more
> Caused by: org.xml.sax.SAXParseException; Premature end of file.
> at org.apache.xerces.util.ErrorHandlerWrapper.createSAXParseException(Unknown Source)
> at org.apache.xerces.util.ErrorHandlerWrapper.fatalError(Unknown Source)
> at org.apache.xerces.impl.XMLErrorReporter.reportError(Unknown Source)
> {code}
> The more apps I try to launch simultaneously, the greater fraction of them seem to fail with this or similar errors; a batch of ~10 will usually work fine, a batch of 15 will see a few failures, and a batch of ~60 will have dozens of failures.
> [This gist shows 11 recent failures I observed|https://gist.github.com/ryan-williams/648bff70e518de0c7c84].
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org