You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Matt Cheah (JIRA)" <ji...@apache.org> on 2015/09/02 00:09:45 UTC
[jira] [Created] (SPARK-10407) Possible Stack-overflow using
InheritableThreadLocal nested-properties for SparkContext.localProperties
Matt Cheah created SPARK-10407:
----------------------------------
Summary: Possible Stack-overflow using InheritableThreadLocal nested-properties for SparkContext.localProperties
Key: SPARK-10407
URL: https://issues.apache.org/jira/browse/SPARK-10407
Project: Spark
Issue Type: Bug
Reporter: Matt Cheah
Priority: Minor
In my long-running web server that eventually uses a SparkContext, I eventually came across some stack overflow errors that could only be cleared by restarting my server.
{code}
java.lang.StackOverflowError: null
at java.io.ObjectInputStream$PeekInputStream.read(ObjectInputStream.java:2307) ~[na:1.7.0_45]
at java.io.ObjectInputStream$BlockDataInputStream.read(ObjectInputStream.java:2718) ~[na:1.7.0_45]
at java.io.ObjectInputStream$BlockDataInputStream.readFully(ObjectInputStream.java:2742) ~[na:1.7.0_45]
at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1979) ~[na:1.7.0_45]
at java.io.ObjectInputStream.defaultReadObject(ObjectInputStream.java:500) ~[na:1.7.0_45]
...
...
at org.apache.commons.lang3.SerializationUtils.clone(SerializationUtils.java:96) ~[commons-lang3-3.3.jar:3.3]
at org.apache.spark.scheduler.DAGScheduler.submitJob(DAGScheduler.scala:516) ~[spark-core_2.10-1.4.1-palantir1.jar:1.4.1-palantir1]
at org.apache.spark.scheduler.DAGScheduler.runJob(DAGScheduler.scala:529) ~[spark-core_2.10-1.4.1-palantir1.jar:1.4.1-palantir1]
at org.apache.spark.SparkContext.runJob(SparkContext.scala:1770) ~[spark-core_2.10-1.4.1-palantir1.jar:1.4.1-palantir1]
at org.apache.spark.SparkContext.runJob(SparkContext.scala:1788) ~[spark-core_2.10-1.4.1-palantir1.jar:1.4.1-palantir1]
at org.apache.spark.SparkContext.runJob(SparkContext.scala:1803) ~[spark-core_2.10-1.4.1-palantir1.jar:1.4.1-palantir1]
at org.apache.spark.rdd.RDD$$anonfun$take$1.apply(RDD.scala:1276) ~[spark-core_2.10-1.4.1-palantir1.jar:1.4.1-palantir1]
...
{code}
The bottom of the trace indicates that serializing a properties object is part of the stack when the overflow happens. I checked the origin of the properties, and it turns out it's coming from SparkContext.localProperties, an InheritableThreadLocal field.
When I debugged further, I found that localProperties.childValue() wraps its parent properties object in another properties object, and returns the wrapper properties. The problem is that every time childValue was being called, I was seeing the properties that was passed in from the parent have a deeper and deeper nesting of wrapped properties. This doesn't make any sense since my application doesn't create threads recursively or anything like that, so I'm marking this issue as a minor one since it shouldn't affect the average application.
On the other hand, there shouldn't really be any reason to be creating the properties in childValue using nesting. Instead, the properties returned by childValue should be flattened, and more importantly, a deep copy of the parent.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org