You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Shixiong Zhu (JIRA)" <ji...@apache.org> on 2019/04/15 17:23:00 UTC
[jira] [Created] (SPARK-27468) "Storage Level" in "RDD Storage
Page" is not correct
Shixiong Zhu created SPARK-27468:
------------------------------------
Summary: "Storage Level" in "RDD Storage Page" is not correct
Key: SPARK-27468
URL: https://issues.apache.org/jira/browse/SPARK-27468
Project: Spark
Issue Type: Bug
Components: Spark Core
Affects Versions: 2.4.1
Reporter: Shixiong Zhu
I ran the following unit test and checked the UI.
{code}
val conf = new SparkConf()
.setAppName("test")
.setMaster("local-cluster[2,1,1024]")
.set("spark.ui.enabled", "true")
sc = new SparkContext(conf)
val rdd = sc.makeRDD(1 to 10, 1).persist(StorageLevel.MEMORY_ONLY_2)
rdd.count()
Thread.sleep(3600000)
{code}
The storage level is "Memory Deserialized 1x Replicated" in the RDD storage page.
I tried to debug and found this is because Spark emitted the following two events:
{code}
event: SparkListenerBlockUpdated(BlockUpdatedInfo(BlockManagerId(1, 10.8.132.160, 65473, None),rdd_0_0,StorageLevel(memory, deserialized, 2 replicas),56,0))
event: SparkListenerBlockUpdated(BlockUpdatedInfo(BlockManagerId(0, 10.8.132.160, 65474, None),rdd_0_0,StorageLevel(memory, deserialized, 1 replicas),56,0))
{code}
The storage level in the second event will overwrite the first one. "1 replicas" comes from this line: https://github.com/apache/spark/blob/3ab96d7acf870e53c9016b0b63d0b328eec23bed/core/src/main/scala/org/apache/spark/storage/BlockManager.scala#L1457
Maybe AppStatusListener should calculate the replicas from events?
Another fact we may need to think about is when replicas is 2, will two Spark events arrive in the same order? Currently, two RPCs from different executors can arrive in any order.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org