You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@mesos.apache.org by "Chengwei Yang (JIRA)" <ji...@apache.org> on 2014/08/29 09:11:53 UTC
[jira] [Created] (MESOS-1746) clear TaskStatus data to avoid OOM
Chengwei Yang created MESOS-1746:
------------------------------------
Summary: clear TaskStatus data to avoid OOM
Key: MESOS-1746
URL: https://issues.apache.org/jira/browse/MESOS-1746
Project: Mesos
Issue Type: Bug
Environment: mesos-0.19.0
Reporter: Chengwei Yang
Assignee: Chengwei Yang
Spark on mesos may use TaskStatus to transfer computed result between worker and scheduler, the source code like below (spark 1.0.2)
{code}
val serializedResult = {
if (serializedDirectResult.limit >= execBackend.akkaFrameSize() -
AkkaUtils.reservedSizeBytes) {
logInfo("Storing result for " + taskId + " in local BlockManager")
val blockId = TaskResultBlockId(taskId)
env.blockManager.putBytes(
blockId, serializedDirectResult, StorageLevel.MEMORY_AND_DISK_SER)
ser.serialize(new IndirectTaskResult[Any](blockId))
} else {
logInfo("Sending result for " + taskId + " directly to driver")
serializedDirectResult
}
}
{code}
And In our test environment, we enlarge akkaFrameSize to 128MB from default value (10MB) and this cause our mesos-master process will be OOM in tens of minutes when running spark tasks in fine-grained mode.
As you can see, even changed akkaFrameSize back to default value (10MB), it's very likely to make mesos-master OOM too, however more slower.
So I think it's good to delete data from TaskStatus since this is only designed to on-top framework and we don't interested in it.
--
This message was sent by Atlassian JIRA
(v6.2#6252)