You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@spark.apache.org by Serega Sheypak <se...@gmail.com> on 2016/05/18 11:17:00 UTC

Managed memory leak detected.SPARK-11293 ?

Hi, please have a look at log snippet:
16/05/18 03:27:16 INFO spark.MapOutputTrackerWorker: Doing the fetch;
tracker endpoint =
NettyRpcEndpointRef(spark://MapOutputTracker@xxx.xxx.xxx.xxx:38128)
16/05/18 03:27:16 INFO spark.MapOutputTrackerWorker: Got the output
locations
16/05/18 03:27:16 INFO storage.ShuffleBlockFetcherIterator: Getting 30
non-empty blocks out of 30 blocks
16/05/18 03:27:16 INFO storage.ShuffleBlockFetcherIterator: Started 30
remote fetches in 3 ms
16/05/18 03:27:16 INFO spark.MapOutputTrackerWorker: Don't have map outputs
for shuffle 1, fetching them
16/05/18 03:27:16 INFO spark.MapOutputTrackerWorker: Doing the fetch;
tracker endpoint =
NettyRpcEndpointRef(spark://MapOutputTracker@xxx.xxx.xxx.xxx:38128)
16/05/18 03:27:16 INFO spark.MapOutputTrackerWorker: Got the output
locations
16/05/18 03:27:16 INFO storage.ShuffleBlockFetcherIterator: Getting 1
non-empty blocks out of 1500 blocks
16/05/18 03:27:16 INFO storage.ShuffleBlockFetcherIterator: Started 1
remote fetches in 1 ms
16/05/18 03:27:17 ERROR executor.Executor: Managed memory leak detected;
size = 6685476 bytes, TID = 3405
16/05/18 03:27:17 ERROR executor.Executor: Exception in task 285.0 in stage
6.0 (TID 3405)

Is it related to https://issues.apache.org/jira/browse/SPARK-11293

Is there any recommended workaround?

Re: Managed memory leak detected.SPARK-11293 ?

Posted by Serega Sheypak <se...@gmail.com>.

Ok, it happens only in YARN+cluster mode. It works with snappy in
YARN+client mode.
I've  started to hit this problem when I switched to cluster mode.

2016-05-18 16:31 GMT+02:00 Ted Yu <yu...@gmail.com>:

> According to:
>
> http://blog.erdemagaoglu.com/post/4605524309/lzo-vs-snappy-vs-lzf-vs-zlib-a-comparison-of
>
> performance of snappy and lzf were on-par to each other.
>
> Maybe lzf has lower memory requirement.
>
> On Wed, May 18, 2016 at 7:22 AM, Serega Sheypak <se...@gmail.com>
> wrote:
>
>> Switching from snappy to lzf helped me:
>>
>> *spark.io.compression.codec=lzf*
>>
>> Do you know why? :) I can't find exact explanation...
>>
>>
>>
>> 2016-05-18 15:41 GMT+02:00 Ted Yu <yu...@gmail.com>:
>>
>>> Please increase the number of partitions.
>>>
>>> Cheers
>>>
>>> On Wed, May 18, 2016 at 4:17 AM, Serega Sheypak <
>>> serega.sheypak@gmail.com> wrote:
>>>
>>>> Hi, please have a look at log snippet:
>>>> 16/05/18 03:27:16 INFO spark.MapOutputTrackerWorker: Doing the fetch;
>>>> tracker endpoint =
>>>> NettyRpcEndpointRef(spark://MapOutputTracker@xxx.xxx.xxx.xxx:38128)
>>>> 16/05/18 03:27:16 INFO spark.MapOutputTrackerWorker: Got the output
>>>> locations
>>>> 16/05/18 03:27:16 INFO storage.ShuffleBlockFetcherIterator: Getting 30
>>>> non-empty blocks out of 30 blocks
>>>> 16/05/18 03:27:16 INFO storage.ShuffleBlockFetcherIterator: Started 30
>>>> remote fetches in 3 ms
>>>> 16/05/18 03:27:16 INFO spark.MapOutputTrackerWorker: Don't have map
>>>> outputs for shuffle 1, fetching them
>>>> 16/05/18 03:27:16 INFO spark.MapOutputTrackerWorker: Doing the fetch;
>>>> tracker endpoint =
>>>> NettyRpcEndpointRef(spark://MapOutputTracker@xxx.xxx.xxx.xxx:38128)
>>>> 16/05/18 03:27:16 INFO spark.MapOutputTrackerWorker: Got the output
>>>> locations
>>>> 16/05/18 03:27:16 INFO storage.ShuffleBlockFetcherIterator: Getting 1
>>>> non-empty blocks out of 1500 blocks
>>>> 16/05/18 03:27:16 INFO storage.ShuffleBlockFetcherIterator: Started 1
>>>> remote fetches in 1 ms
>>>> 16/05/18 03:27:17 ERROR executor.Executor: Managed memory leak
>>>> detected; size = 6685476 bytes, TID = 3405
>>>> 16/05/18 03:27:17 ERROR executor.Executor: Exception in task 285.0 in
>>>> stage 6.0 (TID 3405)
>>>>
>>>> Is it related to https://issues.apache.org/jira/browse/SPARK-11293
>>>>
>>>> Is there any recommended workaround?
>>>>
>>>
>>>
>>
>

Re: Managed memory leak detected.SPARK-11293 ?

Posted by Ted Yu <yu...@gmail.com>.

According to:
http://blog.erdemagaoglu.com/post/4605524309/lzo-vs-snappy-vs-lzf-vs-zlib-a-comparison-of

performance of snappy and lzf were on-par to each other.

Maybe lzf has lower memory requirement.

On Wed, May 18, 2016 at 7:22 AM, Serega Sheypak <se...@gmail.com>
wrote:

> Switching from snappy to lzf helped me:
>
> *spark.io.compression.codec=lzf*
>
> Do you know why? :) I can't find exact explanation...
>
>
>
> 2016-05-18 15:41 GMT+02:00 Ted Yu <yu...@gmail.com>:
>
>> Please increase the number of partitions.
>>
>> Cheers
>>
>> On Wed, May 18, 2016 at 4:17 AM, Serega Sheypak <serega.sheypak@gmail.com
>> > wrote:
>>
>>> Hi, please have a look at log snippet:
>>> 16/05/18 03:27:16 INFO spark.MapOutputTrackerWorker: Doing the fetch;
>>> tracker endpoint =
>>> NettyRpcEndpointRef(spark://MapOutputTracker@xxx.xxx.xxx.xxx:38128)
>>> 16/05/18 03:27:16 INFO spark.MapOutputTrackerWorker: Got the output
>>> locations
>>> 16/05/18 03:27:16 INFO storage.ShuffleBlockFetcherIterator: Getting 30
>>> non-empty blocks out of 30 blocks
>>> 16/05/18 03:27:16 INFO storage.ShuffleBlockFetcherIterator: Started 30
>>> remote fetches in 3 ms
>>> 16/05/18 03:27:16 INFO spark.MapOutputTrackerWorker: Don't have map
>>> outputs for shuffle 1, fetching them
>>> 16/05/18 03:27:16 INFO spark.MapOutputTrackerWorker: Doing the fetch;
>>> tracker endpoint =
>>> NettyRpcEndpointRef(spark://MapOutputTracker@xxx.xxx.xxx.xxx:38128)
>>> 16/05/18 03:27:16 INFO spark.MapOutputTrackerWorker: Got the output
>>> locations
>>> 16/05/18 03:27:16 INFO storage.ShuffleBlockFetcherIterator: Getting 1
>>> non-empty blocks out of 1500 blocks
>>> 16/05/18 03:27:16 INFO storage.ShuffleBlockFetcherIterator: Started 1
>>> remote fetches in 1 ms
>>> 16/05/18 03:27:17 ERROR executor.Executor: Managed memory leak detected;
>>> size = 6685476 bytes, TID = 3405
>>> 16/05/18 03:27:17 ERROR executor.Executor: Exception in task 285.0 in
>>> stage 6.0 (TID 3405)
>>>
>>> Is it related to https://issues.apache.org/jira/browse/SPARK-11293
>>>
>>> Is there any recommended workaround?
>>>
>>
>>
>

Re: Managed memory leak detected.SPARK-11293 ?

Posted by Serega Sheypak <se...@gmail.com>.

Switching from snappy to lzf helped me:

*spark.io.compression.codec=lzf*

Do you know why? :) I can't find exact explanation...



2016-05-18 15:41 GMT+02:00 Ted Yu <yu...@gmail.com>:

> Please increase the number of partitions.
>
> Cheers
>
> On Wed, May 18, 2016 at 4:17 AM, Serega Sheypak <se...@gmail.com>
> wrote:
>
>> Hi, please have a look at log snippet:
>> 16/05/18 03:27:16 INFO spark.MapOutputTrackerWorker: Doing the fetch;
>> tracker endpoint =
>> NettyRpcEndpointRef(spark://MapOutputTracker@xxx.xxx.xxx.xxx:38128)
>> 16/05/18 03:27:16 INFO spark.MapOutputTrackerWorker: Got the output
>> locations
>> 16/05/18 03:27:16 INFO storage.ShuffleBlockFetcherIterator: Getting 30
>> non-empty blocks out of 30 blocks
>> 16/05/18 03:27:16 INFO storage.ShuffleBlockFetcherIterator: Started 30
>> remote fetches in 3 ms
>> 16/05/18 03:27:16 INFO spark.MapOutputTrackerWorker: Don't have map
>> outputs for shuffle 1, fetching them
>> 16/05/18 03:27:16 INFO spark.MapOutputTrackerWorker: Doing the fetch;
>> tracker endpoint =
>> NettyRpcEndpointRef(spark://MapOutputTracker@xxx.xxx.xxx.xxx:38128)
>> 16/05/18 03:27:16 INFO spark.MapOutputTrackerWorker: Got the output
>> locations
>> 16/05/18 03:27:16 INFO storage.ShuffleBlockFetcherIterator: Getting 1
>> non-empty blocks out of 1500 blocks
>> 16/05/18 03:27:16 INFO storage.ShuffleBlockFetcherIterator: Started 1
>> remote fetches in 1 ms
>> 16/05/18 03:27:17 ERROR executor.Executor: Managed memory leak detected;
>> size = 6685476 bytes, TID = 3405
>> 16/05/18 03:27:17 ERROR executor.Executor: Exception in task 285.0 in
>> stage 6.0 (TID 3405)
>>
>> Is it related to https://issues.apache.org/jira/browse/SPARK-11293
>>
>> Is there any recommended workaround?
>>
>
>

Re: Managed memory leak detected.SPARK-11293 ?

Posted by Ted Yu <yu...@gmail.com>.

Please increase the number of partitions.

Cheers

On Wed, May 18, 2016 at 4:17 AM, Serega Sheypak <se...@gmail.com>
wrote:

> Hi, please have a look at log snippet:
> 16/05/18 03:27:16 INFO spark.MapOutputTrackerWorker: Doing the fetch;
> tracker endpoint =
> NettyRpcEndpointRef(spark://MapOutputTracker@xxx.xxx.xxx.xxx:38128)
> 16/05/18 03:27:16 INFO spark.MapOutputTrackerWorker: Got the output
> locations
> 16/05/18 03:27:16 INFO storage.ShuffleBlockFetcherIterator: Getting 30
> non-empty blocks out of 30 blocks
> 16/05/18 03:27:16 INFO storage.ShuffleBlockFetcherIterator: Started 30
> remote fetches in 3 ms
> 16/05/18 03:27:16 INFO spark.MapOutputTrackerWorker: Don't have map
> outputs for shuffle 1, fetching them
> 16/05/18 03:27:16 INFO spark.MapOutputTrackerWorker: Doing the fetch;
> tracker endpoint =
> NettyRpcEndpointRef(spark://MapOutputTracker@xxx.xxx.xxx.xxx:38128)
> 16/05/18 03:27:16 INFO spark.MapOutputTrackerWorker: Got the output
> locations
> 16/05/18 03:27:16 INFO storage.ShuffleBlockFetcherIterator: Getting 1
> non-empty blocks out of 1500 blocks
> 16/05/18 03:27:16 INFO storage.ShuffleBlockFetcherIterator: Started 1
> remote fetches in 1 ms
> 16/05/18 03:27:17 ERROR executor.Executor: Managed memory leak detected;
> size = 6685476 bytes, TID = 3405
> 16/05/18 03:27:17 ERROR executor.Executor: Exception in task 285.0 in
> stage 6.0 (TID 3405)
>
> Is it related to https://issues.apache.org/jira/browse/SPARK-11293
>
> Is there any recommended workaround?
>