You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@flink.apache.org by Gábor Gévay <gg...@gmail.com> on 2016/02/02 16:27:32 UTC

Re: Memory manager behavior in iterative jobs

I've created a JIRA:
https://issues.apache.org/jira/browse/FLINK-3322

Best,
Gábor



2016-01-30 14:36 GMT+01:00 Fabian Hueske <fh...@gmail.com>:
> Hi Gabor and Marton,
>
> the taskmanager.memory.preallocate switch basically replaces Flink's
> streaming mode.
> The current stream runtime code does not operate on managed memory. Hence,
> all memory allocated by the memory manager cannot be used for streaming
> jobs and is "lost". If the switch is set to false, memory is requested as
> it is required by operators as you observed.
> Setting the parameter to true is the original behavior and does not have
> any downside effects for batch programs.
>
> The effect of the switch on the performance of iterative jobs is
> interesting and it sounds like it should be improved.
>
> Best, Fabian
>
> 2016-01-30 14:04 GMT+01:00 Gábor Gévay <gg...@gmail.com>:
>
>> Hello!
>>
>> We have a strangely behaving iterative Flink job: when we give it more
>> memory, it gets much slower (more than 10 times). The problem seems to
>> be mostly caused by GCs. Enabling object reuse didn’t help.
>>
>> With some profiling and debugging, we traced the problem to the
>> operators requesting new memory segments from the memory manager at
>> every superstep of the iteration, and the memory manager satisfying
>> these requests by allocating new memory segments from the Java heap
>> [1], and then the old ones have to be eventually reclaimed by garbage
>> collections. We found the option “taskmanager.memory.preallocate”,
>> which mostly solved the GC problem, but we would like to understand
>> the situation better.
>>
>> What is the reason for the default value of this setting being false?
>> Is there a downside to enabling this option? If the only downside is
>> the slower startup of the task managers, then we could have the best
>> of both worlds, by modifying the logic of the memory manager to use
>> pooling only after releases. I mean the memory manager would give the
>> segments back to the pool when the operators release them even when
>> “preallocate” is false, and then `allocatePages` would use a new
>> method of the memory pool, which would first check if there are
>> segments in the pool and calls `allocateNewSegment` or
>> `requestSegmentFromPool` accordingly. (Instead of the current
>> behaviour, which is to basically disable pooling, when the
>> “preallocate” setting is false.)
>>
>> [1]
>> https://github.com/apache/flink/blob/master/flink-runtime/src/main/java/org/apache/flink/runtime/memory/MemoryManager.java#L293-L307
>>
>> Best,
>> Gábor and Márton
>>