You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@flink.apache.org by "Shammon (Jira)" <ji...@apache.org> on 2021/12/15 11:14:00 UTC

[jira] [Created] (FLINK-25328) Improvement of share memory manager between jobs if they use the same slot in TaskManager for flink olap queries

Shammon created FLINK-25328:
-------------------------------

             Summary: Improvement of share memory manager between jobs if they use the same slot in TaskManager for flink olap queries
                 Key: FLINK-25328
                 URL: https://issues.apache.org/jira/browse/FLINK-25328
             Project: Flink
          Issue Type: Sub-task
          Components: Runtime / Coordination
    Affects Versions: 1.13.3, 1.12.5, 1.14.0
            Reporter: Shammon


    We submit batch jobs to flink session cluster as olap queries, and these jobs' subtasks in TaskManager are frequently created and destroyed because they finish their work quickly. Each slot in taskmanager manages `MemoryManager` for multiple tasks in one job, and the `MemoryManager` is closed when all the subtasks are finished. Join/Aggregate/Sort and etc. operators in the subtasks allocate `MemorySegment` via `MemoryManager` and these `MemorySegment` will be free when they are finished. 
    
    It causes too much memory allocation and free of `MemorySegment` in taskmanager. For example, a TaskManager contains 50 slots, one job has 3 join/agg operatos run in the slot, each operator will allocate 2000 segments and initialize them. If the subtasks of a job take 100ms to execute, then the taskmanager will execute 10 jobs' subtasks one second and it will allocate and free 2000 * 3 * 50 * 10 = 300w segments for them. Allocate and free too many segments from memory will cause two issues:

1) Increases the CPU usage of taskmanager
2) Increase the cost of subtasks in taskmanager, which will increase the latency of job and decrease the qps.

	To improve the usage of memory segment between jobs in the same slot, we propose not drop memory manager when all the subtasks in the slot are finished. The slot will hold the `MemoryManager` and not free the allocated `MemorySegment` in it immediately. When some subtasks of another job are assigned to the slot, they don't need to allocate segments from memory and can reuse the `MemoryManager` and `MemorySegment` in it.  WDYT?  [~xtsong] THX



--
This message was sent by Atlassian Jira
(v8.20.1#820001)