You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by ramkrish86 <gi...@git.apache.org> on 2016/09/13 06:40:20 UTC

[GitHub] flink pull request #2495: FLINK-3322 - Make sorters to reuse the memory page...

GitHub user ramkrish86 opened a pull request:

    https://github.com/apache/flink/pull/2495

    FLINK-3322 - Make sorters to reuse the memory pages allocated for iterative tasks

    Thanks for contributing to Apache Flink. Before you open your pull request, please take the following check list into consideration.
    If your changes take all of the items into account, feel free to open your pull request. For more information and/or questions please refer to the [How To Contribute guide](http://flink.apache.org/how-to-contribute.html).
    In addition to going through the list, please provide a meaningful description of your changes.
    
    - [ ] General
      - The pull request references the related JIRA issue ("[FLINK-XXX] Jira title text")
      - The pull request addresses only one issue
      - Each commit in the PR has a meaningful commit message (including the JIRA id)
    
    - [ ] Documentation
      - Documentation has been added for new functionality
      - Old documentation affected by the pull request has been updated
      - JavaDoc for public methods has been added
    
    - [ ] Tests & Build
      - Functionality added by the pull request is covered by tests
      - `mvn clean verify` has been executed successfully locally or a Travis build has passed
    
    
    This is part1 for FLINK-3322 where only the Sorters are made to reuse the memory pages. As @ggevay  pointed out we have to handle the iterators also where the memory pages are allocated. I have a seperate PR for that because that involves touching lot of places. But am open to feedback here. It is fine with me to combine both also but it was making the changes much bigger. 
    I would like to get the feed back here on this apporach. 
    Here a SorterMemoryAllocator is now passed to the UnilateralSortMergers. That will allocate the required memory pages and it will allocate the required read, write and large buffers. As per the existing logic the buffers will be released. But if the task is an iterative task we wait for the tasks to be released until a close or termination call happens for the iterative task. 
    In case of pages that were grabbed in between for keysort or record sort those will be put back to the respective pages so that we have the required number of pages through out the life cycle of the iterative task.
    
    As said this is only part 1. We need to address the iterators also. But that according to me touches more places. I have done the changes for that but it is not in a shape to be pushed as a PR but am open to feed back here. Thanks all. 

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/ramkrish86/flink FLINK-3322_part1

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/flink/pull/2495.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #2495
    
----
commit 705ee5294bc5263971c2924a55c9230d72806527
Author: Ramkrishna <ra...@intel.com>
Date:   2016-09-13T06:33:59Z

    FLINK-3322 - Make sorters to reuse the memory pages allocated for
    iterative tasks

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink issue #2495: FLINK-3322 - Make sorters to reuse the memory pages alloc...

Posted by ramkrish86 <gi...@git.apache.org>.
Github user ramkrish86 commented on the issue:

    https://github.com/apache/flink/pull/2495
  
    The build shows failure but not sure what is the failure. It does not show any specific test case.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink issue #2495: FLINK-3322 - Make sorters to reuse the memory pages alloc...

Posted by ramkrish86 <gi...@git.apache.org>.
Github user ramkrish86 commented on the issue:

    https://github.com/apache/flink/pull/2495
  
    @StephanEwen 
    Thank you. I agree this is a super important piece of FLINK where the memory is involved and that is one reason why I wanted to know its managment and usage and wanted to work on this. I have already split this into 2 tasks. One that deals with sorters and another is with the Iterators.
    In an iterative task I found that memory is being allocated with the iterators and then with the sorters.
    This PR is only for sorters.
    the other one is for the iterators 
    https://github.com/apache/flink/pull/2510/commits/e17462881cc2f0137cc8412eb578fb4c42baeb15
    @ggevay  is sheperding it and helping me out with the idea of making ResettableDrivers which I was not fully sure if it can be done.
    I had added some docs to the JIRA itself about splitting up the tasks and what is being addressed here and how it can be addressed.
    `SorterMemoryAllocators` are nothing but just holders of the memory segments that needs to be passed to the sorters. They hold the memory so that the read buffers, write buffers and large memory buffers can pull in memory and put back the memory to them at the end of each iteration. I initially understood that changing this should have minimal impact on other areas which are not impacted without which it is going to be difficult to review. 
    As said above it was just to show the intention of the changes and I am very much aware that these change needs time to get a thorough review and any design needs to be focussed on future enhancements as I had mentioned in the doc. A PR will always help to understand better because the changes are in code.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink issue #2495: FLINK-3322 - Make sorters to reuse the memory pages alloc...

Posted by StephanEwen <gi...@git.apache.org>.
Github user StephanEwen commented on the issue:

    https://github.com/apache/flink/pull/2495
  
    Is there a section that describes the design that this follows in more detail? I would like to take a look and comment.
    
    I am a bit skeptical whether this is going in the right direction. For example, I see no reason why there should be a `SorterMemoryAllocator`, or any specialization for sorters. Ideally, we get also fewer specializations for iterative tasks, not more.
    
    This looks like many specializations and case distinctions. Flink runs in critical production settings and this memory allocation stuff is super critical, so we can really only merge it when we feel super comfortable that this is (1) rock solid and (2) a good design for the future. Otherwise this will be a lot of code.
    
    To achieve that, I think it would help to break this down into finer issues and address them one at a time. I can understand that it is hard to slow down sometimes, but for critical runtime changes like this one, I think you need to adjust to the speed of whoever can really review and merge these fine grained changes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink issue #2495: FLINK-3322 - Make sorters to reuse the memory pages alloc...

Posted by ggevay <gi...@git.apache.org>.
Github user ggevay commented on the issue:

    https://github.com/apache/flink/pull/2495
  
    The tests that are failing are in `flink-gelly-examples`, for example `SingleSourceShortestPathsITCase`. They have error msgs that indicate memory management errors, e.g.
    `Caused by: java.lang.RuntimeException: Error obtaining the sorted input: Thread 'SortMerger Reading Thread' terminated due to an exception: segment has been freed`.
    
    But I would say that before fixing this pull request, let's concentrate on https://github.com/apache/flink/pull/2496.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink issue #2495: FLINK-3322 - Make sorters to reuse the memory pages alloc...

Posted by ramkrish86 <gi...@git.apache.org>.
Github user ramkrish86 commented on the issue:

    https://github.com/apache/flink/pull/2495
  
    @ggevay and @StephanEwen  - any feedback/reviews here.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink issue #2495: FLINK-3322 - Make sorters to reuse the memory pages alloc...

Posted by ramkrish86 <gi...@git.apache.org>.
Github user ramkrish86 commented on the issue:

    https://github.com/apache/flink/pull/2495
  
    @StephanEwen 
    Just trying to explain what my intention was. This is particularly to address the sorters.
    Solution
    We could create memory allocators inside the batch task corresponding to every input. So every iteration task will have a memory allocator/creator corresponding to each input. It would create an array of memory segments that are required for the sorters corresponding to each input and when the task keeps looping in an iterative fashion we use the same set of memory segments created by this task.
    When we receive a termination event only then we do release the segments created by this task.
    This would ensure that we do create allocate memory per task and that is used through the life cycle of the iterative task.
    We change the Merge sorters (CombiningUnilateralSortMerger and UnilateralSortMerger) such that we pass the MemoryAllocator to it. When the sorters start doing the sorting, writing and use large records (if any) we pull in the memory segments allocated in the memory allocator. For iterative tasks, where we release the segments we just need to put back the segments to the memory allocator instead of releasing it back to the memory manager.
    When the task receives termination call only then we forcefully close the allocators so that all the created segments are released back to the memory manager.
    So even if preallocation of memory is set to true I think this would work and we won\u2019t be requesting new segments from the MemoryManager\u2019s pool and instead use the segments that were created initially for the first iteration.
    For a normal non-iterative tasks we know that the allocators are created for non-iterative tasks. We have a Boolean to indicate if it is an iterative task or not. Based on this flag, in the place where we try to release the segments we can decide if to release it back to the memory manager or put back to the memory allocator only (in case of iterative tasks).
    
    Pls note that I was able to fix the failed test cases that @ggevay  pointed out. But I have not updated the PR. I can wait for your feedback and thoughts and then proceed with both the PRs - this and #2510 .
    Points to note:
    Not sure whether this aligns with Steven's future vision of memory management.
    Impact on Streaming Iterative tasks
    Will the amount of memory segments needed for this task be dynamically changed? If so the above mechanism cannot work.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink issue #2495: FLINK-3322 - Make sorters to reuse the memory pages alloc...

Posted by ramkrish86 <gi...@git.apache.org>.
Github user ramkrish86 commented on the issue:

    https://github.com/apache/flink/pull/2495
  
    Any chance of a review here @ggevay  and @StephanEwen - A gentle reminder - in case you have some time.
    Happy to take your input/feedbacks and see how this can be done.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink issue #2495: FLINK-3322 - Make sorters to reuse the memory pages alloc...

Posted by ggevay <gi...@git.apache.org>.
Github user ggevay commented on the issue:

    https://github.com/apache/flink/pull/2495
  
    Btw. before submitting a pull request, it is good practice to run `mvn clean verify` locally (or do a Travis build), and make sure that there are no test failures (or, at least no test failures related to the current change).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink issue #2495: FLINK-3322 - Make sorters to reuse the memory pages alloc...

Posted by ramkrish86 <gi...@git.apache.org>.
Github user ramkrish86 commented on the issue:

    https://github.com/apache/flink/pull/2495
  
    > SingleSourceShortestPathsITCase
    
    Sure. Actually I ran mvn clean verify to ensure there are no code style issues in runtime package and ran tests in that package. I was not sure what other tests are impacted because of this change and also tried running all the example related test cases like PageRank, KMeans etc. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink issue #2495: FLINK-3322 - Make sorters to reuse the memory pages alloc...

Posted by ramkrish86 <gi...@git.apache.org>.
Github user ramkrish86 commented on the issue:

    https://github.com/apache/flink/pull/2495
  
    Will need to make an update in this PR. Will do that and push that change shortly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink issue #2495: FLINK-3322 - Make sorters to reuse the memory pages alloc...

Posted by ramkrish86 <gi...@git.apache.org>.
Github user ramkrish86 commented on the issue:

    https://github.com/apache/flink/pull/2495
  
    @StephanEwen - Any comments/feedback here. A gentle reminder !!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---