You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@drill.apache.org by "Chris Westin (JIRA)" <ji...@apache.org> on 2015/01/22 01:30:34 UTC

[jira] [Commented] (DRILL-1942) Improve off-heap memory usage tracking

    [ https://issues.apache.org/jira/browse/DRILL-1942?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14286632#comment-14286632 ] 

Chris Westin commented on DRILL-1942:
-------------------------------------

Simplify memory allocators
(*) Remove unused fields from ChildAllocator and TopLevelAllocator.
Eclipse reports several unused fields.

(*) Extract ChildAllocator class from TopLevelAllocator
Currently, ChildAllocator is a nested class inside TopLevelAllocator. It should instead be a separate class.

As part of this process, merge the Accountor class with the allocator classes. There should be a base allocator class that knows how to do accounting, and the (new) RootAllocator and ChildAllocator classes should derive from this common base class. This shouldn't be difficult, because it currently looks like there aren't any references to any of TopLevelAllocator's fields in ChildAllocator.

Since ChildAllocator can also spawn more child allocators, and these should be children of the child, not of the same parent, extracting ChildAllocator in this way will help maintain a strict hierarchy if that should happen. Allocators can be thought of as Scopes, and should observe nesting similar to nested block scoping in programming language environments.

(*) Improve memory usage tracking
We're using too much memory, but we don't know where or why. Allocators provide an excellent way to track this, if we can rely on them to track memory usage more finely than we currently do.

Make sure that we are using the allocator instance hierarchy that results from the above restructuring of the allocators properly for accounting. If an allocator runs out of memory, and needs more, it should request it from it's parent. If an allocator is done with memory, it should be returned to its parent. Do not short-circuit the accounting mechanisms (as is done now) by directly walking up the allocator instance hierarchy and adjusting accounting directly. Allow each allocator to make its own adjustments as memory is returned to it. Each allocator instance should be encapsulated properly, with the only currency passing between them being memory, and that being done through their published interface.

Allocator ownership is currently associated with a FragmentContext. But we'd like to know which operators are using memory, and how much of it. This can be done by making sure that each operator has its own allocator. Use of child allocators makes this easy. However, we need more than the FragmentContext; each allocator needs to know what operator it is associated with, so getChildAllocator() may need more arguments.
We want to be able to track it per operator per query.

In order to access this information, we've decided that for now, we want to make it available via JMX, as well as providing an option to dump it to the log at the end of the query.

(*) Take advantage of the allocator instance hierarchy for cleanup While we should continue to do strict accounting similar to that done now
(activated by enabling Java assertions), we should take advantage of the resulting allocator instance hierarchy for cleanup purposes. In the event that we do have bugs that lead to buffers from intermediate child allocators not being freed, or if intermediate allocators themselves are not freed, we can do that implicitly when an ancestral allocator is freed.

Building on the above, if we close() an allocator, all of the memory belonging to it and all of its descendants is freed. This should be done from the bottom up: all memory belonging to the leafmost allocators is reclaimed first, and their allocators are closed, and so on up the tree up to and including the allocator that was directly close()d.

We expect to catch bugs in code that doesn't follow this protocol by running tests with assertions, but if we do have bugs we still don't want to have leaks; following this protocol will assure that.

(*) Cope with memory sharing between operators

It looks like there are times when DrillBufs are passed between operators wholesale in order to pass data without copying. This requires the ability to exchange ownership between allocators. This ability currently exists, but needs to be modified a bit to fit all of the above.

In addition to that, DrillBufs are sometimes shared in the above scenario: for a hash or partitioning exchange, some data in one buffer may be sent to one operator, and other data in the same buffer may need to be sent to a different operator. When this happens, the buffer is shared by both operators. We don't appear to have good accounting for this at this time. We need to introduce the reference counting mechanism to support this, and only account for the used memory once.

We've speculated that there may also be situations where we need to share an allocator between multiple operators (or other agents), and that we may need to support reference counting for allocators themselves, but this is not certain. If so, we may need to provide a means to wrap them with an allocator whose implementation delegates most calls, but whose close() decrements a reference count and only closes the real allocator on the last close().


> Improve off-heap memory usage tracking
> --------------------------------------
>
>                 Key: DRILL-1942
>                 URL: https://issues.apache.org/jira/browse/DRILL-1942
>             Project: Apache Drill
>          Issue Type: Improvement
>          Components: Execution - Operators
>            Reporter: Chris Westin
>            Assignee: Chris Westin
>
> We're using a lot more memory than we think we should. We may be leaking it, or not releasing it as soon as we could. 
> This is a call to come up with some improved tracking so that we can get statistics out about exactly where we're using it, and whether or not we can release it earlier.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)