You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@orc.apache.org by "Sergey Shelukhin (JIRA)" <ji...@apache.org> on 2018/10/09 19:03:00 UTC

[jira] [Comment Edited] (ORC-408) hard limit on memory use by ORC writers

    [ https://issues.apache.org/jira/browse/ORC-408?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16643953#comment-16643953 ] 

Sergey Shelukhin edited comment on ORC-408 at 10/9/18 7:02 PM:
---------------------------------------------------------------

According to [~prasanth_j] there is no mechanism that actually reduces the memory. What version/config is the minimum requirement for that to happen?
Also it would seem that flushing a stripe may not always be possible, and may require additional memory to do the flush. What we want is the hard limit for process safety (to avoid GC issues/off heap exhaustion) - as soon as it's hit, no more non-trivial  memory buffers whatsoever can be allocated by any of a set of writers (until perhaps some memory is freed).


was (Author: sershe):
According to [~prasanth_j] there is no mechanism that actually reduces the memory. What version/config is the minimum requirement for that to happen?
Also it would seem that flushing a stripe may not always be possible, and may require additional memory to do the flush. What we want is the hard limit - as soon as it's hit, no more non-trivial  memory buffers whatsoever can be allocated by any of a set of writers (until perhaps some memory is freed).

> hard limit on memory use by ORC writers
> ---------------------------------------
>
>                 Key: ORC-408
>                 URL: https://issues.apache.org/jira/browse/ORC-408
>             Project: ORC
>          Issue Type: Bug
>            Reporter: Sergey Shelukhin
>            Priority: Major
>
> Scenario: we want to hard-limit (within the constraints imposed by using Java) the memory used by a particular Hive task dedicated to ORC writing, to protect other tasks from misbehaving queries. This is similar to how we e.g. limit the memory used for hash join - when the hash table goes over the limit, the task fails.
> However, we currently cannot even hard-limit this for a single writer, much less for several writers combined, when they are writing.
> I wonder if it's possible to add two features to MemoryManager:
> 1) Grouping writers. A tag can be supplied externally (e.g. when creating the writer).
> 2) Hard-limiting the memory by tag - if the group exceeds the memory allowance, all the corresponding writers should be made to fail on next operation, via the callback.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)