You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by "Jean-Daniel Cryans (Updated) (JIRA)" <ji...@apache.org> on 2012/01/14 02:04:39 UTC

[jira] [Updated] (HBASE-5190) Limit the IPC queue size based on calls' payload size

     [ https://issues.apache.org/jira/browse/HBASE-5190?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jean-Daniel Cryans updated HBASE-5190:
--------------------------------------

    Attachment: HBASE-5190.patch

This patch is what I've been testing, it's not polished yet. I implemented the first solution.

DEFAULT_MAX_QUEUE_SIZE_PER_HANDLER gets renamed to DEFAULT_MAX_CALLQUEUE_LENGTH_PER_HANDLER in order to add DEFAULT_MAX_CALLQUEUE_SIZE which is the max size for all calls that are in flight (now that I write this I think I need to change that name too hehe).

An AtomicLong is responsible to keep track of the total size. It could be a bottleneck, I haven't done any checks in that regard, but if it becomes one it could be lessen by not taking into account small calls (like <32KB).

We keep into account the call size from the moment it's inserted into the queue up to when the call is done. I considered doing that after the call is taken off the queue but then it's still easy to OOME yourself if many big calls come at the same time and handlers are free.

When a call is rejected it's being retried on the client side.
                
> Limit the IPC queue size based on calls' payload size
> -----------------------------------------------------
>
>                 Key: HBASE-5190
>                 URL: https://issues.apache.org/jira/browse/HBASE-5190
>             Project: HBase
>          Issue Type: Improvement
>    Affects Versions: 0.90.5
>            Reporter: Jean-Daniel Cryans
>            Assignee: Jean-Daniel Cryans
>             Fix For: 0.94.0
>
>         Attachments: HBASE-5190.patch
>
>
> Currently we limit the number of calls in the IPC queue only on their count. It used to be really high and was dropped down recently to num_handlers * 10 (so 100 by default) because it was easy to OOME yourself when huge calls were being queued. It's still possible to hit this problem if you use really big values and/or a lot of handlers, so the idea is that we should take into account the payload size. I can see 3 solutions:
>  - Do the accounting outside of the queue itself for all calls coming in and out and when a call doesn't fit, throw a retryable exception.
>  - Same accounting but instead block the call when it comes in until space is made available.
>  - Add a new parameter for the maximum size (in bytes) of a Call and then set the size the IPC queue (in terms of the number of items) so that it could only contain as many items as some predefined maximum size (in bytes) for the whole queue.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira