You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by "Jean-Daniel Cryans (JIRA)" <ji...@apache.org> on 2011/04/14 01:12:05 UTC

[jira] [Updated] (HBASE-3767) Cache the number of RS in HTable

     [ https://issues.apache.org/jira/browse/HBASE-3767?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jean-Daniel Cryans updated HBASE-3767:
--------------------------------------

    Attachment: HBASE-3767.patch

So the current way we handle the TPE is called "unbounded queues", from the javadoc:

{quote}
Unbounded queues. Using an unbounded queue (for example a LinkedBlockingQueue without a predefined capacity) will cause new tasks to wait in the queue when all corePoolSize threads are busy. Thus, no more than corePoolSize threads will ever be created. (And the value of the maximumPoolSize therefore doesn't have any effect.) This may be appropriate when each task is completely independent of others, so tasks cannot affect each others execution; for example, in a web page server. While this style of queuing can be useful in smoothing out transient bursts of requests, it admits the possibility of unbounded work queue growth when commands continue to arrive on average faster than they can be processed.
{quote}

The important part is that no more than corePoolSize threads will ever be created, maxPoolSize isn't used, and the rest is just queued. This is why it's important in that context to know the number of region servers since you want maximum parallelism.

Instead, using the "direct handoff" strategy, new threads are created as soon as they start being queued meaning that the number of threads will go up to the number of region servers naturally, even if it changes. From the javadoc:

{quote}
Direct handoffs. A good default choice for a work queue is a SynchronousQueue that hands off tasks to threads without otherwise holding them. Here, an attempt to queue a task will fail if no threads are immediately available to run it, so a new thread will be constructed. This policy avoids lockups when handling sets of requests that might have internal dependencies. Direct handoffs generally require unbounded maximumPoolSizes to avoid rejection of new submitted tasks. This in turn admits the possibility of unbounded thread growth when commands continue to arrive on average faster than they can be processed.
{quote}

We will never suffer from what is described in that last sentence since HCM will only create as many Runnables as there are RS that contain the regions that we need to talk to.

> Cache the number of RS in HTable
> --------------------------------
>
>                 Key: HBASE-3767
>                 URL: https://issues.apache.org/jira/browse/HBASE-3767
>             Project: HBase
>          Issue Type: Improvement
>    Affects Versions: 0.90.2
>            Reporter: Jean-Daniel Cryans
>             Fix For: 0.90.3
>
>         Attachments: HBASE-3767.patch
>
>
> When creating a new HTable we have to query ZK to learn about the number of region servers in the cluster. That is done for every single one of them, I think instead we should do it once per JVM and then reuse that number for all the others.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira