You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@hbase.apache.org by "Karthick Sankarachary (JIRA)" <ji...@apache.org> on 2011/03/18 23:21:29 UTC

[jira] Created: (HBASE-3673) Reduce HTable Pool Contention Using Concurrent Collections

Reduce HTable Pool Contention Using Concurrent Collections
----------------------------------------------------------

                 Key: HBASE-3673
                 URL: https://issues.apache.org/jira/browse/HBASE-3673
             Project: HBase
          Issue Type: Improvement
          Components: client
    Affects Versions: 0.90.1, 0.90.2
            Reporter: Karthick Sankarachary
            Assignee: Karthick Sankarachary
            Priority: Minor
             Fix For: 0.90.0


In the case of medium-to-large sized HTable pools, the amount of time the client spends blocking on the underlying map and queue data structures turns out to be quite significant. Using an efficient wait-free implementation of maps and queues might serve to reduce the contention on the pool. In particular, I was wondering if we should replace the synchronized map with a concurrent hash map, and linked list with a concurrent linked queue.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-3673) Reduce HTable Pool Contention Using Concurrent Collections

Posted by "Karthick Sankarachary (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HBASE-3673?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13010533#comment-13010533 ] 

Karthick Sankarachary commented on HBASE-3673:
----------------------------------------------

The patch did seem to make a difference in our particular use case, in terms of the average time it took to get a htable from the pool. For the sake of a more formal evaluation, I put together a benchmark (see attached test case), which did show noticeable difference. Specifically, when you've a htable pool of size 150, and a worker pool of 100 threads, where each thread does a get followed by a put a million times, the total time spent was 7775614 ms with the patch, as opposed to 12654820 ms without. That's turns out to be a 40% improvement.

That said, using different htable pools for different use cases (e.g., different htables) might be the way to go, as that will tend to reduce the level of concurrency on any given htable pool.

> Reduce HTable Pool Contention Using Concurrent Collections
> ----------------------------------------------------------
>
>                 Key: HBASE-3673
>                 URL: https://issues.apache.org/jira/browse/HBASE-3673
>             Project: HBase
>          Issue Type: Improvement
>          Components: client
>    Affects Versions: 0.90.1, 0.90.2
>            Reporter: Karthick Sankarachary
>            Assignee: Karthick Sankarachary
>            Priority: Minor
>             Fix For: 0.90.0
>
>         Attachments: HBASE-3673-TESTCASE.patch, HBASE-3673.patch
>
>   Original Estimate: 5m
>  Remaining Estimate: 5m
>
> In the case of medium-to-large sized HTable pools, the amount of time the client spends blocking on the underlying map and queue data structures turns out to be quite significant. Using an efficient wait-free implementation of maps and queues might serve to reduce the contention on the pool. In particular, I was wondering if we should replace the synchronized map with a concurrent hash map, and linked list with a concurrent linked queue.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-3673) Reduce HTable Pool Contention Using Concurrent Collections

Posted by "Hudson (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HBASE-3673?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13011486#comment-13011486 ] 

Hudson commented on HBASE-3673:
-------------------------------

Integrated in HBase-TRUNK #1814 (See [https://hudson.apache.org/hudson/job/HBase-TRUNK/1814/])
    

> Reduce HTable Pool Contention Using Concurrent Collections
> ----------------------------------------------------------
>
>                 Key: HBASE-3673
>                 URL: https://issues.apache.org/jira/browse/HBASE-3673
>             Project: HBase
>          Issue Type: Improvement
>          Components: client
>    Affects Versions: 0.90.1, 0.90.2
>            Reporter: Karthick Sankarachary
>            Assignee: Karthick Sankarachary
>            Priority: Minor
>             Fix For: 0.92.0
>
>         Attachments: HBASE-3673-TESTCASE.patch, HBASE-3673.patch
>
>   Original Estimate: 5m
>  Remaining Estimate: 5m
>
> In the case of medium-to-large sized HTable pools, the amount of time the client spends blocking on the underlying map and queue data structures turns out to be quite significant. Using an efficient wait-free implementation of maps and queues might serve to reduce the contention on the pool. In particular, I was wondering if we should replace the synchronized map with a concurrent hash map, and linked list with a concurrent linked queue.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] Updated: (HBASE-3673) Reduce HTable Pool Contention Using Concurrent Collections

Posted by "Karthick Sankarachary (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HBASE-3673?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Karthick Sankarachary updated HBASE-3673:
-----------------------------------------

    Attachment: HBASE-3673.patch

> Reduce HTable Pool Contention Using Concurrent Collections
> ----------------------------------------------------------
>
>                 Key: HBASE-3673
>                 URL: https://issues.apache.org/jira/browse/HBASE-3673
>             Project: HBase
>          Issue Type: Improvement
>          Components: client
>    Affects Versions: 0.90.1, 0.90.2
>            Reporter: Karthick Sankarachary
>            Assignee: Karthick Sankarachary
>            Priority: Minor
>             Fix For: 0.90.0
>
>         Attachments: HBASE-3673.patch
>
>   Original Estimate: 5m
>  Remaining Estimate: 5m
>
> In the case of medium-to-large sized HTable pools, the amount of time the client spends blocking on the underlying map and queue data structures turns out to be quite significant. Using an efficient wait-free implementation of maps and queues might serve to reduce the contention on the pool. In particular, I was wondering if we should replace the synchronized map with a concurrent hash map, and linked list with a concurrent linked queue.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-3673) Reduce HTable Pool Contention Using Concurrent Collections

Posted by "Karthick Sankarachary (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HBASE-3673?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Karthick Sankarachary updated HBASE-3673:
-----------------------------------------

    Attachment: HBASE-3673-TESTCASE.patch

> Reduce HTable Pool Contention Using Concurrent Collections
> ----------------------------------------------------------
>
>                 Key: HBASE-3673
>                 URL: https://issues.apache.org/jira/browse/HBASE-3673
>             Project: HBase
>          Issue Type: Improvement
>          Components: client
>    Affects Versions: 0.90.1, 0.90.2
>            Reporter: Karthick Sankarachary
>            Assignee: Karthick Sankarachary
>            Priority: Minor
>             Fix For: 0.90.0
>
>         Attachments: HBASE-3673-TESTCASE.patch, HBASE-3673.patch
>
>   Original Estimate: 5m
>  Remaining Estimate: 5m
>
> In the case of medium-to-large sized HTable pools, the amount of time the client spends blocking on the underlying map and queue data structures turns out to be quite significant. Using an efficient wait-free implementation of maps and queues might serve to reduce the contention on the pool. In particular, I was wondering if we should replace the synchronized map with a concurrent hash map, and linked list with a concurrent linked queue.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] Commented: (HBASE-3673) Reduce HTable Pool Contention Using Concurrent Collections

Posted by "stack (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HBASE-3673?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13008857#comment-13008857 ] 

stack commented on HBASE-3673:
------------------------------

Patch looks good Karthick.  You tried it?  Any noticeable difference?  Good on you K.

> Reduce HTable Pool Contention Using Concurrent Collections
> ----------------------------------------------------------
>
>                 Key: HBASE-3673
>                 URL: https://issues.apache.org/jira/browse/HBASE-3673
>             Project: HBase
>          Issue Type: Improvement
>          Components: client
>    Affects Versions: 0.90.1, 0.90.2
>            Reporter: Karthick Sankarachary
>            Assignee: Karthick Sankarachary
>            Priority: Minor
>             Fix For: 0.90.0
>
>         Attachments: HBASE-3673.patch
>
>   Original Estimate: 5m
>  Remaining Estimate: 5m
>
> In the case of medium-to-large sized HTable pools, the amount of time the client spends blocking on the underlying map and queue data structures turns out to be quite significant. Using an efficient wait-free implementation of maps and queues might serve to reduce the contention on the pool. In particular, I was wondering if we should replace the synchronized map with a concurrent hash map, and linked list with a concurrent linked queue.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] Updated: (HBASE-3673) Reduce HTable Pool Contention Using Concurrent Collections

Posted by "Karthick Sankarachary (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HBASE-3673?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Karthick Sankarachary updated HBASE-3673:
-----------------------------------------

    Status: Patch Available  (was: Open)

> Reduce HTable Pool Contention Using Concurrent Collections
> ----------------------------------------------------------
>
>                 Key: HBASE-3673
>                 URL: https://issues.apache.org/jira/browse/HBASE-3673
>             Project: HBase
>          Issue Type: Improvement
>          Components: client
>    Affects Versions: 0.90.1, 0.90.2
>            Reporter: Karthick Sankarachary
>            Assignee: Karthick Sankarachary
>            Priority: Minor
>             Fix For: 0.90.0
>
>         Attachments: HBASE-3673.patch
>
>   Original Estimate: 5m
>  Remaining Estimate: 5m
>
> In the case of medium-to-large sized HTable pools, the amount of time the client spends blocking on the underlying map and queue data structures turns out to be quite significant. Using an efficient wait-free implementation of maps and queues might serve to reduce the contention on the pool. In particular, I was wondering if we should replace the synchronized map with a concurrent hash map, and linked list with a concurrent linked queue.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-3673) Reduce HTable Pool Contention Using Concurrent Collections

Posted by "stack (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HBASE-3673?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

stack updated HBASE-3673:
-------------------------

       Resolution: Fixed
    Fix Version/s:     (was: 0.90.0)
                   0.92.0
     Hadoop Flags: [Reviewed]
           Status: Resolved  (was: Patch Available)

Committed to TRUNK.  Thanks for the patch Karthick.

> Reduce HTable Pool Contention Using Concurrent Collections
> ----------------------------------------------------------
>
>                 Key: HBASE-3673
>                 URL: https://issues.apache.org/jira/browse/HBASE-3673
>             Project: HBase
>          Issue Type: Improvement
>          Components: client
>    Affects Versions: 0.90.1, 0.90.2
>            Reporter: Karthick Sankarachary
>            Assignee: Karthick Sankarachary
>            Priority: Minor
>             Fix For: 0.92.0
>
>         Attachments: HBASE-3673-TESTCASE.patch, HBASE-3673.patch
>
>   Original Estimate: 5m
>  Remaining Estimate: 5m
>
> In the case of medium-to-large sized HTable pools, the amount of time the client spends blocking on the underlying map and queue data structures turns out to be quite significant. Using an efficient wait-free implementation of maps and queues might serve to reduce the contention on the pool. In particular, I was wondering if we should replace the synchronized map with a concurrent hash map, and linked list with a concurrent linked queue.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira