You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by "nkeywal (JIRA)" <ji...@apache.org> on 2012/12/04 16:29:00 UTC

[jira] [Comment Edited] (HBASE-7247) Assignment performances decreased by 50% because of regionserver.OpenRegionHandler#tickleOpening

    [ https://issues.apache.org/jira/browse/HBASE-7247?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13509789#comment-13509789 ] 

nkeywal edited comment on HBASE-7247 at 12/4/12 3:27 PM:
---------------------------------------------------------

bq. Do we not have this currently? If someone else changes the znode, we don't notice? We only notice when we go to update the znode?


I haven't found where where we do it. The result of tickleOpening is actually often ignored as well, or at least ignored for a long time. 
for example, in OpenRegionHandler#updateMeta(final HRegion r), we have 

{code}
tickleOpening = true(
while (/* condition, but not on tickleOpening */){
  // do something
  tickleOpening = tickleOpening();
}

return something && tickleOpening;
{code}

In theory, we should break the loop when tickleOpening becomes false.


While the way it's written, it seems that we can have a failure once, then a success.
Basically, it seems that tickleOpening is not always used as a check.

                
      was (Author: nkeywal):
    bq. Do we not have this currently? If someone else changes the znode, we don't notice? We only notice when we go to update the znode?


I haven't found where where we do it. The result of tickleOpening is actually often ignored as well, or at least ignored for a long time. 
for example, in OpenRegionHandler#updateMeta(final HRegion r), we have 

{code}
tickleOpening = true(
while (/* condition, but not on tickleOpening */){
  // do something
  tickleOpening = tickleOpening();
}

return something && tickleOpening;
{code}

In theory, we should break the loop when tickleOpening becomes false.


While the way it's written, it seems that we can have a failure once, then a success.
Basically, it s eems that tickleOpening is not always used as a check.

There is about the same code in OpenRegionHandler#process (i.e. an error in tickleOpening does not interrupt the process). 
                  
> Assignment performances decreased by 50% because of regionserver.OpenRegionHandler#tickleOpening
> ------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-7247
>                 URL: https://issues.apache.org/jira/browse/HBASE-7247
>             Project: HBase
>          Issue Type: Improvement
>          Components: master, Region Assignment, regionserver
>    Affects Versions: 0.96.0
>            Reporter: nkeywal
>            Assignee: nkeywal
>            Priority: Critical
>
> The regionserver.OpenRegionHandler#tickleOpening updates the region znode as "Do this so master doesn't timeout this region-in-transition.".
> However, on the usual test, this makes the assignment time of 1500 regions goes from 70s to 100s, that is, we're 50% slower because of this.
> More generally, ZooKeper commits to disk all the data update, and this takes time. Using it to provide a keep alive seems overkill. At the very list, it could be made asynchronous.
> I'm not sure how necessary these updates are required (I need to go deeper in the internal, feedback welcome), but it seems very important to optimize this... The trival fix would be to make this optional.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira