You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by "Jeffrey Zhong (JIRA)" <ji...@apache.org> on 2013/03/06 04:00:13 UTC
[jira] [Commented] (HBASE-6772) Make the Distributed Split HDFS Location aware

    [ https://issues.apache.org/jira/browse/HBASE-6772?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13594275#comment-13594275 ] 

Jeffrey Zhong commented on HBASE-6772:
--------------------------------------

I have slightly updated the design for this JIRA:

Since master kicks off log splitting work, 1) it knows the locations of the hlog files  2) it knows current online region servers 

Therefore, master knows if a hlog is possibly handled by a region server which is local to the hlog file.

The updated design:
1) master identify the set of hlog files which are potentially handled by their local region servers.
2) add a new attribute of ZK log entry "reserveForLocalRSExpirationTime"
3) Set its value to a future time(current time + 0.2s delay) for hlog files which are potentially handled by local region servers and 0 to hlogs can't be handled by local region servers

When a RS picks a hlog from ZK, if it sees the hlog isn't local to itself but "reserveForLocalRSExpirationTime" expires, it will go ahead to handle the hlog immediately without waiting. 

So we won't have the 1st drawback mentioned in the JIRA description. The situation likely happen for a hbase cluster sitting on a large hadoop cluster.

In addition, we define "local" if and only if a region server is within the same rack as a hlog file to simplify the implementation and get most benefits of the JIRA.

Thanks,
-Jeffrey



                
> Make the Distributed Split HDFS Location aware
> ----------------------------------------------
>
>                 Key: HBASE-6772
>                 URL: https://issues.apache.org/jira/browse/HBASE-6772
>             Project: HBase
>          Issue Type: Improvement
>          Components: master, regionserver
>    Affects Versions: 0.96.0
>            Reporter: nkeywal
>            Assignee: Jeffrey Zhong
>
> During a hlog split, each log file (a single hdfs block) is allocated to a different region server. This region server reads the file and creates the recovery edit files.
> The allocation to the region server is random. We could take into account the locations of the log file to split:
> - the reads would be local, hence faster. This allows short circuit as well.
> - less network i/o used during a failure (and this is important)
> - we would be sure to read from a working datanode, hence we're sure we won't have read errors. Read errors slow the split process a lot, as we often enter the "timeouted world". 
> We need to limit the calls to the namenode however.
> Typical algo could be:
> - the master gets the locations of the hlog files
> - it writes it into ZK, if possible in one transaction (this way all the tasks are visible alltogether, allowing some arbitrage by the region server).
> - when the regionserver receives the event, it checks for all logs and all locations.
> - if there is a match, it takes it
> - if not it waits something like 0.2s (to give the time to other regionserver to take it if the location matches), and take any remaining task.
> Drawbacks are:
> - a 0.2s delay added if there is no regionserver available on one of the locations. It's likely possible to remove it with some extra synchronization.
> - Small increase in complexity and dependency to HDFS
> Considering the advantages, it's worth it imho.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira