You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by "Bryan Beaudreault (Jira)" <ji...@apache.org> on 2021/11/23 16:34:00 UTC

[jira] (HBASE-26250) Automatic and near real-time healing of locality

    [ https://issues.apache.org/jira/browse/HBASE-26250 ]


    Bryan Beaudreault deleted comment on HBASE-26250:
    -------------------------------------------

was (Author: bbeaudreault):
While I'm looking for overall feedback on the approach so that I can contribute this tool to the community, I'd also like to specifically request feedback/guidance on the design described in the "Refreshing HFiles" section. As some of the devs know, I'm working on a major version upgrade at HubSpot and need to update our internal tool to work with HBase2. The internals in this area have changed a bit and I don't want to assume that a straight port of our cdh5-based implementation will be the best approach for HBase2. Thanks!

> Automatic and near real-time healing of locality
> ------------------------------------------------
>
>                 Key: HBASE-26250
>                 URL: https://issues.apache.org/jira/browse/HBASE-26250
>             Project: HBase
>          Issue Type: New Feature
>            Reporter: Bryan Beaudreault
>            Assignee: Bryan Beaudreault
>            Priority: Major
>
> I’m proposing a somewhat major new tool for quickly and efficiently alleviating latency pains due to locality. This is especially useful in cloud environments, and has been highly impactful at HubSpot, where we run thousands of RegionServers across 40+ multi-zone clusters. Please see the attached design doc for details on the problem, why compactions are not enough to solve the problem, and an overview (with diagram) of the components that make up this new tool.
> As spec'd, this new feature would require submission of a new tool in the HDFS project. Once we reach consensus on the approach I can create the relevant upstream HDFS JIRA.
> See the design doc here: [https://docs.google.com/document/d/1GLGzrF1QLyhyOCr2fFw0LCymnyFPT0ktShTaaXn-75A/edit#heading=h.aswo7shg76b6]
> Note: This issue is an attempt to upstream a tool that has been fully deployed for all clusters in production at HubSpot for about 6 months. It's been very effective for us as currently implemented, but will need to be re-organized and re-designed a bit to fit into the HBase/HDFS projects. As such I'd like feedback on the design before putting in too much effort on porting multiple components into PRs.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)