You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by "Tim Robertson (JIRA)" <ji...@apache.org> on 2016/10/18 13:14:58 UTC

[jira] [Comment Edited] (HBASE-12596) bulkload needs to follow locality

    [ https://issues.apache.org/jira/browse/HBASE-12596?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15585264#comment-15585264 ] 

Tim Robertson edited comment on HBASE-12596 at 10/18/16 1:14 PM:
-----------------------------------------------------------------

Do any of the original committers recall if there was a technical reason why this could not be applied to 1.0.0?
I'm about to try and produce a patched implementation for a running 1.0 (CDH 5.4.10) installation but if someone knows this is doomed it would be nice to know. 

Edited: Yes, it works and a patched HFileOutputFormat2 based on the CDH 5.4.10 (Hbase 1.0.0-cdh5.4.10) is here:
https://github.com/gbif/maps/commit/ee4e0001486f3e8b37b034c5b05fc8c8d4e76ab9

Used as PatchedHFileOutputFormat2.configureIncrementalLoad(job, table);



was (Author: timrobertson100):
Do any of the original committers recall if there was a technical reason why this could not be applied to 1.0.0?
I'm about to try and produce a patched implementation for a running 1.0 (CDH 5.4.10) installation but if someone knows this is doomed it would be nice to know. 

> bulkload needs to follow locality
> ---------------------------------
>
>                 Key: HBASE-12596
>                 URL: https://issues.apache.org/jira/browse/HBASE-12596
>             Project: HBase
>          Issue Type: Improvement
>          Components: HFile, regionserver
>    Affects Versions: 0.98.8
>         Environment: hadoop-2.3.0, hbase-0.98.8, jdk1.7
>            Reporter: Victor Xu
>            Assignee: Victor Xu
>             Fix For: 2.0.0, 0.98.14, 1.3.0
>
>         Attachments: HBASE-12596-0.98-v1.patch, HBASE-12596-0.98-v2.patch, HBASE-12596-0.98-v3.patch, HBASE-12596-0.98-v4.patch, HBASE-12596-0.98-v5.patch, HBASE-12596-0.98-v6.patch, HBASE-12596-branch-1-v1.patch, HBASE-12596-branch-1-v2.patch, HBASE-12596-master-v1.patch, HBASE-12596-master-v2.patch, HBASE-12596-master-v3.patch, HBASE-12596-master-v4.patch, HBASE-12596-master-v5.patch, HBASE-12596-master-v6.patch, HBASE-12596.patch
>
>
> Normally, we have 2 steps to perform a bulkload: 1. use a job to write HFiles to be loaded; 2. Move these HFiles to the right hdfs directory. However, the locality could be loss during the first step. Why not just write the HFiles directly into the right place? We can do this easily because StoreFile.WriterBuilder has the "withFavoredNodes" method, and we just need to call it in HFileOutputFormat's getNewWriter().
> This feature is enabled by default, and we could use 'hbase.bulkload.locality.sensitive.enabled=false' to disable it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)