You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by "xi chaomin (Jira)" <ji...@apache.org> on 2023/01/17 06:18:00 UTC

[jira] [Resolved] (HBASE-26225) let hbase.mapreduce.bulkload.assign.sequenceNumbers take effect in SecureBulkLoadManager.secureBulkLoadHFiles

     [ https://issues.apache.org/jira/browse/HBASE-26225?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

xi chaomin resolved HBASE-26225.
--------------------------------
    Resolution: Won't Fix

> let hbase.mapreduce.bulkload.assign.sequenceNumbers take effect in SecureBulkLoadManager.secureBulkLoadHFiles
> -------------------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-26225
>                 URL: https://issues.apache.org/jira/browse/HBASE-26225
>             Project: HBase
>          Issue Type: Improvement
>          Components: Performance
>            Reporter: xi chaomin
>            Priority: Minor
>         Attachments: SecureBulkLoadManager.patch
>
>
> HBASE-10958 Call Flush before BulkLoad to obtain the latest sequenceID to prevent data loss during replay. '_hbase.mapreduce.bulkload.assign.sequenceNumbers_' controls whether to flush before BulkLoad, but we pass true to whether to flush in *SecureBulkLoadManager*. If we bulkload frequently we flush a lot of small files. Can we make 'hbase.mapreduce.bulkload.assign.sequenceNumbers' work in SecureBulkLoadManager? This passes -1 to sequenceId, we won't loss data.
> SecureBulkLoadManager.java. 
> secureBulkLoadHFiles
> {code:java}
> // code placeholder
> return region.bulkLoadHFiles(familyPaths, true, new SecureBulkLoadListener(fs, bulkToken, conf), request.getCopyFile(), clusterIds, request.getReplicate());
> {code}
> Hregion.java
> {code:java}
> // code placeholder
> public Map<byte[], List<Path>> bulkLoadHFiles(Collection<Pair<byte[], String>> familyPaths,
>     boolean assignSeqId, BulkLoadListener bulkLoadListener, boolean copyFile,
>     List<String> clusterIds, boolean replicate)
> {code}
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)