You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by "stack (JIRA)" <ji...@apache.org> on 2018/04/05 23:37:00 UTC
[jira] [Commented] (HBASE-14340) Add second bulk load option to
Spark Bulk Load to send puts as the value
[ https://issues.apache.org/jira/browse/HBASE-14340?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16427756#comment-16427756 ]
stack commented on HBASE-14340:
-------------------------------
Reverted from branch-2/2.0.0 by HBASE-18817
> Add second bulk load option to Spark Bulk Load to send puts as the value
> ------------------------------------------------------------------------
>
> Key: HBASE-14340
> URL: https://issues.apache.org/jira/browse/HBASE-14340
> Project: HBase
> Issue Type: New Feature
> Components: spark
> Reporter: Theodore michael Malaska
> Assignee: Theodore michael Malaska
> Priority: Minor
> Fix For: 3.0.0
>
> Attachments: HBASE-14340.1.patch, HBASE-14340.2.patch
>
>
> The initial bulk load option for Spark bulk load sends values over one by one through the shuffle. This is the similar to how the original MR bulk load worked.
> How ever the MR bulk loader have more then one bulk load option. There is a second option that allows for all the Column Families, Qualifiers, and Values or a row to be combined in the map side.
> This only works if the row is not super wide.
> But if the row is not super wide this method of sending values through the shuffle will reduce the data and work the shuffle has to deal with.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)