You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pig.apache.org by "Harsh J (JIRA)" <ji...@apache.org> on 2012/09/14 10:23:08 UTC

[jira] [Commented] (PIG-2921) Provide a bulkloadable option in HBaseStorage

    [ https://issues.apache.org/jira/browse/PIG-2921?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13455665#comment-13455665 ] 

Harsh J commented on PIG-2921:
------------------------------

One problem though: Bulkload needs to be done via the user HBase runs as, AFAICT.
                
> Provide a bulkloadable option in HBaseStorage
> ---------------------------------------------
>
>                 Key: PIG-2921
>                 URL: https://issues.apache.org/jira/browse/PIG-2921
>             Project: Pig
>          Issue Type: New Feature
>          Components: data
>    Affects Versions: 0.9.2
>            Reporter: Harsh J
>
> Right now, the Pig HBaseStorage writes Puts directly into HBase. This is slow for bulk operations (such as the ones Pig exactly does). The Puts/Deletes are more meant for realtime operations, so it would be nice if Pig had an automatic mechanism to prepare bulkloadable HFiles for the target table, and bulkload it in right at the end of the job.
> For compatibility reasons, this can be optional and turned off by default until it is agreed that this must be default (but can continue to provide a turn-off option).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira