You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@crunch.apache.org by "Josh Wills (JIRA)" <ji...@apache.org> on 2012/12/12 09:27:21 UTC

[jira] [Updated] (CRUNCH-127) Allow multiple HBaseTargets in a single pipeline

     [ https://issues.apache.org/jira/browse/CRUNCH-127?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Josh Wills updated CRUNCH-127:
------------------------------

    Attachment: CRUNCH-127.patch

First cut at this. I banged my head against making HBaseTarget work w/MultipleOutputs, to no avail. In the process, I rewrote most of the MultipleOutputs stuff to make it work more like CrunchInputs, which has some advantages (and some disadvantages) that might be worth exploring later.

In the meantime, here's a simple patch that adds in support for HBase's MultiTableOutputFormat. For this change, the key is the name of the table to write, and the value is either a Put or a Delete, so it needs to be given a PTable<ImmutableBytesWritable, Put|Delete> in order to work. Still need to write an integration test, but let me know if you get a chance to bang on it.
                
> Allow multiple HBaseTargets in a single pipeline
> ------------------------------------------------
>
>                 Key: CRUNCH-127
>                 URL: https://issues.apache.org/jira/browse/CRUNCH-127
>             Project: Crunch
>          Issue Type: Bug
>            Reporter: Micah Whitacre
>            Assignee: Josh Wills
>         Attachments: CRUNCH-127.patch
>
>
> Currently when a pipeline contains writes to multiple HBaseTargets, all puts are being sent to the first configured HBaseTarget ignoring the second one and causing issues if the columns are not the same.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira