You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by "Bogdan-Alexandru Matican (JIRA)" <ji...@apache.org> on 2011/06/15 01:30:47 UTC

[jira] [Commented] (HBASE-3967) Support deletes in HFileOutputFormat based bulk import mechanism

    [ https://issues.apache.org/jira/browse/HBASE-3967?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13049511#comment-13049511 ] 

Bogdan-Alexandru Matican commented on HBASE-3967:
-------------------------------------------------

Ok, so I think I've managed to make this work. However, I couldn't simply abstract up and use Row directly as the mapper output due to the following set of lines in "org.apache.hadoop.mapred.MapTask"

844       if (key.getClass() != keyClass) {
845         throw new IOException("Type mismatch in key from map: expected "
846                               + keyClass.getName() + ", recieved "
847                               + key.getClass().getName());
848       }

and the corresponding for value. 

This meant that even if I tried to pass a Put or a Delete as Rows when writing to the map context, it would fail at this check. As such, I just created an abstraction that acts as a union for _either_ a Put or a Delete and can be built off of either.

> Support deletes in HFileOutputFormat based bulk import mechanism
> ----------------------------------------------------------------
>
>                 Key: HBASE-3967
>                 URL: https://issues.apache.org/jira/browse/HBASE-3967
>             Project: HBase
>          Issue Type: Improvement
>            Reporter: Kannan Muthukkaruppan
>
> During bulk imports, it'll be useful to be able to do delete mutations (either to delete data that already exists in HBase or was inserted earlier during this run of the import). 
> For example, we have a use case, where we are processing a log of data which may have both inserts and deletes in the mix and we want to upload that into HBase using the bulk import mechanism.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira