You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@hbase.apache.org by Troy Bryant <Tr...@telus.com> on 2014/02/27 17:37:05 UTC

Row Deletion by Column Value

Hi all,

Using 0.92.2

We're looking into custom garbage collection methods.  Due to some business logic, we'd like to be able to delete rows based on the value of one of the columns, these deletes can be eventual rather than immediate.  We have written a Map Reduce job that works, but we aren't sure if it's fast enough in the long run.

I have two questions:
Would it be possible to implement a coprocessor that would essentially do the column value check during a major compaction, and only write rows that pass the check?  I'm not sure this is feasible because based on what I understand, the reads occur at the key-value level and not the row level.

Since our deletes can be eventual, would it be possible/faster to just tombstone the rows rather than delete them during our map reduce job, and let the major compaction handle the actual deletion?  If I'm not mistaken addDeleteMarker would be the method for this.

Thanks for your time.

Troy Bryant

Re: Row Deletion by Column Value

Posted by Ted Yu <yu...@gmail.com>.

bq. addDeleteMarker would be the method for this.

You can use the above method.

BTW 0.92.2 was so old - there have been 3 major releases since: 0.94, 0.96
and 0.98 :-)


On Thu, Feb 27, 2014 at 8:37 AM, Troy Bryant <Tr...@telus.com> wrote:

> Hi all,
>
> Using 0.92.2
>
> We're looking into custom garbage collection methods.  Due to some
> business logic, we'd like to be able to delete rows based on the value of
> one of the columns, these deletes can be eventual rather than immediate.
>  We have written a Map Reduce job that works, but we aren't sure if it's
> fast enough in the long run.
>
> I have two questions:
> Would it be possible to implement a coprocessor that would essentially do
> the column value check during a major compaction, and only write rows that
> pass the check?  I'm not sure this is feasible because based on what I
> understand, the reads occur at the key-value level and not the row level.
>
> Since our deletes can be eventual, would it be possible/faster to just
> tombstone the rows rather than delete them during our map reduce job, and
> let the major compaction handle the actual deletion?  If I'm not mistaken
> addDeleteMarker would be the method for this.
>
> Thanks for your time.
>
> Troy Bryant
>
>
>
>