You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@hbase.apache.org by Billy Pearson <sa...@pearsonwholesale.com> on 2011/01/12 01:07:30 UTC

incrementColumnValue

Is there a way to make a mapreduce job and use incrementColumnValue in place 
of Put?

I am trying to move a job over from thrift and have to be able to use 
incrementColumnValue
as a output but I can not seams to work it out with out calling HTable every 
map.

small example would be nice if anyone uses it now
Billy

Re: incrementColumnValue

Posted by "M. C. Srivas" <mc...@gmail.com>.

"increment" and "decrement" are not idempotent. Map/Reduce requires you do
things in an idempotent fashion (the same task may get executed multiple
times, even simultaneously).

On Tue, Jan 11, 2011 at 4:07 PM, Billy Pearson
<sa...@pearsonwholesale.com>wrote:

> Is there a way to make a mapreduce job and use incrementColumnValue in
> place of Put?
>
> I am trying to move a job over from thrift and have to be able to use
> incrementColumnValue
> as a output but I can not seams to work it out with out calling HTable
> every map.
>
> small example would be nice if anyone uses it now
> Billy
>
>
>

Re: incrementColumnValue

Posted by Billy Pearson <sa...@pearsonwholesale.com>.

Thanks for that info did not thank about it that way but good reason.

Billy


> Hey,
>
> It is not possible, nor alas would it be a good idea.  Speculative
> execution can cause jobs to run twice with the "results" discarded
> from one.  The hbase output format doesnt really have a good way to
> 'discard' results, since we are outputting to a table not to a file
> that can be tossed.
>
> Futhermore, failures will cause job reruns, and the ICV is not exactly
> what you'd call idempotent.  You can instantiate HTable and call ICV
> directly yourself in either the map or reduce phase, but again, not
> recommended.
>
> You can also summarize your data and use a secondary process to
> execute a roll up of ICVs... if the number isnt too massive this might
> be acceptable.
>
> On Tue, Jan 11, 2011 at 4:07 PM, Billy Pearson
> <sa...@pearsonwholesale.com> wrote:
>> Is there a way to make a mapreduce job and use incrementColumnValue in 
>> place
>> of Put?
>>
>> I am trying to move a job over from thrift and have to be able to use
>> incrementColumnValue
>> as a output but I can not seams to work it out with out calling HTable 
>> every
>> map.
>>
>> small example would be nice if anyone uses it now
>> Billy
>>
>>
>>
>

Re: incrementColumnValue

Posted by Ryan Rawson <ry...@gmail.com>.

Hey,

It is not possible, nor alas would it be a good idea.  Speculative
execution can cause jobs to run twice with the "results" discarded
from one.  The hbase output format doesnt really have a good way to
'discard' results, since we are outputting to a table not to a file
that can be tossed.

Futhermore, failures will cause job reruns, and the ICV is not exactly
what you'd call idempotent.  You can instantiate HTable and call ICV
directly yourself in either the map or reduce phase, but again, not
recommended.

You can also summarize your data and use a secondary process to
execute a roll up of ICVs... if the number isnt too massive this might
be acceptable.

On Tue, Jan 11, 2011 at 4:07 PM, Billy Pearson
<sa...@pearsonwholesale.com> wrote:
> Is there a way to make a mapreduce job and use incrementColumnValue in place
> of Put?
>
> I am trying to move a job over from thrift and have to be able to use
> incrementColumnValue
> as a output but I can not seams to work it out with out calling HTable every
> map.
>
> small example would be nice if anyone uses it now
> Billy
>
>
>