You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hbase.apache.org by Vishal Kapoor <vi...@gmail.com> on 2011/03/25 19:20:33 UTC

Observer/Observable MapReduce

Can someone give me a direction on how to start a map reduce based on
an outcome of another map reduce? ( nothing common between them apart
from the first decides about the scope of the second.

I might also want to set the scope of my second map reduce
(from/after) my first map reduce(scope as in scan(start,stop)

typically data comes in a few tables for us and we start crunching it
and then adding some more data to man tables like info etc to get rid
of table joins.

a light weight framework will do better than a typical workflow management tool.

thanks,
Vishal Kapoor

Re: Observer/Observable MapReduce

Posted by Harsh J <qw...@gmail.com>.
Instead of using a table, how about using the available ZooKeeper
service itself? They can hold small bits of information pretty well
themselves.

On Sat, Mar 26, 2011 at 12:29 AM, Vishal Kapoor
<vi...@gmail.com> wrote:
> David,
> how about waking up my second map reduce job as soon as I see some
> rows updated in that table.
> any thoughts on observing a column update?
>
> thanks,
> Vishal
>
> On Fri, Mar 25, 2011 at 2:56 PM, Buttler, David <bu...@llnl.gov> wrote:
>> What about just storing some metadata in a special table?
>> Then on you second job startup you can read that meta data and set your scan /input splits appropriately?
>> Dave
>>
>> -----Original Message-----
>> From: Vishal Kapoor [mailto:vishal.kapoor.in@gmail.com]
>> Sent: Friday, March 25, 2011 11:21 AM
>> To: user@hbase.apache.org
>> Subject: Observer/Observable MapReduce
>>
>> Can someone give me a direction on how to start a map reduce based on
>> an outcome of another map reduce? ( nothing common between them apart
>> from the first decides about the scope of the second.
>>
>> I might also want to set the scope of my second map reduce
>> (from/after) my first map reduce(scope as in scan(start,stop)
>>
>> typically data comes in a few tables for us and we start crunching it
>> and then adding some more data to man tables like info etc to get rid
>> of table joins.
>>
>> a light weight framework will do better than a typical workflow management tool.
>>
>> thanks,
>> Vishal Kapoor
>>
>



-- 
Harsh J
http://harshj.com

RE: Observer/Observable MapReduce

Posted by Doug Meil <do...@explorysmedical.com>.
The simplest way to do this is with a thread that executes the jobs you want to run synchronously....

	Job job1 = ...
	job1.waitForCompletion(true);

	Job job2 = ...
	job2.waitForCompletion(true);


-----Original Message-----
From: Vishal Kapoor [mailto:vishal.kapoor.in@gmail.com] 
Sent: Friday, March 25, 2011 3:00 PM
To: user@hbase.apache.org
Cc: Buttler, David
Subject: Re: Observer/Observable MapReduce

David,
how about waking up my second map reduce job as soon as I see some rows updated in that table.
any thoughts on observing a column update?

thanks,
Vishal

On Fri, Mar 25, 2011 at 2:56 PM, Buttler, David <bu...@llnl.gov> wrote:
> What about just storing some metadata in a special table?
> Then on you second job startup you can read that meta data and set your scan /input splits appropriately?
> Dave
>
> -----Original Message-----
> From: Vishal Kapoor [mailto:vishal.kapoor.in@gmail.com]
> Sent: Friday, March 25, 2011 11:21 AM
> To: user@hbase.apache.org
> Subject: Observer/Observable MapReduce
>
> Can someone give me a direction on how to start a map reduce based on 
> an outcome of another map reduce? ( nothing common between them apart 
> from the first decides about the scope of the second.
>
> I might also want to set the scope of my second map reduce
> (from/after) my first map reduce(scope as in scan(start,stop)
>
> typically data comes in a few tables for us and we start crunching it 
> and then adding some more data to man tables like info etc to get rid 
> of table joins.
>
> a light weight framework will do better than a typical workflow management tool.
>
> thanks,
> Vishal Kapoor
>

Re: Observer/Observable MapReduce

Posted by Vishal Kapoor <vi...@gmail.com>.
David,
how about waking up my second map reduce job as soon as I see some
rows updated in that table.
any thoughts on observing a column update?

thanks,
Vishal

On Fri, Mar 25, 2011 at 2:56 PM, Buttler, David <bu...@llnl.gov> wrote:
> What about just storing some metadata in a special table?
> Then on you second job startup you can read that meta data and set your scan /input splits appropriately?
> Dave
>
> -----Original Message-----
> From: Vishal Kapoor [mailto:vishal.kapoor.in@gmail.com]
> Sent: Friday, March 25, 2011 11:21 AM
> To: user@hbase.apache.org
> Subject: Observer/Observable MapReduce
>
> Can someone give me a direction on how to start a map reduce based on
> an outcome of another map reduce? ( nothing common between them apart
> from the first decides about the scope of the second.
>
> I might also want to set the scope of my second map reduce
> (from/after) my first map reduce(scope as in scan(start,stop)
>
> typically data comes in a few tables for us and we start crunching it
> and then adding some more data to man tables like info etc to get rid
> of table joins.
>
> a light weight framework will do better than a typical workflow management tool.
>
> thanks,
> Vishal Kapoor
>

RE: Observer/Observable MapReduce

Posted by "Buttler, David" <bu...@llnl.gov>.
What about just storing some metadata in a special table?
Then on you second job startup you can read that meta data and set your scan /input splits appropriately?
Dave

-----Original Message-----
From: Vishal Kapoor [mailto:vishal.kapoor.in@gmail.com] 
Sent: Friday, March 25, 2011 11:21 AM
To: user@hbase.apache.org
Subject: Observer/Observable MapReduce

Can someone give me a direction on how to start a map reduce based on
an outcome of another map reduce? ( nothing common between them apart
from the first decides about the scope of the second.

I might also want to set the scope of my second map reduce
(from/after) my first map reduce(scope as in scan(start,stop)

typically data comes in a few tables for us and we start crunching it
and then adding some more data to man tables like info etc to get rid
of table joins.

a light weight framework will do better than a typical workflow management tool.

thanks,
Vishal Kapoor

Re: Observer/Observable MapReduce

Posted by Andrey Stepachev <oc...@gmail.com>.
Look at http://yahoo.github.com/oozie/. May be it will helps you.

2011/3/25 Vishal Kapoor <vi...@gmail.com>

> Can someone give me a direction on how to start a map reduce based on
> an outcome of another map reduce? ( nothing common between them apart
> from the first decides about the scope of the second.
>
> I might also want to set the scope of my second map reduce
> (from/after) my first map reduce(scope as in scan(start,stop)
>
> typically data comes in a few tables for us and we start crunching it
> and then adding some more data to man tables like info etc to get rid
> of table joins.
>
> a light weight framework will do better than a typical workflow management
> tool.
>
> thanks,
> Vishal Kapoor
>