You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-user@hadoop.apache.org by abhay ratnaparkhi <ab...@gmail.com> on 2011/04/23 08:02:03 UTC

distribute objects prior to launching MR

Hello,

I'm writing one MR task where I need to pass a common set of data to all Map
task.
The data required by all the MR is present in relational database.

Is it possible to get data from database before launching job and then pass
the object to all Maps?
I know we can use DistributedCache to distribute files. But is there any
facility to distribute objects?

Abhay.

Re: distribute objects prior to launching MR

Posted by Harsh J <ha...@cloudera.com>.
Hello Abhay,

On Sat, Apr 23, 2011 at 11:32 AM, abhay ratnaparkhi
<ab...@gmail.com> wrote:
> Is it possible to get data from database before launching job and then pass
> the object to all Maps?
> I know we can use DistributedCache to distribute files. But is there any
> facility to distribute objects?

There's no direct way of doing that in Hadoop without manual
ser-deser. Serialize into a file -> Add to DC -> Deserialize in Tasks,
should do it.

-- 
Harsh J