You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-user@hadoop.apache.org by "Rahul.V." <gr...@gmail.com> on 2010/08/03 06:14:17 UTC

hadoop on unstable nodes

Hi,
Is there any research currently going on where map reduce is applied to
nodes in normal internet scenarios?.In environments where network bandwidth
is at premium what are the tweaks applied to hadoop?
I would be very thankful if you can post me links in this direction.

-- 
Regards,
R.V.

Re: hadoop on unstable nodes

Posted by He Chen <ai...@gmail.com>.
Condor has a hadoop subproject in UW-Madison, and there are also some
scientists from VT. They worked on security Hadoop MapReduce on Internet.

In my opinion, Alex is correct, Hadoop MR is communication intensive
especially in the map and shuffle stage. In the map stage, every mapper
needs input data from File System. If your data distributed among Internet,
you may encounter heavy delay. Also in the shuffle stage, reducer collect
mapper's intermediate results through Internet. This is another bottleneck
we can not overlook.

Hope this will help.

Chen

On Tue, Aug 3, 2010 at 11:37 AM, Alex Loddengaard <al...@cloudera.com> wrote:

> I don't know of any research, but such a scenario is likely not going to
> turn out so well.  Hadoop is very network hungry and is designed to be run
> in a datacenter.  Sorry I don't have more information for you.
>
> Alex
>
> On Mon, Aug 2, 2010 at 9:14 PM, Rahul.V. <greatness.hardness@gmail.com
> >wrote:
>
> > Hi,
> > Is there any research currently going on where map reduce is applied to
> > nodes in normal internet scenarios?.In environments where network
> bandwidth
> > is at premium what are the tweaks applied to hadoop?
> > I would be very thankful if you can post me links in this direction.
> >
> > --
> > Regards,
> > R.V.
> >
>

Re: hadoop on unstable nodes

Posted by Alex Loddengaard <al...@cloudera.com>.
I don't know of any research, but such a scenario is likely not going to
turn out so well.  Hadoop is very network hungry and is designed to be run
in a datacenter.  Sorry I don't have more information for you.

Alex

On Mon, Aug 2, 2010 at 9:14 PM, Rahul.V. <gr...@gmail.com>wrote:

> Hi,
> Is there any research currently going on where map reduce is applied to
> nodes in normal internet scenarios?.In environments where network bandwidth
> is at premium what are the tweaks applied to hadoop?
> I would be very thankful if you can post me links in this direction.
>
> --
> Regards,
> R.V.
>