You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-dev@hadoop.apache.org by Doğacan Güney <do...@gmail.com> on 2007/03/02 13:50:11 UTC

Question about DFS/MR deployment

Hello everyone,

When we started to use Hadoop (which was around 0.4.0 I think), we
used different machines for DFS and MR. IIRC, we had some problems
with running both a datanode and a tasktracker on the same machine, or
perhaps we were just superstitious. Anyway, the decision stuck and we
still use different machines.

So, the question is:
How do you run MR/DFS? Do you run JT/NN on the same machine or on
different machines? Do you run a tasktracker and a datanode on the
same machine? Also, in general is it recommended to run them on the
same machine?
(Our machines are dual core AMD64s with 2-4 GBs of RAM, btw)

Thanks,
Doğacan Güney

Re: Question about DFS/MR deployment

Posted by Andrzej Bialecki <ab...@getopt.org>.
Doğacan Güney wrote:
> Hello everyone,
>
> When we started to use Hadoop (which was around 0.4.0 I think), we
> used different machines for DFS and MR. IIRC, we had some problems
> with running both a datanode and a tasktracker on the same machine, or
> perhaps we were just superstitious. Anyway, the decision stuck and we
> still use different machines.
>
> So, the question is:
> How do you run MR/DFS? Do you run JT/NN on the same machine or on
> different machines? Do you run a tasktracker and a datanode on the
> same machine? Also, in general is it recommended to run them on the
> same machine?
> (Our machines are dual core AMD64s with 2-4 GBs of RAM, btw)

Your setup is rather unusual. Typically you should run DN/TT on the same 
machines, because then tasktrackers may benefit from data locality (i.e. 
DFS blocks may be found on the local disk and don't have to be 
transmitted over the network). I think it would be much better to 
resolve whatever issue prevented you from doing this in the first place ...

JT/NN don't have to run on the same machine, although in my setups I 
usually end up with this configuration - JT and NN create moderate 
loads, so a single machine is usually sufficient, and usually I can't 
afford to put them on dedicated separate machines ..

-- 
Best regards,
Andrzej Bialecki     <><
 ___. ___ ___ ___ _ _   __________________________________
[__ || __|__/|__||\/|  Information Retrieval, Semantic Web
___|||__||  \|  ||  |  Embedded Unix, System Integration
http://www.sigram.com  Contact: info at sigram dot com



Re: Question about DFS/MR deployment

Posted by Doğacan Güney <do...@gmail.com>.
(I really meant to post it to hadoop-user, sorry.)