You are viewing a plain text version of this content. The canonical link for it is here.

Posted to common-user@hadoop.apache.org by Jason Venner <ja...@attributor.com> on 2007/12/26 20:30:40 UTC

Do people put their master node in the slave list - 0.15.1

I have been experimenting with that, and when I do, the master saturates 
well before the slave nodes, and the jobs start experiencing timeouts

The map task in question is the IdentityMapper, this job is a simple 
merge sort, combining data by key where there are duplicate keys in the 
input stream.
There is no swapping going on in my cluster, and the machines in 
question are all 8 processor boxes, and the tasks.maximum was set to 6.

task_200712261033_0002_m_000078_0: Exception in thread "main" 
java.net.SocketTimeoutException: timed out waiting for rpc response
task_200712261033_0002_m_000078_0:      at 
org.apache.hadoop.ipc.Client.call(Client.java:484)
task_200712261033_0002_m_000078_0:      at 
org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:184)
task_200712261033_0002_m_000078_0:      at 
org.apache.hadoop.mapred.$Proxy0.getTask(Unknown Source)
task_200712261033_0002_m_000078_0:      at 
org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:1747)
07/12/26 10:48:03 INFO mapred.JobClient: Task Id : 
task_200712261033_0002_m_000081_1, Status : FAILED

Re: Do people put their master node in the slave list - 0.15.1

Posted by Eric Baldeschwieler <er...@yahoo-inc.com>.

We run these on separate nodes.  We saturate a 16G RAM node just  
running the name node, but we have a lot of files and clients.

On Dec 26, 2007, at 12:22 PM, Ted Dunning wrote:

>
> My namenode and jobtracker are both on a machine that is a datanode  
> and has
> a tasktracker as well.  It is also less well outfitted than yours.
>
> I have no problems, but my data is encrypted which might make the  
> CPU/disk
> trade-offs very different.
>
>
> On 12/26/07 12:11 PM, "Jason Venner" <ja...@attributor.com> wrote:
>
>> This seems to be more a function of input file size than anything.  
>> I had
>> a single (uncompressed) 35gig input file Text,Value.
>>
>> Jason Venner wrote:
>>> I have been experimenting with that, and when I do, the master
>>> saturates well before the slave nodes, and the jobs start  
>>> experiencing
>>> timeouts
>>>
>>> The map task in question is the IdentityMapper, this job is a simple
>>> merge sort, combining data by key where there are duplicate keys in
>>> the input stream.
>>> There is no swapping going on in my cluster, and the machines in
>>> question are all 8 processor boxes, and the tasks.maximum was set  
>>> to 6.
>>>
>>> task_200712261033_0002_m_000078_0: Exception in thread "main"
>>> java.net.SocketTimeoutException: timed out waiting for rpc response
>>> task_200712261033_0002_m_000078_0:      at
>>> org.apache.hadoop.ipc.Client.call(Client.java:484)
>>> task_200712261033_0002_m_000078_0:      at
>>> org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:184)
>>> task_200712261033_0002_m_000078_0:      at
>>> org.apache.hadoop.mapred.$Proxy0.getTask(Unknown Source)
>>> task_200712261033_0002_m_000078_0:      at
>>> org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java: 
>>> 1747)
>>> 07/12/26 10:48:03 INFO mapred.JobClient: Task Id :
>>> task_200712261033_0002_m_000081_1, Status : FAILED
>>>
>

Re: Do people put their master node in the slave list - 0.15.1

Posted by Ted Dunning <td...@veoh.com>.

My namenode and jobtracker are both on a machine that is a datanode and has
a tasktracker as well.  It is also less well outfitted than yours.

I have no problems, but my data is encrypted which might make the CPU/disk
trade-offs very different.


On 12/26/07 12:11 PM, "Jason Venner" <ja...@attributor.com> wrote:

> This seems to be more a function of input file size than anything. I had
> a single (uncompressed) 35gig input file Text,Value.
> 
> Jason Venner wrote:
>> I have been experimenting with that, and when I do, the master
>> saturates well before the slave nodes, and the jobs start experiencing
>> timeouts
>> 
>> The map task in question is the IdentityMapper, this job is a simple
>> merge sort, combining data by key where there are duplicate keys in
>> the input stream.
>> There is no swapping going on in my cluster, and the machines in
>> question are all 8 processor boxes, and the tasks.maximum was set to 6.
>> 
>> task_200712261033_0002_m_000078_0: Exception in thread "main"
>> java.net.SocketTimeoutException: timed out waiting for rpc response
>> task_200712261033_0002_m_000078_0:      at
>> org.apache.hadoop.ipc.Client.call(Client.java:484)
>> task_200712261033_0002_m_000078_0:      at
>> org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:184)
>> task_200712261033_0002_m_000078_0:      at
>> org.apache.hadoop.mapred.$Proxy0.getTask(Unknown Source)
>> task_200712261033_0002_m_000078_0:      at
>> org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:1747)
>> 07/12/26 10:48:03 INFO mapred.JobClient: Task Id :
>> task_200712261033_0002_m_000081_1, Status : FAILED
>>

Re: Do people put their master node in the slave list - 0.15.1

Posted by Jason Venner <ja...@attributor.com>.

This seems to be more a function of input file size than anything. I had 
a single (uncompressed) 35gig input file Text,Value.

Jason Venner wrote:
> I have been experimenting with that, and when I do, the master 
> saturates well before the slave nodes, and the jobs start experiencing 
> timeouts
>
> The map task in question is the IdentityMapper, this job is a simple 
> merge sort, combining data by key where there are duplicate keys in 
> the input stream.
> There is no swapping going on in my cluster, and the machines in 
> question are all 8 processor boxes, and the tasks.maximum was set to 6.
>
> task_200712261033_0002_m_000078_0: Exception in thread "main" 
> java.net.SocketTimeoutException: timed out waiting for rpc response
> task_200712261033_0002_m_000078_0:      at 
> org.apache.hadoop.ipc.Client.call(Client.java:484)
> task_200712261033_0002_m_000078_0:      at 
> org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:184)
> task_200712261033_0002_m_000078_0:      at 
> org.apache.hadoop.mapred.$Proxy0.getTask(Unknown Source)
> task_200712261033_0002_m_000078_0:      at 
> org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:1747)
> 07/12/26 10:48:03 INFO mapred.JobClient: Task Id : 
> task_200712261033_0002_m_000081_1, Status : FAILED
>