You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-dev@hadoop.apache.org by "B. X." <bx...@gmail.com> on 2009/08/19 21:14:50 UTC

different modes of inter process communication

greetings,

   Is there a summary somewhere regarding the different APIs used
between JVM processes in Hadoop to communicate with each other?  I
found at least the following:
-  using SocketChannel with apache/hadoop/ipc/[Server|Client] apis;
-  using IO streams customized in
apache/hadoop/hdfs/server/datanode/Block[Sender|Receiver];

Thanks,
-Bin

Re: different modes of inter process communication

Posted by Philip Zeyliger <ph...@cloudera.com>.
It's not complete nor authoritative, but
http://www.cloudera.com/blog/2009/08/14/hadoop-default-ports-quick-reference/
has a table of what ports which daemons have open, and what protocol
they use to communicate.  Most communication is Hadoop IPC, but
there's some raw socket stuff for the datanodes, and there's some HTTP
too.

-- Philip

On Wed, Aug 19, 2009 at 12:14 PM, B. X.<bx...@gmail.com> wrote:
> greetings,
>
>   Is there a summary somewhere regarding the different APIs used
> between JVM processes in Hadoop to communicate with each other?  I
> found at least the following:
> -  using SocketChannel with apache/hadoop/ipc/[Server|Client] apis;
> -  using IO streams customized in
> apache/hadoop/hdfs/server/datanode/Block[Sender|Receiver];
>
> Thanks,
> -Bin
>

Re: different modes of inter process communication

Posted by Raghu Angadi <ra...@yahoo-inc.com>.
Steve Loughran wrote:
> Raghu Angadi wrote:
>>
>> A heartBeat is also an RPC. When you pause Namenode for 30 sec the 
>> datanode's heartbeat thread just waits for 30 sec for its heartbeat 
>> RPC to return. Note that when you pause Namenode, the RPCs to it don't 
>> fail immediately. During this wait, DNs can perform other transactions 
>> like serving data to clients.
> 
> If the heartbeat were just telling the NN that the DN is alive, 

Thats not the case in Hadoop. Central servers don't actively contact 
their slaves. It's been a long standing problem. For e.g. anything that 
NN want to tell a DN, it has to be in the form of response to a 
heartbeat or another RPC.

Raghu.

> you 
> could do it with a UDP that didn't block the DN. If, however, the DN 
> needs to know/care that the NN is up, then you do need to care about the 
> state of the namenode. But you don't have to do it blocking; looking for 
> a UDP back some time later is all you need to do.
> 
> 


Re: different modes of inter process communication

Posted by Steve Loughran <st...@apache.org>.
Raghu Angadi wrote:
> 
> A heartBeat is also an RPC. When you pause Namenode for 30 sec the 
> datanode's heartbeat thread just waits for 30 sec for its heartbeat RPC 
> to return. Note that when you pause Namenode, the RPCs to it don't fail 
> immediately. During this wait, DNs can perform other transactions like 
> serving data to clients.

If the heartbeat were just telling the NN that the DN is alive, you 
could do it with a UDP that didn't block the DN. If, however, the DN 
needs to know/care that the NN is up, then you do need to care about the 
state of the namenode. But you don't have to do it blocking; looking for 
a UDP back some time later is all you need to do.



Re: different modes of inter process communication

Posted by Raghu Angadi <ra...@yahoo-inc.com>.
A heartBeat is also an RPC. When you pause Namenode for 30 sec the 
datanode's heartbeat thread just waits for 30 sec for its heartbeat RPC 
to return. Note that when you pause Namenode, the RPCs to it don't fail 
immediately. During this wait, DNs can perform other transactions like 
serving data to clients.

B. X. wrote:
> On Wed, Aug 19, 2009 at 7:10 PM, Owen O'Malley <om...@apache.org> wrote:
> 
> Thank you both for clearing it up.
> 
> I have another related question:  my understanding is that basic
> heartbeat mechanism are used to keep different roles (namenode,
> datanode, tasktracker etc) aware of each other, but I am not able to
> observe this in the log.   For example, if I use the sigstop/sigcont
> mechanism to stop the namenode jvm process for 30 seconds and then
> continue, I don't observe any extra communications due to supposedly
> missed heartbeat.  (I checked the dfs.heartbeat.interval is set to 3
> seconds).  Rather, what I saw is all roles seem to stop in unison for
> 30 seconds (by the fact that no log events in the same time window).
> 
> I would appreciate some pointers on how heartbeats are used and configured.
> 
> -Bin


Re: different modes of inter process communication

Posted by "B. X." <bx...@gmail.com>.
On Wed, Aug 19, 2009 at 7:10 PM, Owen O'Malley <om...@apache.org> wrote:

Thank you both for clearing it up.

I have another related question:  my understanding is that basic
heartbeat mechanism are used to keep different roles (namenode,
datanode, tasktracker etc) aware of each other, but I am not able to
observe this in the log.   For example, if I use the sigstop/sigcont
mechanism to stop the namenode jvm process for 30 seconds and then
continue, I don't observe any extra communications due to supposedly
missed heartbeat.  (I checked the dfs.heartbeat.interval is set to 3
seconds).  Rather, what I saw is all roles seem to stop in unison for
30 seconds (by the fact that no log events in the same time window).

I would appreciate some pointers on how heartbeats are used and configured.

-Bin

Re: different modes of inter process communication

Posted by Owen O'Malley <om...@apache.org>.
Most of the communication is over RPC. The other ones are the data  
node protocol and http (in Jetty) for the shuffle. So you caught  
everything, but Jetty.

-- Owen