You are viewing a plain text version of this content. The canonical link for it is here.

Posted to common-issues@hadoop.apache.org by "Andrew Johnson (JIRA)" <ji...@apache.org> on 2015/01/20 17:21:34 UTC

[jira] [Updated] (HADOOP-10181) GangliaContext does not work with multicast ganglia setup

     [ https://issues.apache.org/jira/browse/HADOOP-10181?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Andrew Johnson updated HADOOP-10181:
------------------------------------
    Attachment: HADOOP-10181.001.patch

We've been running the attached patch in our production cluster for a few days now.  It has fixed the issue.

> GangliaContext does not work with multicast ganglia setup
> ---------------------------------------------------------
>
>                 Key: HADOOP-10181
>                 URL: https://issues.apache.org/jira/browse/HADOOP-10181
>             Project: Hadoop Common
>          Issue Type: Bug
>            Reporter: Andrew Otto
>            Priority: Minor
>              Labels: ganglia, hadoop, metrics, multicast
>         Attachments: HADOOP-10181.001.patch
>
>
> The GangliaContext class which is used to send Hadoop metrics to Ganglia uses a DatagramSocket to send these metrics.  This works fine for Ganglia multicast setups that are all on the same VLAN.  However, when working with multiple VLANs, a packet sent via DatagramSocket to a multicast address will end up with a TTL of 1.  Multicast TTL indicates the number of network hops for which a particular multicast packet is valid.  The packets sent by GangliaContext do not make it to ganglia aggregrators on the same multicast group, but in different VLANs.
> To fix, we'd need a configuration property that specifies that multicast is to be used, and another that allows setting of the multicast packet TTL.  With these set, we could then use MulticastSocket setTimeToLive() instead of just plain ol' DatagramSocket.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)