You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@mesos.apache.org by "Avinash Sridharan (JIRA)" <ji...@apache.org> on 2016/05/04 21:51:12 UTC

[jira] [Commented] (MESOS-5325) Mesos can't determine if task IP is reachable

    [ https://issues.apache.org/jira/browse/MESOS-5325?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15271498#comment-15271498 ] 

Avinash Sridharan commented on MESOS-5325:
------------------------------------------

[~djosborne] Mesos cannot determine if an IP address allocated to the container is routeable from other containers or not. I do agree that this is an issue with ip-per-container in general,  but this problem needs to be solved at the service discovery layer (potentially MesosDNS). The service discovery module needs to be able to resolve the name to a routeable IP address based on where the query for DNS resolution originated. Effectively the service discovery layer needs to build a split horizon of its view of the network.

> Mesos can't determine if task IP is reachable
> ---------------------------------------------
>
>                 Key: MESOS-5325
>                 URL: https://issues.apache.org/jira/browse/MESOS-5325
>             Project: Mesos
>          Issue Type: Bug
>            Reporter: Dan Osborne
>
> I have uncovered a design flaw that affects ip-per-container tasks when run in a cluster alongside non ip-per-container tasks. This affects docker-libnetwork, netmodules, and I suspect it will also affect CNI.
> After Mesos launches a docker bridge task, it fills the task's networkinfo field with the docker bridge IP assigned to that task. Because of this behavior, when a launched task's NetworkInfo is later utilized by Mesos components, it is unknown if it is filled with an IP address accessible throughout the cluster, or if it is not.
> A common use case where this is a problem can be encountered when using Mesos DNS. Mesos-DNS has a configuration setting that tells it which information to respond to a query with: NetworkInfo, or HostIP. If it has been configured to prefer NetworkInfo, it correctly resolves ip-per-container containers to their unique IP. But, because the docker bridge IP is also stored in NetworkInfo, it will incorrectly resolve docker-bridge containers to an IP address not accessible from anywhere besides the slave they are on. This breaks DNS resolutions in Mesos.
> I believe Mesos needs a way to distinguish between tasks which are accessible via their IP and tasks that are not.
> One fix would be to prevent Mesos from filling in NetworkInfo for a task if it is known that the task is not reachable throughout the cluster via that address. Essentially, NetworkInfo could be interpreted as a boolean - Its presence means this task is addressable. Its absence means the task is not. In practice, this would mean it gets filled in for CNI tasks, netmodules tasks, and docker tasks bound to the host networking namespace. It would not get filled in for docker bridge tasks.
> I believe this change would be fairly minimum in scope. To implement it,  Mesos would need to be changed to not store Docker Bridge IP's in NetworkInfo.
> I'm also open to discussion and other suggestions on how to resolve this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)