You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@mesos.apache.org by "Jie Yu (JIRA)" <ji...@apache.org> on 2018/08/02 05:18:00 UTC

[jira] [Assigned] (MESOS-9127) Port mapper CNI plugin might deadlock iptables on the agent.

     [ https://issues.apache.org/jira/browse/MESOS-9127?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jie Yu reassigned MESOS-9127:
-----------------------------

    Assignee: Jie Yu

> Port mapper CNI plugin might deadlock iptables on the agent.
> ------------------------------------------------------------
>
>                 Key: MESOS-9127
>                 URL: https://issues.apache.org/jira/browse/MESOS-9127
>             Project: Mesos
>          Issue Type: Bug
>          Components: network
>    Affects Versions: 1.4.1, 1.5.1, 1.6.1
>            Reporter: Jie Yu
>            Assignee: Jie Yu
>            Priority: Major
>
> Recently, we noticed that if one launches a lot of containers that use the port mapper CNI plugin, the iptables will get deadlock. The symptom is like the following.
> If you do any iptables command on the box, it'll get stuck on acquiring the xtables lock:
> {noformat}
> core@ip-10-0-2-99 ~ $ time iptables -w -t nat -S UCR-DEFAULT-BRIDGE 
> Another app is currently holding the xtables lock; waiting for it to exit...
> Another app is currently holding the xtables lock; waiting for it to exit...
> Another app is currently holding the xtables lock; waiting for it to exit...
> Another app is currently holding the xtables lock; waiting for it to exit...
> Another app is currently holding the xtables lock; waiting for it to exit...
> Another app is currently holding the xtables lock; waiting for it to exit...
> Another app is currently holding the xtables lock; waiting for it to exit...
> Another app is currently holding the xtables lock; waiting for it to exit...
> Another app is currently holding the xtables lock; waiting for it to exit...
> Another app is currently holding the xtables lock; waiting for it to exit...
> Another app is currently holding the xtables lock; waiting for it to exit...
> Another app is currently holding the xtables lock; waiting for it to exit...
> Another app is currently holding the xtables lock; waiting for it to exit...
> Another app is currently holding the xtables lock; waiting for it to exit...
> Another app is currently holding the xtables lock; waiting for it to exit...
> Another app is currently holding the xtables lock; waiting for it to exit...
> Another app is currently holding the xtables lock; waiting for it to exit...
> Another app is currently holding the xtables lock; waiting for it to exit...
> Another app is currently holding the xtables lock; waiting for it to exit...
> Another app is currently holding the xtables lock; waiting for it to exit...
> Another app is currently holding the xtables lock; waiting for it to exit...
> ^C
> real    0m41.349s
> user    0m0.001s
> sys    0m0.000s
> {noformat}
> And you'll notice that a lot of iptables and mesos port mapper CNI plugin processes on the box
> {noformat}
> $ ps -fp $(pidof mesos-cni-port-mapper) | wc -l
> 191
> $ ps -fp $(pidof iptables) | wc -l
> 192
> {noformat}
> Then, we look into the process that is holding the xtables lock:
> {noformat}
> $ sudo netstat -p -n | grep xtables
> unix  2      [ ]         STREAM                   225083   25048/iptables       @xtables
> $ ps aux | grep 25048
> root     25048  0.0  0.0  26184  2512 ?        S    22:41   0:00 iptables -w -t nat -S UCR-DEFAULT-BRIDGE
> core     31857  0.0  0.0   6760   976 pts/0    S+   23:18   0:00 grep --colour=auto 25048
> $ sudo strace -s 10000 -p 25048
> Process 25048 attached
> write(1, "-dport 13839 -m comment --comment \"container_id: a074efef-c119-4764-a584-87c57e6b7f68\" -j DNAT --to-destination 172.31.254.183:80\n-A UCR-DEFAULT-BRIDGE ! -i ucr-br0 -p tcp -m tcp --dport 17282 -m comment --comment \"container_id: 47880db5-90ad-4034-9c2b-2fd246d42342\" -j DNAT --to-destination 172.31.254.126:80\n-A UCR-DEFAULT-BRIDGE ! -i ucr-br0 -p tcp -m tcp --dport 28712 -m comment --comment \"container_id: 3293d5d0-772c-48d2-bafd-1c4b4d56247e\" -j DNAT --to-destination 172.31.254.130:80\n-A UCR-DEFAULT-BRIDGE ! -i ucr-br0 -p tcp -m tcp --dport 23893 -m comment --comment \"container_id: f57de8eb-a3b9-44cb-8dac-7ab261bc8aac\" -j DNAT --to-destination 172.31.254.149:80\n-A UCR-DEFAULT-BRIDGE ! -i ucr-br0 -p tcp -m tcp --dport 28449 -m comment --comment \"container_id: 9238dbf0-7b28-4fda-880a-bf0c8f40562a\" -j DNAT --to-destination 172.31.254.190:80\n-A UCR-DEFAULT-BRIDGE ! -i ucr-br0 -p tcp -m tcp --dport 26438 -m comment --comment \"container_id: d307cf58-8972-4de4-ad45-26c29786add0\" -j DNAT --to-destination 172.31.254.187:80\n-A UCR-DEFAULT-BRIDGE ! -i ucr-br0 -p tcp -m tcp --dport 7682 -m comment --comment \"container_id: 60f5a61b-f4c0-4846-b2cd-63cd7eb5a4e8\" -j DNAT --to-destination 172.31.254.177:80\n-A UCR-DEFAULT-BRIDGE ! -i ucr-br0 -p tcp -m tcp --dport 23904 -m comment --comment \"container_id: f203ff9e-7b81-4e54-ab44-d45e2a937f38\" -j DNAT --to-destination 172.31.254.157:80\n-A UCR-DEFAULT-BRIDGE ! -i ucr-br0 -p tcp -m tcp --dport 8359 -m comment --comment \"container_id: 578cc89c-83bf-46ba-9ae7-c7b89e40e739\" -j DNAT --to-destination 172.31.254.158:80\n-A UCR-DEFAULT-BRIDGE ! -i ucr-br0 -p tcp -m tcp --dport 28482 -m comment --comment \"container_id: 70721adb-cd6c-4158-8b11-4ef694999203\" -j DNAT --to-destination 172.31.254.163:80\n-A UCR-DEFAULT-BRIDGE ! -i ucr-br0 -p tcp -m tcp --dport 2564 -m comment --comment \"container_id: 70bc9735-8e0a-4b27-9a68-1b1151c9f6ec\" -j DNAT --to-destination 172.31.254.105:80\n-A UCR-DEFAULT-BRIDGE ! -i ucr-br0 -p tcp -m tcp --dport 6952 -m comment --comment \"container_id: a494e0fe-ce6e-4e95-b094-504c7603b6fb\" -j DNAT --to-destination 172.31.254.159:80\n-A UCR-DEFAULT-BRIDGE ! -i ucr-br0 -p tcp -m tcp --dport 29371 -m comment --comment \"container_id: 1dac0183-8939-4165-95e8-cf6503cc53ad\" -j DNAT --to-destination 172.31.254.174:80\n-A UCR-DEFAULT-BRIDGE ! -i ucr-br0 -p tcp -m tcp --dport 6551 -m comment --comment \"container_id: 148a197b-7247-4a3c-b528-1417cc8ccf3f\" -j DNAT --to-destination 172.31.254.144:80\n-A UCR-DEFAULT-BRIDGE ! -i ucr-br0 -p tcp -m tcp --dport 16568 -m comment --comment \"container_id: e68bb942-ab69-43a5-a308-40b33368d111\" -j DNAT --to-destination 172.31.254.184:80\n-A UCR-DEFAULT-BRIDGE ! -i ucr-br0 -p tcp -m tcp --dport 5915 -m comment --comment \"container_id: 08d53ef6-7d3c-4b44-8bb7-923b6a4a1780\" -j DNAT --to-destination 172.31.254.186:80\n-A UCR-DEFAULT-BRIDGE ! -i ucr-br0 -p tcp -m tcp --dport 11252 -m comment --comment \"container_id: 894e1455-6c77-4fc2-8414-53f3db7b492a\" -j DNAT --to-destination 172.31.254.223:80\n-A UCR-DEFAULT-BRIDGE ! -i ucr-br0 -p tcp -m tcp --dport 7583 -m comment --comment \"container_id: 917b1fc3-12ca-4198-a5f4-13144fbf9096\" -j DNAT --to-destination 172.31.254.164:80\n-A UCR-DEFAULT-BRIDGE ! -i ucr-br0 -p tcp -m tcp --dport 12695 -m comment --comment \"container_id: cfadeb29-3481-4e9e-8eba-e57ce6a4f693\" -j DNAT --to-destination 172.31.254.153:80\n-A UCR-DEFAULT-BRIDGE ! -i ucr-br0 -p tcp -m tcp --dport 1256 -m comment --comment \"container_id: 212649ee-a894-4201-bc38-798545ce9e1f\" -j DNAT --to-destination 172.31.254.132:80\n-A UCR-DEFAULT-BRIDGE ! -i ucr-br0 -p tcp -m tcp --dport 28581 -m comment --comment \"container_id: 38c46524-9023-450d-a134-24ad14eecbd5\" -j DNAT --to-destination 172.31.254.202:80\n-A UCR-DEFAULT-BRIDGE ! -i ucr-br0 -p tcp -m tcp --dport 5709 -m comment --comment \"container_id: 8a510a8e-df16-40b2-a4b7-36605a5f9984\" -j DNAT --to-destination 172.31.254.194:80\n-A UCR-DEFAULT-BRIDGE ! -i ucr-br0 -p tcp -m tcp --dport 6050 -m comment --comment \"container_id: 94757d96-b6ad-4a57-9e38-c956a2e24fd6\" -j DNAT --to-destination 172.31.254.152:80\n-A UCR-DEFAULT-B", 4096
> {noformat}
> So it gets stuck when writing to stdout. That indicates that the pipe is full, and the reader end hasn't read from the pipe. Then we identified the reader
> {noformat}
> $ ps aux | grep a074efef-c119-4764-a584-87c57e6b7f68
> core      2355  0.0  0.0   6756  1020 pts/0    S+   23:39   0:00 grep --colour=auto a074efef-c119-4764-a584-87c57e6b7f68
> root     26486  0.0  0.0  11948  2852 ?        S    22:41   0:00 sh -c        #!/bin/sh       exec 1>&2       set -x        # The iptables command searches for the DNAT rules with tag       # "container_id: <CNI_CONTAINERID>", and if it exists goes ahead       # and deletes it.       iptables -w -t nat -S UCR-DEFAULT-BRIDGE | sed "/container_id: a074efef-c119-4764-a584-87c57e6b7f68/ s/-A/iptables -w -t nat -D/e"
> root     26490  0.0  0.0   4340   896 ?        S    22:41   0:00 sed /container_id: a074efef-c119-4764-a584-87c57e6b7f68/ s/-A/iptables -w -t nat -D/e
> $ sudo strace -s 10000 -p 26490
> Process 26490 attached
> read(0, ^CProcess 26490 detached
> {noformat}
> The reader which is `sed` is reading from stdin, somehow gets stuck as well.
> The corresponding container gets stuck during CNI detach after checking the agent log.
> After checking the code, we pinpoint the problematic script in our CNI port mapper plugin:
> https://github.com/apache/mesos/blob/1.6.0/src/slave/containerizer/mesos/isolators/network/cni/plugins/port_mapper/port_mapper.cpp#L372
> {code}
> iptables -w -t nat -S %s | sed "/%s/ s/-A/iptables -w -t nat -D/e"
> {code}
> This command will deadlock if there are a lot of entires in the iptables because the `sed` won't process the next line while executing `iptables -w -t nat -D ...`. But the executing of `iptables -w -t nat -D ... ` might get stuck if the first command `iptables -w -t nat -S %s` didn't finish, because the lock is not released. The first command might not finish if it has a lot of output, filling the pipe that `sed` hasn't had a chance to process yet.
> The correct solution is to write the command to execute in a file, and execute the file.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)