You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@mesos.apache.org by Christopher Ketchum <ck...@ucsc.edu> on 2016/05/27 07:44:48 UTC

Benign 'Shutdown failed on fd' error messages

Hi all,

I'm running Mesos 0.25.0 and have been seeing these strange 'Shutdown
failed on fd' errors. I saw a couple other postings about similar error
messages but this case seems to be unique since Mesos seems to be working
fine, apart from the printed messages. Does anyone have any suggestions
about determining what these messages mean, or, alternatively,  how to
silence these errors if they aren't significant?

Thanks!
Chris

I0121 21:29:31.058202 11329 sched.cpp:164] Version: 0.25.0
I0121 21:29:31.106197 11355 sched.cpp:262] New master detected at
master@xxx:xx:xx:xxx:5050
I0121 21:29:31.106302 11355 sched.cpp:272] No credentials provided.
Attempting to register without authentication
E0121 21:29:31.106353 11368 socket.hpp:174] Shutdown failed on fd=11:
Transport endpoint is not connected [107]
E0121 21:29:31.106487 11368 socket.hpp:174] Shutdown failed on fd=11:
Transport endpoint is not connected [107]
E0121 21:29:31.113162 11368 socket.hpp:174] Shutdown failed on fd=11:
Transport endpoint is not connected [107]
E0121 21:29:31.263561 11368 socket.hpp:174] Shutdown failed on fd=11:
Transport endpoint is not connected [107]
E0121 21:29:31.286962 11368 socket.hpp:174] Shutdown failed on fd=11:
Transport endpoint is not connected [107]
E0121 21:29:31.887789 11368 socket.hpp:174] Shutdown failed on fd=11:
Transport endpoint is not connected [107]
E0121 21:29:31.978222 11368 socket.hpp:174] Shutdown failed on fd=11:
Transport endpoint is not connected [107]
E0121 21:29:34.231999 11368 socket.hpp:174] Shutdown failed on fd=13:
Transport endpoint is not connected [107]

Re: Benign 'Shutdown failed on fd' error messages

Posted by haosdent <ha...@gmail.com>.
>does Mesos open sockets for zookeeper by default?
Not, only when use multiple Mesos masters with zookeeper, Mesos would try
to connect zookeeper.

If your tasks and frameworks work fine, I think you could ignore these
messages as @Joseph said. And if you want to debug what these invalid fds
used for, I think you could use lsof -p ${YOUR_MASTER_PROCESS_IP} to find
the connections.

On Sat, May 28, 2016 at 4:53 AM, Christopher Ketchum <ck...@ucsc.edu>
wrote:

> I'm not using zookeeper, does Mesos open sockets for zookeeper by default?
> That might explain how I end up with the unused sockets. I'll look at my
> set up again and see if I can find what would be causing this error with
> this new information.
>
> Thanks again!
> Chris
>
> On Fri, May 27, 2016 at 10:54 AM, Joseph Wu <jo...@mesosphere.io> wrote:
>
>> This log line is part of some socket cleanup Mesos performs for all
>> sockets.  Mesos calls the "shutdown" syscall on the socket:
>> http://man7.org/linux/man-pages/man2/shutdown.2.html
>>
>> This part of the log line:
>> > Transport endpoint is not connected
>> comes from the *ENOTCONN* error code.  We generally hit this error code
>> in one of two cases:
>> 1) We created the socket, but never used it.  We call shutdown(s) to be
>> safe.
>> 2) The socket was closed on the other side.  Again, we call shutdown(s)
>> to be safe.
>>
>> I'd argue that this should not be logged at the ERROR level.  But as of
>> the current code, these log lines can't be silenced without losing all of
>> your logging verbosity :(
>>
>> BTW, the log line has been moved since 0.26, but it will still show up in
>> a different form:
>>
>> https://github.com/apache/mesos/commit/b06e932a036044c54cd72ddde1d26c5f9271ea51#diff-b13970db30a54291dc4a85c16491abfe
>>
>> On Fri, May 27, 2016 at 1:00 AM, haosdent <ha...@gmail.com> wrote:
>>
>>> Do you use zookeeepr? Looks similar to this one
>>> http://search-hadoop.com/m/0Vlr6fthtf1D5ssd1
>>>
>>> On Fri, May 27, 2016 at 3:44 PM, Christopher Ketchum <ck...@ucsc.edu>
>>> wrote:
>>>
>>>> Hi all,
>>>>
>>>> I'm running Mesos 0.25.0 and have been seeing these strange 'Shutdown
>>>> failed on fd' errors. I saw a couple other postings about similar error
>>>> messages but this case seems to be unique since Mesos seems to be working
>>>> fine, apart from the printed messages. Does anyone have any suggestions
>>>> about determining what these messages mean, or, alternatively,  how to
>>>> silence these errors if they aren't significant?
>>>>
>>>> Thanks!
>>>> Chris
>>>>
>>>> I0121 21:29:31.058202 11329 sched.cpp:164] Version: 0.25.0
>>>> I0121 21:29:31.106197 11355 sched.cpp:262] New master detected at master@xxx:xx:xx:xxx:5050
>>>> I0121 21:29:31.106302 11355 sched.cpp:272] No credentials provided. Attempting to register without authentication
>>>> E0121 21:29:31.106353 11368 socket.hpp:174] Shutdown failed on fd=11: Transport endpoint is not connected [107]
>>>> E0121 21:29:31.106487 11368 socket.hpp:174] Shutdown failed on fd=11: Transport endpoint is not connected [107]
>>>> E0121 21:29:31.113162 11368 socket.hpp:174] Shutdown failed on fd=11: Transport endpoint is not connected [107]
>>>> E0121 21:29:31.263561 11368 socket.hpp:174] Shutdown failed on fd=11: Transport endpoint is not connected [107]
>>>> E0121 21:29:31.286962 11368 socket.hpp:174] Shutdown failed on fd=11: Transport endpoint is not connected [107]
>>>> E0121 21:29:31.887789 11368 socket.hpp:174] Shutdown failed on fd=11: Transport endpoint is not connected [107]
>>>> E0121 21:29:31.978222 11368 socket.hpp:174] Shutdown failed on fd=11: Transport endpoint is not connected [107]
>>>> E0121 21:29:34.231999 11368 socket.hpp:174] Shutdown failed on fd=13: Transport endpoint is not connected [107]
>>>>
>>>>
>>>
>>>
>>> --
>>> Best Regards,
>>> Haosdent Huang
>>>
>>
>>
>


-- 
Best Regards,
Haosdent Huang

Re: Benign 'Shutdown failed on fd' error messages

Posted by Christopher Ketchum <ck...@ucsc.edu>.
I'm not using zookeeper, does Mesos open sockets for zookeeper by default?
That might explain how I end up with the unused sockets. I'll look at my
set up again and see if I can find what would be causing this error with
this new information.

Thanks again!
Chris

On Fri, May 27, 2016 at 10:54 AM, Joseph Wu <jo...@mesosphere.io> wrote:

> This log line is part of some socket cleanup Mesos performs for all
> sockets.  Mesos calls the "shutdown" syscall on the socket:
> http://man7.org/linux/man-pages/man2/shutdown.2.html
>
> This part of the log line:
> > Transport endpoint is not connected
> comes from the *ENOTCONN* error code.  We generally hit this error code
> in one of two cases:
> 1) We created the socket, but never used it.  We call shutdown(s) to be
> safe.
> 2) The socket was closed on the other side.  Again, we call shutdown(s) to
> be safe.
>
> I'd argue that this should not be logged at the ERROR level.  But as of
> the current code, these log lines can't be silenced without losing all of
> your logging verbosity :(
>
> BTW, the log line has been moved since 0.26, but it will still show up in
> a different form:
>
> https://github.com/apache/mesos/commit/b06e932a036044c54cd72ddde1d26c5f9271ea51#diff-b13970db30a54291dc4a85c16491abfe
>
> On Fri, May 27, 2016 at 1:00 AM, haosdent <ha...@gmail.com> wrote:
>
>> Do you use zookeeepr? Looks similar to this one
>> http://search-hadoop.com/m/0Vlr6fthtf1D5ssd1
>>
>> On Fri, May 27, 2016 at 3:44 PM, Christopher Ketchum <ck...@ucsc.edu>
>> wrote:
>>
>>> Hi all,
>>>
>>> I'm running Mesos 0.25.0 and have been seeing these strange 'Shutdown
>>> failed on fd' errors. I saw a couple other postings about similar error
>>> messages but this case seems to be unique since Mesos seems to be working
>>> fine, apart from the printed messages. Does anyone have any suggestions
>>> about determining what these messages mean, or, alternatively,  how to
>>> silence these errors if they aren't significant?
>>>
>>> Thanks!
>>> Chris
>>>
>>> I0121 21:29:31.058202 11329 sched.cpp:164] Version: 0.25.0
>>> I0121 21:29:31.106197 11355 sched.cpp:262] New master detected at master@xxx:xx:xx:xxx:5050
>>> I0121 21:29:31.106302 11355 sched.cpp:272] No credentials provided. Attempting to register without authentication
>>> E0121 21:29:31.106353 11368 socket.hpp:174] Shutdown failed on fd=11: Transport endpoint is not connected [107]
>>> E0121 21:29:31.106487 11368 socket.hpp:174] Shutdown failed on fd=11: Transport endpoint is not connected [107]
>>> E0121 21:29:31.113162 11368 socket.hpp:174] Shutdown failed on fd=11: Transport endpoint is not connected [107]
>>> E0121 21:29:31.263561 11368 socket.hpp:174] Shutdown failed on fd=11: Transport endpoint is not connected [107]
>>> E0121 21:29:31.286962 11368 socket.hpp:174] Shutdown failed on fd=11: Transport endpoint is not connected [107]
>>> E0121 21:29:31.887789 11368 socket.hpp:174] Shutdown failed on fd=11: Transport endpoint is not connected [107]
>>> E0121 21:29:31.978222 11368 socket.hpp:174] Shutdown failed on fd=11: Transport endpoint is not connected [107]
>>> E0121 21:29:34.231999 11368 socket.hpp:174] Shutdown failed on fd=13: Transport endpoint is not connected [107]
>>>
>>>
>>
>>
>> --
>> Best Regards,
>> Haosdent Huang
>>
>
>

Re: Benign 'Shutdown failed on fd' error messages

Posted by Joseph Wu <jo...@mesosphere.io>.
This log line is part of some socket cleanup Mesos performs for all
sockets.  Mesos calls the "shutdown" syscall on the socket:
http://man7.org/linux/man-pages/man2/shutdown.2.html

This part of the log line:
> Transport endpoint is not connected
comes from the *ENOTCONN* error code.  We generally hit this error code in
one of two cases:
1) We created the socket, but never used it.  We call shutdown(s) to be
safe.
2) The socket was closed on the other side.  Again, we call shutdown(s) to
be safe.

I'd argue that this should not be logged at the ERROR level.  But as of the
current code, these log lines can't be silenced without losing all of your
logging verbosity :(

BTW, the log line has been moved since 0.26, but it will still show up in a
different form:
https://github.com/apache/mesos/commit/b06e932a036044c54cd72ddde1d26c5f9271ea51#diff-b13970db30a54291dc4a85c16491abfe

On Fri, May 27, 2016 at 1:00 AM, haosdent <ha...@gmail.com> wrote:

> Do you use zookeeepr? Looks similar to this one
> http://search-hadoop.com/m/0Vlr6fthtf1D5ssd1
>
> On Fri, May 27, 2016 at 3:44 PM, Christopher Ketchum <ck...@ucsc.edu>
> wrote:
>
>> Hi all,
>>
>> I'm running Mesos 0.25.0 and have been seeing these strange 'Shutdown
>> failed on fd' errors. I saw a couple other postings about similar error
>> messages but this case seems to be unique since Mesos seems to be working
>> fine, apart from the printed messages. Does anyone have any suggestions
>> about determining what these messages mean, or, alternatively,  how to
>> silence these errors if they aren't significant?
>>
>> Thanks!
>> Chris
>>
>> I0121 21:29:31.058202 11329 sched.cpp:164] Version: 0.25.0
>> I0121 21:29:31.106197 11355 sched.cpp:262] New master detected at master@xxx:xx:xx:xxx:5050
>> I0121 21:29:31.106302 11355 sched.cpp:272] No credentials provided. Attempting to register without authentication
>> E0121 21:29:31.106353 11368 socket.hpp:174] Shutdown failed on fd=11: Transport endpoint is not connected [107]
>> E0121 21:29:31.106487 11368 socket.hpp:174] Shutdown failed on fd=11: Transport endpoint is not connected [107]
>> E0121 21:29:31.113162 11368 socket.hpp:174] Shutdown failed on fd=11: Transport endpoint is not connected [107]
>> E0121 21:29:31.263561 11368 socket.hpp:174] Shutdown failed on fd=11: Transport endpoint is not connected [107]
>> E0121 21:29:31.286962 11368 socket.hpp:174] Shutdown failed on fd=11: Transport endpoint is not connected [107]
>> E0121 21:29:31.887789 11368 socket.hpp:174] Shutdown failed on fd=11: Transport endpoint is not connected [107]
>> E0121 21:29:31.978222 11368 socket.hpp:174] Shutdown failed on fd=11: Transport endpoint is not connected [107]
>> E0121 21:29:34.231999 11368 socket.hpp:174] Shutdown failed on fd=13: Transport endpoint is not connected [107]
>>
>>
>
>
> --
> Best Regards,
> Haosdent Huang
>

Re: Benign 'Shutdown failed on fd' error messages

Posted by haosdent <ha...@gmail.com>.
Do you use zookeeepr? Looks similar to this one
http://search-hadoop.com/m/0Vlr6fthtf1D5ssd1

On Fri, May 27, 2016 at 3:44 PM, Christopher Ketchum <ck...@ucsc.edu>
wrote:

> Hi all,
>
> I'm running Mesos 0.25.0 and have been seeing these strange 'Shutdown
> failed on fd' errors. I saw a couple other postings about similar error
> messages but this case seems to be unique since Mesos seems to be working
> fine, apart from the printed messages. Does anyone have any suggestions
> about determining what these messages mean, or, alternatively,  how to
> silence these errors if they aren't significant?
>
> Thanks!
> Chris
>
> I0121 21:29:31.058202 11329 sched.cpp:164] Version: 0.25.0
> I0121 21:29:31.106197 11355 sched.cpp:262] New master detected at master@xxx:xx:xx:xxx:5050
> I0121 21:29:31.106302 11355 sched.cpp:272] No credentials provided. Attempting to register without authentication
> E0121 21:29:31.106353 11368 socket.hpp:174] Shutdown failed on fd=11: Transport endpoint is not connected [107]
> E0121 21:29:31.106487 11368 socket.hpp:174] Shutdown failed on fd=11: Transport endpoint is not connected [107]
> E0121 21:29:31.113162 11368 socket.hpp:174] Shutdown failed on fd=11: Transport endpoint is not connected [107]
> E0121 21:29:31.263561 11368 socket.hpp:174] Shutdown failed on fd=11: Transport endpoint is not connected [107]
> E0121 21:29:31.286962 11368 socket.hpp:174] Shutdown failed on fd=11: Transport endpoint is not connected [107]
> E0121 21:29:31.887789 11368 socket.hpp:174] Shutdown failed on fd=11: Transport endpoint is not connected [107]
> E0121 21:29:31.978222 11368 socket.hpp:174] Shutdown failed on fd=11: Transport endpoint is not connected [107]
> E0121 21:29:34.231999 11368 socket.hpp:174] Shutdown failed on fd=13: Transport endpoint is not connected [107]
>
>


-- 
Best Regards,
Haosdent Huang