You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@mesos.apache.org by "Nan Xiao (JIRA)" <ji...@apache.org> on 2015/12/07 07:22:10 UTC

[jira] [Created] (MESOS-4072) The lt-mesos-master will coredump in some situation.

Nan Xiao created MESOS-4072:
-------------------------------

             Summary: The lt-mesos-master will coredump in some situation.
                 Key: MESOS-4072
                 URL: https://issues.apache.org/jira/browse/MESOS-4072
             Project: Mesos
          Issue Type: Bug
    Affects Versions: 0.25.0
            Reporter: Nan Xiao


 I find  lt-mesos-master  will coredump when following conditions are met:  

(1) The user doesn't have write permission of /var/lib/mesos directory:

nan@ubuntu:~/mesos-0.25.0/build$ ls -lt /var/lib/
total 176
dr-xr-xr-x 2 root    root    4096 Dec  7 03:08 mesos
......

(2) the /var/lib/mesos is an empty folder:
nan@ubuntu:~/mesos-0.25.0/build$ ls -lt /var/lib/mesos/
total 0

Executing following command will core dump:

nan@ubuntu:~/mesos-0.25.0/build$ ./bin/mesos-master.sh --ip=16.187.250.141 --work_dir=/var/lib/mesos
I1207 03:18:36.431015 22951 main.cpp:229] Build: 2015-12-07 00:11:18 by nan
I1207 03:18:36.431154 22951 main.cpp:231] Version: 0.25.0
I1207 03:18:36.431388 22951 main.cpp:252] Using 'HierarchicalDRF' allocator
F1207 03:18:36.431807 22951 replica.cpp:724] CHECK_SOME(state): IO error: /var/lib/mesos/replicated_log/LOCK: No such file or directory Failed to recover the log
*** Check failure stack trace: ***
    @     0x7f076bc208ca  google::LogMessage::Fail()
    @     0x7f076bc20816  google::LogMessage::SendToLog()
    @     0x7f076bc20218  google::LogMessage::Flush()
    @     0x7f076bc2312c  google::LogMessageFatal::~LogMessageFatal()
    @     0x7f076adf8f30  _CheckFatal::~_CheckFatal()
    @     0x7f076baa4939  mesos::internal::log::ReplicaProcess::restore()
    @     0x7f076baa0f8c  mesos::internal::log::ReplicaProcess::ReplicaProcess()
    @     0x7f076baa4c95  mesos::internal::log::Replica::Replica()
    @     0x7f076b9cf819  mesos::internal::log::LogProcess::LogProcess()
    @     0x7f076b9d576c  mesos::internal::log::Log::Log()
    @           0x46d21f  main
    @     0x7f0766f69ec5  (unknown)
    @           0x46b979  (unknown)
Aborted (core dumped)

Use gdb to analyze it:

nan@ubuntu:~/mesos-0.25.0/build$ gdb /home/nan/mesos-0.25.0/build/src/.libs/lt-mesos-master core
GNU gdb (Ubuntu 7.7.1-0ubuntu5~14.04.2) 7.7.1
Copyright (C) 2014 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
<http://www.gnu.org/software/gdb/documentation/>.
For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from /home/nan/mesos-0.25.0/build/src/.libs/lt-mesos-master...done.
[New LWP 22065]
[New LWP 22087]
[New LWP 22085]
[New LWP 22089]
[New LWP 22084]
[New LWP 22086]
[New LWP 22091]
[New LWP 22088]
[New LWP 22092]
[New LWP 22090]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
Core was generated by `/home/nan/mesos-0.25.0/build/src/.libs/lt-mesos-master --ip=127.0.0.1 --work_di'.
Program terminated with signal SIGABRT, Aborted.
#0  0x00007fe917810cc9 in __GI_raise (sig=sig@entry=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:56
56      ../nptl/sysdeps/unix/sysv/linux/raise.c: No such file or directory.
Traceback (most recent call last):
  File "/usr/share/gdb/auto-load/usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.19-gdb.py", line 63, in <module>
    from libstdcxx.v6.printers import register_libstdcxx_printers
ImportError: No module named 'libstdcxx'
(gdb) bt
#0  0x00007fe917810cc9 in __GI_raise (sig=sig@entry=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:56
#1  0x00007fe9178140d8 in __GI_abort () at abort.c:89
#2  0x00007fe91c4b8c1b in DumpStackTraceAndExit () from /home/nan/mesos-0.25.0/build/src/.libs/libmesos-0.25.0.so
#3  0x00007fe91c4b28ca in google::LogMessage::Fail () from /home/nan/mesos-0.25.0/build/src/.libs/libmesos-0.25.0.so
#4  0x00007fe91c4b2816 in google::LogMessage::SendToLog () from /home/nan/mesos-0.25.0/build/src/.libs/libmesos-0.25.0.so
#5  0x00007fe91c4b2218 in google::LogMessage::Flush () from /home/nan/mesos-0.25.0/build/src/.libs/libmesos-0.25.0.so
#6  0x00007fe91c4b512c in google::LogMessageFatal::~LogMessageFatal () from /home/nan/mesos-0.25.0/build/src/.libs/libmesos-0.25.0.so
#7  0x00007fe91b68af30 in _CheckFatal::~_CheckFatal (this=0x7ffe704ec3f0, __in_chrg=<optimized out>)
    at ../../3rdparty/libprocess/3rdparty/stout/include/stout/check.hpp:165
#8  0x00007fe91c336939 in mesos::internal::log::ReplicaProcess::restore (this=0x16f25d0, path=...) at ../../src/log/replica.cpp:724
#9  0x00007fe91c332f8c in mesos::internal::log::ReplicaProcess::ReplicaProcess (this=0x16f25d0, path=..., __in_chrg=<optimized out>,
    __vtt_parm=<optimized out>) at ../../src/log/replica.cpp:160
#10 0x00007fe91c336c95 in mesos::internal::log::Replica::Replica (this=0x16e82a0, path=...) at ../../src/log/replica.cpp:753
#11 0x00007fe91c261819 in mesos::internal::log::LogProcess::LogProcess () from /home/nan/mesos-0.25.0/build/src/.libs/libmesos-0.25.0.so
#12 0x00007fe91c26776c in mesos::internal::log::Log::Log () from /home/nan/mesos-0.25.0/build/src/.libs/libmesos-0.25.0.so
#13 0x000000000046d21f in main (argc=3, argv=0x7ffe704ef028) at ../../src/master/main.cpp:307
(gdb)




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)