You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@mesos.apache.org by Du Li <du...@ericsson.com> on 2013/10/29 23:55:18 UTC
problem launching mesos master from HEAD
Hi,
I just built the latest code from mesos HEAD. When doing "bin/mesos-master", the master process core dumped with error messages as below. Framework authentication seems a newly added feature. Do I need to do any extra configuration to mesos and the framework (e.g. Spark)?
Thanks,
Du
I1029 15:32:44.601929 18088 main.cpp:123] Build: 2013-10-29 15:02:13 by bigdatauser
I1029 15:32:44.603224 18088 main.cpp:124] Starting Mesos master
I1029 15:32:44.604331 18088 master.cpp:293] Master started on 10.126.71.151:5050
I1029 15:32:44.604408 18088 master.cpp:308] Master ID: 201310291532-2538044938-5050-18088
I1029 15:32:44.604465 18088 master.cpp:318] Master allowing unauthenticated frameworks to register!!
*** Aborted at 1383085964 (unix time) try "date -d @1383085964" if you are using GNU date ***
I1029 15:32:44.604970 18088 master.cpp:706] Elected as master!
PC: @ 0x7f3b715e7514 mesos::internal::state::LevelDBStorageProcess::initialize()
*** SIGSEGV (@0x0) received by PID 18088 (TID 0x7f3b66c1c700) from PID 0; stack trace: ***
@ 0x7f3b703bbcb0 (unknown)
@ 0x7f3b715e7514 mesos::internal::state::LevelDBStorageProcess::initialize()
@ 0x7f3b7160d389 process::ProcessManager::resume()
@ 0x7f3b7160df2c process::schedule()
@ 0x7f3b703b3e9a start_thread
@ 0x7f3b700e03fd (unknown)
Segmentation fault (core dumped)
Re: problem launching mesos master from HEAD
Posted by Du Li <du...@ericsson.com>.
Hi Ben,
I recompiled the code with you fix and was able to start a testing master
by command "bin/mesos-master".
Thanks for your quick attention!
Du
On 10/30/13 12:00 PM, "Benjamin Mahler" <be...@gmail.com> wrote:
>Just committed the fix for this one.
>
>
>On Wed, Oct 30, 2013 at 11:55 AM, Benjamin Mahler
><benjamin.mahler@gmail.com
>> wrote:
>
>> Ah I see the issue in LevelDBStorageProcess::initialize
>>
>> void LevelDBStorageProcess::initialize()
>> {
>> leveldb::Options options;
>> options.create_if_missing = true;
>>
>> leveldb::Status status = leveldb::DB::Open(options, path, &db);
>>
>> if (!status.ok()) {
>> // TODO(benh): Consider trying to repair the DB.
>> error = Option<string>::some(status.ToString());
>> }
>>
>> // TODO(benh): Conditionally compact to avoid long recovery times?
>> * db->CompactRange(NULL, NULL);*
>> }
>>
>> If the Open operation fails we must not dereference db, I'll commit a
>>fix
>> shortly for you. This should fix things up. (This is likely happening
>> because the default work directory for the master is /tmp/mesos which is
>> likely not writable by your master).
>>
>>
>> On Wed, Oct 30, 2013 at 11:45 AM, Benjamin Mahler <
>> benjamin.mahler@gmail.com> wrote:
>>
>>> Interesting, I cannot reproduce this, what command are you using to run
>>> the master? Which operating system?
>>>
>>> If you're running the master from a build, can you run the master with
>>> gdb?
>>>
>>> ./bin/gdb-mesos-master.sh
>>>
>>> Once is segfaults it would be good to get a backtrace on where this
>>> occurs (seems the backtrace you linked does not have line numbers
>>> unfortunately).
>>>
>>>
>>> On Tue, Oct 29, 2013 at 3:55 PM, Du Li <du...@ericsson.com> wrote:
>>>
>>>> Hi,
>>>>
>>>> I just built the latest code from mesos HEAD. When doing
>>>> "bin/mesos-master", the master process core dumped with error
>>>>messages as
>>>> below. Framework authentication seems a newly added feature. Do I
>>>>need to
>>>> do any extra configuration to mesos and the framework (e.g. Spark)?
>>>>
>>>> Thanks,
>>>> Du
>>>>
>>>> I1029 15:32:44.601929 18088 main.cpp:123] Build: 2013-10-29 15:02:13
>>>>by
>>>> bigdatauser
>>>> I1029 15:32:44.603224 18088 main.cpp:124] Starting Mesos master
>>>> I1029 15:32:44.604331 18088 master.cpp:293] Master started on
>>>> 10.126.71.151:5050
>>>> I1029 15:32:44.604408 18088 master.cpp:308] Master ID:
>>>> 201310291532-2538044938-5050-18088
>>>> I1029 15:32:44.604465 18088 master.cpp:318] Master allowing
>>>> unauthenticated frameworks to register!!
>>>> *** Aborted at 1383085964 (unix time) try "date -d @1383085964" if you
>>>> are using GNU date ***
>>>> I1029 15:32:44.604970 18088 master.cpp:706] Elected as master!
>>>> PC: @ 0x7f3b715e7514
>>>> mesos::internal::state::LevelDBStorageProcess::initialize()
>>>> *** SIGSEGV (@0x0) received by PID 18088 (TID 0x7f3b66c1c700) from PID
>>>> 0; stack trace: ***
>>>> @ 0x7f3b703bbcb0 (unknown)
>>>> @ 0x7f3b715e7514
>>>> mesos::internal::state::LevelDBStorageProcess::initialize()
>>>> @ 0x7f3b7160d389 process::ProcessManager::resume()
>>>> @ 0x7f3b7160df2c process::schedule()
>>>> @ 0x7f3b703b3e9a start_thread
>>>> @ 0x7f3b700e03fd (unknown)
>>>> Segmentation fault (core dumped)
>>>>
>>>>
>>>>
>>>
>>
Re: problem launching mesos master from HEAD
Posted by Benjamin Mahler <be...@gmail.com>.
Just committed the fix for this one.
On Wed, Oct 30, 2013 at 11:55 AM, Benjamin Mahler <benjamin.mahler@gmail.com
> wrote:
> Ah I see the issue in LevelDBStorageProcess::initialize
>
> void LevelDBStorageProcess::initialize()
> {
> leveldb::Options options;
> options.create_if_missing = true;
>
> leveldb::Status status = leveldb::DB::Open(options, path, &db);
>
> if (!status.ok()) {
> // TODO(benh): Consider trying to repair the DB.
> error = Option<string>::some(status.ToString());
> }
>
> // TODO(benh): Conditionally compact to avoid long recovery times?
> * db->CompactRange(NULL, NULL);*
> }
>
> If the Open operation fails we must not dereference db, I'll commit a fix
> shortly for you. This should fix things up. (This is likely happening
> because the default work directory for the master is /tmp/mesos which is
> likely not writable by your master).
>
>
> On Wed, Oct 30, 2013 at 11:45 AM, Benjamin Mahler <
> benjamin.mahler@gmail.com> wrote:
>
>> Interesting, I cannot reproduce this, what command are you using to run
>> the master? Which operating system?
>>
>> If you're running the master from a build, can you run the master with
>> gdb?
>>
>> ./bin/gdb-mesos-master.sh
>>
>> Once is segfaults it would be good to get a backtrace on where this
>> occurs (seems the backtrace you linked does not have line numbers
>> unfortunately).
>>
>>
>> On Tue, Oct 29, 2013 at 3:55 PM, Du Li <du...@ericsson.com> wrote:
>>
>>> Hi,
>>>
>>> I just built the latest code from mesos HEAD. When doing
>>> "bin/mesos-master", the master process core dumped with error messages as
>>> below. Framework authentication seems a newly added feature. Do I need to
>>> do any extra configuration to mesos and the framework (e.g. Spark)?
>>>
>>> Thanks,
>>> Du
>>>
>>> I1029 15:32:44.601929 18088 main.cpp:123] Build: 2013-10-29 15:02:13 by
>>> bigdatauser
>>> I1029 15:32:44.603224 18088 main.cpp:124] Starting Mesos master
>>> I1029 15:32:44.604331 18088 master.cpp:293] Master started on
>>> 10.126.71.151:5050
>>> I1029 15:32:44.604408 18088 master.cpp:308] Master ID:
>>> 201310291532-2538044938-5050-18088
>>> I1029 15:32:44.604465 18088 master.cpp:318] Master allowing
>>> unauthenticated frameworks to register!!
>>> *** Aborted at 1383085964 (unix time) try "date -d @1383085964" if you
>>> are using GNU date ***
>>> I1029 15:32:44.604970 18088 master.cpp:706] Elected as master!
>>> PC: @ 0x7f3b715e7514
>>> mesos::internal::state::LevelDBStorageProcess::initialize()
>>> *** SIGSEGV (@0x0) received by PID 18088 (TID 0x7f3b66c1c700) from PID
>>> 0; stack trace: ***
>>> @ 0x7f3b703bbcb0 (unknown)
>>> @ 0x7f3b715e7514
>>> mesos::internal::state::LevelDBStorageProcess::initialize()
>>> @ 0x7f3b7160d389 process::ProcessManager::resume()
>>> @ 0x7f3b7160df2c process::schedule()
>>> @ 0x7f3b703b3e9a start_thread
>>> @ 0x7f3b700e03fd (unknown)
>>> Segmentation fault (core dumped)
>>>
>>>
>>>
>>
>
Re: problem launching mesos master from HEAD
Posted by Benjamin Mahler <be...@gmail.com>.
Ah I see the issue in LevelDBStorageProcess::initialize
void LevelDBStorageProcess::initialize()
{
leveldb::Options options;
options.create_if_missing = true;
leveldb::Status status = leveldb::DB::Open(options, path, &db);
if (!status.ok()) {
// TODO(benh): Consider trying to repair the DB.
error = Option<string>::some(status.ToString());
}
// TODO(benh): Conditionally compact to avoid long recovery times?
* db->CompactRange(NULL, NULL);*
}
If the Open operation fails we must not dereference db, I'll commit a fix
shortly for you. This should fix things up. (This is likely happening
because the default work directory for the master is /tmp/mesos which is
likely not writable by your master).
On Wed, Oct 30, 2013 at 11:45 AM, Benjamin Mahler <benjamin.mahler@gmail.com
> wrote:
> Interesting, I cannot reproduce this, what command are you using to run
> the master? Which operating system?
>
> If you're running the master from a build, can you run the master with gdb?
>
> ./bin/gdb-mesos-master.sh
>
> Once is segfaults it would be good to get a backtrace on where this occurs
> (seems the backtrace you linked does not have line numbers unfortunately).
>
>
> On Tue, Oct 29, 2013 at 3:55 PM, Du Li <du...@ericsson.com> wrote:
>
>> Hi,
>>
>> I just built the latest code from mesos HEAD. When doing
>> "bin/mesos-master", the master process core dumped with error messages as
>> below. Framework authentication seems a newly added feature. Do I need to
>> do any extra configuration to mesos and the framework (e.g. Spark)?
>>
>> Thanks,
>> Du
>>
>> I1029 15:32:44.601929 18088 main.cpp:123] Build: 2013-10-29 15:02:13 by
>> bigdatauser
>> I1029 15:32:44.603224 18088 main.cpp:124] Starting Mesos master
>> I1029 15:32:44.604331 18088 master.cpp:293] Master started on
>> 10.126.71.151:5050
>> I1029 15:32:44.604408 18088 master.cpp:308] Master ID:
>> 201310291532-2538044938-5050-18088
>> I1029 15:32:44.604465 18088 master.cpp:318] Master allowing
>> unauthenticated frameworks to register!!
>> *** Aborted at 1383085964 (unix time) try "date -d @1383085964" if you
>> are using GNU date ***
>> I1029 15:32:44.604970 18088 master.cpp:706] Elected as master!
>> PC: @ 0x7f3b715e7514
>> mesos::internal::state::LevelDBStorageProcess::initialize()
>> *** SIGSEGV (@0x0) received by PID 18088 (TID 0x7f3b66c1c700) from PID 0;
>> stack trace: ***
>> @ 0x7f3b703bbcb0 (unknown)
>> @ 0x7f3b715e7514
>> mesos::internal::state::LevelDBStorageProcess::initialize()
>> @ 0x7f3b7160d389 process::ProcessManager::resume()
>> @ 0x7f3b7160df2c process::schedule()
>> @ 0x7f3b703b3e9a start_thread
>> @ 0x7f3b700e03fd (unknown)
>> Segmentation fault (core dumped)
>>
>>
>>
>
Re: problem launching mesos master from HEAD
Posted by Benjamin Mahler <be...@gmail.com>.
Interesting, I cannot reproduce this, what command are you using to run the
master? Which operating system?
If you're running the master from a build, can you run the master with gdb?
./bin/gdb-mesos-master.sh
Once is segfaults it would be good to get a backtrace on where this occurs
(seems the backtrace you linked does not have line numbers unfortunately).
On Tue, Oct 29, 2013 at 3:55 PM, Du Li <du...@ericsson.com> wrote:
> Hi,
>
> I just built the latest code from mesos HEAD. When doing
> "bin/mesos-master", the master process core dumped with error messages as
> below. Framework authentication seems a newly added feature. Do I need to
> do any extra configuration to mesos and the framework (e.g. Spark)?
>
> Thanks,
> Du
>
> I1029 15:32:44.601929 18088 main.cpp:123] Build: 2013-10-29 15:02:13 by
> bigdatauser
> I1029 15:32:44.603224 18088 main.cpp:124] Starting Mesos master
> I1029 15:32:44.604331 18088 master.cpp:293] Master started on
> 10.126.71.151:5050
> I1029 15:32:44.604408 18088 master.cpp:308] Master ID:
> 201310291532-2538044938-5050-18088
> I1029 15:32:44.604465 18088 master.cpp:318] Master allowing
> unauthenticated frameworks to register!!
> *** Aborted at 1383085964 (unix time) try "date -d @1383085964" if you are
> using GNU date ***
> I1029 15:32:44.604970 18088 master.cpp:706] Elected as master!
> PC: @ 0x7f3b715e7514
> mesos::internal::state::LevelDBStorageProcess::initialize()
> *** SIGSEGV (@0x0) received by PID 18088 (TID 0x7f3b66c1c700) from PID 0;
> stack trace: ***
> @ 0x7f3b703bbcb0 (unknown)
> @ 0x7f3b715e7514
> mesos::internal::state::LevelDBStorageProcess::initialize()
> @ 0x7f3b7160d389 process::ProcessManager::resume()
> @ 0x7f3b7160df2c process::schedule()
> @ 0x7f3b703b3e9a start_thread
> @ 0x7f3b700e03fd (unknown)
> Segmentation fault (core dumped)
>
>
>