You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@mesos.apache.org by Du Li <du...@ericsson.com> on 2013/10/29 23:55:18 UTC

problem launching mesos master from HEAD

Hi,

I just built the latest code from mesos HEAD. When doing "bin/mesos-master", the master process core dumped with error messages as below. Framework authentication seems a newly added feature. Do I need to do any extra configuration to mesos and the framework (e.g. Spark)?

Thanks,
Du

I1029 15:32:44.601929 18088 main.cpp:123] Build: 2013-10-29 15:02:13 by bigdatauser
I1029 15:32:44.603224 18088 main.cpp:124] Starting Mesos master
I1029 15:32:44.604331 18088 master.cpp:293] Master started on 10.126.71.151:5050
I1029 15:32:44.604408 18088 master.cpp:308] Master ID: 201310291532-2538044938-5050-18088
I1029 15:32:44.604465 18088 master.cpp:318] Master allowing unauthenticated frameworks to register!!
*** Aborted at 1383085964 (unix time) try "date -d @1383085964" if you are using GNU date ***
I1029 15:32:44.604970 18088 master.cpp:706] Elected as master!
PC: @     0x7f3b715e7514 mesos::internal::state::LevelDBStorageProcess::initialize()
*** SIGSEGV (@0x0) received by PID 18088 (TID 0x7f3b66c1c700) from PID 0; stack trace: ***
    @     0x7f3b703bbcb0 (unknown)
    @     0x7f3b715e7514 mesos::internal::state::LevelDBStorageProcess::initialize()
    @     0x7f3b7160d389 process::ProcessManager::resume()
    @     0x7f3b7160df2c process::schedule()
    @     0x7f3b703b3e9a start_thread
    @     0x7f3b700e03fd (unknown)
Segmentation fault (core dumped)



Re: problem launching mesos master from HEAD

Posted by Du Li <du...@ericsson.com>.
Hi Ben,

I recompiled the code with you fix and was able to start a testing master
by command "bin/mesos-master".

Thanks for your quick attention!

Du

On 10/30/13 12:00 PM, "Benjamin Mahler" <be...@gmail.com> wrote:

>Just committed the fix for this one.
>
>
>On Wed, Oct 30, 2013 at 11:55 AM, Benjamin Mahler
><benjamin.mahler@gmail.com
>> wrote:
>
>> Ah I see the issue in LevelDBStorageProcess::initialize
>>
>> void LevelDBStorageProcess::initialize()
>> {
>>   leveldb::Options options;
>>   options.create_if_missing = true;
>>
>>   leveldb::Status status = leveldb::DB::Open(options, path, &db);
>>
>>   if (!status.ok()) {
>>     // TODO(benh): Consider trying to repair the DB.
>>     error = Option<string>::some(status.ToString());
>>   }
>>
>>   // TODO(benh): Conditionally compact to avoid long recovery times?
>> *  db->CompactRange(NULL, NULL);*
>> }
>>
>> If the Open operation fails we must not dereference db, I'll commit a
>>fix
>> shortly for you. This should fix things up. (This is likely happening
>> because the default work directory for the master is /tmp/mesos which is
>> likely not writable by your master).
>>
>>
>> On Wed, Oct 30, 2013 at 11:45 AM, Benjamin Mahler <
>> benjamin.mahler@gmail.com> wrote:
>>
>>> Interesting, I cannot reproduce this, what command are you using to run
>>> the master? Which operating system?
>>>
>>> If you're running the master from a build, can you run the master with
>>> gdb?
>>>
>>> ./bin/gdb-mesos-master.sh
>>>
>>> Once is segfaults it would be good to get a backtrace on where this
>>> occurs (seems the backtrace you linked does not have line numbers
>>> unfortunately).
>>>
>>>
>>> On Tue, Oct 29, 2013 at 3:55 PM, Du Li <du...@ericsson.com> wrote:
>>>
>>>> Hi,
>>>>
>>>> I just built the latest code from mesos HEAD. When doing
>>>> "bin/mesos-master", the master process core dumped with error
>>>>messages as
>>>> below. Framework authentication seems a newly added feature. Do I
>>>>need to
>>>> do any extra configuration to mesos and the framework (e.g. Spark)?
>>>>
>>>> Thanks,
>>>> Du
>>>>
>>>> I1029 15:32:44.601929 18088 main.cpp:123] Build: 2013-10-29 15:02:13
>>>>by
>>>> bigdatauser
>>>> I1029 15:32:44.603224 18088 main.cpp:124] Starting Mesos master
>>>> I1029 15:32:44.604331 18088 master.cpp:293] Master started on
>>>> 10.126.71.151:5050
>>>> I1029 15:32:44.604408 18088 master.cpp:308] Master ID:
>>>> 201310291532-2538044938-5050-18088
>>>> I1029 15:32:44.604465 18088 master.cpp:318] Master allowing
>>>> unauthenticated frameworks to register!!
>>>> *** Aborted at 1383085964 (unix time) try "date -d @1383085964" if you
>>>> are using GNU date ***
>>>> I1029 15:32:44.604970 18088 master.cpp:706] Elected as master!
>>>> PC: @     0x7f3b715e7514
>>>> mesos::internal::state::LevelDBStorageProcess::initialize()
>>>> *** SIGSEGV (@0x0) received by PID 18088 (TID 0x7f3b66c1c700) from PID
>>>> 0; stack trace: ***
>>>>     @     0x7f3b703bbcb0 (unknown)
>>>>     @     0x7f3b715e7514
>>>> mesos::internal::state::LevelDBStorageProcess::initialize()
>>>>     @     0x7f3b7160d389 process::ProcessManager::resume()
>>>>     @     0x7f3b7160df2c process::schedule()
>>>>     @     0x7f3b703b3e9a start_thread
>>>>     @     0x7f3b700e03fd (unknown)
>>>> Segmentation fault (core dumped)
>>>>
>>>>
>>>>
>>>
>>


Re: problem launching mesos master from HEAD

Posted by Benjamin Mahler <be...@gmail.com>.
Just committed the fix for this one.


On Wed, Oct 30, 2013 at 11:55 AM, Benjamin Mahler <benjamin.mahler@gmail.com
> wrote:

> Ah I see the issue in LevelDBStorageProcess::initialize
>
> void LevelDBStorageProcess::initialize()
> {
>   leveldb::Options options;
>   options.create_if_missing = true;
>
>   leveldb::Status status = leveldb::DB::Open(options, path, &db);
>
>   if (!status.ok()) {
>     // TODO(benh): Consider trying to repair the DB.
>     error = Option<string>::some(status.ToString());
>   }
>
>   // TODO(benh): Conditionally compact to avoid long recovery times?
> *  db->CompactRange(NULL, NULL);*
> }
>
> If the Open operation fails we must not dereference db, I'll commit a fix
> shortly for you. This should fix things up. (This is likely happening
> because the default work directory for the master is /tmp/mesos which is
> likely not writable by your master).
>
>
> On Wed, Oct 30, 2013 at 11:45 AM, Benjamin Mahler <
> benjamin.mahler@gmail.com> wrote:
>
>> Interesting, I cannot reproduce this, what command are you using to run
>> the master? Which operating system?
>>
>> If you're running the master from a build, can you run the master with
>> gdb?
>>
>> ./bin/gdb-mesos-master.sh
>>
>> Once is segfaults it would be good to get a backtrace on where this
>> occurs (seems the backtrace you linked does not have line numbers
>> unfortunately).
>>
>>
>> On Tue, Oct 29, 2013 at 3:55 PM, Du Li <du...@ericsson.com> wrote:
>>
>>> Hi,
>>>
>>> I just built the latest code from mesos HEAD. When doing
>>> "bin/mesos-master", the master process core dumped with error messages as
>>> below. Framework authentication seems a newly added feature. Do I need to
>>> do any extra configuration to mesos and the framework (e.g. Spark)?
>>>
>>> Thanks,
>>> Du
>>>
>>> I1029 15:32:44.601929 18088 main.cpp:123] Build: 2013-10-29 15:02:13 by
>>> bigdatauser
>>> I1029 15:32:44.603224 18088 main.cpp:124] Starting Mesos master
>>> I1029 15:32:44.604331 18088 master.cpp:293] Master started on
>>> 10.126.71.151:5050
>>> I1029 15:32:44.604408 18088 master.cpp:308] Master ID:
>>> 201310291532-2538044938-5050-18088
>>> I1029 15:32:44.604465 18088 master.cpp:318] Master allowing
>>> unauthenticated frameworks to register!!
>>> *** Aborted at 1383085964 (unix time) try "date -d @1383085964" if you
>>> are using GNU date ***
>>> I1029 15:32:44.604970 18088 master.cpp:706] Elected as master!
>>> PC: @     0x7f3b715e7514
>>> mesos::internal::state::LevelDBStorageProcess::initialize()
>>> *** SIGSEGV (@0x0) received by PID 18088 (TID 0x7f3b66c1c700) from PID
>>> 0; stack trace: ***
>>>     @     0x7f3b703bbcb0 (unknown)
>>>     @     0x7f3b715e7514
>>> mesos::internal::state::LevelDBStorageProcess::initialize()
>>>     @     0x7f3b7160d389 process::ProcessManager::resume()
>>>     @     0x7f3b7160df2c process::schedule()
>>>     @     0x7f3b703b3e9a start_thread
>>>     @     0x7f3b700e03fd (unknown)
>>> Segmentation fault (core dumped)
>>>
>>>
>>>
>>
>

Re: problem launching mesos master from HEAD

Posted by Benjamin Mahler <be...@gmail.com>.
Ah I see the issue in LevelDBStorageProcess::initialize

void LevelDBStorageProcess::initialize()
{
  leveldb::Options options;
  options.create_if_missing = true;

  leveldb::Status status = leveldb::DB::Open(options, path, &db);

  if (!status.ok()) {
    // TODO(benh): Consider trying to repair the DB.
    error = Option<string>::some(status.ToString());
  }

  // TODO(benh): Conditionally compact to avoid long recovery times?
*  db->CompactRange(NULL, NULL);*
}

If the Open operation fails we must not dereference db, I'll commit a fix
shortly for you. This should fix things up. (This is likely happening
because the default work directory for the master is /tmp/mesos which is
likely not writable by your master).


On Wed, Oct 30, 2013 at 11:45 AM, Benjamin Mahler <benjamin.mahler@gmail.com
> wrote:

> Interesting, I cannot reproduce this, what command are you using to run
> the master? Which operating system?
>
> If you're running the master from a build, can you run the master with gdb?
>
> ./bin/gdb-mesos-master.sh
>
> Once is segfaults it would be good to get a backtrace on where this occurs
> (seems the backtrace you linked does not have line numbers unfortunately).
>
>
> On Tue, Oct 29, 2013 at 3:55 PM, Du Li <du...@ericsson.com> wrote:
>
>> Hi,
>>
>> I just built the latest code from mesos HEAD. When doing
>> "bin/mesos-master", the master process core dumped with error messages as
>> below. Framework authentication seems a newly added feature. Do I need to
>> do any extra configuration to mesos and the framework (e.g. Spark)?
>>
>> Thanks,
>> Du
>>
>> I1029 15:32:44.601929 18088 main.cpp:123] Build: 2013-10-29 15:02:13 by
>> bigdatauser
>> I1029 15:32:44.603224 18088 main.cpp:124] Starting Mesos master
>> I1029 15:32:44.604331 18088 master.cpp:293] Master started on
>> 10.126.71.151:5050
>> I1029 15:32:44.604408 18088 master.cpp:308] Master ID:
>> 201310291532-2538044938-5050-18088
>> I1029 15:32:44.604465 18088 master.cpp:318] Master allowing
>> unauthenticated frameworks to register!!
>> *** Aborted at 1383085964 (unix time) try "date -d @1383085964" if you
>> are using GNU date ***
>> I1029 15:32:44.604970 18088 master.cpp:706] Elected as master!
>> PC: @     0x7f3b715e7514
>> mesos::internal::state::LevelDBStorageProcess::initialize()
>> *** SIGSEGV (@0x0) received by PID 18088 (TID 0x7f3b66c1c700) from PID 0;
>> stack trace: ***
>>     @     0x7f3b703bbcb0 (unknown)
>>     @     0x7f3b715e7514
>> mesos::internal::state::LevelDBStorageProcess::initialize()
>>     @     0x7f3b7160d389 process::ProcessManager::resume()
>>     @     0x7f3b7160df2c process::schedule()
>>     @     0x7f3b703b3e9a start_thread
>>     @     0x7f3b700e03fd (unknown)
>> Segmentation fault (core dumped)
>>
>>
>>
>

Re: problem launching mesos master from HEAD

Posted by Benjamin Mahler <be...@gmail.com>.
Interesting, I cannot reproduce this, what command are you using to run the
master? Which operating system?

If you're running the master from a build, can you run the master with gdb?

./bin/gdb-mesos-master.sh

Once is segfaults it would be good to get a backtrace on where this occurs
(seems the backtrace you linked does not have line numbers unfortunately).


On Tue, Oct 29, 2013 at 3:55 PM, Du Li <du...@ericsson.com> wrote:

> Hi,
>
> I just built the latest code from mesos HEAD. When doing
> "bin/mesos-master", the master process core dumped with error messages as
> below. Framework authentication seems a newly added feature. Do I need to
> do any extra configuration to mesos and the framework (e.g. Spark)?
>
> Thanks,
> Du
>
> I1029 15:32:44.601929 18088 main.cpp:123] Build: 2013-10-29 15:02:13 by
> bigdatauser
> I1029 15:32:44.603224 18088 main.cpp:124] Starting Mesos master
> I1029 15:32:44.604331 18088 master.cpp:293] Master started on
> 10.126.71.151:5050
> I1029 15:32:44.604408 18088 master.cpp:308] Master ID:
> 201310291532-2538044938-5050-18088
> I1029 15:32:44.604465 18088 master.cpp:318] Master allowing
> unauthenticated frameworks to register!!
> *** Aborted at 1383085964 (unix time) try "date -d @1383085964" if you are
> using GNU date ***
> I1029 15:32:44.604970 18088 master.cpp:706] Elected as master!
> PC: @     0x7f3b715e7514
> mesos::internal::state::LevelDBStorageProcess::initialize()
> *** SIGSEGV (@0x0) received by PID 18088 (TID 0x7f3b66c1c700) from PID 0;
> stack trace: ***
>     @     0x7f3b703bbcb0 (unknown)
>     @     0x7f3b715e7514
> mesos::internal::state::LevelDBStorageProcess::initialize()
>     @     0x7f3b7160d389 process::ProcessManager::resume()
>     @     0x7f3b7160df2c process::schedule()
>     @     0x7f3b703b3e9a start_thread
>     @     0x7f3b700e03fd (unknown)
> Segmentation fault (core dumped)
>
>
>