You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@mesos.apache.org by "Till Toenshoff (JIRA)" <ji...@apache.org> on 2014/04/17 22:47:15 UTC

[jira] [Commented] (MESOS-1220) Make check failure on OSX - IO error: Too many open files

    [ https://issues.apache.org/jira/browse/MESOS-1220?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13973380#comment-13973380 ] 

Till Toenshoff commented on MESOS-1220:
---------------------------------------

Reverting the following fixes the problem:

commit f87111b4481aaf24c22c94984b28baf91611ddf5
Author: Benjamin Mahler <bm...@twitter.com>
Date:   Wed Apr 16 15:57:28 2014 -0700

    Used LogStorage for all tests.
    
    Review: https://reviews.apache.org/r/20431


> Make check failure on OSX - IO error: Too many open files
> ---------------------------------------------------------
>
>                 Key: MESOS-1220
>                 URL: https://issues.apache.org/jira/browse/MESOS-1220
>             Project: Mesos
>          Issue Type: Bug
>    Affects Versions: 0.19.0
>         Environment: OSX 10.9.2, Apple LLVM version 5.1 (clang-503.0.40) (based on LLVM 3.4svn)
>            Reporter: Till Toenshoff
>              Labels: build, build-failure, mesos, unit-test
>
> Make check runs into an abort:
> {noformat}
> $ make check
> [...]
> [       OK ] CoordinatorTest.LearnedOnOneReplica_NotLearnedOnAnother_AnotherFailsAndRecovers (0 ms)
> [----------] 21 tests from CoordinatorTest (816 ms total)
> [----------] 2 tests from RecoverTest
> [ RUN      ] RecoverTest.RacingCatchup
> F0417 21:45:21.254204 1980908304 replica.cpp:709] CHECK_SOME(state): IO error: /private/tmp/RecoverTest_RacingCatchup_if5Cz6/.log4/LOCK: Too many open filesFailed to recover the log
> *** Check failure stack trace: ***
>     @        0x10a2f9434  google::LogMessage::SendToLog()
>     @        0x10a2f9963  google::LogMessage::Flush()
>     @        0x10a2fcaff  google::LogMessageFatal::~LogMessageFatal()
>     @        0x10a2fa059  google::LogMessageFatal::~LogMessageFatal()
>     @        0x109dd8479  _CheckFatal::~_CheckFatal()
>     @        0x109dd8349  _CheckFatal::~_CheckFatal()
>     @        0x10a1b379a  mesos::internal::log::ReplicaProcess::restore()
>     @        0x10a1b3241  mesos::internal::log::ReplicaProcess::ReplicaProcess()
>     @        0x10a1b696b  mesos::internal::log::Replica::Replica()
>     @        0x1091ebd9a  RecoverTest_RacingCatchup_Test::TestBody()
>     @        0x10945234c  testing::internal::HandleExceptionsInMethodIfSupported<>()
>     @        0x1094431ea  testing::Test::Run()
>     @        0x109443e72  testing::TestInfo::Run()
>     @        0x1094444b0  testing::TestCase::Run()
>     @        0x109449d05  testing::internal::UnitTestImpl::RunAllTests()
>     @        0x109452b14  testing::internal::HandleExceptionsInMethodIfSupported<>()
>     @        0x109449a39  testing::UnitTest::Run()
>     @        0x10922a270  main
>     @     0x7fff8a98d5fd  start
>     @                0x1  (unknown)
> make[3]: *** [check-local] Abort trap: 6
> {noformat}
> That test does not fail when being run individually, hinting that we got some general file-handle leakage problem.
> The exact test that throws the abort bomb is machine and dependent. Tried it on two MBP's and one fails a few tests earlier than the other.



--
This message was sent by Atlassian JIRA
(v6.2#6252)