You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@mesos.apache.org by "Chengwei Yang (JIRA)" <ji...@apache.org> on 2014/09/17 03:17:34 UTC

[jira] [Created] (MESOS-1804) the "store" component cause on-top framework (chronos) crash

Chengwei Yang created MESOS-1804:
------------------------------------

             Summary: the "store" component cause on-top framework (chronos) crash
                 Key: MESOS-1804
                 URL: https://issues.apache.org/jira/browse/MESOS-1804
             Project: Mesos
          Issue Type: Bug
         Environment: mesos-0.19.0
            Reporter: Chengwei Yang
            Assignee: Chengwei Yang


chronos running with mesos-0.19.0 may crash like below.

{code}
[2014-09-05 15:21:36,095] INFO State J_chronos_job_34 does not exist yet. Adding to state (com.airbnb.scheduler.state.MesosStatePersistenceStore:146)
F0905 15:21:36.175230 27727 org_apache_mesos_state_AbstractState.cpp:319] Check failed: future->isReady()
*** Check failure stack trace: ***
@ 0x7f4f1ecb199d google::LogMessage::Fail()
@ 0x7f4f1ecb59b7 google::LogMessage::SendToLog()
@ 0x7f4f1ecb3839 google::LogMessage::Flush()
@ 0x7f4f1ecb3b3d google::LogMessageFatal::~LogMessageFatal()
@ 0x7f4f1ec2ef90 Java_org_apache_mesos_state_AbstractState__1_1store_1get
@ 0x7f4f18293d45 (unknown)
Aborted (core dumped)
{code}

The related code snippet as below:
{code}
$ sed -ne '311,334p' src/java/jni/org_apache_mesos_state_AbstractState.cpp
JNIEXPORT jobject JNICALL Java_org_apache_mesos_state_AbstractState__1_1store_1get
  (JNIEnv* env, jobject thiz, jlong jfuture)
{
  Future<Option<Variable> >* future = (Future<Option<Variable> >*) jfuture;

  future->await();

  if (future->isFailed()) {
    jclass clazz = env->FindClass("java/util/concurrent/ExecutionException");
    env->ThrowNew(clazz, future->failure().c_str());
    return NULL;
  } else if (future->isDiscarded()) {
    // TODO(benh): Consider throwing an ExecutionException since we
    // never return true for 'isCancelled'.
    jclass clazz = env->FindClass("java/util/concurrent/CancellationException");
    env->ThrowNew(clazz, "Future was discarded");
    return NULL;
  }

  CHECK_READY(*future);

  if (future->get().isSome()) {
    Variable* variable = new Variable(future->get().get());
{code}

The root cause seems that CHECK_READY(*future) failed and crashed chronos.

See chronos issue: https://github.com/airbnb/chronos/issues/253



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)