You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@mesos.apache.org by "Kevin Klues (JIRA)" <ji...@apache.org> on 2016/10/06 07:33:20 UTC
[jira] [Commented] (MESOS-6118) Agent would crash with docker
container tasks due to host mount table read.
[ https://issues.apache.org/jira/browse/MESOS-6118?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15551191#comment-15551191 ]
Kevin Klues commented on MESOS-6118:
------------------------------------
I've added two new patches to try and address this:
https://reviews.apache.org/r/52597/
https://reviews.apache.org/r/52596/
[~jamiebriant] [~bobrik] Could you please try things out with these patches and see if they fix your issues?
> Agent would crash with docker container tasks due to host mount table read.
> ---------------------------------------------------------------------------
>
> Key: MESOS-6118
> URL: https://issues.apache.org/jira/browse/MESOS-6118
> Project: Mesos
> Issue Type: Bug
> Components: slave
> Affects Versions: 1.0.1
> Environment: Build: 2016-08-26 23:06:27 by centos
> Version: 1.0.1
> Git tag: 1.0.1
> Git SHA: 3611eb0b7eea8d144e9b2e840e0ba16f2f659ee3
> systemd version `219` detected
> Inializing systemd state
> Created systemd slice: `/run/systemd/system/mesos_executors.slice`
> Started systemd slice `mesos_executors.slice`
> Using isolation: posix/cpu,posix/mem,filesystem/posix,network/cni
> Using /sys/fs/cgroup/freezer as the freezer hierarchy for the Linux launcher
> Linux ip-10-254-192-40 3.10.0-327.28.3.el7.x86_64 #1 SMP Thu Aug 18 19:05:49 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux
> Reporter: Jamie Briant
> Assignee: Kevin Klues
> Priority: Critical
> Labels: linux, slave
> Attachments: crashlogfull.log, cycle2.log, cycle3.log, cycle5.log, cycle6.log, slave-crash.log
>
>
> I have a framework which schedules thousands of short running (a few seconds to a few minutes) of tasks, over a period of several minutes. In 1.0.1, the slave process will crash every few minutes (with systemd restarting it).
> Crash is:
> Sep 01 20:52:23 ip-10-254-192-99 mesos-slave: F0901 20:52:23.905678 1232 fs.cpp:140] Check failed: !visitedParents.contains(parentId)
> Sep 01 20:52:23 ip-10-254-192-99 mesos-slave: *** Check failure stack trace: ***
> Version 1.0.0 works without this issue.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)