You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@mesos.apache.org by "James Peach (JIRA)" <ji...@apache.org> on 2018/01/17 23:53:00 UTC

[jira] [Commented] (MESOS-6575) Change `disk/xfs` isolator to terminate executor when it exceeds quota

    [ https://issues.apache.org/jira/browse/MESOS-6575?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16329711#comment-16329711 ] 

James Peach commented on MESOS-6575:
------------------------------------

Yeh, I think that using the soft limit is a pretty good idea. We can set the soft limit to the resources and the hard limit to resource + a fudge factor. We can kill applications based on either directly observing soft limit breaches, or the quota warnings (need to check whether XFS will reset them if the task goes back under the soft limit).

> Change `disk/xfs` isolator to terminate executor when it exceeds quota
> ----------------------------------------------------------------------
>
>                 Key: MESOS-6575
>                 URL: https://issues.apache.org/jira/browse/MESOS-6575
>             Project: Mesos
>          Issue Type: Task
>          Components: agent, containerization
>            Reporter: Santhosh Kumar Shanmugham
>            Assignee: James Peach
>            Priority: Major
>
> Unlike {{disk/du}} isolator which sends a {{ContainerLimitation}} protobuf when the executor exceeds the quota, {{disk/xfs}} isolator, which relies on XFS's internal quota enforcement, silently fails the {{write}} operation, that causes the quota limit to be exceeded, without surfacing the quota breach information.
> This task is to change the `disk/xfs` isolator so that, a {{ContainerLimitation}} message is triggered when the quota is exceeded. 
> This feature will rely on the underlying filesystem being mounted with {{pqnoenforce}} (accounting-only mode), so that XFS does not silently causes a {{EDQUOT}} error on writes that causes the quota to be exceeded. Now the isolator can track the disk quota via {{xfs_quota}}, very much like {{disk/du}} using {{du}}, every {{container_disk_watch_interval}} and surface the disk quota limit exceed event via a {{ContainerLimitation}} protobuf, causing the executor to be terminated. This feature can then be turned on/off via the existing {{enforce_container_disk_quota}} option.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)