You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@mesos.apache.org by "Anand Mazumdar (JIRA)" <ji...@apache.org> on 2015/10/16 09:19:06 UTC

[jira] [Comment Edited] (MESOS-3747) HTTP Scheduler API no longer allows FrameworkInfo.user to be empty string

    [ https://issues.apache.org/jira/browse/MESOS-3747?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14960287#comment-14960287 ] 

Anand Mazumdar edited comment on MESOS-3747 at 10/16/15 7:18 AM:
-----------------------------------------------------------------

[~liqiang] Thanks for taking this up. It's perfectly fine to drop these silently with a warning when using the C++ Scheduler Library as we do for other failed validations as you already spoke about earlier.

This JIRA was referring to what should be the general behavior of the Mesos master in case of clients that might be different then the Scheduler Library.


was (Author: anandmazumdar):
[[~liqiang] Thanks for taking this up. It's perfectly fine to drop these silently with a warning when using the C++ Scheduler Library as we do for other failed validations as you already spoke about earlier.

This JIRA was referring to what should be the general behavior of the Mesos master in case of clients that might be different then the Scheduler Library.

> HTTP Scheduler API no longer allows FrameworkInfo.user to be empty string
> -------------------------------------------------------------------------
>
>                 Key: MESOS-3747
>                 URL: https://issues.apache.org/jira/browse/MESOS-3747
>             Project: Mesos
>          Issue Type: Bug
>          Components: HTTP API
>    Affects Versions: 0.24.0, 0.24.1, 0.25.0
>            Reporter: Ben Whitehead
>            Assignee: Liqiang Lin
>            Priority: Blocker
>
> When using libmesos a framework can set its user to {{""}} (empty string) to inherit the user the agent processes is running as, this behavior now results in a {{TASK_FAILED}}.
> Full messages and relevant agent logs below.
> The error returned to the framework tells me nothing about the user not existing on the agent host instead it tells me the container died due to OOM.
> {code:title=FrameworkInfo}
> call {
>     type: SUBSCRIBE
>     subscribe: {
>         frameworkInfo: {
>             user: "",
>             name: "testing"
>         }
>     }
> }
> {code}
> {code:title=TaskInfo}
> call {
>     framework_id { value: "20151015-125949-16777343-5050-20146-0000" },
>     type: ACCEPT,
>     accept { 
>         offer_ids: [{ value: "20151015-125949-16777343-5050-20146-O0" }],
>         operations { 
>             type: LAUNCH, 
>             launch { 
>                 task_infos [
>                     {
>                         name: "task-1",
>                         task_id: { value: "task-1" },
>                         agent_id: { value: "20151015-125949-16777343-5050-20146-S0" },
>                         resources [
>                             { name: "cpus", type: SCALAR, scalar: { value: 0.1 },  role: "*" },
>                             { name: "mem",  type: SCALAR, scalar: { value: 64.0 }, role: "*" },
>                             { name: "disk", type: SCALAR, scalar: { value: 0.0 },  role: "*" },
>                         ],
>                         command: { 
>                             environment { 
>                                 variables [ 
>                                     { name: "SLEEP_SECONDS" value: "15" } 
>                                 ] 
>                             },
>                             value: "env | sort && sleep $SLEEP_SECONDS"
>                         }
>                     }
>                 ]
>              }
>          }
>      }
> }
> {code}
> {code:title=Update Status}
> event: {
>     type: UPDATE,
>     update: { 
>         status: { 
>             task_id: { value: "task-1" }, 
>             state: TASK_FAILED,
>             message: "Container destroyed while preparing isolators",
>             agent_id: { value: "20151015-125949-16777343-5050-20146-S0" }, 
>             timestamp: 1.444939217401241E9,
>             executor_id: { value: "task-1" },
>             source: SOURCE_AGENT, 
>             reason: REASON_MEMORY_LIMIT,
>             uuid: "\237g()L\026EQ\222\301\261\265\\\221\224|" 
>         } 
>     }
> }
> {code}
> {code:title=agent logs}
> I1015 13:15:34.260592 19639 slave.cpp:1270] Got assigned task task-1 for framework e4de5b96-41cc-4713-af44-7cffbdd63ba6-0000
> I1015 13:15:34.260921 19639 slave.cpp:1386] Launching task task-1 for framework e4de5b96-41cc-4713-af44-7cffbdd63ba6-0000
> W1015 13:15:34.262243 19639 paths.cpp:423] Failed to chown executor directory '/home/ben.whitehead/opt/mesos/work/slave/work_dir/slaves/e4de5b96-41cc-4713-af44-7cffbdd63ba6-S0/frameworks/e4de5b96-41cc-4713-af44-7cffbdd63ba6-0000/executors/task-1/runs/3958ff84-8dd9-4c3c-995d-5aba5250541b': Failed to get user information for '': Success
> I1015 13:15:34.262444 19639 slave.cpp:4852] Launching executor task-1 of framework e4de5b96-41cc-4713-af44-7cffbdd63ba6-0000 with resources cpus(*):0.1; mem(*):32 in work directory '/home/ben.whitehead/opt/mesos/work/slave/work_dir/slaves/e4de5b96-41cc-4713-af44-7cffbdd63ba6-S0/frameworks/e4de5b96-41cc-4713-af44-7cffbdd63ba6-0000/executors/task-1/runs/3958ff84-8dd9-4c3c-995d-5aba5250541b'
> I1015 13:15:34.262581 19639 slave.cpp:1604] Queuing task 'task-1' for executor task-1 of framework 'e4de5b96-41cc-4713-af44-7cffbdd63ba6-0000
> I1015 13:15:34.262684 19638 docker.cpp:734] No container info found, skipping launch
> I1015 13:15:34.263478 19638 containerizer.cpp:640] Starting container '3958ff84-8dd9-4c3c-995d-5aba5250541b' for executor 'task-1' of framework 'e4de5b96-41cc-4713-af44-7cffbdd63ba6-0000'
> E1015 13:15:34.264516 19641 slave.cpp:3342] Container '3958ff84-8dd9-4c3c-995d-5aba5250541b' for executor 'task-1' of framework 'e4de5b96-41cc-4713-af44-7cffbdd63ba6-0000' failed to start: Failed to prepare isolator: Failed to get user information for '': Success
> I1015 13:15:34.264681 19636 containerizer.cpp:1097] Destroying container '3958ff84-8dd9-4c3c-995d-5aba5250541b'
> I1015 13:15:34.265997 19636 slave.cpp:3433] Executor 'task-1' of framework e4de5b96-41cc-4713-af44-7cffbdd63ba6-0000 has terminated with unknown status
> I1015 13:15:34.266568 19636 slave.cpp:2717] Handling status update TASK_FAILED (UUID: 6e45302e-72a4-442f-8056-6154eab5e265) for task task-1 of framework e4de5b96-41cc-4713-af44-7cffbdd63ba6-0000 from @0.0.0.0:0
> W1015 13:15:34.266695 19636 containerizer.cpp:988] Ignoring update for unknown container: 3958ff84-8dd9-4c3c-995d-5aba5250541b
> I1015 13:15:34.266772 19638 status_update_manager.cpp:322] Received status update TASK_FAILED (UUID: 6e45302e-72a4-442f-8056-6154eab5e265) for task task-1 of framework e4de5b96-41cc-4713-af44-7cffbdd63ba6-0000
> I1015 13:15:34.266885 19636 slave.cpp:3016] Forwarding the update TASK_FAILED (UUID: 6e45302e-72a4-442f-8056-6154eab5e265) for task task-1 of framework e4de5b96-41cc-4713-af44-7cffbdd63ba6-0000 to master@127.0.0.1:5050
> I1015 13:15:35.255997 19638 status_update_manager.cpp:394] Received status update acknowledgement (UUID: 6e45302e-72a4-442f-8056-6154eab5e265) for task task-1 of framework e4de5b96-41cc-4713-af44-7cffbdd63ba6-0000
> I1015 13:15:35.256165 19640 slave.cpp:3544] Cleaning up executor 'task-1' of framework e4de5b96-41cc-4713-af44-7cffbdd63ba6-0000
> I1015 13:15:35.256273 19641 gc.cpp:56] Scheduling '/home/ben.whitehead/opt/mesos/work/slave/work_dir/slaves/e4de5b96-41cc-4713-af44-7cffbdd63ba6-S0/frameworks/e4de5b96-41cc-4713-af44-7cffbdd63ba6-0000/executors/task-1/runs/3958ff84-8dd9-4c3c-995d-5aba5250541b' for gc 6.99999703411852days in the future
> I1015 13:15:35.256283 19640 slave.cpp:3633] Cleaning up framework e4de5b96-41cc-4713-af44-7cffbdd63ba6-0000
> I1015 13:15:35.256340 19641 gc.cpp:56] Scheduling '/home/ben.whitehead/opt/mesos/work/slave/work_dir/slaves/e4de5b96-41cc-4713-af44-7cffbdd63ba6-S0/frameworks/e4de5b96-41cc-4713-af44-7cffbdd63ba6-0000/executors/task-1' for gc 6.99999703386667days in the future
> I1015 13:15:35.256350 19634 status_update_manager.cpp:284] Closing status update streams for framework e4de5b96-41cc-4713-af44-7cffbdd63ba6-0000
> I1015 13:15:35.256377 19641 gc.cpp:56] Scheduling '/home/ben.whitehead/opt/mesos/work/slave/work_dir/slaves/e4de5b96-41cc-4713-af44-7cffbdd63ba6-S0/frameworks/e4de5b96-41cc-4713-af44-7cffbdd63ba6-0000' for gc 6.99999703291556days in the future
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)