You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@mesos.apache.org by "Vinod Kone (JIRA)" <ji...@apache.org> on 2013/03/21 22:01:15 UTC

[jira] [Commented] (MESOS-410) AllocatorTest/0.SchedulerFailover is flaky.

    [ https://issues.apache.org/jira/browse/MESOS-410?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13609465#comment-13609465 ] 

Vinod Kone commented on MESOS-410:
----------------------------------

[~twm378] Hey Thomas. As an aside, can you run all the allocator tests in a repeated loop, and fix any errors that you find?
                
> AllocatorTest/0.SchedulerFailover is flaky.
> -------------------------------------------
>
>                 Key: MESOS-410
>                 URL: https://issues.apache.org/jira/browse/MESOS-410
>             Project: Mesos
>          Issue Type: Bug
>            Reporter: Benjamin Mahler
>            Assignee: Thomas Marshall
>
> Looks like there are some bad expectations in the SchedulerFailover test.
> [ RUN      ] AllocatorTest/0.SchedulerFailover
> I0321 20:49:55.008021  3226 master.cpp:309] Master started on 67.195.138.60:33834
> I0321 20:49:55.008059  3226 master.cpp:324] Master ID: 201303212049-1015726915-33834-3147
> I0321 20:49:55.008726  3226 master.cpp:603] Elected as master!
> I0321 20:49:55.008412  3228 sched.cpp:182] New master at master@67.195.138.60:33834
> I0321 20:49:55.009618  3228 master.cpp:646] Registering framework 201303212049-1015726915-33834-3147-0000 at scheduler(63)@67.195.138.60:33834
> I0321 20:49:55.010125  3228 sched.cpp:217] Framework registered with 201303212049-1015726915-33834-3147-0000
> W0321 20:49:55.008816  3229 master.cpp:81] No whitelist given. Advertising offers for all slaves
> I0321 20:49:55.008859  3230 hierarchical_allocator_process.hpp:236] Initializing hierarchical allocator process with master : master@67.195.138.60:33834
> I0321 20:49:55.060097  3230 hierarchical_allocator_process.hpp:268] Added framework 201303212049-1015726915-33834-3147-0000
> I0321 20:49:55.065731  3230 hierarchical_allocator_process.hpp:668] No resources available to allocate!
> I0321 20:49:55.066246  3230 hierarchical_allocator_process.hpp:599] Performed allocation for 0 slaves in 517.48us
> I0321 20:49:55.008122  3224 slave.cpp:203] Slave started on 73)@67.195.138.60:33834
> I0321 20:49:55.067206  3224 slave.cpp:204] Slave resources: cpus=3; mem=1024; ports=[31000-32000]; disk=14036
> I0321 20:49:55.068222  3224 slave.cpp:453] New master detected at master@67.195.138.60:33834
> I0321 20:49:55.068403  3224 slave.cpp:377] Finished recovery
> I0321 20:49:55.069236  3224 master.cpp:968] Attempting to register slave on janus.apache.org at slave(73)@67.195.138.60:33834
> I0321 20:49:55.070163  3224 master.cpp:1224] Master now considering a slave at janus.apache.org:33834 as active
> I0321 20:49:55.070575  3224 master.cpp:1862] Adding slave 201303212049-1015726915-33834-3147-0 at janus.apache.org with cpus=3; mem=1024; ports=[31000-32000]; disk=14036
> I0321 20:49:55.072634  3224 slave.cpp:487] Registered with master; given slave ID 201303212049-1015726915-33834-3147-0
> I0321 20:49:55.068428  3226 status_update_manager.cpp:132] New master detected at master@67.195.138.60:33834
> I0321 20:49:55.072762  3228 hierarchical_allocator_process.hpp:395] Added slave 201303212049-1015726915-33834-3147-0 (janus.apache.org) with cpus=3; mem=1024; ports=[31000-32000]; disk=14036 (and cpus=3; mem=1024; ports=[31000-32000]; disk=14036 available)
> I0321 20:49:55.074096  3228 hierarchical_allocator_process.hpp:660] Found available resources: cpus=3; mem=1024; ports=[31000-32000]; disk=14036 on slave 201303212049-1015726915-33834-3147-0
> I0321 20:49:55.074628  3228 hierarchical_allocator_process.hpp:686] Offering cpus=3; mem=1024; ports=[31000-32000]; disk=14036 on slave 201303212049-1015726915-33834-3147-0 to framework 201303212049-1015726915-33834-3147-0000
> I0321 20:49:55.075227  3228 hierarchical_allocator_process.hpp:619] Performed allocation for slave 201303212049-1015726915-33834-3147-0 in 1.14ms
> I0321 20:49:55.075314  3224 master.hpp:309] Adding offer with resources cpus=3; mem=1024; ports=[31000-32000]; disk=14036 on slave 201303212049-1015726915-33834-3147-0
> I0321 20:49:55.076213  3224 master.cpp:1327] Sending 1 offers to framework 201303212049-1015726915-33834-3147-0000
> I0321 20:49:55.076750  3224 sched.cpp:282] Received 1 offers
> I0321 20:49:55.077636  3224 master.cpp:1534] Processing reply for offer 201303212049-1015726915-33834-3147-0 on slave 201303212049-1015726915-33834-3147-0 (janus.apache.org) for framework 201303212049-1015726915-33834-3147-0000
> I0321 20:49:55.077834  3224 master.hpp:289] Adding task with resources cpus=1; mem=256 on slave 201303212049-1015726915-33834-3147-0
> I0321 20:49:55.078308  3224 master.cpp:1651] Launching task 0 of framework 201303212049-1015726915-33834-3147-0000 with resources cpus=1; mem=256 on slave 201303212049-1015726915-33834-3147-0 (janus.apache.org)
> I0321 20:49:55.078922  3224 master.hpp:318] Removing offer with resources cpus=3; mem=1024; ports=[31000-32000]; disk=14036 on slave 201303212049-1015726915-33834-3147-0
> I0321 20:49:55.078999  3229 slave.cpp:599] Got assigned task 0 for framework 201303212049-1015726915-33834-3147-0000
> I0321 20:49:55.079066  3230 hierarchical_allocator_process.hpp:471] Framework 201303212049-1015726915-33834-3147-0000 left cpus=2; mem=768; ports=[31000-32000]; disk=14036 unused on slave 201303212049-1015726915-33834-3147-0
> I0321 20:49:55.081300  3229 paths.hpp:302] Created executor directory '/tmp/AllocatorTest_0_SchedulerFailover_oWPFSh/slaves/201303212049-1015726915-33834-3147-0/frameworks/201303212049-1015726915-33834-3147-0000/executors/default/runs/da91e171-2172-4e06-b425-1aa1e8522aa7'
> I0321 20:49:55.122453  3229 slave.cpp:436] Successfully attached file '/tmp/AllocatorTest_0_SchedulerFailover_oWPFSh/slaves/201303212049-1015726915-33834-3147-0/frameworks/201303212049-1015726915-33834-3147-0000/executors/default/runs/da91e171-2172-4e06-b425-1aa1e8522aa7'
> I0321 20:49:55.122535  3225 exec.cpp:170] Executor started at: executor(20)@67.195.138.60:33834 with pid 3147
> I0321 20:49:55.123455  3225 slave.cpp:1058] Got registration for executor 'default' of framework 201303212049-1015726915-33834-3147-0000
> I0321 20:49:55.124008  3225 slave.cpp:1133] Flushing queued tasks for framework 201303212049-1015726915-33834-3147-0000
> I0321 20:49:55.124099  3229 exec.cpp:194] Executor registered on slave 201303212049-1015726915-33834-3147-0
> I0321 20:49:55.125076  3229 exec.cpp:258] Executor asked to run task '0'
> I0321 20:49:55.125663  3227 sched.cpp:422] Stopping framework '201303212049-1015726915-33834-3147-0000'
> I0321 20:49:55.126713  3227 master.cpp:488] Framework 201303212049-1015726915-33834-3147-0000 disconnected
> I0321 20:49:55.127213  3227 master.cpp:500] Giving framework 201303212049-1015726915-33834-3147-0000 500.00ms to failover
> I0321 20:49:55.127316  3223 hierarchical_allocator_process.hpp:359] Deactivated framework 201303212049-1015726915-33834-3147-0000
> I0321 20:49:55.127428  3226 sched.cpp:182] New master at master@67.195.138.60:33834
> I0321 20:49:55.128818  3226 master.cpp:681] Re-registering framework 201303212049-1015726915-33834-3147-0000 at scheduler(64)@67.195.138.60:33834
> I0321 20:49:55.162433  3226 master.cpp:700] Framework 201303212049-1015726915-33834-3147-0000 failed over
> I0321 20:49:55.163038  3226 sched.cpp:217] Framework registered with 201303212049-1015726915-33834-3147-0000
> I0321 20:49:55.163110  3228 slave.cpp:968] Updating framework 201303212049-1015726915-33834-3147-0000 pid to scheduler(64)@67.195.138.60:33834
> I0321 20:49:55.163147  3224 hierarchical_allocator_process.hpp:327] Activated framework 201303212049-1015726915-33834-3147-0000
> I0321 20:49:55.164475  3224 hierarchical_allocator_process.hpp:660] Found available resources: cpus=2; mem=768; ports=[31000-32000]; disk=14036 on slave 201303212049-1015726915-33834-3147-0
> I0321 20:49:55.164984  3224 hierarchical_allocator_process.hpp:686] Offering cpus=2; mem=768; ports=[31000-32000]; disk=14036 on slave 201303212049-1015726915-33834-3147-0 to framework 201303212049-1015726915-33834-3147-0000
> I0321 20:49:55.165602  3224 hierarchical_allocator_process.hpp:599] Performed allocation for 1 slaves in 1.15ms
> I0321 20:49:55.165699  3230 master.hpp:309] Adding offer with resources cpus=2; mem=768; ports=[31000-32000]; disk=14036 on slave 201303212049-1015726915-33834-3147-0
> I0321 20:49:55.166571  3230 master.cpp:1327] Sending 1 offers to framework 201303212049-1015726915-33834-3147-0000
> I0321 20:49:55.167109  3230 sched.cpp:282] Received 1 offers
> I0321 20:49:55.167677  3225 sched.cpp:422] Stopping framework '201303212049-1015726915-33834-3147-0000'
> I0321 20:49:55.168128  3225 master.cpp:488] Framework 201303212049-1015726915-33834-3147-0000 disconnected
> I0321 20:49:55.168639  3225 master.cpp:500] Giving framework 201303212049-1015726915-33834-3147-0000 500.00ms to failover
> I0321 20:49:55.178027  3225 master.hpp:318] Removing offer with resources cpus=2; mem=768; ports=[31000-32000]; disk=14036 on slave 201303212049-1015726915-33834-3147-0
> I0321 20:49:55.168725  3229 hierarchical_allocator_process.hpp:359] Deactivated framework 201303212049-1015726915-33834-3147-0000
> I0321 20:49:55.199956  3229 hierarchical_allocator_process.hpp:544] Recovered cpus=2; mem=768; ports=[31000-32000]; disk=14036 (total allocatable: cpus=2; mem=768; ports=[31000-32000]; disk=14036) on slave 201303212049-1015726915-33834-3147-0 from framework 201303212049-1015726915-33834-3147-0000
> I0321 20:49:55.678827  3227 master.cpp:1259] Framework failover timeout, removing framework 201303212049-1015726915-33834-3147-0000
> I0321 20:49:55.699990  3227 master.hpp:300] Removing task with resources cpus=1; mem=256 on slave 201303212049-1015726915-33834-3147-0
> I0321 20:49:55.705976  3227 hierarchical_allocator_process.hpp:544] Recovered cpus=1; mem=256 (total allocatable: cpus=3; mem=1024; ports=[31000-32000]; disk=14036) on slave 201303212049-1015726915-33834-3147-0 from framework 201303212049-1015726915-33834-3147-0000
> I0321 20:49:55.700172  3223 slave.cpp:901] Asked to shut down framework 201303212049-1015726915-33834-3147-0000 by master@67.195.138.60:33834
> I0321 20:49:55.706676  3223 slave.cpp:906] Shutting down framework 201303212049-1015726915-33834-3147-0000
> I0321 20:49:55.707144  3223 slave.cpp:1693] Shutting down executor 'default' of framework 201303212049-1015726915-33834-3147-0000
> I0321 20:49:55.707645  3223 slave.cpp:386] Slave terminating
> I0321 20:49:55.708091  3223 slave.cpp:901] Asked to shut down framework 201303212049-1015726915-33834-3147-0000 by @0.0.0.0:0
> I0321 20:49:55.708566  3223 slave.cpp:906] Shutting down framework 201303212049-1015726915-33834-3147-0000
> I0321 20:49:55.709998  3223 slave.cpp:1693] Shutting down executor 'default' of framework 201303212049-1015726915-33834-3147-0000
> I0321 20:49:55.710511  3223 master.cpp:537] Slave 201303212049-1015726915-33834-3147-0(janus.apache.org) disconnected
> I0321 20:49:55.710978  3223 master.cpp:542] Removing disconnected slave 201303212049-1015726915-33834-3147-0(janus.apache.org) because it is not checkpointing!
> I0321 20:49:55.707731  3228 status_update_manager.cpp:233] Closing status update streams for framework 201303212049-1015726915-33834-3147-0000
> I0321 20:49:55.712024  3228 status_update_manager.cpp:233] Closing status update streams for framework 201303212049-1015726915-33834-3147-0000
> I0321 20:49:55.711609  3147 master.cpp:477] Master terminating
> ../../src/tests/allocator_tests.cpp:642: Failure
> Actual function call count doesn't match EXPECT_CALL(exec, shutdown(_))...
>          Expected: to be called once
>            Actual: never called - unsatisfied and active

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira