You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@mesos.apache.org by "Vinod Kone (JIRA)" <ji...@apache.org> on 2015/02/27 19:39:05 UTC

[jira] [Commented] (MESOS-2401) MasterTest.ShutdownFrameworkWhileTaskRunning is flaky

    [ https://issues.apache.org/jira/browse/MESOS-2401?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14340536#comment-14340536 ] 

Vinod Kone commented on MESOS-2401:
-----------------------------------

Looks like this test actually sets the shutdown grace period to 0 (not sure why) which always has this race. I'm surprised that we are only seeing this issue now, considering that code has been there since 2013.

> MasterTest.ShutdownFrameworkWhileTaskRunning is flaky
> -----------------------------------------------------
>
>                 Key: MESOS-2401
>                 URL: https://issues.apache.org/jira/browse/MESOS-2401
>             Project: Mesos
>          Issue Type: Bug
>          Components: test
>            Reporter: Vinod Kone
>            Assignee: Vinod Kone
>
> Looks like the executorShutdownTimeout() was called immediately after executorShutdown() was called!
> {code}
> [ RUN      ] MasterTest.ShutdownFrameworkWhileTaskRunning
> Using temporary directory '/tmp/MasterTest_ShutdownFrameworkWhileTaskRunning_sBd6vK'
> I0224 18:51:17.385068 30213 leveldb.cpp:176] Opened db in 1.262442ms
> I0224 18:51:17.386360 30213 leveldb.cpp:183] Compacted db in 985102ns
> I0224 18:51:17.387025 30213 leveldb.cpp:198] Created db iterator in 78043ns
> I0224 18:51:17.387420 30213 leveldb.cpp:204] Seeked to beginning of db in 25814ns
> I0224 18:51:17.387804 30213 leveldb.cpp:273] Iterated through 0 keys in the db in 25025ns
> I0224 18:51:17.388270 30213 replica.cpp:744] Replica recovered with log positions 0 -> 0 with 1 holes and 0 unlearned
> I0224 18:51:17.389760 30227 recover.cpp:449] Starting replica recovery
> I0224 18:51:17.395699 30227 recover.cpp:475] Replica is in 4 status
> I0224 18:51:17.398294 30227 replica.cpp:641] Replica in 4 status received a broadcasted recover request
> I0224 18:51:17.398816 30227 recover.cpp:195] Received a recover response from a replica in 4 status
> I0224 18:51:17.402415 30230 recover.cpp:566] Updating replica status to 3
> I0224 18:51:17.403473 30229 leveldb.cpp:306] Persisting metadata (8 bytes) to leveldb took 273857ns
> I0224 18:51:17.404093 30229 replica.cpp:323] Persisted replica status to 3
> I0224 18:51:17.404930 30229 recover.cpp:475] Replica is in 3 status
> I0224 18:51:17.407995 30233 replica.cpp:641] Replica in 3 status received a broadcasted recover request
> I0224 18:51:17.410697 30231 recover.cpp:195] Received a recover response from a replica in 3 status
> I0224 18:51:17.415710 30230 recover.cpp:566] Updating replica status to 1
> I0224 18:51:17.416987 30227 leveldb.cpp:306] Persisting metadata (8 bytes) to leveldb took 221966ns
> I0224 18:51:17.417579 30227 replica.cpp:323] Persisted replica status to 1
> I0224 18:51:17.418803 30234 recover.cpp:580] Successfully joined the Paxos group
> I0224 18:51:17.419699 30227 recover.cpp:464] Recover process terminated
> I0224 18:51:17.430594 30234 master.cpp:349] Master 20150224-185117-2272962752-44950-30213 (fedora-19) started on 192.168.122.135:44950
> I0224 18:51:17.431082 30234 master.cpp:395] Master only allowing authenticated frameworks to register
> I0224 18:51:17.431453 30234 master.cpp:400] Master only allowing authenticated slaves to register
> I0224 18:51:17.431828 30234 credentials.hpp:37] Loading credentials for authentication from '/tmp/MasterTest_ShutdownFrameworkWhileTaskRunning_sBd6vK/credentials'
> I0224 18:51:17.432740 30234 master.cpp:442] Authorization enabled
> I0224 18:51:17.434224 30229 hierarchical.hpp:287] Initialized hierarchical allocator process
> I0224 18:51:17.434994 30233 whitelist_watcher.cpp:79] No whitelist given
> I0224 18:51:17.440687 30234 master.cpp:1356] The newly elected leader is master@192.168.122.135:44950 with id 20150224-185117-2272962752-44950-30213
> I0224 18:51:17.441764 30234 master.cpp:1369] Elected as the leading master!
> I0224 18:51:17.442430 30234 master.cpp:1187] Recovering from registrar
> I0224 18:51:17.443053 30229 registrar.cpp:313] Recovering registrar
> I0224 18:51:17.445468 30228 log.cpp:660] Attempting to start the writer
> I0224 18:51:17.449970 30233 replica.cpp:477] Replica received implicit promise request with proposal 1
> I0224 18:51:17.451359 30233 leveldb.cpp:306] Persisting metadata (8 bytes) to leveldb took 339488ns
> I0224 18:51:17.451949 30233 replica.cpp:345] Persisted promised to 1
> I0224 18:51:17.456845 30235 process.cpp:2117] Dropped / Lost event for PID: hierarchical-allocator(154)@192.168.122.135:44950
> I0224 18:51:17.461741 30231 coordinator.cpp:230] Coordinator attemping to fill missing position
> I0224 18:51:17.464686 30228 replica.cpp:378] Replica received explicit promise request for position 0 with proposal 2
> I0224 18:51:17.465515 30228 leveldb.cpp:343] Persisting action (8 bytes) to leveldb took 170261ns
> I0224 18:51:17.465991 30228 replica.cpp:679] Persisted action at 0
> I0224 18:51:17.470512 30229 replica.cpp:511] Replica received write request for position 0
> I0224 18:51:17.471437 30229 leveldb.cpp:438] Reading position from leveldb took 139178ns
> I0224 18:51:17.472129 30229 leveldb.cpp:343] Persisting action (14 bytes) to leveldb took 141560ns
> I0224 18:51:17.472705 30229 replica.cpp:679] Persisted action at 0
> I0224 18:51:17.476305 30228 replica.cpp:658] Replica received learned notice for position 0
> I0224 18:51:17.477991 30228 leveldb.cpp:343] Persisting action (16 bytes) to leveldb took 208112ns
> I0224 18:51:17.478574 30228 replica.cpp:679] Persisted action at 0
> I0224 18:51:17.479044 30228 replica.cpp:664] Replica learned 1 action at position 0
> I0224 18:51:17.484371 30233 log.cpp:676] Writer started with ending position 0
> I0224 18:51:17.487396 30233 leveldb.cpp:438] Reading position from leveldb took 96498ns
> I0224 18:51:17.498906 30233 registrar.cpp:346] Successfully fetched the registry (0B) in 55.234048ms
> I0224 18:51:17.499781 30233 registrar.cpp:445] Applied 1 operations in 97308ns; attempting to update the 'registry'
> I0224 18:51:17.503955 30231 log.cpp:684] Attempting to append 131 bytes to the log
> I0224 18:51:17.505009 30231 coordinator.cpp:340] Coordinator attempting to write 2 action at position 1
> I0224 18:51:17.507428 30228 replica.cpp:511] Replica received write request for position 1
> I0224 18:51:17.508517 30228 leveldb.cpp:343] Persisting action (150 bytes) to leveldb took 316570ns
> I0224 18:51:17.508985 30228 replica.cpp:679] Persisted action at 1
> I0224 18:51:17.512902 30229 replica.cpp:658] Replica received learned notice for position 1
> I0224 18:51:17.517261 30229 leveldb.cpp:343] Persisting action (152 bytes) to leveldb took 427860ns
> I0224 18:51:17.517470 30229 replica.cpp:679] Persisted action at 1
> I0224 18:51:17.517796 30229 replica.cpp:664] Replica learned 2 action at position 1
> I0224 18:51:17.532624 30232 registrar.cpp:490] Successfully updated the 'registry' in 32.31104ms
> I0224 18:51:17.533957 30228 log.cpp:703] Attempting to truncate the log to 1
> I0224 18:51:17.534366 30228 coordinator.cpp:340] Coordinator attempting to write 3 action at position 2
> I0224 18:51:17.536684 30227 replica.cpp:511] Replica received write request for position 2
> I0224 18:51:17.537406 30227 leveldb.cpp:343] Persisting action (16 bytes) to leveldb took 196455ns
> I0224 18:51:17.537946 30227 replica.cpp:679] Persisted action at 2
> I0224 18:51:17.537695 30232 registrar.cpp:376] Successfully recovered registrar
> I0224 18:51:17.544136 30231 master.cpp:1214] Recovered 0 slaves from the Registry (95B) ; allowing 10mins for slaves to re-register
> I0224 18:51:17.546041 30227 replica.cpp:658] Replica received learned notice for position 2
> I0224 18:51:17.546728 30227 leveldb.cpp:343] Persisting action (18 bytes) to leveldb took 192442ns
> I0224 18:51:17.547058 30227 leveldb.cpp:401] Deleting ~1 keys from leveldb took 61064ns
> I0224 18:51:17.547363 30227 replica.cpp:679] Persisted action at 2
> I0224 18:51:17.547669 30227 replica.cpp:664] Replica learned 3 action at position 2
> I0224 18:51:17.565460 30234 slave.cpp:174] Slave started on 138)@192.168.122.135:44950
> I0224 18:51:17.566038 30234 credentials.hpp:85] Loading credential for authentication from '/tmp/MasterTest_ShutdownFrameworkWhileTaskRunning_lRugms/credential'
> I0224 18:51:17.566584 30234 slave.cpp:281] Slave using credential for: test-principal
> I0224 18:51:17.567198 30234 slave.cpp:299] Slave resources: cpus(*):2; mem(*):1024; disk(*):1024; ports(*):[31000-32000]
> I0224 18:51:17.567930 30234 slave.cpp:328] Slave hostname: fedora-19
> I0224 18:51:17.568172 30234 slave.cpp:329] Slave checkpoint: false
> W0224 18:51:17.568435 30234 slave.cpp:331] Disabling checkpointing is deprecated and the --checkpoint flag will be removed in a future release. Please avoid using this flag
> I0224 18:51:17.570539 30227 state.cpp:35] Recovering state from '/tmp/MasterTest_ShutdownFrameworkWhileTaskRunning_lRugms/meta'
> I0224 18:51:17.573499 30232 status_update_manager.cpp:197] Recovering status update manager
> I0224 18:51:17.574209 30234 slave.cpp:3775] Finished recovery
> I0224 18:51:17.576277 30229 status_update_manager.cpp:171] Pausing sending status updates
> I0224 18:51:17.576680 30234 slave.cpp:624] New master detected at master@192.168.122.135:44950
> I0224 18:51:17.577131 30234 slave.cpp:687] Authenticating with master master@192.168.122.135:44950
> I0224 18:51:17.577385 30234 slave.cpp:692] Using default CRAM-MD5 authenticatee
> I0224 18:51:17.577945 30228 authenticatee.hpp:139] Creating new client SASL connection
> I0224 18:51:17.578837 30234 slave.cpp:660] Detecting new master
> I0224 18:51:17.579270 30228 master.cpp:3813] Authenticating slave(138)@192.168.122.135:44950
> I0224 18:51:17.579900 30228 master.cpp:3824] Using default CRAM-MD5 authenticator
> I0224 18:51:17.580572 30228 authenticator.hpp:170] Creating new server SASL connection
> I0224 18:51:17.581501 30231 authenticatee.hpp:230] Received SASL authentication mechanisms: CRAM-MD5
> I0224 18:51:17.581805 30231 authenticatee.hpp:256] Attempting to authenticate with mechanism 'CRAM-MD5'
> I0224 18:51:17.582222 30228 authenticator.hpp:276] Received SASL authentication start
> I0224 18:51:17.582531 30228 authenticator.hpp:398] Authentication requires more steps
> I0224 18:51:17.582945 30230 authenticatee.hpp:276] Received SASL authentication step
> I0224 18:51:17.583351 30228 authenticator.hpp:304] Received SASL authentication step
> I0224 18:51:17.583643 30228 auxprop.cpp:99] Request to lookup properties for user: 'test-principal' realm: 'fedora-19' server FQDN: 'fedora-19' SASL_AUXPROP_VERIFY_AGAINST_HASH: false SASL_AUXPROP_OVERRIDE: false SASL_AUXPROP_AUTHZID: false 
> I0224 18:51:17.583911 30228 auxprop.cpp:171] Looking up auxiliary property '*userPassword'
> I0224 18:51:17.584241 30228 auxprop.cpp:171] Looking up auxiliary property '*cmusaslsecretCRAM-MD5'
> I0224 18:51:17.584517 30228 auxprop.cpp:99] Request to lookup properties for user: 'test-principal' realm: 'fedora-19' server FQDN: 'fedora-19' SASL_AUXPROP_VERIFY_AGAINST_HASH: false SASL_AUXPROP_OVERRIDE: false SASL_AUXPROP_AUTHZID: true 
> I0224 18:51:17.584787 30228 auxprop.cpp:121] Skipping auxiliary property '*userPassword' since SASL_AUXPROP_AUTHZID == true
> I0224 18:51:17.585075 30228 auxprop.cpp:121] Skipping auxiliary property '*cmusaslsecretCRAM-MD5' since SASL_AUXPROP_AUTHZID == true
> I0224 18:51:17.585358 30228 authenticator.hpp:390] Authentication success
> I0224 18:51:17.585750 30233 authenticatee.hpp:316] Authentication success
> I0224 18:51:17.586354 30232 master.cpp:3871] Successfully authenticated principal 'test-principal' at slave(138)@192.168.122.135:44950
> I0224 18:51:17.590953 30234 slave.cpp:758] Successfully authenticated with master master@192.168.122.135:44950
> I0224 18:51:17.591686 30233 master.cpp:2938] Registering slave at slave(138)@192.168.122.135:44950 (fedora-19) with id 20150224-185117-2272962752-44950-30213-S0
> I0224 18:51:17.592718 30233 registrar.cpp:445] Applied 1 operations in 100358ns; attempting to update the 'registry'
> I0224 18:51:17.595989 30227 log.cpp:684] Attempting to append 302 bytes to the log
> I0224 18:51:17.596757 30227 coordinator.cpp:340] Coordinator attempting to write 2 action at position 3
> I0224 18:51:17.599280 30227 replica.cpp:511] Replica received write request for position 3
> I0224 18:51:17.599481 30234 slave.cpp:1090] Will retry registration in 12.331173ms if necessary
> I0224 18:51:17.601940 30227 leveldb.cpp:343] Persisting action (321 bytes) to leveldb took 999045ns
> I0224 18:51:17.602339 30227 replica.cpp:679] Persisted action at 3
> I0224 18:51:17.612349 30229 replica.cpp:658] Replica received learned notice for position 3
> I0224 18:51:17.612934 30229 leveldb.cpp:343] Persisting action (323 bytes) to leveldb took 152139ns
> I0224 18:51:17.613471 30229 replica.cpp:679] Persisted action at 3
> I0224 18:51:17.613796 30229 replica.cpp:664] Replica learned 2 action at position 3
> I0224 18:51:17.615980 30229 master.cpp:2926] Ignoring register slave message from slave(138)@192.168.122.135:44950 (fedora-19) as admission is already in progress
> I0224 18:51:17.614302 30233 slave.cpp:1090] Will retry registration in 11.014835ms if necessary
> I0224 18:51:17.617490 30234 registrar.cpp:490] Successfully updated the 'registry' in 24.179968ms
> I0224 18:51:17.618989 30234 master.cpp:2995] Registered slave 20150224-185117-2272962752-44950-30213-S0 at slave(138)@192.168.122.135:44950 (fedora-19) with cpus(*):2; mem(*):1024; disk(*):1024; ports(*):[31000-32000]
> I0224 18:51:17.619567 30233 hierarchical.hpp:455] Added slave 20150224-185117-2272962752-44950-30213-S0 (fedora-19) with cpus(*):2; mem(*):1024; disk(*):1024; ports(*):[31000-32000] (and cpus(*):2; mem(*):1024; disk(*):1024; ports(*):[31000-32000] available)
> I0224 18:51:17.621080 30233 hierarchical.hpp:834] No resources available to allocate!
> I0224 18:51:17.621441 30233 hierarchical.hpp:759] Performed allocation for slave 20150224-185117-2272962752-44950-30213-S0 in 544608ns
> I0224 18:51:17.619704 30229 slave.cpp:792] Registered with master master@192.168.122.135:44950; given slave ID 20150224-185117-2272962752-44950-30213-S0
> I0224 18:51:17.622195 30229 slave.cpp:2830] Received ping from slave-observer(125)@192.168.122.135:44950
> I0224 18:51:17.622385 30227 status_update_manager.cpp:178] Resuming sending status updates
> I0224 18:51:17.620266 30232 log.cpp:703] Attempting to truncate the log to 3
> I0224 18:51:17.623522 30232 coordinator.cpp:340] Coordinator attempting to write 3 action at position 4
> I0224 18:51:17.624835 30229 replica.cpp:511] Replica received write request for position 4
> I0224 18:51:17.625727 30229 leveldb.cpp:343] Persisting action (16 bytes) to leveldb took 259831ns
> I0224 18:51:17.626122 30229 replica.cpp:679] Persisted action at 4
> I0224 18:51:17.627686 30227 replica.cpp:658] Replica received learned notice for position 4
> I0224 18:51:17.628228 30227 leveldb.cpp:343] Persisting action (18 bytes) to leveldb took 93777ns
> I0224 18:51:17.628785 30227 leveldb.cpp:401] Deleting ~2 keys from leveldb took 57660ns
> I0224 18:51:17.629176 30227 replica.cpp:679] Persisted action at 4
> I0224 18:51:17.629443 30227 replica.cpp:664] Replica learned 3 action at position 4
> I0224 18:51:17.636715 30213 sched.cpp:157] Version: 0.23.0
> I0224 18:51:17.638003 30229 sched.cpp:254] New master detected at master@192.168.122.135:44950
> I0224 18:51:17.638602 30229 sched.cpp:310] Authenticating with master master@192.168.122.135:44950
> I0224 18:51:17.639024 30229 sched.cpp:317] Using default CRAM-MD5 authenticatee
> I0224 18:51:17.639580 30228 authenticatee.hpp:139] Creating new client SASL connection
> I0224 18:51:17.640455 30235 process.cpp:2117] Dropped / Lost event for PID: scheduler-11bb6bcb-cd51-4927-a28b-dbca9d63772f@192.168.122.135:44950
> I0224 18:51:17.641150 30228 master.cpp:3813] Authenticating scheduler-fc72e828-0783-41b6-9892-ffc961e8567e@192.168.122.135:44950
> I0224 18:51:17.641597 30228 master.cpp:3824] Using default CRAM-MD5 authenticator
> I0224 18:51:17.642643 30228 authenticator.hpp:170] Creating new server SASL connection
> I0224 18:51:17.643698 30234 authenticatee.hpp:230] Received SASL authentication mechanisms: CRAM-MD5
> I0224 18:51:17.644296 30234 authenticatee.hpp:256] Attempting to authenticate with mechanism 'CRAM-MD5'
> I0224 18:51:17.644739 30228 authenticator.hpp:276] Received SASL authentication start
> I0224 18:51:17.645143 30228 authenticator.hpp:398] Authentication requires more steps
> I0224 18:51:17.645654 30230 authenticatee.hpp:276] Received SASL authentication step
> I0224 18:51:17.646122 30228 authenticator.hpp:304] Received SASL authentication step
> I0224 18:51:17.646421 30228 auxprop.cpp:99] Request to lookup properties for user: 'test-principal' realm: 'fedora-19' server FQDN: 'fedora-19' SASL_AUXPROP_VERIFY_AGAINST_HASH: false SASL_AUXPROP_OVERRIDE: false SASL_AUXPROP_AUTHZID: false 
> I0224 18:51:17.646746 30228 auxprop.cpp:171] Looking up auxiliary property '*userPassword'
> I0224 18:51:17.647203 30228 auxprop.cpp:171] Looking up auxiliary property '*cmusaslsecretCRAM-MD5'
> I0224 18:51:17.647644 30228 auxprop.cpp:99] Request to lookup properties for user: 'test-principal' realm: 'fedora-19' server FQDN: 'fedora-19' SASL_AUXPROP_VERIFY_AGAINST_HASH: false SASL_AUXPROP_OVERRIDE: false SASL_AUXPROP_AUTHZID: true 
> I0224 18:51:17.648454 30228 auxprop.cpp:121] Skipping auxiliary property '*userPassword' since SASL_AUXPROP_AUTHZID == true
> I0224 18:51:17.648788 30228 auxprop.cpp:121] Skipping auxiliary property '*cmusaslsecretCRAM-MD5' since SASL_AUXPROP_AUTHZID == true
> I0224 18:51:17.649210 30228 authenticator.hpp:390] Authentication success
> I0224 18:51:17.649705 30231 authenticatee.hpp:316] Authentication success
> I0224 18:51:17.653314 30231 sched.cpp:398] Successfully authenticated with master master@192.168.122.135:44950
> I0224 18:51:17.653766 30232 master.cpp:3871] Successfully authenticated principal 'test-principal' at scheduler-fc72e828-0783-41b6-9892-ffc961e8567e@192.168.122.135:44950
> I0224 18:51:17.654683 30231 sched.cpp:521] Sending registration request to master@192.168.122.135:44950
> I0224 18:51:17.655138 30231 sched.cpp:554] Will retry registration in 1.028970132secs if necessary
> I0224 18:51:17.657112 30232 master.cpp:1574] Received registration request for framework 'default' at scheduler-fc72e828-0783-41b6-9892-ffc961e8567e@192.168.122.135:44950
> I0224 18:51:17.658509 30232 master.cpp:1435] Authorizing framework principal 'test-principal' to receive offers for role '*'
> I0224 18:51:17.659765 30232 master.cpp:1638] Registering framework 20150224-185117-2272962752-44950-30213-0000 (default) at scheduler-fc72e828-0783-41b6-9892-ffc961e8567e@192.168.122.135:44950
> I0224 18:51:17.660727 30233 hierarchical.hpp:321] Added framework 20150224-185117-2272962752-44950-30213-0000
> I0224 18:51:17.661730 30233 hierarchical.hpp:741] Performed allocation for 1 slaves in 529369ns
> I0224 18:51:17.662911 30229 sched.cpp:448] Framework registered with 20150224-185117-2272962752-44950-30213-0000
> I0224 18:51:17.663374 30229 sched.cpp:462] Scheduler::registered took 35637ns
> I0224 18:51:17.664552 30232 master.cpp:3755] Sending 1 offers to framework 20150224-185117-2272962752-44950-30213-0000 (default) at scheduler-fc72e828-0783-41b6-9892-ffc961e8567e@192.168.122.135:44950
> I0224 18:51:17.668009 30234 sched.cpp:611] Scheduler::resourceOffers took 2.574292ms
> I0224 18:51:17.671038 30232 master.cpp:2268] Processing ACCEPT call for offers: [ 20150224-185117-2272962752-44950-30213-O0 ] on slave 20150224-185117-2272962752-44950-30213-S0 at slave(138)@192.168.122.135:44950 (fedora-19) for framework 20150224-185117-2272962752-44950-30213-0000 (default) at scheduler-fc72e828-0783-41b6-9892-ffc961e8567e@192.168.122.135:44950
> I0224 18:51:17.672071 30232 master.cpp:2112] Authorizing framework principal 'test-principal' to launch task 1 as user 'jenkins'
> W0224 18:51:17.674675 30232 validation.cpp:326] Executor default for task 1 uses less CPUs (None) than the minimum required (0.01). Please update your executor, as this will be mandatory in future releases.
> W0224 18:51:17.675395 30232 validation.cpp:338] Executor default for task 1 uses less memory (None) than the minimum required (32MB). Please update your executor, as this will be mandatory in future releases.
> I0224 18:51:17.676460 30232 master.hpp:822] Adding task 1 with resources cpus(*):2; mem(*):1024; disk(*):1024; ports(*):[31000-32000] on slave 20150224-185117-2272962752-44950-30213-S0 (fedora-19)
> I0224 18:51:17.677078 30232 master.cpp:2545] Launching task 1 of framework 20150224-185117-2272962752-44950-30213-0000 (default) at scheduler-fc72e828-0783-41b6-9892-ffc961e8567e@192.168.122.135:44950 with resources cpus(*):2; mem(*):1024; disk(*):1024; ports(*):[31000-32000] on slave 20150224-185117-2272962752-44950-30213-S0 at slave(138)@192.168.122.135:44950 (fedora-19)
> I0224 18:51:17.678084 30230 slave.cpp:1121] Got assigned task 1 for framework 20150224-185117-2272962752-44950-30213-0000
> I0224 18:51:17.680057 30230 slave.cpp:1231] Launching task 1 for framework 20150224-185117-2272962752-44950-30213-0000
> I0224 18:51:17.684798 30230 slave.cpp:4177] Launching executor default of framework 20150224-185117-2272962752-44950-30213-0000 in work directory '/tmp/MasterTest_ShutdownFrameworkWhileTaskRunning_lRugms/slaves/20150224-185117-2272962752-44950-30213-S0/frameworks/20150224-185117-2272962752-44950-30213-0000/executors/default/runs/675638b4-5214-449d-96d8-c50551496152'
> I0224 18:51:17.688701 30230 exec.cpp:132] Version: 0.23.0
> I0224 18:51:17.689615 30234 exec.cpp:182] Executor started at: executor(41)@192.168.122.135:44950 with pid 30213
> I0224 18:51:17.690659 30230 slave.cpp:1379] Queuing task '1' for executor default of framework '20150224-185117-2272962752-44950-30213-0000
> I0224 18:51:17.691382 30230 slave.cpp:577] Successfully attached file '/tmp/MasterTest_ShutdownFrameworkWhileTaskRunning_lRugms/slaves/20150224-185117-2272962752-44950-30213-S0/frameworks/20150224-185117-2272962752-44950-30213-0000/executors/default/runs/675638b4-5214-449d-96d8-c50551496152'
> I0224 18:51:17.691813 30230 slave.cpp:2140] Got registration for executor 'default' of framework 20150224-185117-2272962752-44950-30213-0000 from executor(41)@192.168.122.135:44950
> I0224 18:51:17.692772 30231 exec.cpp:206] Executor registered on slave 20150224-185117-2272962752-44950-30213-S0
> I0224 18:51:17.695121 30231 exec.cpp:218] Executor::registered took 80811ns
> I0224 18:51:17.697582 30230 slave.cpp:3132] Monitoring executor 'default' of framework '20150224-185117-2272962752-44950-30213-0000' in container '675638b4-5214-449d-96d8-c50551496152'
> I0224 18:51:17.699354 30230 slave.cpp:1533] Sending queued task '1' to executor 'default' of framework 20150224-185117-2272962752-44950-30213-0000
> I0224 18:51:17.700932 30227 exec.cpp:293] Executor asked to run task '1'
> I0224 18:51:17.701679 30227 exec.cpp:302] Executor::launchTask took 140355ns
> I0224 18:51:17.705504 30227 exec.cpp:525] Executor sending status update TASK_RUNNING (UUID: 57877913-d602-445c-a0d9-87fc71e8eca5) for task 1 of framework 20150224-185117-2272962752-44950-30213-0000
> I0224 18:51:17.707149 30228 slave.cpp:2507] Handling status update TASK_RUNNING (UUID: 57877913-d602-445c-a0d9-87fc71e8eca5) for task 1 of framework 20150224-185117-2272962752-44950-30213-0000 from executor(41)@192.168.122.135:44950
> I0224 18:51:17.708539 30228 status_update_manager.cpp:317] Received status update TASK_RUNNING (UUID: 57877913-d602-445c-a0d9-87fc71e8eca5) for task 1 of framework 20150224-185117-2272962752-44950-30213-0000
> I0224 18:51:17.709377 30228 status_update_manager.cpp:494] Creating StatusUpdate stream for task 1 of framework 20150224-185117-2272962752-44950-30213-0000
> I0224 18:51:17.710360 30228 status_update_manager.cpp:371] Forwarding update TASK_RUNNING (UUID: 57877913-d602-445c-a0d9-87fc71e8eca5) for task 1 of framework 20150224-185117-2272962752-44950-30213-0000 to the slave
> I0224 18:51:17.711405 30233 slave.cpp:2750] Forwarding the update TASK_RUNNING (UUID: 57877913-d602-445c-a0d9-87fc71e8eca5) for task 1 of framework 20150224-185117-2272962752-44950-30213-0000 to master@192.168.122.135:44950
> I0224 18:51:17.712425 30233 master.cpp:3295] Status update TASK_RUNNING (UUID: 57877913-d602-445c-a0d9-87fc71e8eca5) for task 1 of framework 20150224-185117-2272962752-44950-30213-0000 from slave 20150224-185117-2272962752-44950-30213-S0 at slave(138)@192.168.122.135:44950 (fedora-19)
> I0224 18:51:17.713047 30233 master.cpp:3336] Forwarding status update TASK_RUNNING (UUID: 57877913-d602-445c-a0d9-87fc71e8eca5) for task 1 of framework 20150224-185117-2272962752-44950-30213-0000
> I0224 18:51:17.713930 30233 master.cpp:4615] Updating the latest state of task 1 of framework 20150224-185117-2272962752-44950-30213-0000 to TASK_RUNNING
> I0224 18:51:17.714588 30232 sched.cpp:717] Scheduler::statusUpdate took 118286ns
> I0224 18:51:17.715512 30213 sched.cpp:1589] Asked to stop the driver
> I0224 18:51:17.718159 30232 master.cpp:2782] Forwarding status update acknowledgement 57877913-d602-445c-a0d9-87fc71e8eca5 for task 1 of framework 20150224-185117-2272962752-44950-30213-0000 (default) at scheduler-fc72e828-0783-41b6-9892-ffc961e8567e@192.168.122.135:44950 to slave 20150224-185117-2272962752-44950-30213-S0 at slave(138)@192.168.122.135:44950 (fedora-19)
> I0224 18:51:17.718691 30234 sched.cpp:831] Stopping framework '20150224-185117-2272962752-44950-30213-0000'
> I0224 18:51:17.721380 30232 master.cpp:1898] Asked to unregister framework 20150224-185117-2272962752-44950-30213-0000
> I0224 18:51:17.722920 30232 master.cpp:4183] Removing framework 20150224-185117-2272962752-44950-30213-0000 (default) at scheduler-fc72e828-0783-41b6-9892-ffc961e8567e@192.168.122.135:44950
> I0224 18:51:17.725231 30231 hierarchical.hpp:400] Deactivated framework 20150224-185117-2272962752-44950-30213-0000
> I0224 18:51:17.725734 30229 slave.cpp:1746] Asked to shut down framework 20150224-185117-2272962752-44950-30213-0000 by master@192.168.122.135:44950
> I0224 18:51:17.726658 30229 slave.cpp:1771] Shutting down framework 20150224-185117-2272962752-44950-30213-0000
> I0224 18:51:17.727322 30229 slave.cpp:3440] Shutting down executor 'default' of framework 20150224-185117-2272962752-44950-30213-0000
> I0224 18:51:17.728742 30228 status_update_manager.cpp:389] Received status update acknowledgement (UUID: 57877913-d602-445c-a0d9-87fc71e8eca5) for task 1 of framework 20150224-185117-2272962752-44950-30213-0000
> I0224 18:51:17.729042 30231 slave.cpp:2677] Status update manager successfully handled status update TASK_RUNNING (UUID: 57877913-d602-445c-a0d9-87fc71e8eca5) for task 1 of framework 20150224-185117-2272962752-44950-30213-0000
> I0224 18:51:17.730578 30231 slave.cpp:2683] Sending acknowledgement for status update TASK_RUNNING (UUID: 57877913-d602-445c-a0d9-87fc71e8eca5) for task 1 of framework 20150224-185117-2272962752-44950-30213-0000 to executor(41)@192.168.122.135:44950
> I0224 18:51:17.731300 30231 slave.cpp:3510] Killing executor 'default' of framework 20150224-185117-2272962752-44950-30213-0000
> I0224 18:51:17.733461 30232 master.cpp:4615] Updating the latest state of task 1 of framework 20150224-185117-2272962752-44950-30213-0000 to TASK_KILLED
> I0224 18:51:17.734503 30231 slave.cpp:3190] Executor 'default' of framework 20150224-185117-2272962752-44950-30213-0000 exited with status 0
> I0224 18:51:17.736177 30231 slave.cpp:3299] Cleaning up executor 'default' of framework 20150224-185117-2272962752-44950-30213-0000
> I0224 18:51:17.737277 30233 gc.cpp:56] Scheduling '/tmp/MasterTest_ShutdownFrameworkWhileTaskRunning_lRugms/slaves/20150224-185117-2272962752-44950-30213-S0/frameworks/20150224-185117-2272962752-44950-30213-0000/executors/default/runs/675638b4-5214-449d-96d8-c50551496152' for gc 6.99999146853037days in the future
> I0224 18:51:17.738636 30231 slave.cpp:3378] Cleaning up framework 20150224-185117-2272962752-44950-30213-0000
> I0224 18:51:17.739148 30228 gc.cpp:56] Scheduling '/tmp/MasterTest_ShutdownFrameworkWhileTaskRunning_lRugms/slaves/20150224-185117-2272962752-44950-30213-S0/frameworks/20150224-185117-2272962752-44950-30213-0000/executors/default' for gc 6.99999145243259days in the future
> I0224 18:51:17.740373 30228 status_update_manager.cpp:279] Closing status update streams for framework 20150224-185117-2272962752-44950-30213-0000
> I0224 18:51:17.741170 30228 status_update_manager.cpp:525] Cleaning up status update stream for task 1 of framework 20150224-185117-2272962752-44950-30213-0000
> I0224 18:51:17.742255 30229 gc.cpp:56] Scheduling '/tmp/MasterTest_ShutdownFrameworkWhileTaskRunning_lRugms/slaves/20150224-185117-2272962752-44950-30213-S0/frameworks/20150224-185117-2272962752-44950-30213-0000' for gc 6.99999141055704days in the future
> I0224 18:51:17.743207 30231 slave.cpp:2080] Status update manager successfully handled status update acknowledgement (UUID: 57877913-d602-445c-a0d9-87fc71e8eca5) for task 1 of framework 20150224-185117-2272962752-44950-30213-0000
> E0224 18:51:17.743799 30231 slave.cpp:2091] Status update acknowledgement (UUID: 57877913-d602-445c-a0d9-87fc71e8eca5) for task 1 of unknown framework 20150224-185117-2272962752-44950-30213-0000
> I0224 18:51:17.744699 30233 hierarchical.hpp:648] Recovered cpus(*):2; mem(*):1024; disk(*):1024; ports(*):[31000-32000] (total allocatable: cpus(*):2; mem(*):1024; disk(*):1024; ports(*):[31000-32000]) on slave 20150224-185117-2272962752-44950-30213-S0 from framework 20150224-185117-2272962752-44950-30213-0000
> I0224 18:51:17.746369 30232 master.cpp:4682] Removing task 1 with resources cpus(*):2; mem(*):1024; disk(*):1024; ports(*):[31000-32000] of framework 20150224-185117-2272962752-44950-30213-0000 on slave 20150224-185117-2272962752-44950-30213-S0 at slave(138)@192.168.122.135:44950 (fedora-19)
> I0224 18:51:17.747835 30232 master.cpp:4711] Removing executor 'default' with resources  of framework 20150224-185117-2272962752-44950-30213-0000 on slave 20150224-185117-2272962752-44950-30213-S0 at slave(138)@192.168.122.135:44950 (fedora-19)
> I0224 18:51:17.749958 30230 hierarchical.hpp:354] Removed framework 20150224-185117-2272962752-44950-30213-0000
> W0224 18:51:17.754004 30232 master.cpp:3382] Ignoring unknown exited executor 'default' of framework 20150224-185117-2272962752-44950-30213-0000 on slave 20150224-185117-2272962752-44950-30213-S0 at slave(138)@192.168.122.135:44950 (fedora-19)
> I0224 18:51:17.806908 30235 process.cpp:2117] Dropped / Lost event for PID: hierarchical-allocator(155)@192.168.122.135:44950
> I0224 18:51:18.169684 30235 process.cpp:2117] Dropped / Lost event for PID: hierarchical-allocator(156)@192.168.122.135:44950
> I0224 18:51:18.277674 30235 process.cpp:2117] Dropped / Lost event for PID: scheduler-279c11f0-4f87-4922-b09c-e75af333d93e@192.168.122.135:44950
> I0224 18:51:18.435956 30229 hierarchical.hpp:834] No resources available to allocate!
> I0224 18:51:18.438434 30229 hierarchical.hpp:741] Performed allocation for 1 slaves in 2.913091ms
> I0224 18:51:18.687819 30235 process.cpp:2117] Dropped / Lost event for PID: scheduler-fc72e828-0783-41b6-9892-ffc961e8567e@192.168.122.135:44950
> I0224 18:51:18.840509 30235 process.cpp:2117] Dropped / Lost event for PID: scheduler-11bb6bcb-cd51-4927-a28b-dbca9d63772f@192.168.122.135:44950
> I0224 18:51:19.440609 30233 hierarchical.hpp:834] No resources available to allocate!
> I0224 18:51:19.444022 30233 hierarchical.hpp:741] Performed allocation for 1 slaves in 3.667907ms
> I0224 18:51:20.445341 30229 hierarchical.hpp:834] No resources available to allocate!
> I0224 18:51:20.448892 30229 hierarchical.hpp:741] Performed allocation for 1 slaves in 3.787334ms
> I0224 18:51:20.840729 30235 process.cpp:2117] Dropped / Lost event for PID: slave(133)@192.168.122.135:44950
> I0224 18:51:20.895016 30235 process.cpp:2117] Dropped / Lost event for PID: scheduler-deb787a0-9e87-42c9-813e-fc35f354582f@192.168.122.135:44950
> I0224 18:51:21.016639 30235 process.cpp:2117] Dropped / Lost event for PID: slave(133)@192.168.122.135:44950
> I0224 18:51:21.258066 30235 process.cpp:2117] Dropped / Lost event for PID: slave(134)@192.168.122.135:44950
> I0224 18:51:21.312721 30235 process.cpp:2117] Dropped / Lost event for PID: scheduler-7a62b2f4-6959-49de-9fd8-72ffd048f4e3@192.168.122.135:44950
> I0224 18:51:21.450574 30230 hierarchical.hpp:834] No resources available to allocate!
> I0224 18:51:21.451004 30230 hierarchical.hpp:741] Performed allocation for 1 slaves in 761280ns
> I0224 18:51:21.557883 30235 process.cpp:2117] Dropped / Lost event for PID: slave(135)@192.168.122.135:44950
> I0224 18:51:21.611552 30235 process.cpp:2117] Dropped / Lost event for PID: scheduler-279c11f0-4f87-4922-b09c-e75af333d93e@192.168.122.135:44950
> I0224 18:51:21.709940 30235 process.cpp:2117] Dropped / Lost event for PID: slave(135)@192.168.122.135:44950
> I0224 18:51:21.915220 30235 process.cpp:2117] Dropped / Lost event for PID: slave(136)@192.168.122.135:44950
> I0224 18:51:21.997714 30235 process.cpp:2117] Dropped / Lost event for PID: scheduler-11bb6bcb-cd51-4927-a28b-dbca9d63772f@192.168.122.135:44950
> I0224 18:51:22.107311 30235 process.cpp:2117] Dropped / Lost event for PID: slave(136)@192.168.122.135:44950
> I0224 18:51:22.219341 30235 process.cpp:2117] Dropped / Lost event for PID: scheduler-11bb6bcb-cd51-4927-a28b-dbca9d63772f@192.168.122.135:44950
> I0224 18:51:22.269714 30235 process.cpp:2117] Dropped / Lost event for PID: slave(137)@192.168.122.135:44950
> I0224 18:51:22.453269 30229 hierarchical.hpp:834] No resources available to allocate!
> I0224 18:51:22.457568 30229 hierarchical.hpp:741] Performed allocation for 1 slaves in 4.67818ms
> I0224 18:51:22.644316 30235 process.cpp:2117] Dropped / Lost event for PID: scheduler-fc72e828-0783-41b6-9892-ffc961e8567e@192.168.122.135:44950
> I0224 18:51:23.459383 30231 hierarchical.hpp:834] No resources available to allocate!
> I0224 18:51:23.462417 30231 hierarchical.hpp:741] Performed allocation for 1 slaves in 3.417923ms
> I0224 18:51:24.464651 30228 hierarchical.hpp:834] No resources available to allocate!
> I0224 18:51:24.468094 30228 hierarchical.hpp:741] Performed allocation for 1 slaves in 3.698248ms
> I0224 18:51:25.469254 30232 hierarchical.hpp:834] No resources available to allocate!
> I0224 18:51:25.472430 30232 hierarchical.hpp:741] Performed allocation for 1 slaves in 3.477698ms
> I0224 18:51:25.971513 30235 process.cpp:2117] Dropped / Lost event for PID: (2965)@192.168.122.135:44950
> I0224 18:51:26.474663 30234 hierarchical.hpp:834] No resources available to allocate!
> I0224 18:51:26.475232 30234 hierarchical.hpp:741] Performed allocation for 1 slaves in 942399ns
> I0224 18:51:26.672420 30235 process.cpp:2117] Dropped / Lost event for PID: (2996)@192.168.122.135:44950
> I0224 18:51:27.069792 30235 process.cpp:2117] Dropped / Lost event for PID: (3010)@192.168.122.135:44950
> I0224 18:51:27.476572 30228 hierarchical.hpp:834] No resources available to allocate!
> I0224 18:51:27.479708 30228 hierarchical.hpp:741] Performed allocation for 1 slaves in 3.419391ms
> I0224 18:51:28.481403 30228 hierarchical.hpp:834] No resources available to allocate!
> I0224 18:51:28.484798 30228 hierarchical.hpp:741] Performed allocation for 1 slaves in 3.709639ms
> I0224 18:51:29.487264 30228 hierarchical.hpp:834] No resources available to allocate!
> I0224 18:51:29.491909 30228 hierarchical.hpp:741] Performed allocation for 1 slaves in 5.065187ms
> I0224 18:51:29.623121 30235 process.cpp:2117] Dropped / Lost event for PID: __waiter__(739)@192.168.122.135:44950
> I0224 18:51:29.641655 30235 process.cpp:2117] Dropped / Lost event for PID: slave-observer(120)@192.168.122.135:44950
> I0224 18:51:29.647222 30235 process.cpp:2117] Dropped / Lost event for PID: __waiter__(740)@192.168.122.135:44950
> I0224 18:51:30.493526 30232 hierarchical.hpp:834] No resources available to allocate!
> I0224 18:51:30.496922 30232 hierarchical.hpp:741] Performed allocation for 1 slaves in 3.714894ms
> I0224 18:51:30.670241 30235 process.cpp:2117] Dropped / Lost event for PID: __waiter__(741)@192.168.122.135:44950
> I0224 18:51:30.670737 30235 process.cpp:2117] Dropped / Lost event for PID: __waiter__(742)@192.168.122.135:44950
> I0224 18:51:30.882052 30235 process.cpp:2117] Dropped / Lost event for PID: slave-observer(121)@192.168.122.135:44950
> I0224 18:51:30.922494 30235 process.cpp:2117] Dropped / Lost event for PID: __waiter__(744)@192.168.122.135:44950
> I0224 18:51:30.985663 30235 process.cpp:2117] Dropped / Lost event for PID: __waiter__(745)@192.168.122.135:44950
> I0224 18:51:31.293728 30235 process.cpp:2117] Dropped / Lost event for PID: slave-observer(122)@192.168.122.135:44950
> I0224 18:51:31.328766 30235 process.cpp:2117] Dropped / Lost event for PID: __waiter__(747)@192.168.122.135:44950
> I0224 18:51:31.497968 30234 hierarchical.hpp:834] No resources available to allocate!
> I0224 18:51:31.501803 30234 hierarchical.hpp:741] Performed allocation for 1 slaves in 4.13526ms
> I0224 18:51:31.604324 30235 process.cpp:2117] Dropped / Lost event for PID: slave-observer(123)@192.168.122.135:44950
> I0224 18:51:31.645259 30235 process.cpp:2117] Dropped / Lost event for PID: __waiter__(749)@192.168.122.135:44950
> I0224 18:51:31.676254 30235 process.cpp:2117] Dropped / Lost event for PID: __waiter__(750)@192.168.122.135:44950
> I0224 18:51:31.913120 30235 process.cpp:2117] Dropped / Lost event for PID: __waiter__(752)@192.168.122.135:44950
> I0224 18:51:31.957001 30235 process.cpp:2117] Dropped / Lost event for PID: slave-observer(124)@192.168.122.135:44950
> I0224 18:51:31.996151 30235 process.cpp:2117] Dropped / Lost event for PID: __waiter__(753)@192.168.122.135:44950
> I0224 18:51:32.033216 30235 process.cpp:2117] Dropped / Lost event for PID: __waiter__(754)@192.168.122.135:44950
> I0224 18:51:32.231158 30235 process.cpp:2117] Dropped / Lost event for PID: __waiter__(756)@192.168.122.135:44950
> I0224 18:51:32.267324 30235 process.cpp:2117] Dropped / Lost event for PID: __waiter__(757)@192.168.122.135:44950
> I0224 18:51:32.503401 30227 hierarchical.hpp:834] No resources available to allocate!
> I0224 18:51:32.503978 30227 hierarchical.hpp:741] Performed allocation for 1 slaves in 1.062132ms
> I0224 18:51:32.621012 30232 slave.cpp:2830] Received ping from slave-observer(125)@192.168.122.135:44950
> I0224 18:51:32.658608 30235 process.cpp:2117] Dropped / Lost event for PID: __waiter__(759)@192.168.122.135:44950
> I0224 18:51:32.671058 30235 process.cpp:2117] Dropped / Lost event for PID: __waiter__(760)@192.168.122.135:44950
> I0224 18:51:32.719501 30235 process.cpp:2117] Dropped / Lost event for PID: __waiter__(761)@192.168.122.135:44950
> tests/master_tests.cpp:259: Failure
> Failed to wait 15secs for shutdown
> I0224 18:51:32.737337 30213 process.cpp:2117] Dropped / Lost event for PID: scheduler-fc72e828-0783-41b6-9892-ffc961e8567e@192.168.122.135:44950
> tests/master_tests.cpp:249: Failure
> Actual function call count doesn't match EXPECT_CALL(exec, shutdown(_))...
>          Expected: to be called once
>            Actual: never called - unsatisfied and active
> I0224 18:51:32.741822 30229 master.cpp:787] Master terminating
> I0224 18:51:32.750730 30234 slave.cpp:2915] master@192.168.122.135:44950 exited
> W0224 18:51:32.751667 30234 slave.cpp:2918] Master disconnected! Waiting for a new master to be elected
> I0224 18:51:32.769243 30213 process.cpp:2117] Dropped / Lost event for PID: master@192.168.122.135:44950
> I0224 18:51:32.770519 30213 process.cpp:2117] Dropped / Lost event for PID: master@192.168.122.135:44950
> *** Aborted at 1424832692 (unix time) try "date -d @1424832692" if you are using GNU date ***
> PC: @          0x4612540 (unknown)
> *** SIGSEGV (@0x4612540) received by PID 30213 (TID 0x7fc7f2fa6880) from PID 73475392; stack trace: ***
>     @       0x3aa2a0efa0 (unknown)
>     @          0x4612540 (unknown)
> make[3]: *** [check-local] Segmentation fault (core dumped)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)