You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@mesos.apache.org by "Vinod Kone (JIRA)" <ji...@apache.org> on 2019/02/15 22:46:00 UTC

[jira] [Commented] (MESOS-8892) MasterSlaveReconciliationTest.ReconcileDroppedOperation is flaky

    [ https://issues.apache.org/jira/browse/MESOS-8892?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16769822#comment-16769822 ] 

Vinod Kone commented on MESOS-8892:
-----------------------------------

Observed this on 1.6.x branch

{code}
[ RUN      ] MasterSlaveReconciliationTest.ReconcileDroppedOperation
I0215 21:36:18.921594  4052 cluster.cpp:172] Creating default 'local' authorizer
I0215 21:36:18.922894  4057 master.cpp:465] Master 21d3c979-83c3-4141-9a3a-635fd550d45a (ip-172-16-10-236.ec2.internal) started on 172.16.10.236:36326
I0215 21:36:18.922915  4057 master.cpp:468] Flags at startup: --acls="" --agent_ping_timeout="15secs" --agent_reregister_timeout="10mins" --allocation_interval="1secs" --allocator
="hierarchical" --authenticate_agents="true" --authenticate_frameworks="true" --authenticate_http_frameworks="true" --authenticate_http_readonly="true" --authenticate_http_readwri
te="true" --authentication_v0_timeout="15secs" --authenticators="crammd5" --authorizers="local" --credentials="/tmp/exYTvt/credentials" --filter_gpu_resources="true" --framework_s
orter="drf" --help="false" --hostname_lookup="true" --http_authenticators="basic" --http_framework_authenticators="basic" --initialize_driver_logging="true" --log_auto_initialize=
"true" --logbufsecs="0" --logging_level="INFO" --max_agent_ping_timeouts="5" --max_completed_frameworks="50" --max_completed_tasks_per_framework="1000" --max_unreachable_tasks_per
_framework="1000" --memory_profiling="false" --min_allocatable_resources="cpus:0.01|mem:32" --port="5050" --quiet="false" --recovery_agent_removal_limit="100%" --registry="in_memo
ry" --registry_fetch_timeout="1mins" --registry_gc_interval="15mins" --registry_max_agent_age="2weeks" --registry_max_agent_count="102400" --registry_store_timeout="100secs" --reg
istry_strict="false" --require_agent_domain="false" --role_sorter="drf" --root_submissions="true" --version="false" --webui_dir="/usr/local/share/mesos/webui" --work_dir="/tmp/exY
Tvt/master" --zk_session_timeout="10secs"
I0215 21:36:18.923121  4057 master.cpp:517] Master only allowing authenticated frameworks to register
I0215 21:36:18.923393  4057 master.cpp:523] Master only allowing authenticated agents to register
I0215 21:36:18.923408  4057 master.cpp:529] Master only allowing authenticated HTTP frameworks to register
I0215 21:36:18.923414  4057 credentials.hpp:37] Loading credentials for authentication from '/tmp/exYTvt/credentials'
I0215 21:36:18.923651  4057 master.cpp:573] Using default 'crammd5' authenticator
I0215 21:36:18.923777  4057 http.cpp:959] Creating default 'basic' HTTP authenticator for realm 'mesos-master-readonly'
I0215 21:36:18.923904  4057 http.cpp:959] Creating default 'basic' HTTP authenticator for realm 'mesos-master-readwrite'
I0215 21:36:18.924266  4057 http.cpp:959] Creating default 'basic' HTTP authenticator for realm 'mesos-master-scheduler'
I0215 21:36:18.924465  4057 master.cpp:654] Authorization enabled
I0215 21:36:18.924823  4056 hierarchical.cpp:179] Initialized hierarchical allocator process
I0215 21:36:18.927826  4058 whitelist_watcher.cpp:77] No whitelist given
I0215 21:36:18.928741  4054 master.cpp:2176] Elected as the leading master!
I0215 21:36:18.928759  4054 master.cpp:1711] Recovering from registrar
I0215 21:36:18.928800  4054 registrar.cpp:339] Recovering registrar
I0215 21:36:18.929002  4054 registrar.cpp:383] Successfully fetched the registry (0B) in 132096ns
I0215 21:36:18.929033  4054 registrar.cpp:487] Applied 1 operations in 7184ns; attempting to update the registry
I0215 21:36:18.929154  4058 registrar.cpp:544] Successfully updated the registry in 108032ns
I0215 21:36:18.929232  4058 registrar.cpp:416] Successfully recovered registrar
I0215 21:36:18.929361  4055 master.cpp:1825] Recovered 0 agents from the registry (176B); allowing 10mins for agents to reregister
I0215 21:36:18.929415  4055 hierarchical.cpp:217] Skipping recovery of hierarchical allocator: nothing to recover
W0215 21:36:18.931118  4052 process.cpp:2829] Attempted to spawn already running process files@172.16.10.236:36326
I0215 21:36:18.931596  4052 containerizer.cpp:300] Using isolation { environment_secret, posix/cpu, posix/mem, filesystem/posix, network/cni }
I0215 21:36:18.934453  4052 linux_launcher.cpp:147] Using /sys/fs/cgroup/freezer as the freezer hierarchy for the Linux launcher
I0215 21:36:18.934859  4052 provisioner.cpp:299] Using default backend 'aufs'
I0215 21:36:18.935410  4052 cluster.cpp:460] Creating default 'local' authorizer
I0215 21:36:18.936164  4060 slave.cpp:259] Mesos agent started on (230)@172.16.10.236:36326
W0215 21:36:18.936399  4052 process.cpp:2829] Attempted to spawn already running process version@172.16.10.236:36326
I0215 21:36:18.936187  4060 slave.cpp:260] Flags at startup: --acls="" --appc_simple_discovery_uri_prefix="http://" --appc_store_dir="/tmp/exYTvt/GHfic5/store/appc" --authenticate
_http_executors="true" --authenticate_http_readonly="true" --authenticate_http_readwrite="true" --authenticatee="crammd5" --authentication_backoff_factor="1secs" --authentication_
timeout_max="1mins" --authentication_timeout_min="5secs" --authorizer="local" --cgroups_cpu_enable_pids_and_tids_count="false" --cgroups_enable_cfs="false" --cgroups_hierarchy="/sys/fs/cgroup" --cgroups_limit_swap="false" --cgroups_root="mesos" --container_disk_watch_interval="15secs" --containerizers="mesos" --credential="/tmp/exYTvt/GHfic5/credential" --default_role="*" --disallow_sharing_agent_pid_namespace="false" --disk_watch_interval="1mins" --docker="docker" --docker_kill_orphans="true" --docker_registry="https://registry-1.docker.io" --docker_remove_delay="6hrs" --docker_socket="/var/run/docker.sock" --docker_stop_timeout="0ns" --docker_store_dir="/tmp/exYTvt/GHfic5/store/docker" --docker_volume_checkpoint_dir="/var/run/mesos/isolators/docker/volume" --enforce_container_disk_quota="false" --executor_registration_timeout="1mins" --executor_reregistration_timeout="2secs" --executor_shutdown_grace_period="5secs" --fetcher_cache_dir="/tmp/exYTvt/GHfic5/fetch" --fetcher_cache_size="2GB" --fetcher_stall_timeout="1mins" --frameworks_home="" --gc_delay="1weeks" --gc_disk_headroom="0.1" --hadoop_home="" --help="false" --hostname_lookup="true" --http_command_executor="false" --http_credentials="/tmp/exYTvt/GHfic5/http_credentials" --http_heartbeat_interval="30secs" --initialize_driver_logging="true" --isolation="posix/cpu,posix/mem" --jwt_secret_key="/tmp/exYTvt/GHfic5/jwt_secret_key" --launcher="linux" --launcher_dir="/home/ubuntu/workspace/mesos/Mesos_CI-build/FLAG/SSL/label/mesos-ec2-ubuntu-14.04/mesos/build/src" --logbufsecs="0" --logging_level="INFO" --max_completed_executors_per_framework="150" --memory_profiling="false" --network_cni_root_dir_persist="false" --oversubscribed_resources_interval="15secs" --perf_duration="10secs" --perf_interval="1mins" --port="5051" --qos_correction_interval_min="0ns" --quiet="false" --reconfiguration_policy="equal" --recover="reconnect" --recovery_timeout="15mins" --registration_backoff_factor="10ms" --resources="cpus:2;gpus:0;mem:1024;disk:1024;ports:[31000-32000]" --revocable_cpu_low_priority="true" --runtime_dir="/tmp/MasterSlaveReconciliationTest_ReconcileDroppedOperati:
I0215 21:36:18.936803  4060 credentials.hpp:86] Loading credential for authentication from '/tmp/exYTvt/GHfic5/credential'
I0215 21:36:18.937058  4060 slave.cpp:292] Agent using credential for: test-principal
I0215 21:36:18.937243  4060 credentials.hpp:37] Loading credentials for authentication from '/tmp/exYTvt/GHfic5/http_credentials'
I0215 21:36:18.937469  4060 http.cpp:959] Creating default 'basic' HTTP authenticator for realm 'mesos-agent-executor'
I0215 21:36:18.937640  4060 http.cpp:980] Creating default 'jwt' HTTP authenticator for realm 'mesos-agent-executor'
I0215 21:36:18.937775  4052 sched.cpp:232] Version: 1.6.2
I0215 21:36:18.938051  4054 sched.cpp:336] New master detected at master@172.16.10.236:36326
I0215 21:36:18.938086  4054 sched.cpp:401] Authenticating with master master@172.16.10.236:36326
I0215 21:36:18.938094  4054 sched.cpp:408] Using default CRAM-MD5 authenticatee
I0215 21:36:18.938236  4058 authenticatee.cpp:121] Creating new client SASL connection
I0215 21:36:18.938653  4060 http.cpp:959] Creating default 'basic' HTTP authenticator for realm 'mesos-agent-readonly'
I0215 21:36:18.938658  4058 master.cpp:9598] Authenticating scheduler-26d16957-4fe3-4c71-a551-8a11d9d84917@172.16.10.236:36326
I0215 21:36:18.938818  4060 http.cpp:980] Creating default 'jwt' HTTP authenticator for realm 'mesos-agent-readonly'
I0215 21:36:18.938910  4057 authenticator.cpp:414] Starting authentication session for crammd5-authenticatee(473)@172.16.10.236:36326
I0215 21:36:18.939082  4057 authenticator.cpp:98] Creating new server SASL connection
I0215 21:36:18.939328  4060 http.cpp:959] Creating default 'basic' HTTP authenticator for realm 'mesos-agent-readwrite'
I0215 21:36:18.939360  4060 http.cpp:980] Creating default 'jwt' HTTP authenticator for realm 'mesos-agent-readwrite'
I0215 21:36:18.939591  4060 disk_profile_adaptor.cpp:80] Creating default disk profile adaptor module
I0215 21:36:18.939776  4058 authenticatee.cpp:213] Received SASL authentication mechanisms: CRAM-MD5
I0215 21:36:18.939793  4058 authenticatee.cpp:239] Attempting to authenticate with mechanism 'CRAM-MD5'
I0215 21:36:18.939973  4057 authenticator.cpp:204] Received SASL authentication start
I0215 21:36:18.940114  4057 authenticator.cpp:326] Authentication requires more steps
I0215 21:36:18.940212  4058 authenticatee.cpp:259] Received SASL authentication step
I0215 21:36:18.940357  4057 authenticator.cpp:232] Received SASL authentication step
I0215 21:36:18.940452  4057 auxprop.cpp:109] Request to lookup properties for user: 'test-principal' realm: 'ip-172-16-10-236.ec2.internal' server FQDN: 'ip-172-16-10-236.ec2.internal' SASL_AUXPROP_VERIFY_AGAINST_HASH: false SASL_AUXPROP_OVERRIDE: false SASL_AUXPROP_AUTHZID: false 
I0215 21:36:18.940551  4057 auxprop.cpp:181] Looking up auxiliary property '*userPassword'
I0215 21:36:18.940645  4057 auxprop.cpp:181] Looking up auxiliary property '*cmusaslsecretCRAM-MD5'
I0215 21:36:18.940759  4057 auxprop.cpp:109] Request to lookup properties for user: 'test-principal' realm: 'ip-172-16-10-236.ec2.internal' server FQDN: 'ip-172-16-10-236.ec2.internal' SASL_AUXPROP_VERIFY_AGAINST_HASH: false SASL_AUXPROP_OVERRIDE: false SASL_AUXPROP_AUTHZID: true 
I0215 21:36:18.940874  4057 auxprop.cpp:131] Skipping auxiliary property '*userPassword' since SASL_AUXPROP_AUTHZID == true
I0215 21:36:18.940984  4057 auxprop.cpp:131] Skipping auxiliary property '*cmusaslsecretCRAM-MD5' since SASL_AUXPROP_AUTHZID == true
I0215 21:36:18.941083  4057 authenticator.cpp:318] Authentication success
I0215 21:36:18.941223  4058 authenticatee.cpp:299] Authentication success
I0215 21:36:18.941262  4058 master.cpp:9630] Successfully authenticated principal 'test-principal' at scheduler-26d16957-4fe3-4c71-a551-8a11d9d84917@172.16.10.236:36326
I0215 21:36:18.941284  4058 authenticator.cpp:432] Authentication session cleanup for crammd5-authenticatee(473)@172.16.10.236:36326
I0215 21:36:18.941448  4053 sched.cpp:513] Successfully authenticated with master master@172.16.10.236:36326
I0215 21:36:18.941666  4053 sched.cpp:817] Sending SUBSCRIBE call to master@172.16.10.236:36326
I0215 21:36:18.941814  4053 sched.cpp:850] Will retry registration in 631.34748ms if necessary
I0215 21:36:18.941965  4054 master.cpp:2941] Received SUBSCRIBE call for framework 'default' at scheduler-26d16957-4fe3-4c71-a551-8a11d9d84917@172.16.10.236:36326
I0215 21:36:18.942229  4054 master.cpp:2248] Authorizing framework principal 'test-principal' to receive offers for roles '{ default-role }'
I0215 21:36:18.942422  4054 master.cpp:3022] Subscribing framework default with checkpointing disabled and capabilities [ MULTI_ROLE, RESERVATION_REFINEMENT ]
I0215 21:36:18.942665  4054 master.cpp:9828] Adding framework 21d3c979-83c3-4141-9a3a-635fd550d45a-0000 (default) at scheduler-26d16957-4fe3-4c71-a551-8a11d9d84917@172.16.10.236:36326 with roles {  } suppressed
I0215 21:36:18.942977  4058 hierarchical.cpp:301] Added framework 21d3c979-83c3-4141-9a3a-635fd550d45a-0000
I0215 21:36:18.943023  4058 hierarchical.cpp:1522] Performed allocation for 0 agents in 10842ns
I0215 21:36:18.943249  4058 sched.cpp:744] Framework registered with 21d3c979-83c3-4141-9a3a-635fd550d45a-0000
I0215 21:36:18.943275  4058 sched.cpp:758] Scheduler::registered took 9502ns
I0215 21:36:18.943986  4060 slave.cpp:607] Agent resources: [{"name":"cpus","scalar":{"value":2.0},"type":"SCALAR"},{"name":"mem","scalar":{"value":1024.0},"type":"SCALAR"},{"name":"disk","scalar":{"value":1024.0},"type":"SCALAR"},{"name":"ports","ranges":{"range":[{"begin":31000,"end":32000}]},"type":"RANGES"}]
I0215 21:36:18.944051  4060 slave.cpp:615] Agent attributes: [  ]
I0215 21:36:18.944057  4060 slave.cpp:624] Agent hostname: ip-172-16-10-236.ec2.internal
I0215 21:36:18.944169  4058 task_status_update_manager.cpp:181] Pausing sending task status updates
I0215 21:36:18.944509  4060 state.cpp:66] Recovering state from '/tmp/MasterSlaveReconciliationTest_ReconcileDroppedOperation_Ibpmpl/meta'
I0215 21:36:18.944571  4060 slave.cpp:6922] Finished recovering checkpointed state from '/tmp/MasterSlaveReconciliationTest_ReconcileDroppedOperation_Ibpmpl/meta', beginning agent recovery
I0215 21:36:18.944607  4060 task_status_update_manager.cpp:207] Recovering task status update manager
I0215 21:36:18.944643  4058 containerizer.cpp:749] Recovering Mesos containers
I0215 21:36:18.944687  4058 linux_launcher.cpp:300] Recovering Linux launcher
I0215 21:36:18.945267  4058 containerizer.cpp:1056] Recovering isolators
I0215 21:36:18.945441  4058 containerizer.cpp:1095] Recovering provisioner
I0215 21:36:18.945570  4058 provisioner.cpp:495] Provisioner recovery complete
I0215 21:36:18.945850  4055 slave.cpp:7151] Recovering executors
I0215 21:36:18.945884  4055 slave.cpp:7304] Finished recovery
I0215 21:36:18.946213  4055 slave.cpp:1267] New master detected at master@172.16.10.236:36326
I0215 21:36:18.946228  4058 task_status_update_manager.cpp:181] Pausing sending task status updates
I0215 21:36:18.946313  4055 slave.cpp:1332] Detecting new master
I0215 21:36:18.949466  4056 slave.cpp:1359] Authenticating with master master@172.16.10.236:36326
I0215 21:36:18.949492  4056 slave.cpp:1368] Using default CRAM-MD5 authenticatee
I0215 21:36:18.949568  4056 authenticatee.cpp:121] Creating new client SASL connection
I0215 21:36:18.950022  4056 master.cpp:9598] Authenticating slave(230)@172.16.10.236:36326
I0215 21:36:18.950577  4056 authenticator.cpp:414] Starting authentication session for crammd5-authenticatee(474)@172.16.10.236:36326
I0215 21:36:18.951122  4056 authenticator.cpp:98] Creating new server SASL connection
I0215 21:36:18.951697  4056 authenticatee.cpp:213] Received SASL authentication mechanisms: CRAM-MD5
I0215 21:36:18.951717  4056 authenticatee.cpp:239] Attempting to authenticate with mechanism 'CRAM-MD5'
I0215 21:36:18.951750  4053 authenticator.cpp:204] Received SASL authentication start
I0215 21:36:18.951781  4053 authenticator.cpp:326] Authentication requires more steps
I0215 21:36:18.951808  4056 authenticatee.cpp:259] Received SASL authentication step
I0215 21:36:18.951838  4056 authenticator.cpp:232] Received SASL authentication step
I0215 21:36:18.951850  4056 auxprop.cpp:109] Request to lookup properties for user: 'test-principal' realm: 'ip-172-16-10-236.ec2.internal' server FQDN: 'ip-172-16-10-236.ec2.internal' SASL_AUXPROP_VERIFY_AGAINST_HASH: false SASL_AUXPROP_OVERRIDE: false SASL_AUXPROP_AUTHZID: false 
I0215 21:36:18.951858  4056 auxprop.cpp:181] Looking up auxiliary property '*userPassword'
I0215 21:36:18.951865  4056 auxprop.cpp:181] Looking up auxiliary property '*cmusaslsecretCRAM-MD5'
I0215 21:36:18.951874  4056 auxprop.cpp:109] Request to lookup properties for user: 'test-principal' realm: 'ip-172-16-10-236.ec2.internal' server FQDN: 'ip-172-16-10-236.ec2.internal' SASL_AUXPROP_VERIFY_AGAINST_HASH: false SASL_AUXPROP_OVERRIDE: false SASL_AUXPROP_AUTHZID: true 
I0215 21:36:18.951880  4056 auxprop.cpp:131] Skipping auxiliary property '*userPassword' since SASL_AUXPROP_AUTHZID == true
I0215 21:36:18.951885  4056 auxprop.cpp:131] Skipping auxiliary property '*cmusaslsecretCRAM-MD5' since SASL_AUXPROP_AUTHZID == true
I0215 21:36:18.951896  4056 authenticator.cpp:318] Authentication success
I0215 21:36:18.951920  4054 authenticatee.cpp:299] Authentication success
I0215 21:36:18.955675  4053 master.cpp:9630] Successfully authenticated principal 'test-principal' at slave(230)@172.16.10.236:36326
I0215 21:36:18.955706  4053 authenticator.cpp:432] Authentication session cleanup for crammd5-authenticatee(474)@172.16.10.236:36326
I0215 21:36:18.955812  4054 slave.cpp:1459] Successfully authenticated with master master@172.16.10.236:36326
I0215 21:36:18.955981  4054 slave.cpp:1890] Will retry registration in 7.639052ms if necessary
I0215 21:36:18.956200  4057 master.cpp:6587] Received register agent message from slave(230)@172.16.10.236:36326 (ip-172-16-10-236.ec2.internal)
I0215 21:36:18.956362  4057 master.cpp:3870] Authorizing agent providing resources 'cpus:2; mem:1024; disk:1024; ports:[31000-32000]' with principal 'test-principal'
I0215 21:36:18.956614  4053 master.cpp:6654] Authorized registration of agent at slave(230)@172.16.10.236:36326 (ip-172-16-10-236.ec2.internal)
I0215 21:36:18.956676  4053 master.cpp:6769] Registering agent at slave(230)@172.16.10.236:36326 (ip-172-16-10-236.ec2.internal) with id 21d3c979-83c3-4141-9a3a-635fd550d45a-S0
I0215 21:36:18.956806  4059 registrar.cpp:487] Applied 1 operations in 44437ns; attempting to update the registry
I0215 21:36:18.957008  4060 registrar.cpp:544] Successfully updated the registry in 173824ns
I0215 21:36:18.957057  4060 master.cpp:6817] Admitted agent 21d3c979-83c3-4141-9a3a-635fd550d45a-S0 at slave(230)@172.16.10.236:36326 (ip-172-16-10-236.ec2.internal)
I0215 21:36:18.957332  4060 master.cpp:6862] Registered agent 21d3c979-83c3-4141-9a3a-635fd550d45a-S0 at slave(230)@172.16.10.236:36326 (ip-172-16-10-236.ec2.internal) with cpus:2; mem:1024; disk:1024; ports:[31000-32000]
I0215 21:36:18.957406  4058 hierarchical.cpp:578] Added agent 21d3c979-83c3-4141-9a3a-635fd550d45a-S0 (ip-172-16-10-236.ec2.internal) with cpus:2; mem:1024; disk:1024; ports:[31000-32000] (allocated: {})
I0215 21:36:18.957417  4060 slave.cpp:1492] Registered with master master@172.16.10.236:36326; given agent ID 21d3c979-83c3-4141-9a3a-635fd550d45a-S0
I0215 21:36:18.957481  4053 task_status_update_manager.cpp:188] Resuming sending task status updates
I0215 21:36:18.957701  4058 hierarchical.cpp:1522] Performed allocation for 1 agents in 196716ns
I0215 21:36:18.957811  4060 slave.cpp:1512] Checkpointing SlaveInfo to '/tmp/MasterSlaveReconciliationTest_ReconcileDroppedOperation_Ibpmpl/meta/slaves/21d3c979-83c3-4141-9a3a-635fd550d45a-S0/slave.info'
I0215 21:36:18.957830  4058 master.cpp:9430] Sending 1 offers to framework 21d3c979-83c3-4141-9a3a-635fd550d45a-0000 (default) at scheduler-26d16957-4fe3-4c71-a551-8a11d9d84917@172.16.10.236:36326
I0215 21:36:18.957980  4058 sched.cpp:914] Scheduler::resourceOffers took 38114ns
I0215 21:36:18.958674  4058 master.cpp:11391] Removing offer 21d3c979-83c3-4141-9a3a-635fd550d45a-O0
I0215 21:36:18.958787  4058 master.cpp:4369] Processing ACCEPT call for offers: [ 21d3c979-83c3-4141-9a3a-635fd550d45a-O0 ] on agent 21d3c979-83c3-4141-9a3a-635fd550d45a-S0 at slave(230)@172.16.10.236:36326 (ip-172-16-10-236.ec2.internal) for framework 21d3c979-83c3-4141-9a3a-635fd550d45a-0000 (default) at scheduler-26d16957-4fe3-4c71-a551-8a11d9d84917@172.16.10.236:36326
I0215 21:36:18.958845  4058 master.cpp:3653] Authorizing principal 'test-principal' to reserve resources 'cpus(allocated: default-role)(reservations: [(DYNAMIC,default-role,test-principal)]):2; mem(allocated: default-role)(reservations: [(DYNAMIC,default-role,test-principal)]):1024; disk(allocated: default-role)(reservations: [(DYNAMIC,default-role,test-principal)]):1024; ports(allocated: default-role)(reservations: [(DYNAMIC,default-role,test-principal)]):[31000-32000]'
I0215 21:36:18.959198  4060 slave.cpp:1561] Forwarding agent update {"operations":{},"resource_version_uuid":{"value":"\/KII43v1SiWj6P8mf2\/jzA=="},"slave_id":{"value":"21d3c979-83c3-4141-9a3a-635fd550d45a-S0"},"update_oversubscribed_resources":true}
I0215 21:36:18.959280  4058 master.cpp:4733] Applying RESERVE operation for resources [{"allocation_info":{"role":"default-role"},"name":"cpus","reservations":[{"principal":"test-principal","role":"default-role","type":"DYNAMIC"}],"scalar":{"value":2.0},"type":"SCALAR"},{"allocation_info":{"role":"default-role"},"name":"mem","reservations":[{"principal":"test-principal","role":"default-role","type":"DYNAMIC"}],"scalar":{"value":1024.0},"type":"SCALAR"},{"allocation_info":{"role":"default-role"},"name":"disk","reservations":[{"principal":"test-principal","role":"default-role","type":"DYNAMIC"}],"scalar":{"value":1024.0},"type":"SCALAR"},{"allocation_info":{"role":"default-role"},"name":"ports","ranges":{"range":[{"begin":31000,"end":32000}]},"reservations":[{"principal":"test-principal","role":"default-role","type":"DYNAMIC"}],"type":"RANGES"}] from framework 21d3c979-83c3-4141-9a3a-635fd550d45a-0000 (default) at scheduler-26d16957-4fe3-4c71-a551-8a11d9d84917@172.16.10.236:36326 to agent 21d3c979-83c3-4141-9a3a-635fd550d45a-S0 at slave(230)@172.16.10.236:36326 (ip-172-16-10-236.ec2.internal)
I0215 21:36:18.959903  4058 master.cpp:11289] Sending operation '' (uuid: 479b8ad6-c107-419c-8297-5e97e6b352e7) to agent 21d3c979-83c3-4141-9a3a-635fd550d45a-S0 at slave(230)@172.16.10.236:36326 (ip-172-16-10-236.ec2.internal)
I0215 21:36:18.960083  4058 master.cpp:7812] Received update of agent 21d3c979-83c3-4141-9a3a-635fd550d45a-S0 at slave(230)@172.16.10.236:36326 (ip-172-16-10-236.ec2.internal) with total oversubscribed resources {}
W0215 21:36:18.960137  4058 master.cpp:7986] Performing explicit reconciliation with agent for known operation 479b8ad6-c107-419c-8297-5e97e6b352e7 since it was not present in original reconciliation message from agent
I0215 21:36:18.960264  4055 master.cpp:11039] Updating the state of operation '' (uuid: 479b8ad6-c107-419c-8297-5e97e6b352e7) for framework 21d3c979-83c3-4141-9a3a-635fd550d45a-0000 (latest state: OPERATION_PENDING, status update state: OPERATION_DROPPED)
I0215 21:36:18.960857  4059 slave.cpp:1267] New master detected at master@172.16.10.236:36326
I0215 21:36:18.961047  4059 slave.cpp:1332] Detecting new master
I0215 21:36:18.960906  4057 task_status_update_manager.cpp:181] Pausing sending task status updates
I0215 21:36:18.960995  4058 hierarchical.cpp:930] Updated allocation of framework 21d3c979-83c3-4141-9a3a-635fd550d45a-0000 on agent 21d3c979-83c3-4141-9a3a-635fd550d45a-S0 from cpus(allocated: default-role):2; mem(allocated: default-role):1024; disk(allocated: default-role):1024; ports(allocated: default-role):[31000-32000] to disk(allocated: default-role)(reservations: [(DYNAMIC,default-role,test-principal)]):1024; cpus(allocated: default-role)(reservations: [(DYNAMIC,default-role,test-principal)]):2; mem(allocated: default-role)(reservations: [(DYNAMIC,default-role,test-principal)]):1024; ports(allocated: default-role)(reservations: [(DYNAMIC,default-role,test-principal)]):[31000-32000]
I0215 21:36:18.961328  4058 hierarchical.cpp:1197] Recovered disk(allocated: default-role)(reservations: [(DYNAMIC,default-role,test-principal)]):1024; cpus(allocated: default-role)(reservations: [(DYNAMIC,default-role,test-principal)]):2; mem(allocated: default-role)(reservations: [(DYNAMIC,default-role,test-principal)]):1024; ports(allocated: default-role)(reservations: [(DYNAMIC,default-role,test-principal)]):[31000-32000] (total: disk(reservations: [(DYNAMIC,default-role,test-principal)]):1024; cpus(reservations: [(DYNAMIC,default-role,test-principal)]):2; mem(reservations: [(DYNAMIC,default-role,test-principal)]):1024; ports(reservations: [(DYNAMIC,default-role,test-principal)]):[31000-32000], allocated: {}) on agent 21d3c979-83c3-4141-9a3a-635fd550d45a-S0 from framework 21d3c979-83c3-4141-9a3a-635fd550d45a-0000
I0215 21:36:18.961364  4058 hierarchical.cpp:1243] Framework 21d3c979-83c3-4141-9a3a-635fd550d45a-0000 filtered agent 21d3c979-83c3-4141-9a3a-635fd550d45a-S0 for 5secs
I0215 21:36:18.961469  4058 hierarchical.cpp:674] Agent 21d3c979-83c3-4141-9a3a-635fd550d45a-S0 (ip-172-16-10-236.ec2.internal) updated with total resources disk(reservations: [(DYNAMIC,default-role,test-principal)]):1024; cpus(reservations: [(DYNAMIC,default-role,test-principal)]):2; mem(reservations: [(DYNAMIC,default-role,test-principal)]):1024; ports(reservations: [(DYNAMIC,default-role,test-principal)]):[31000-32000]
I0215 21:36:18.970415  4058 slave.cpp:1359] Authenticating with master master@172.16.10.236:36326
I0215 21:36:18.970441  4058 slave.cpp:1368] Using default CRAM-MD5 authenticatee
I0215 21:36:18.970512  4058 authenticatee.cpp:121] Creating new client SASL connection
I0215 21:36:18.970950  4058 master.cpp:9598] Authenticating slave(230)@172.16.10.236:36326; clearing previous authentication
I0215 21:36:18.971081  4058 authenticator.cpp:414] Starting authentication session for crammd5-authenticatee(475)@172.16.10.236:36326
I0215 21:36:18.971220  4059 authenticator.cpp:98] Creating new server SASL connection
I0215 21:36:18.971645  4059 authenticatee.cpp:213] Received SASL authentication mechanisms: CRAM-MD5
I0215 21:36:18.971665  4059 authenticatee.cpp:239] Attempting to authenticate with mechanism 'CRAM-MD5'
I0215 21:36:18.971695  4059 authenticator.cpp:204] Received SASL authentication start
I0215 21:36:18.971730  4059 authenticator.cpp:326] Authentication requires more steps
I0215 21:36:18.971756  4059 authenticatee.cpp:259] Received SASL authentication step
I0215 21:36:18.971787  4059 authenticator.cpp:232] Received SASL authentication step
I0215 21:36:18.971801  4059 auxprop.cpp:109] Request to lookup properties for user: 'test-principal' realm: 'ip-172-16-10-236.ec2.internal' server FQDN: 'ip-172-16-10-236.ec2.internal' SASL_AUXPROP_VERIFY_AGAINST_HASH: false SASL_AUXPROP_OVERRIDE: false SASL_AUXPROP_AUTHZID: false 
I0215 21:36:18.971807  4059 auxprop.cpp:181] Looking up auxiliary property '*userPassword'
I0215 21:36:18.971817  4059 auxprop.cpp:181] Looking up auxiliary property '*cmusaslsecretCRAM-MD5'
I0215 21:36:18.971825  4059 auxprop.cpp:109] Request to lookup properties for user: 'test-principal' realm: 'ip-172-16-10-236.ec2.internal' server FQDN: 'ip-172-16-10-236.ec2.internal' SASL_AUXPROP_VERIFY_AGAINST_HASH: false SASL_AUXPROP_OVERRIDE: false SASL_AUXPROP_AUTHZID: true 
I0215 21:36:18.971830  4059 auxprop.cpp:131] Skipping auxiliary property '*userPassword' since SASL_AUXPROP_AUTHZID == true
I0215 21:36:18.971837  4059 auxprop.cpp:131] Skipping auxiliary property '*cmusaslsecretCRAM-MD5' since SASL_AUXPROP_AUTHZID == true
I0215 21:36:18.971848  4059 authenticator.cpp:318] Authentication success
I0215 21:36:18.971884  4059 authenticatee.cpp:299] Authentication success
I0215 21:36:18.971920  4057 master.cpp:9630] Successfully authenticated principal 'test-principal' at slave(230)@172.16.10.236:36326
I0215 21:36:18.971968  4057 slave.cpp:1459] Successfully authenticated with master master@172.16.10.236:36326
I0215 21:36:18.972054  4059 authenticator.cpp:432] Authentication session cleanup for crammd5-authenticatee(475)@172.16.10.236:36326
I0215 21:36:18.972090  4057 slave.cpp:1890] Will retry registration in 15.324248ms if necessary
I0215 21:36:18.972134  4053 master.cpp:6941] Received reregister agent message from agent 21d3c979-83c3-4141-9a3a-635fd550d45a-S0 at slave(230)@172.16.10.236:36326 (ip-172-16-10-236.ec2.internal)
I0215 21:36:18.972215  4053 master.cpp:3870] Authorizing agent providing resources 'cpus:2; mem:1024; disk:1024; ports:[31000-32000]' with principal 'test-principal'
I0215 21:36:18.972426  4057 master.cpp:7033] Authorized re-registration of agent 21d3c979-83c3-4141-9a3a-635fd550d45a-S0 at slave(230)@172.16.10.236:36326 (ip-172-16-10-236.ec2.internal)
I0215 21:36:18.972462  4057 master.cpp:7117] Agent is already marked as registered: 21d3c979-83c3-4141-9a3a-635fd550d45a-S0 at slave(230)@172.16.10.236:36326 (ip-172-16-10-236.ec2.internal)
I0215 21:36:18.972503  4057 master.cpp:7485] Registry updated for slave 21d3c979-83c3-4141-9a3a-635fd550d45a-S0 at slave(230)@172.16.10.236:36326(ip-172-16-10-236.ec2.internal)
I0215 21:36:18.972612  4057 slave.cpp:1598] Re-registered with master master@172.16.10.236:36326
I0215 21:36:18.972668  4057 slave.cpp:1643] Forwarding agent update {"operations":{},"resource_version_uuid":{"value":"\/KII43v1SiWj6P8mf2\/jzA=="},"slave_id":{"value":"21d3c979-83c3-4141-9a3a-635fd550d45a-S0"},"update_oversubscribed_resources":true}
I0215 21:36:18.972713  4057 task_status_update_manager.cpp:188] Resuming sending task status updates
I0215 21:36:18.972753  4053 hierarchical.cpp:674] Agent 21d3c979-83c3-4141-9a3a-635fd550d45a-S0 (ip-172-16-10-236.ec2.internal) updated with total resources cpus:2; mem:1024; disk:1024; ports:[31000-32000]
I0215 21:36:18.972795  4057 master.cpp:7812] Received update of agent 21d3c979-83c3-4141-9a3a-635fd550d45a-S0 at slave(230)@172.16.10.236:36326 (ip-172-16-10-236.ec2.internal) with total oversubscribed resources {}
I0215 21:36:18.972828  4057 master.cpp:7921] Ignoring update on agent 21d3c979-83c3-4141-9a3a-635fd550d45a-S0 at slave(230)@172.16.10.236:36326 (ip-172-16-10-236.ec2.internal) as it reports no changes
I0215 21:36:18.972972  4053 hierarchical.cpp:1522] Performed allocation for 1 agents in 169807ns
I0215 21:36:18.973088  4053 master.cpp:9430] Sending 1 offers to framework 21d3c979-83c3-4141-9a3a-635fd550d45a-0000 (default) at scheduler-26d16957-4fe3-4c71-a551-8a11d9d84917@172.16.10.236:36326
I0215 21:36:18.973199  4053 sched.cpp:914] Scheduler::resourceOffers took 9895ns
I0215 21:36:19.926244  4055 hierarchical.cpp:1522] Performed allocation for 1 agents in 65186ns
I0215 21:36:20.927438  4058 hierarchical.cpp:1522] Performed allocation for 1 agents in 58478ns
I0215 21:36:21.927950  4060 hierarchical.cpp:1522] Performed allocation for 1 agents in 59592ns
I0215 21:36:22.928951  4059 hierarchical.cpp:1522] Performed allocation for 1 agents in 60565ns
I0215 21:36:23.929944  4054 hierarchical.cpp:1522] Performed allocation for 1 agents in 60438ns
I0215 21:36:24.930943  4053 hierarchical.cpp:1522] Performed allocation for 1 agents in 57440ns
I0215 21:36:25.931985  4055 hierarchical.cpp:1522] Performed allocation for 1 agents in 60381ns
I0215 21:36:26.932894  4058 hierarchical.cpp:1522] Performed allocation for 1 agents in 61231ns
I0215 21:36:27.933981  4054 hierarchical.cpp:1522] Performed allocation for 1 agents in 59727ns
I0215 21:36:28.934787  4058 hierarchical.cpp:1522] Performed allocation for 1 agents in 59511ns
hierarchical.cpp:1522] Performed allocation for 1 agents in 61505ns
I0215 21:36:31.937150  4060 hierarchical.cpp:1522] Performed allocation for 1 agents in 61542ns
I0215 21:36:32.938012  4057 hierarchical.cpp:1522] Performed allocation for 1 agents in 59636ns
I0215 21:36:33.939301  4057 hierarchical.cpp:1522] Performed allocation for 1 agents in 62049ns
../../src/tests/master_slave_reconciliation_tests.cpp:414: Failure
Failed to wait 15secs for reconcileOperationsMessage
I0215 21:36:33.962352  4052 slave.cpp:922] Agent terminating
I0215 21:36:33.962371  4053 master.cpp:1439] Framework 21d3c979-83c3-4141-9a3a-635fd550d45a-0000 (default) at scheduler-26d16957-4fe3-4c71-a551-8a11d9d84917@172.16.10.236:36326 disconnected
I0215 21:36:33.963155  4053 master.cpp:3317] Deactivating framework 21d3c979-83c3-4141-9a3a-635fd550d45a-0000 (default) at scheduler-26d16957-4fe3-4c71-a551-8a11d9d84917@172.16.10.236:36326
I0215 21:36:33.963495  4053 master.cpp:11391] Removing offer 21d3c979-83c3-4141-9a3a-635fd550d45a-O1
I0215 21:36:33.963515  4053 master.cpp:3294] Disconnecting framework 21d3c979-83c3-4141-9a3a-635fd550d45a-0000 (default) at scheduler-26d16957-4fe3-4c71-a551-8a11d9d84917@172.16.10.236:36326
I0215 21:36:33.963527  4053 master.cpp:1454] Giving framework 21d3c979-83c3-4141-9a3a-635fd550d45a-0000 (default) at scheduler-26d16957-4fe3-4c71-a551-8a11d9d84917@172.16.10.236:36326 0ns to failover
I0215 21:36:33.963565  4053 master.cpp:1324] Agent 21d3c979-83c3-4141-9a3a-635fd550d45a-S0 at slave(230)@172.16.10.236:36326 (ip-172-16-10-236.ec2.internal) disconnected
I0215 21:36:33.963577  4053 master.cpp:3354] Disconnecting agent 21d3c979-83c3-4141-9a3a-635fd550d45a-S0 at slave(230)@172.16.10.236:36326 (ip-172-16-10-236.ec2.internal)
I0215 21:36:33.963587  4053 master.cpp:3373] Deactivating agent 21d3c979-83c3-4141-9a3a-635fd550d45a-S0 at slave(230)@172.16.10.236:36326 (ip-172-16-10-236.ec2.internal)
I0215 21:36:33.963776  4054 hierarchical.cpp:409] Deactivated framework 21d3c979-83c3-4141-9a3a-635fd550d45a-0000
I0215 21:36:33.963985  4054 hierarchical.cpp:1197] Recovered cpus(allocated: default-role):2; mem(allocated: default-role):1024; disk(allocated: default-role):1024; ports(allocated: default-role):[31000-32000] (total: cpus:2; mem:1024; disk:1024; ports:[31000-32000], allocated: {}) on agent 21d3c979-83c3-4141-9a3a-635fd550d45a-S0 from framework 21d3c979-83c3-4141-9a3a-635fd550d45a-0000
I0215 21:36:33.964031  4054 hierarchical.cpp:771] Agent 21d3c979-83c3-4141-9a3a-635fd550d45a-S0 deactivated
I0215 21:36:33.964300  4059 master.cpp:9243] Framework failover timeout, removing framework 21d3c979-83c3-4141-9a3a-635fd550d45a-0000 (default) at scheduler-26d16957-4fe3-4c71-a551-8a11d9d84917@172.16.10.236:36326
I0215 21:36:33.964319  4059 master.cpp:10142] Removing framework 21d3c979-83c3-4141-9a3a-635fd550d45a-0000 (default) at scheduler-26d16957-4fe3-4c71-a551-8a11d9d84917@172.16.10.236:36326
I0215 21:36:33.964576  4059 hierarchical.cpp:348] Removed framework 21d3c979-83c3-4141-9a3a-635fd550d45a-0000
I0215 21:36:33.965606  4052 master.cpp:1166] Master terminating
I0215 21:36:33.965875  4055 hierarchical.cpp:614] Removed agent 21d3c979-83c3-4141-9a3a-635fd550d45a-S0
../../3rdparty/libprocess/include/process/gmock.hpp:466: Failure
Actual function call count doesn't match EXPECT_CALL(filter->mock, filter(testing::A<const MessageEvent&>()))...
    Expected args: message matcher (8-byte object <48-A0 07-B4 1B-7F 00-00>, 1-byte object <A0>, 1-byte object <48>)
         Expected: to be called once
           Actual: never called - unsatisfied and active
[  FAILED  ] MasterSlaveReconciliationTest.ReconcileDroppedOperation (15051 ms)
{code}

> MasterSlaveReconciliationTest.ReconcileDroppedOperation is flaky
> ----------------------------------------------------------------
>
>                 Key: MESOS-8892
>                 URL: https://issues.apache.org/jira/browse/MESOS-8892
>             Project: Mesos
>          Issue Type: Bug
>          Components: test
>    Affects Versions: 1.6.0
>            Reporter: Greg Mann
>            Assignee: Benjamin Bannier
>            Priority: Major
>              Labels: mesosphere
>             Fix For: 1.7.0
>
>         Attachments: MasterSlaveReconciliationTest.ReconcileDroppedOperation.txt
>
>
> This was observed on a Debian 9 SSL/GRPC-enabled build. It appears that a poorly-timed {{UpdateSlaveMessage}} leads to the operation reconciliation occurring before the expectation for the {{ReconcileOperationsMessage}} is registered:
> {code}
> I0508 00:11:09.700815 22498 master.cpp:4362] Processing ACCEPT call for offers: [ f850080d-9c7a-4ff7-8d4b-9e54aa0418cb-O0 ] on agent f850080d-9c7a-4ff7-8d4b-9e54aa0418cb-S0 at slave(212)@127.0.0.1:36309 (localhost) for framework f850080d-9c7a-4ff7-8d4b-9e54aa0418cb-0000 (default) at scheduler-b0f55e01-2f6f-42c8-8614-901036acfc31@127.0.0.1:36309
> I0508 00:11:09.700870 22498 master.cpp:3602] Authorizing principal 'test-principal' to reserve resources 'cpus(allocated: default-role)(reservations: [(DYNAMIC,default-role,test-principal)]):2; mem(allocated: default-role)(reservations: [(DYNAMIC,default-role,test-principal)]):1024; disk(allocated: default-role)(reservations: [(DYNAMIC,default-role,test-principal)]):1024; ports(allocated: default-role)(reservations: [(DYNAMIC,default-role,test-principal)]):[31000-32000]'
> I0508 00:11:09.701228 22493 master.cpp:4725] Applying RESERVE operation for resources [{"allocation_info":{"role":"default-role"},"name":"cpus","reservations":[{"principal":"test-principal","role":"default-role","type":"DYNAMIC"}],"scalar":{"value":2.0},"type":"SCALAR"},{"allocation_info":{"role":"default-role"},"name":"mem","reservations":[{"principal":"test-principal","role":"default-role","type":"DYNAMIC"}],"scalar":{"value":1024.0},"type":"SCALAR"},{"allocation_info":{"role":"default-role"},"name":"disk","reservations":[{"principal":"test-principal","role":"default-role","type":"DYNAMIC"}],"scalar":{"value":1024.0},"type":"SCALAR"},{"allocation_info":{"role":"default-role"},"name":"ports","ranges":{"range":[{"begin":31000,"end":32000}]},"reservations":[{"principal":"test-principal","role":"default-role","type":"DYNAMIC"}],"type":"RANGES"}] from framework f850080d-9c7a-4ff7-8d4b-9e54aa0418cb-0000 (default) at scheduler-b0f55e01-2f6f-42c8-8614-901036acfc31@127.0.0.1:36309 to agent f850080d-9c7a-4ff7-8d4b-9e54aa0418cb-S0 at slave(212)@127.0.0.1:36309 (localhost)
> I0508 00:11:09.701498 22493 master.cpp:11265] Sending operation '' (uuid: 81dffb62-6e75-4c6c-a97b-41c92c58d6a7) to agent f850080d-9c7a-4ff7-8d4b-9e54aa0418cb-S0 at slave(212)@127.0.0.1:36309 (localhost)
> I0508 00:11:09.701627 22494 slave.cpp:1564] Forwarding agent update {"operations":{},"resource_version_uuid":{"value":"0HeA06ftS6m76SNoNZNPag=="},"slave_id":{"value":"f850080d-9c7a-4ff7-8d4b-9e54aa0418cb-S0"},"update_oversubscribed_resources":true}
> I0508 00:11:09.701848 22494 master.cpp:7800] Received update of agent f850080d-9c7a-4ff7-8d4b-9e54aa0418cb-S0 at slave(212)@127.0.0.1:36309 (localhost) with total oversubscribed resources {}
> W0508 00:11:09.701905 22494 master.cpp:7974] Performing explicit reconciliation with agent for known operation 81dffb62-6e75-4c6c-a97b-41c92c58d6a7 since it was not present in original reconciliation message from agent
> I0508 00:11:09.702085 22494 master.cpp:11015] Updating the state of operation '' (uuid: 81dffb62-6e75-4c6c-a97b-41c92c58d6a7) for framework f850080d-9c7a-4ff7-8d4b-9e54aa0418cb-0000 (latest state: OPERATION_PENDING, status update state: OPERATION_DROPPED)
> I0508 00:11:09.702239 22491 hierarchical.cpp:925] Updated allocation of framework f850080d-9c7a-4ff7-8d4b-9e54aa0418cb-0000 on agent f850080d-9c7a-4ff7-8d4b-9e54aa0418cb-S0 from cpus(allocated: default-role):2; mem(allocated: default-role):1024; disk(allocated: default-role):1024; ports(allocated: default-role):[31000-32000] to disk(allocated: default-role)(reservations: [(DYNAMIC,default-role,test-principal)]):1024; cpus(allocated: default-role)(reservations: [(DYNAMIC,default-role,test-principal)]):2; mem(allocated: default-role)(reservations: [(DYNAMIC,default-role,test-principal)]):1024; ports(allocated: default-role)(reservations: [(DYNAMIC,default-role,test-principal)]):[31000-32000]
> I0508 00:11:09.702267 22493 slave.cpp:1274] New master detected at master@127.0.0.1:36309
> I0508 00:11:09.702306 22495 task_status_update_manager.cpp:181] Pausing sending task status updates
> I0508 00:11:09.702337 22493 slave.cpp:1329] Detecting new master
> {code}
> Full log is attached as {{MasterSlaveReconciliationTest.ReconcileDroppedOperation.txt}}.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)