You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@mesos.apache.org by "Till Toenshoff (JIRA)" <ji...@apache.org> on 2018/11/29 20:51:00 UTC

[jira] [Commented] (MESOS-3235) FetcherCacheHttpTest.HttpCachedSerialized and FetcherCacheHttpTest.HttpCachedConcurrent are flaky

    [ https://issues.apache.org/jira/browse/MESOS-3235?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16703805#comment-16703805 ] 

Till Toenshoff commented on MESOS-3235:
---------------------------------------

Symptoms for failures nowadays look different;

{noformat}
19:40:52  [ RUN      ] FetcherCacheHttpTest.HttpCachedConcurrent
19:40:52  I1129 19:40:52.498800 27028 cluster.cpp:173] Creating default 'local' authorizer
19:40:52  I1129 19:40:52.499980 27050 master.cpp:414] Master ce756623-c356-49a6-b58d-a59616f2e201 (ip-172-16-10-61.ec2.internal) started on 172.16.10.61:36069
19:40:52  I1129 19:40:52.500003 27050 master.cpp:417] Flags at startup: --acls="" --agent_ping_timeout="15secs" --agent_reregister_timeout="10mins" --allocation_interval="1secs" --allocator="hierarchical" --authenticate_agents="true" --authenticate_frameworks="true" --authenticate_http_frameworks="true" --authenticate_http_readonly="true" --authenticate_http_readwrite="true" --authentication_v0_timeout="15secs" --authenticators="crammd5" --authorizers="local" --credentials="/tmp/p108BH/credentials" --filter_gpu_resources="true" --framework_sorter="drf" --help="false" --hostname_lookup="true" --http_authenticators="basic" --http_framework_authenticators="basic" --initialize_driver_logging="true" --log_auto_initialize="true" --logbufsecs="0" --logging_level="INFO" --max_agent_ping_timeouts="5" --max_completed_frameworks="50" --max_completed_tasks_per_framework="1000" --max_unreachable_tasks_per_framework="1000" --memory_profiling="false" --min_allocatable_resources="cpus:0.01|mem:32" --port="5050" --publish_per_framework_metrics="true" --quiet="false" --recovery_agent_removal_limit="100%" --registry="in_memory" --registry_fetch_timeout="1mins" --registry_gc_interval="15mins" --registry_max_agent_age="2weeks" --registry_max_agent_count="102400" --registry_store_timeout="100secs" --registry_strict="false" --require_agent_domain="false" --role_sorter="drf" --root_submissions="true" --version="false" --webui_dir="/usr/local/share/mesos/webui" --work_dir="/tmp/p108BH/master" --zk_session_timeout="10secs"
19:40:52  I1129 19:40:52.500200 27050 master.cpp:466] Master only allowing authenticated frameworks to register
19:40:52  I1129 19:40:52.500211 27050 master.cpp:472] Master only allowing authenticated agents to register
19:40:52  I1129 19:40:52.500298 27050 master.cpp:478] Master only allowing authenticated HTTP frameworks to register
19:40:52  I1129 19:40:52.500308 27050 credentials.hpp:37] Loading credentials for authentication from '/tmp/p108BH/credentials'
19:40:52  I1129 19:40:52.500483 27050 master.cpp:522] Using default 'crammd5' authenticator
19:40:52  I1129 19:40:52.500535 27050 http.cpp:1017] Creating default 'basic' HTTP authenticator for realm 'mesos-master-readonly'
19:40:52  I1129 19:40:52.500576 27050 http.cpp:1017] Creating default 'basic' HTTP authenticator for realm 'mesos-master-readwrite'
19:40:52  I1129 19:40:52.500602 27050 http.cpp:1017] Creating default 'basic' HTTP authenticator for realm 'mesos-master-scheduler'
19:40:52  I1129 19:40:52.500634 27050 master.cpp:603] Authorization enabled
19:40:52  I1129 19:40:52.500835 27049 hierarchical.cpp:175] Initialized hierarchical allocator process
19:40:52  I1129 19:40:52.500865 27049 whitelist_watcher.cpp:77] No whitelist given
19:40:52  I1129 19:40:52.501417 27050 master.cpp:2089] Elected as the leading master!
19:40:52  I1129 19:40:52.501437 27050 master.cpp:1644] Recovering from registrar
19:40:52  I1129 19:40:52.501475 27050 registrar.cpp:339] Recovering registrar
19:40:52  I1129 19:40:52.501665 27049 registrar.cpp:383] Successfully fetched the registry (0B) in 174080ns
19:40:52  I1129 19:40:52.501708 27049 registrar.cpp:487] Applied 1 operations in 9065ns; attempting to update the registry
19:40:52  I1129 19:40:52.501972 27055 registrar.cpp:544] Successfully updated the registry in 216832ns
19:40:52  I1129 19:40:52.502024 27055 registrar.cpp:416] Successfully recovered registrar
19:40:52  I1129 19:40:52.502141 27055 master.cpp:1758] Recovered 0 agents from the registry (172B); allowing 10mins for agents to reregister
19:40:52  I1129 19:40:52.502243 27055 hierarchical.cpp:215] Skipping recovery of hierarchical allocator: nothing to recover
19:40:52  W1129 19:40:52.502579 27028 process.cpp:2829] Attempted to spawn already running process version@172.16.10.61:36069
19:40:52  I1129 19:40:52.503276 27028 containerizer.cpp:305] Using isolation { environment_secret, posix/cpu, posix/mem, filesystem/posix, network/cni }
19:40:52  I1129 19:40:52.505079 27028 linux_launcher.cpp:144] Using /cgroup/freezer as the freezer hierarchy for the Linux launcher
19:40:52  I1129 19:40:52.505468 27028 provisioner.cpp:298] Using default backend 'copy'
19:40:52  W1129 19:40:52.506837 27028 process.cpp:2829] Attempted to spawn already running process files@172.16.10.61:36069
19:40:52  I1129 19:40:52.507014 27028 cluster.cpp:485] Creating default 'local' authorizer
19:40:52  I1129 19:40:52.507437 27050 slave.cpp:268] Mesos agent started on (85)@172.16.10.61:36069
19:40:52  I1129 19:40:52.507453 27050 slave.cpp:269] Flags at startup: --acls="" --appc_simple_discovery_uri_prefix="http://" --appc_store_dir="/tmp/FetcherCacheHttpTest_HttpCachedConcurrent_zgSznn/store/appc" --authenticate_http_executors="true" --authenticate_http_readonly="true" --authenticate_http_readwrite="true" --authenticatee="crammd5" --authentication_backoff_factor="1secs" --authentication_timeout_max="1mins" --authentication_timeout_min="5secs" --authorizer="local" --cgroups_cpu_enable_pids_and_tids_count="false" --cgroups_destroy_timeout="1mins" --cgroups_enable_cfs="false" --cgroups_hierarchy="/sys/fs/cgroup" --cgroups_limit_swap="false" --cgroups_root="mesos" --container_disk_watch_interval="15secs" --containerizers="mesos" --credential="/tmp/FetcherCacheHttpTest_HttpCachedConcurrent_zgSznn/credential" --default_role="*" --disallow_sharing_agent_pid_namespace="false" --disk_watch_interval="1mins" --docker="docker" --docker_kill_orphans="true" --docker_registry="https://registry-1.docker.io" --docker_remove_delay="6hrs" --docker_socket="/var/run/docker.sock" --docker_stop_timeout="0ns" --docker_store_dir="/tmp/FetcherCacheHttpTest_HttpCachedConcurrent_zgSznn/store/docker" --docker_volume_checkpoint_dir="/var/run/mesos/isolators/docker/volume" --enforce_container_disk_quota="false" --executor_registration_timeout="1mins" --executor_reregistration_timeout="2secs" --executor_shutdown_grace_period="5secs" --fetcher_cache_dir="/tmp/FetcherCacheHttpTest_HttpCachedConcurrent_zgSznn/fetch" --fetcher_cache_size="2GB" --fetcher_stall_timeout="1mins" --frameworks_home="/tmp/FetcherCacheHttpTest_HttpCachedConcurrent_zgSznn/frameworks" --gc_delay="1weeks" --gc_disk_headroom="0.1" --gc_non_executor_container_sandboxes="false" --help="false" --hostname_lookup="true" --http_command_executor="false" --http_credentials="/tmp/FetcherCacheHttpTest_HttpCachedConcurrent_zgSznn/http_credentials" --http_heartbeat_interval="30secs" --initialize_driver_logging="true" --isolation="posix/cpu,posix/mem" --jwt_secret_key="/tmp/FetcherCacheHttpTest_HttpCachedConcurrent_zgSznn/jwt_secret_key" --launcher="linux" --launcher_dir="/home/centos/workspace/mesos/Mesos_CI-build/FLAG/SSL/label/mesos-ec2-centos-6/mesos/build/src" --logbufsecs="0" --logging_level="INFO" --max_completed_executors_per_framework="150" --memory_profiling="false" --network_cni_metrics="true" --oversubscribed_resources_interval="15secs" --perf_duration="10secs" --perf_interval="1mins" --port="5051" --qos_correction_interval_min="0ns" --quiet="false" --reconfiguration_policy="equal" --recover="reconnect" --recovery_timeout="15mins" --registration_backoff_factor="10ms" --resources="cpus:1000;mem:1000" --revocable_cpu_low_priority="true" --runtime_dir="/tmp/FetcherCacheHttpTest_HttpCachedConcurrent_zgSznn" --sandbox_directory="/mnt/mesos/sandbox" --strict="true" --switch_user="true" --systemd_enable_support="true" --systemd_runtime_directory="/run/systemd/system" --version="false" --work_dir="/tmp/FetcherCacheHttpTest_HttpCachedConcurrent_UedRux" --zk_session_timeout="10secs"
19:40:52  I1129 19:40:52.507632 27050 credentials.hpp:86] Loading credential for authentication from '/tmp/FetcherCacheHttpTest_HttpCachedConcurrent_zgSznn/credential'
19:40:52  I1129 19:40:52.507694 27050 slave.cpp:301] Agent using credential for: test-principal
19:40:52  I1129 19:40:52.507704 27050 credentials.hpp:37] Loading credentials for authentication from '/tmp/FetcherCacheHttpTest_HttpCachedConcurrent_zgSznn/http_credentials'
19:40:52  I1129 19:40:52.507792 27050 http.cpp:1017] Creating default 'basic' HTTP authenticator for realm 'mesos-agent-executor'
19:40:52  I1129 19:40:52.507823 27050 http.cpp:1038] Creating default 'jwt' HTTP authenticator for realm 'mesos-agent-executor'
19:40:52  I1129 19:40:52.507867 27050 http.cpp:1017] Creating default 'basic' HTTP authenticator for realm 'mesos-agent-readonly'
19:40:52  I1129 19:40:52.507889 27050 http.cpp:1038] Creating default 'jwt' HTTP authenticator for realm 'mesos-agent-readonly'
19:40:52  I1129 19:40:52.507925 27050 http.cpp:1017] Creating default 'basic' HTTP authenticator for realm 'mesos-agent-readwrite'
19:40:52  I1129 19:40:52.507946 27050 http.cpp:1038] Creating default 'jwt' HTTP authenticator for realm 'mesos-agent-readwrite'
19:40:52  I1129 19:40:52.508002 27050 disk_profile_adaptor.cpp:80] Creating default disk profile adaptor module
19:40:52  I1129 19:40:52.508512 27050 slave.cpp:616] Agent resources: [{"name":"cpus","scalar":{"value":1000.0},"type":"SCALAR"},{"name":"mem","scalar":{"value":1000.0},"type":"SCALAR"},{"name":"disk","scalar":{"value":35068.0},"type":"SCALAR"},{"name":"ports","ranges":{"range":[{"begin":31000,"end":32000}]},"type":"RANGES"}]
19:40:52  I1129 19:40:52.508566 27050 slave.cpp:624] Agent attributes: [  ]
19:40:52  I1129 19:40:52.508574 27050 slave.cpp:633] Agent hostname: ip-172-16-10-61.ec2.internal
19:40:52  I1129 19:40:52.508677 27049 task_status_update_manager.cpp:181] Pausing sending task status updates
19:40:52  I1129 19:40:52.508895 27050 state.cpp:66] Recovering state from '/tmp/FetcherCacheHttpTest_HttpCachedConcurrent_UedRux/meta'
19:40:52  I1129 19:40:52.509006 27051 slave.cpp:6914] Finished recovering checkpointed state from '/tmp/FetcherCacheHttpTest_HttpCachedConcurrent_UedRux/meta', beginning agent recovery
19:40:52  I1129 19:40:52.509119 27051 task_status_update_manager.cpp:207] Recovering task status update manager
19:40:52  I1129 19:40:52.509207 27049 containerizer.cpp:727] Recovering Mesos containers
19:40:52  I1129 19:40:52.509256 27049 linux_launcher.cpp:286] Recovering Linux launcher
19:40:52  I1129 19:40:52.509380 27049 containerizer.cpp:1053] Recovering isolators
19:40:52  I1129 19:40:52.509629 27054 containerizer.cpp:1092] Recovering provisioner
19:40:52  I1129 19:40:52.509738 27054 provisioner.cpp:494] Provisioner recovery complete
19:40:52  I1129 19:40:52.509932 27056 composing.cpp:339] Finished recovering all containerizers
19:40:52  I1129 19:40:52.509981 27056 slave.cpp:7143] Recovering executors
19:40:52  I1129 19:40:52.510002 27056 slave.cpp:7296] Finished recovery
19:40:52  I1129 19:40:52.510367 27056 task_status_update_manager.cpp:181] Pausing sending task status updates
19:40:52  I1129 19:40:52.510372 27052 slave.cpp:1259] New master detected at master@172.16.10.61:36069
19:40:52  I1129 19:40:52.510411 27052 slave.cpp:1324] Detecting new master
19:40:52  I1129 19:40:52.516607 27055 slave.cpp:1351] Authenticating with master master@172.16.10.61:36069
19:40:52  I1129 19:40:52.516636 27055 slave.cpp:1360] Using default CRAM-MD5 authenticatee
19:40:52  I1129 19:40:52.516714 27055 authenticatee.cpp:121] Creating new client SASL connection
19:40:52  I1129 19:40:52.516861 27055 master.cpp:9649] Authenticating slave(85)@172.16.10.61:36069
19:40:52  I1129 19:40:52.516918 27055 authenticator.cpp:414] Starting authentication session for crammd5-authenticatee(195)@172.16.10.61:36069
19:40:52  I1129 19:40:52.516978 27055 authenticator.cpp:98] Creating new server SASL connection
19:40:52  I1129 19:40:52.517082 27055 authenticatee.cpp:213] Received SASL authentication mechanisms: CRAM-MD5
19:40:52  I1129 19:40:52.517115 27055 authenticatee.cpp:239] Attempting to authenticate with mechanism 'CRAM-MD5'
19:40:52  I1129 19:40:52.517168 27055 authenticator.cpp:204] Received SASL authentication start
19:40:52  I1129 19:40:52.517230 27055 authenticator.cpp:326] Authentication requires more steps
19:40:52  I1129 19:40:52.517285 27055 authenticatee.cpp:259] Received SASL authentication step
19:40:52  I1129 19:40:52.517385 27055 authenticator.cpp:232] Received SASL authentication step
19:40:52  I1129 19:40:52.517422 27055 auxprop.cpp:109] Request to lookup properties for user: 'test-principal' realm: 'ip-172-16-10-61' server FQDN: 'ip-172-16-10-61' SASL_AUXPROP_OVERRIDE: false SASL_AUXPROP_AUTHZID: false 
19:40:52  I1129 19:40:52.517432 27055 auxprop.cpp:181] Looking up auxiliary property '*userPassword'
19:40:52  I1129 19:40:52.517443 27055 auxprop.cpp:181] Looking up auxiliary property '*cmusaslsecretCRAM-MD5'
19:40:52  I1129 19:40:52.517473 27055 auxprop.cpp:109] Request to lookup properties for user: 'test-principal' realm: 'ip-172-16-10-61' server FQDN: 'ip-172-16-10-61' SASL_AUXPROP_OVERRIDE: false SASL_AUXPROP_AUTHZID: true 
19:40:52  I1129 19:40:52.517482 27055 auxprop.cpp:131] Skipping auxiliary property '*userPassword' since SASL_AUXPROP_AUTHZID == true
19:40:52  I1129 19:40:52.517488 27055 auxprop.cpp:131] Skipping auxiliary property '*cmusaslsecretCRAM-MD5' since SASL_AUXPROP_AUTHZID == true
19:40:52  I1129 19:40:52.517519 27055 authenticator.cpp:318] Authentication success
19:40:52  I1129 19:40:52.517580 27055 authenticatee.cpp:299] Authentication success
19:40:52  I1129 19:40:52.517597 27054 master.cpp:9681] Successfully authenticated principal 'test-principal' at slave(85)@172.16.10.61:36069
19:40:52  I1129 19:40:52.517627 27054 authenticator.cpp:432] Authentication session cleanup for crammd5-authenticatee(195)@172.16.10.61:36069
19:40:52  I1129 19:40:52.517670 27055 slave.cpp:1451] Successfully authenticated with master master@172.16.10.61:36069
19:40:52  I1129 19:40:52.517773 27055 slave.cpp:1882] Will retry registration in 7.353676ms if necessary
19:40:52  I1129 19:40:52.517809 27056 master.cpp:6600] Received register agent message from slave(85)@172.16.10.61:36069 (ip-172-16-10-61.ec2.internal)
19:40:52  I1129 19:40:52.517876 27056 master.cpp:3930] Authorizing agent providing resources 'cpus:1000; mem:1000; disk:35068; ports:[31000-32000]' with principal 'test-principal'
19:40:52  I1129 19:40:52.518025 27053 master.cpp:6667] Authorized registration of agent at slave(85)@172.16.10.61:36069 (ip-172-16-10-61.ec2.internal)
19:40:52  I1129 19:40:52.518064 27053 master.cpp:6782] Registering agent at slave(85)@172.16.10.61:36069 (ip-172-16-10-61.ec2.internal) with id ce756623-c356-49a6-b58d-a59616f2e201-S0
19:40:52  I1129 19:40:52.518195 27053 registrar.cpp:487] Applied 1 operations in 45339ns; attempting to update the registry
19:40:52  I1129 19:40:52.518384 27053 registrar.cpp:544] Successfully updated the registry in 114944ns
19:40:52  I1129 19:40:52.518479 27053 master.cpp:6830] Admitted agent ce756623-c356-49a6-b58d-a59616f2e201-S0 at slave(85)@172.16.10.61:36069 (ip-172-16-10-61.ec2.internal)
19:40:52  I1129 19:40:52.518676 27053 master.cpp:6875] Registered agent ce756623-c356-49a6-b58d-a59616f2e201-S0 at slave(85)@172.16.10.61:36069 (ip-172-16-10-61.ec2.internal) with cpus:1000; mem:1000; disk:35068; ports:[31000-32000]
19:40:52  I1129 19:40:52.518725 27056 hierarchical.cpp:603] Added agent ce756623-c356-49a6-b58d-a59616f2e201-S0 (ip-172-16-10-61.ec2.internal) with cpus:1000; mem:1000; disk:35068; ports:[31000-32000] (allocated: {})
19:40:52  I1129 19:40:52.518786 27053 slave.cpp:1484] Registered with master master@172.16.10.61:36069; given agent ID ce756623-c356-49a6-b58d-a59616f2e201-S0
19:40:52  I1129 19:40:52.518965 27050 task_status_update_manager.cpp:188] Resuming sending task status updates
19:40:52  I1129 19:40:52.519047 27053 slave.cpp:1504] Checkpointing SlaveInfo to '/tmp/FetcherCacheHttpTest_HttpCachedConcurrent_UedRux/meta/slaves/ce756623-c356-49a6-b58d-a59616f2e201-S0/slave.info'
19:40:52  I1129 19:40:52.519412 27053 slave.cpp:1553] Forwarding agent update {"operations":{},"resource_version_uuid":{"value":"9gOS1Ul4SAWsRZWZIr+WNw=="},"slave_id":{"value":"ce756623-c356-49a6-b58d-a59616f2e201-S0"},"update_oversubscribed_resources":false}
19:40:52  I1129 19:40:52.519600 27053 master.cpp:7934] Ignoring update on agent ce756623-c356-49a6-b58d-a59616f2e201-S0 at slave(85)@172.16.10.61:36069 (ip-172-16-10-61.ec2.internal) as it reports no changes
19:40:52  I1129 19:40:52.519680 27056 hierarchical.cpp:1566] Performed allocation for 1 agents in 13174ns
19:40:52  I1129 19:40:52.520311 27028 sched.cpp:232] Version: 1.8.0
19:40:52  I1129 19:40:52.520510 27052 sched.cpp:336] New master detected at master@172.16.10.61:36069
19:40:52  I1129 19:40:52.520550 27052 sched.cpp:401] Authenticating with master master@172.16.10.61:36069
19:40:52  I1129 19:40:52.520560 27052 sched.cpp:408] Using default CRAM-MD5 authenticatee
19:40:52  I1129 19:40:52.520637 27052 authenticatee.cpp:121] Creating new client SASL connection
19:40:52  I1129 19:40:52.520709 27052 master.cpp:9649] Authenticating scheduler-38dde2c7-fac8-425f-ba81-a54f276313f2@172.16.10.61:36069
19:40:52  I1129 19:40:52.520748 27054 authenticator.cpp:414] Starting authentication session for crammd5-authenticatee(196)@172.16.10.61:36069
19:40:52  I1129 19:40:52.520808 27054 authenticator.cpp:98] Creating new server SASL connection
19:40:52  I1129 19:40:52.520860 27054 authenticatee.cpp:213] Received SASL authentication mechanisms: CRAM-MD5
19:40:52  I1129 19:40:52.520869 27054 authenticatee.cpp:239] Attempting to authenticate with mechanism 'CRAM-MD5'
19:40:52  I1129 19:40:52.520896 27054 authenticator.cpp:204] Received SASL authentication start
19:40:52  I1129 19:40:52.520923 27054 authenticator.cpp:326] Authentication requires more steps
19:40:52  I1129 19:40:52.520952 27054 authenticatee.cpp:259] Received SASL authentication step
19:40:52  I1129 19:40:52.520987 27054 authenticator.cpp:232] Received SASL authentication step
19:40:52  I1129 19:40:52.520999 27054 auxprop.cpp:109] Request to lookup properties for user: 'test-principal' realm: 'ip-172-16-10-61' server FQDN: 'ip-172-16-10-61' SASL_AUXPROP_OVERRIDE: false SASL_AUXPROP_AUTHZID: false 
19:40:52  I1129 19:40:52.521004 27054 auxprop.cpp:181] Looking up auxiliary property '*userPassword'
19:40:52  I1129 19:40:52.521011 27054 auxprop.cpp:181] Looking up auxiliary property '*cmusaslsecretCRAM-MD5'
19:40:52  I1129 19:40:52.521016 27054 auxprop.cpp:109] Request to lookup properties for user: 'test-principal' realm: 'ip-172-16-10-61' server FQDN: 'ip-172-16-10-61' SASL_AUXPROP_OVERRIDE: false SASL_AUXPROP_AUTHZID: true 
19:40:52  I1129 19:40:52.521020 27054 auxprop.cpp:131] Skipping auxiliary property '*userPassword' since SASL_AUXPROP_AUTHZID == true
19:40:52  I1129 19:40:52.521024 27054 auxprop.cpp:131] Skipping auxiliary property '*cmusaslsecretCRAM-MD5' since SASL_AUXPROP_AUTHZID == true
19:40:52  I1129 19:40:52.521033 27054 authenticator.cpp:318] Authentication success
19:40:52  I1129 19:40:52.521070 27052 authenticatee.cpp:299] Authentication success
19:40:52  I1129 19:40:52.521076 27051 authenticator.cpp:432] Authentication session cleanup for crammd5-authenticatee(196)@172.16.10.61:36069
19:40:52  I1129 19:40:52.521150 27052 sched.cpp:513] Successfully authenticated with master master@172.16.10.61:36069
19:40:52  I1129 19:40:52.521162 27052 sched.cpp:817] Sending SUBSCRIBE call to master@172.16.10.61:36069
19:40:52  I1129 19:40:52.521185 27054 master.cpp:9681] Successfully authenticated principal 'test-principal' at scheduler-38dde2c7-fac8-425f-ba81-a54f276313f2@172.16.10.61:36069
19:40:52  I1129 19:40:52.521191 27052 sched.cpp:850] Will retry registration in 281.068578ms if necessary
19:40:52  I1129 19:40:52.521230 27054 master.cpp:2860] Received SUBSCRIBE call for framework 'default' at scheduler-38dde2c7-fac8-425f-ba81-a54f276313f2@172.16.10.61:36069
19:40:52  W1129 19:40:52.521241 27054 master.cpp:2868] Setting 'principal' in FrameworkInfo to 'test-principal' because the framework authenticated with that principal but did not set it in FrameworkInfo
19:40:52  I1129 19:40:52.521252 27054 master.cpp:2161] Authorizing framework principal 'test-principal' to receive offers for roles '{ * }'
19:40:52  I1129 19:40:52.521348 27054 master.cpp:2941] Subscribing framework default with checkpointing enabled and capabilities [  ]
19:40:52  I1129 19:40:52.521795 27054 master.cpp:9879] Adding framework ce756623-c356-49a6-b58d-a59616f2e201-0000 (default) at scheduler-38dde2c7-fac8-425f-ba81-a54f276313f2@172.16.10.61:36069 with roles {  } suppressed
19:40:52  I1129 19:40:52.522022 27052 sched.cpp:744] Framework registered with ce756623-c356-49a6-b58d-a59616f2e201-0000
19:40:52  I1129 19:40:52.522047 27052 sched.cpp:758] Scheduler::registered took 10021ns
19:40:52  I1129 19:40:52.522125 27054 hierarchical.cpp:304] Added framework ce756623-c356-49a6-b58d-a59616f2e201-0000
19:40:52  I1129 19:40:52.522256 27054 hierarchical.cpp:1566] Performed allocation for 1 agents in 98787ns
19:40:52  I1129 19:40:52.522393 27054 master.cpp:9464] Sending offers [ ce756623-c356-49a6-b58d-a59616f2e201-O0 ] to framework ce756623-c356-49a6-b58d-a59616f2e201-0000 (default) at scheduler-38dde2c7-fac8-425f-ba81-a54f276313f2@172.16.10.61:36069
19:40:52  I1129 19:40:52.522516 27054 sched.cpp:914] Scheduler::resourceOffers took 9650ns
19:40:53  I1129 19:40:53.501468 27051 hierarchical.cpp:1566] Performed allocation for 1 agents in 45049ns
19:40:54  I1129 19:40:54.501998 27052 hierarchical.cpp:1566] Performed allocation for 1 agents in 44168ns
19:40:55  I1129 19:40:55.503232 27049 hierarchical.cpp:1566] Performed allocation for 1 agents in 41653ns
19:40:56  I1129 19:40:56.503955 27052 hierarchical.cpp:1566] Performed allocation for 1 agents in 45250ns
19:40:57  I1129 19:40:57.504122 27049 hierarchical.cpp:1566] Performed allocation for 1 agents in 44545ns
19:40:58  I1129 19:40:58.505278 27051 hierarchical.cpp:1566] Performed allocation for 1 agents in 63681ns
19:40:59  I1129 19:40:59.505705 27052 hierarchical.cpp:1566] Performed allocation for 1 agents in 43930ns
19:41:00  I1129 19:41:00.506145 27049 hierarchical.cpp:1566] Performed allocation for 1 agents in 44924ns
19:41:01  I1129 19:41:01.507376 27055 hierarchical.cpp:1566] Performed allocation for 1 agents in 46191ns
19:41:02  I1129 19:41:02.508450 27056 hierarchical.cpp:1566] Performed allocation for 1 agents in 45564ns
19:41:03  I1129 19:41:03.509368 27053 hierarchical.cpp:1566] Performed allocation for 1 agents in 43645ns
19:41:04  I1129 19:41:04.510268 27050 hierarchical.cpp:1566] Performed allocation for 1 agents in 44269ns
19:41:05  I1129 19:41:05.511152 27053 hierarchical.cpp:1566] Performed allocation for 1 agents in 43534ns
19:41:06  I1129 19:41:06.512348 27050 hierarchical.cpp:1566] Performed allocation for 1 agents in 45408ns
19:41:07  I1129 19:41:07.512817 27054 hierarchical.cpp:1566] Performed allocation for 1 agents in 42789ns
19:41:07  ../../src/tests/fetcher_cache_tests.cpp:1149: Failure
19:41:07  tasks: Failed to wait for resource offers: discarded
19:41:07  Begin listing sandboxes
19:41:07  End sandboxes
19:41:07  I1129 19:41:07.526495 27028 sched.cpp:2008] Asked to stop the driver
19:41:07  I1129 19:41:07.526536 27055 sched.cpp:1184] Stopping framework ce756623-c356-49a6-b58d-a59616f2e201-0000
19:41:07  I1129 19:41:07.526682 27050 master.cpp:10181] Processing TEARDOWN call for framework ce756623-c356-49a6-b58d-a59616f2e201-0000 (default) at scheduler-38dde2c7-fac8-425f-ba81-a54f276313f2@172.16.10.61:36069
19:41:07  I1129 19:41:07.526720 27050 master.cpp:10193] Removing framework ce756623-c356-49a6-b58d-a59616f2e201-0000 (default) at scheduler-38dde2c7-fac8-425f-ba81-a54f276313f2@172.16.10.61:36069
19:41:07  I1129 19:41:07.526736 27050 master.cpp:3236] Deactivating framework ce756623-c356-49a6-b58d-a59616f2e201-0000 (default) at scheduler-38dde2c7-fac8-425f-ba81-a54f276313f2@172.16.10.61:36069
19:41:07  I1129 19:41:07.526842 27050 master.cpp:11463] Removing offer ce756623-c356-49a6-b58d-a59616f2e201-O0
19:41:07  I1129 19:41:07.526962 27050 hierarchical.cpp:418] Deactivated framework ce756623-c356-49a6-b58d-a59616f2e201-0000
19:41:07  I1129 19:41:07.527051 27050 hierarchical.cpp:1238] Recovered cpus(allocated: *):1000; mem(allocated: *):1000; disk(allocated: *):35068; ports(allocated: *):[31000-32000] (total: cpus:1000; mem:1000; disk:35068; ports:[31000-32000], allocated: {}) on agent ce756623-c356-49a6-b58d-a59616f2e201-S0 from framework ce756623-c356-49a6-b58d-a59616f2e201-0000
19:41:07  I1129 19:41:07.527081 27049 slave.cpp:3901] Asked to shut down framework ce756623-c356-49a6-b58d-a59616f2e201-0000 by master@172.16.10.61:36069
19:41:07  I1129 19:41:07.527096 27049 slave.cpp:3916] Cannot shut down unknown framework ce756623-c356-49a6-b58d-a59616f2e201-0000
19:41:07  I1129 19:41:07.527132 27050 hierarchical.cpp:357] Removed framework ce756623-c356-49a6-b58d-a59616f2e201-0000
19:41:07  I1129 19:41:07.527271 27053 master.cpp:1117] Master terminating
19:41:07  I1129 19:41:07.527498 27051 hierarchical.cpp:643] Removed agent ce756623-c356-49a6-b58d-a59616f2e201-S0
19:41:07  I1129 19:41:07.527531 27051 slave.cpp:5898] Got exited event for master@172.16.10.61:36069
19:41:07  W1129 19:41:07.527542 27051 slave.cpp:5903] Master disconnected! Waiting for a new master to be elected
19:41:07  I1129 19:41:07.529418 27028 slave.cpp:914] Agent terminating
19:41:07  [  FAILED  ] FetcherCacheHttpTest.HttpCachedConcurrent (15040 ms)
{noformat}

> FetcherCacheHttpTest.HttpCachedSerialized and FetcherCacheHttpTest.HttpCachedConcurrent are flaky
> -------------------------------------------------------------------------------------------------
>
>                 Key: MESOS-3235
>                 URL: https://issues.apache.org/jira/browse/MESOS-3235
>             Project: Mesos
>          Issue Type: Bug
>          Components: fetcher, test
>    Affects Versions: 0.23.0
>            Reporter: Joseph Wu
>            Priority: Major
>              Labels: flaky, flaky-test, mesosphere
>         Attachments: fetchercache_log_centos_6.txt
>
>
> On OSX, {{make clean && make -j8 V=0 check}}:
> {code}
> [----------] 3 tests from FetcherCacheHttpTest
> [ RUN      ] FetcherCacheHttpTest.HttpCachedSerialized
> HTTP/1.1 200 OK
> Date: Fri, 07 Aug 2015 17:23:05 GMT
> Content-Length: 30
> I0807 10:23:05.673596 2085372672 exec.cpp:133] Version: 0.24.0
> E0807 10:23:05.675884 184373248 socket.hpp:173] Shutdown failed on fd=18: Socket is not connected [57]
> I0807 10:23:05.675897 182226944 exec.cpp:207] Executor registered on slave 20150807-102305-139395082-52338-52313-S0
> E0807 10:23:05.683980 184373248 socket.hpp:173] Shutdown failed on fd=18: Socket is not connected [57]
> Registered executor on 10.0.79.8
> Starting task 0
> Forked command at 54363
> sh -c './mesos-fetcher-test-cmd 0'
> E0807 10:23:05.694953 184373248 socket.hpp:173] Shutdown failed on fd=18: Socket is not connected [57]
> Command exited with status 0 (pid: 54363)
> E0807 10:23:05.793927 184373248 socket.hpp:173] Shutdown failed on fd=18: Socket is not connected [57]
> I0807 10:23:06.590008 2085372672 exec.cpp:133] Version: 0.24.0
> E0807 10:23:06.592244 355938304 socket.hpp:173] Shutdown failed on fd=18: Socket is not connected [57]
> I0807 10:23:06.592243 353255424 exec.cpp:207] Executor registered on slave 20150807-102305-139395082-52338-52313-S0
> E0807 10:23:06.597995 355938304 socket.hpp:173] Shutdown failed on fd=18: Socket is not connected [57]
> Registered executor on 10.0.79.8
> Starting task 1
> Forked command at 54411
> sh -c './mesos-fetcher-test-cmd 1'
> E0807 10:23:06.608708 355938304 socket.hpp:173] Shutdown failed on fd=18: Socket is not connected [57]
> Command exited with status 0 (pid: 54411)
> E0807 10:23:06.707649 355938304 socket.hpp:173] Shutdown failed on fd=18: Socket is not connected [57]
> ../../src/tests/fetcher_cache_tests.cpp:860: Failure
> Failed to wait 15secs for awaitFinished(task.get())
> *** Aborted at 1438968214 (unix time) try "date -d @1438968214" if you are using GNU date ***
> [  FAILED  ] FetcherCacheHttpTest.HttpCachedSerialized (28685 ms)
> [ RUN      ] FetcherCacheHttpTest.HttpCachedConcurrent
> PC: @        0x113723618 process::Owned<>::get()
> *** SIGSEGV (@0x0) received by PID 52313 (TID 0x118d59000) stack trace: ***
>     @     0x7fff8fcacf1a _sigtramp
>     @     0x7f9bc3109710 (unknown)
>     @        0x1136f07e2 mesos::internal::slave::Fetcher::fetch()
>     @        0x113862f9d mesos::internal::slave::MesosContainerizerProcess::fetch()
>     @        0x1138f1b5d _ZZN7process8dispatchI7NothingN5mesos8internal5slave25MesosContainerizerProcessERKNS2_11ContainerIDERKNS2_11CommandInfoERKNSt3__112basic_stringIcNSC_11char_traitsIcEENSC_9allocatorIcEEEERK6OptionISI_ERKNS2_7SlaveIDES6_S9_SI_SM_SP_EENS_6FutureIT_EERKNS_3PIDIT0_EEMSW_FSU_T1_T2_T3_T4_T5_ET6_T7_T8_T9_T10_ENKUlPNS_11ProcessBaseEE_clES1D_
>     @        0x1138f18cf _ZNSt3__110__function6__funcIZN7process8dispatchI7NothingN5mesos8internal5slave25MesosContainerizerProcessERKNS5_11ContainerIDERKNS5_11CommandInfoERKNS_12basic_stringIcNS_11char_traitsIcEENS_9allocatorIcEEEERK6OptionISK_ERKNS5_7SlaveIDES9_SC_SK_SO_SR_EENS2_6FutureIT_EERKNS2_3PIDIT0_EEMSY_FSW_T1_T2_T3_T4_T5_ET6_T7_T8_T9_T10_EUlPNS2_11ProcessBaseEE_NSI_IS1G_EEFvS1F_EEclEOS1F_
>     @        0x1143768cf std::__1::function<>::operator()()
>     @        0x11435ca7f process::ProcessBase::visit()
>     @        0x1143ed6fe process::DispatchEvent::visit()
>     @        0x1127aaaa1 process::ProcessBase::serve()
>     @        0x114343b4e process::ProcessManager::resume()
>     @        0x1143431ca process::internal::schedule()
>     @        0x1143da646 _ZNSt3__114__thread_proxyINS_5tupleIJPFvvEEEEEEPvS5_
>     @     0x7fff95090268 _pthread_body
>     @     0x7fff950901e5 _pthread_start
>     @     0x7fff9508e41d thread_start
> Failed to synchronize with slave (it's probably exited)
> make[3]: *** [check-local] Segmentation fault: 11
> make[2]: *** [check-am] Error 2
> make[1]: *** [check] Error 2
> make: *** [check-recursive] Error 1
> {code}
> This was encountered just once out of 3+ {{make check}}s.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)