You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@mesos.apache.org by "Jie Yu (JIRA)" <ji...@apache.org> on 2015/05/08 19:56:02 UTC

[jira] [Resolved] (MESOS-2672) ContainerizerTest.ROOT_CGROUPS_BalloonFramework flaky

     [ https://issues.apache.org/jira/browse/MESOS-2672?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jie Yu resolved MESOS-2672.
---------------------------
    Resolution: Fixed

commit 7d2d5de9b9f6adbb94cf692576236424aeaf2f67
Author: Chi Zhang <ch...@gmail.com>
Date:   Fri May 8 10:54:14 2015 -0700

    Changed BalloonExecutor to do memset before mlock.
    
    mlock returns error when requested memory is more than the limit,
    because it couldn't find enough lockable memory, which defeats the
    purpose to trigger an oom.
    
    Review: https://reviews.apache.org/r/33990

> ContainerizerTest.ROOT_CGROUPS_BalloonFramework flaky
> -----------------------------------------------------
>
>                 Key: MESOS-2672
>                 URL: https://issues.apache.org/jira/browse/MESOS-2672
>             Project: Mesos
>          Issue Type: Bug
>            Reporter: Chi Zhang
>            Assignee: Chi Zhang
>
> {noformat}
> I0429 00:58:35.267629  2086 slave.cpp:3210] Executor 'default' of framework 20150429-005830-16777343-5432-2023-0000 terminated with signal Aborted
> I0429 00:58:35.270761  2086 slave.cpp:2512] Handling status update TASK_LOST (UUID: f969e350-6f91-4fa9-980e-1852554bd704) for task 1 of framework 201
> 50429-005830-16777343-5432-2023-0000 from @0.0.0.0:0
> I0429 00:58:35.270983  2086 slave.cpp:4604] Terminating task 1
> W0429 00:58:35.271574  2080 containerizer.cpp:903] Ignoring update for unknown container: 1298549a-a3d2-46ff-aad0-9dbc777affcc
> I0429 00:58:35.272541  2074 status_update_manager.cpp:317] Received status update TASK_LOST (UUID: f969e350-6f91-4fa9-980e-1852554bd704) for task 1 o
> f framework 20150429-005830-16777343-5432-2023-0000
> I0429 00:58:35.272624  2074 status_update_manager.cpp:494] Creating StatusUpdate stream for task 1 of framework 20150429-005830-16777343-5432-2023-00
> 00
> I0429 00:58:35.273217  2053 master.cpp:3493] Executor default of framework 20150429-005830-16777343-5432-2023-0000 on slave 20150429-005830-16777343-
> 5432-2023-S0 at slave(1)@10.35.12.124:5051 (smfd-aki-27-sr1.devel.twitter.com): terminated with signal Aborted
> {noformat}
> which is from
> {code}
>  60    // We use mlock and memset here to make sure that the memory                                                                                  
>  61    // actually gets paged in and thus accounted for.                                                                                             
>  62    if (mlock(buffer, chunk) != 0) {                                                                                                              
>  63      perror("Failed to lock memory, mlock");                                                                                                     
>  64      abort();                                                                                                                                    
>  65    }                                                                                                                                             
>  66                                                                                                                                                  
>  67    if (memset(buffer, 1, chunk) != buffer) {                                                                                                     
>  68      perror("Failed to fill memory, memset");                                                                                                    
>  69      abort();                                                                                                                                    
>  70    }  
> {code}
> This is the same as MESOS-2660: I've confirmed that swapping them fixed it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)