You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@mesos.apache.org by Jorge Machado <jo...@me.com.INVALID> on 2019/08/01 11:05:26 UTC

Restarting mesas-agent kills executors

Hi Guys, 

I was reading about agent restarts on http://mesos.apache.org/documentation/latest/agent-recovery/ <http://mesos.apache.org/documentation/latest/agent-recovery/>
From what I understood, If I had a task running and we restart the mesos-agent I should not loose any task running. 
This is not the case for systemctl (or with service command) from ubuntu 18.04. Our Framework has checkpointing active...

My config: 

[Unit]
Description=Mesos Agent
After=network.target
Wants=network.target

[Service]
Environment=LIBPROCESS_SSL_ENABLED=true
Environment=LIBPROCESS_SSL_SUPPORT_DOWNGRADE=false
Environment=LIBPROCESS_SSL_CIPHERS=AES128-SHA:AES256-SHA:DHE-RSA-AES128-SHA:DHE-DSS-AES128-SHA:DHE-RSA-AES256-SHA:DHE-DSS-AES256-SHA
Environment=LIBPROCESS_SSL_KEY_FILE=/etc/ssl/private/server_2048.key
Environment=LIBPROCESS_SSL_CERT_FILE=/etc/ssl/server.crt
Environment=LIBPROCESS_SSL_CA_FILE=/etc/pki/trust/anchors/it4ad.pem

ExecStart=/usr/local/sbin/mesos-agent \
    --master=<zookeeper> \
    --work_dir=/data/mesos/work \
    --log_dir=/var/log/mesos \
    --executor_registration_timeout=20mins \
    --executor_environment_variables=file:///etc/mesos/executor_envs.json \
    --resources=file:///etc/mesos/resources.txt \
    --image_gc_config=file:///etc/mesos/image-gc-config.json \
    --isolation=cgroups/cpu,cgroups/mem,cgroups/devices,filesystem/linux,gpu/nvidia,docker/runtime,namespaces/pid,namespaces/ipc \
    --image_providers=docker \
    --docker_store_dir=/data/mesos/store/docker \
    --gc_delay=3weeks \
    --attributes=<attr>

KillMode=control-cgroup
Restart=always
RestartSec=20
LimitNOFILE=infinity
CPUAccounting=true
MemoryAccounting=true
TasksMax=infinity

[Install]
WantedBy=multi-user.target



Any tipp ? thx



Jorge Machado
www.jmachado.me






Re: Restarting mesas-agent kills executors

Posted by Vinod Kone <vi...@gmail.com>.
Need agent and executor logs to diagnose. Can you share them?

Thanks,
Vinod

> On Aug 1, 2019, at 6:05 AM, Jorge Machado <jo...@me.com.invalid> wrote:
> 
> Hi Guys, 
> 
> I was reading about agent restarts on http://mesos.apache.org/documentation/latest/agent-recovery/ <http://mesos.apache.org/documentation/latest/agent-recovery/>
> From what I understood, If I had a task running and we restart the mesos-agent I should not loose any task running. 
> This is not the case for systemctl (or with service command) from ubuntu 18.04. Our Framework has checkpointing active...
> 
> My config: 
> 
> [Unit]
> Description=Mesos Agent
> After=network.target
> Wants=network.target
> 
> [Service]
> Environment=LIBPROCESS_SSL_ENABLED=true
> Environment=LIBPROCESS_SSL_SUPPORT_DOWNGRADE=false
> Environment=LIBPROCESS_SSL_CIPHERS=AES128-SHA:AES256-SHA:DHE-RSA-AES128-SHA:DHE-DSS-AES128-SHA:DHE-RSA-AES256-SHA:DHE-DSS-AES256-SHA
> Environment=LIBPROCESS_SSL_KEY_FILE=/etc/ssl/private/server_2048.key
> Environment=LIBPROCESS_SSL_CERT_FILE=/etc/ssl/server.crt
> Environment=LIBPROCESS_SSL_CA_FILE=/etc/pki/trust/anchors/it4ad.pem
> 
> ExecStart=/usr/local/sbin/mesos-agent \
>    --master=<zookeeper> \
>    --work_dir=/data/mesos/work \
>    --log_dir=/var/log/mesos \
>    --executor_registration_timeout=20mins \
>    --executor_environment_variables=file:///etc/mesos/executor_envs.json \
>    --resources=file:///etc/mesos/resources.txt \
>    --image_gc_config=file:///etc/mesos/image-gc-config.json \
>    --isolation=cgroups/cpu,cgroups/mem,cgroups/devices,filesystem/linux,gpu/nvidia,docker/runtime,namespaces/pid,namespaces/ipc \
>    --image_providers=docker \
>    --docker_store_dir=/data/mesos/store/docker \
>    --gc_delay=3weeks \
>    --attributes=<attr>
> 
> KillMode=control-cgroup
> Restart=always
> RestartSec=20
> LimitNOFILE=infinity
> CPUAccounting=true
> MemoryAccounting=true
> TasksMax=infinity
> 
> [Install]
> WantedBy=multi-user.target
> 
> 
> 
> Any tipp ? thx
> 
> 
> 
> Jorge Machado
> www.jmachado.me
> 
> 
> 
> 
>