You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@mesos.apache.org by Rodrick Brown <ro...@orchard-app.com> on 2015/10/28 16:42:40 UTC

unable to start mesos-slave as non-root user after 0.25 upgrade

After I upgraded the first thing I notice is that permissions on wrapper script

# ls -al /usr/bin/mesos-init-wrapper
-rwxr-x---. 1 root root 5202 Oct 12 21:08 /usr/bin/mesos-init-wrapper

So systemd was unable to EXEC this script

So I changed the perms on this wrapper 
# chmod a+x  /usr/bin/mesos-init-wrapper


However I’m still unable to bring up the process via systemd 

Oct 28 15:39:27 prod-mesos-s-1.aws.orchardplatform.com systemd[1]: Started Mesos Slave.
Oct 28 15:39:27 prod-mesos-s-1.aws.orchardplatform.com systemd[1]: Starting Mesos Slave...
Oct 28 15:39:27 prod-mesos-s-1.aws.orchardplatform.com systemd[1]: mesos-slave.service: main process exited, code=exited, status=126/n/a
Oct 28 15:39:27 prod-mesos-s-1.aws.orchardplatform.com systemd[1]: Unit mesos-slave.service entered failed state.
Oct 28 15:39:27 prod-mesos-s-1.aws.orchardplatform.com systemd[1]: mesos-slave.service failed.
Oct 28 15:39:47 prod-mesos-s-1.aws.orchardplatform.com systemd[1]: mesos-slave.service holdoff time over, scheduling restart.
Oct 28 15:39:47 prod-mesos-s-1.aws.orchardplatform.com systemd[1]: Started Mesos Slave.
Oct 28 15:39:47 prod-mesos-s-1.aws.orchardplatform.com systemd[1]: Starting Mesos Slave...
Oct 28 15:39:47 prod-mesos-s-1.aws.orchardplatform.com systemd[1]: mesos-slave.service: main process exited, code=exited, status=126/n/a
Oct 28 15:39:47 prod-mesos-s-1.aws.orchardplatform.com systemd[1]: Unit mesos-slave.service entered failed state.
Oct 28 15:39:47 prod-mesos-s-1.aws.orchardplatform.com systemd[1]: mesos-slave.service failed.

# cat /usr/lib/systemd/system/mesos-slave.service
[Unit]
Description=Mesos Slave
After=network.target
Wants=network.target

[Service]
User=mesos
ExecStart=/usr/bin/mesos-init-wrapper slave
KillMode=process
Restart=always
RestartSec=20
LimitNOFILE=16384
CPUAccounting=true
MemoryAccounting=true

[Install]
WantedBy=multi-user.target

The only change I made to the unit file was add User=mesos this worked in previous versions of mesos. 

If remove User=mesos and have systemd bring the process up as root the slave joins the cluster and everything works as designed. 
Was something changed in 0.24.1 and 0.25 ? 

Thanks. 


-- 
 <http://www.orchardplatform.com/>
Rodrick Brown / DevOPs Engineer 
+1 917 445 6839 / rodrick@orchardplatform.com <ma...@orchardplatform.com>
Orchard Platform 
101 5th Avenue, 4th Floor, New York, NY 10003 
http://www.orchardplatform.com <http://www.orchardplatform.com/>
Orchard Blog <http://www.orchardplatform.com/blog/> | Marketplace Lending Meetup <http://www.meetup.com/Peer-to-Peer-Lending-P2P/>

-- 
*NOTICE TO RECIPIENTS*: This communication is confidential and intended for 
the use of the addressee only. If you are not an intended recipient of this 
communication, please delete it immediately and notify the sender by return 
email. Unauthorized reading, dissemination, distribution or copying of this 
communication is prohibited. This communication does not constitute an 
offer to sell or a solicitation of an indication of interest to purchase 
any loan, security or any other financial product or instrument, nor is it 
an offer to sell or a solicitation of an indication of interest to purchase 
any products or services to any persons who are prohibited from receiving 
such information under applicable law. The contents of this communication 
may not be accurate or complete and are subject to change without notice. 
As such, Orchard App, Inc. (including its subsidiaries and affiliates, 
"Orchard") makes no representation regarding the accuracy or completeness 
of the information contained herein. The intended recipient is advised to 
consult its own professional advisors, including those specializing in 
legal, tax and accounting matters. Orchard does not provide legal, tax or 
accounting advice.

Re: unable to start mesos-slave as non-root user after 0.25 upgrade

Posted by Rodrick Brown <ro...@orchard-app.com>.
The issue was indeed permission problems. 

I’m using Fedora 22 and mesos-0.25.0-0.2.70.centos701406.x86_64

The following file is set with odd perms 
-rwxr-x---. 1 root root 5202 Oct 12 21:08 /bin/mesos-init-wrapper

I was able to use ansible to fix the perms across my cluster by changing perms/mode on the following  

- name: Setting ownership for mesos on /bin/mesos-init-wrapper
  file: path=/bin/mesos-init-wrapper owner=root group=mesos mode=755
  notify: restart mesos service
  tags: set_perms

- name: Setting permissions for mesos group on root owned mesos files
  file: path={{ item }} owner=root group=mesos mode=0764
  with_items:
    - /etc/default/mesos
    - /etc/default/mesos-master
    - /etc/default/mesos-slave
  notify: restart mesos service
  tags: set_perms

- name: Setting permissions for mesos user/group on dirs
  file: path={{ item }} owner=mesos group=mesos mode=0765 recurse=yes
  with_items:
    - /var/lib/mesos
    - /var/log/mesos
    - /etc/mesos
    - /tmp/mesos
  notify: restart mesos service
  tags: set_perms


-- 
 <http://www.orchardplatform.com/>
Rodrick Brown / DevOPs Engineer 
+1 917 445 6839 / rodrick@orchardplatform.com <ma...@orchardplatform.com>
Orchard Platform 
101 5th Avenue, 4th Floor, New York, NY 10003 
http://www.orchardplatform.com <http://www.orchardplatform.com/>
Orchard Blog <http://www.orchardplatform.com/blog/> | Marketplace Lending Meetup <http://www.meetup.com/Peer-to-Peer-Lending-P2P/>
> On Oct 28, 2015, at 5:29 PM, Joris Van Remoortere <jo...@mesosphere.io> wrote:
> 
> This may be related to the systemd support we added in 0.25.
> If the agent detects it is running on systemd it will try to launch a systemd slice under which to run the executors. If your non-root user does not have sufficient permissions to perform these operations that will be a problem.
> Can you share the agent logs to verify this? You should be able to access them using journalctl.
> 
> Joris
> 
> — 
> Joris Van Remoortere
> Mesosphere
> 
> On Wed, Oct 28, 2015 at 12:33 PM, haosdent <haosdent@gmail.com <ma...@gmail.com>> wrote:
> does mesos slave have any log?
> 
> On Wed, Oct 28, 2015 at 11:42 PM, Rodrick Brown <rodrick@orchard-app.com <ma...@orchard-app.com>> wrote:
> After I upgraded the first thing I notice is that permissions on wrapper script
> 
> # ls -al /usr/bin/mesos-init-wrapper
> -rwxr-x---. 1 root root 5202 Oct 12 21:08 /usr/bin/mesos-init-wrapper
> 
> So systemd was unable to EXEC this script
> 
> So I changed the perms on this wrapper 
> # chmod a+x  /usr/bin/mesos-init-wrapper
> 
> 
> However I’m still unable to bring up the process via systemd 
> 
> Oct 28 15:39:27 prod-mesos-s-1.aws.orchardplatform.com <http://prod-mesos-s-1.aws.orchardplatform.com/> systemd[1]: Started Mesos Slave.
> Oct 28 15:39:27 prod-mesos-s-1.aws.orchardplatform.com <http://prod-mesos-s-1.aws.orchardplatform.com/> systemd[1]: Starting Mesos Slave...
> Oct 28 15:39:27 prod-mesos-s-1.aws.orchardplatform.com <http://prod-mesos-s-1.aws.orchardplatform.com/> systemd[1]: mesos-slave.service: main process exited, code=exited, status=126/n/a
> Oct 28 15:39:27 prod-mesos-s-1.aws.orchardplatform.com <http://prod-mesos-s-1.aws.orchardplatform.com/> systemd[1]: Unit mesos-slave.service entered failed state.
> Oct 28 15:39:27 prod-mesos-s-1.aws.orchardplatform.com <http://prod-mesos-s-1.aws.orchardplatform.com/> systemd[1]: mesos-slave.service failed.
> Oct 28 15:39:47 prod-mesos-s-1.aws.orchardplatform.com <http://prod-mesos-s-1.aws.orchardplatform.com/> systemd[1]: mesos-slave.service holdoff time over, scheduling restart.
> Oct 28 15:39:47 prod-mesos-s-1.aws.orchardplatform.com <http://prod-mesos-s-1.aws.orchardplatform.com/> systemd[1]: Started Mesos Slave.
> Oct 28 15:39:47 prod-mesos-s-1.aws.orchardplatform.com <http://prod-mesos-s-1.aws.orchardplatform.com/> systemd[1]: Starting Mesos Slave...
> Oct 28 15:39:47 prod-mesos-s-1.aws.orchardplatform.com <http://prod-mesos-s-1.aws.orchardplatform.com/> systemd[1]: mesos-slave.service: main process exited, code=exited, status=126/n/a
> Oct 28 15:39:47 prod-mesos-s-1.aws.orchardplatform.com <http://prod-mesos-s-1.aws.orchardplatform.com/> systemd[1]: Unit mesos-slave.service entered failed state.
> Oct 28 15:39:47 prod-mesos-s-1.aws.orchardplatform.com <http://prod-mesos-s-1.aws.orchardplatform.com/> systemd[1]: mesos-slave.service failed.
> 
> # cat /usr/lib/systemd/system/mesos-slave.service
> [Unit]
> Description=Mesos Slave
> After=network.target
> Wants=network.target
> 
> [Service]
> User=mesos
> ExecStart=/usr/bin/mesos-init-wrapper slave
> KillMode=process
> Restart=always
> RestartSec=20
> LimitNOFILE=16384
> CPUAccounting=true
> MemoryAccounting=true
> 
> [Install]
> WantedBy=multi-user.target
> 
> The only change I made to the unit file was add User=mesos this worked in previous versions of mesos. 
> 
> If remove User=mesos and have systemd bring the process up as root the slave joins the cluster and everything works as designed. 
> Was something changed in 0.24.1 and 0.25 ? 
> 
> Thanks. 
> 
> 
> -- 
>  <http://www.orchardplatform.com/>
> Rodrick Brown / DevOPs Engineer 
> +1 917 445 6839 <tel:%2B1%20917%20445%206839> / rodrick@orchardplatform.com <ma...@orchardplatform.com>
> Orchard Platform 
> 101 5th Avenue, 4th Floor, New York, NY 10003 
> http://www.orchardplatform.com <http://www.orchardplatform.com/>
> Orchard Blog <http://www.orchardplatform.com/blog/> | Marketplace Lending Meetup <http://www.meetup.com/Peer-to-Peer-Lending-P2P/>
> 
> NOTICE TO RECIPIENTS: This communication is confidential and intended for the use of the addressee only. If you are not an intended recipient of this communication, please delete it immediately and notify the sender by return email. Unauthorized reading, dissemination, distribution or copying of this communication is prohibited. This communication does not constitute an offer to sell or a solicitation of an indication of interest to purchase any loan, security or any other financial product or instrument, nor is it an offer to sell or a solicitation of an indication of interest to purchase any products or services to any persons who are prohibited from receiving such information under applicable law. The contents of this communication may not be accurate or complete and are subject to change without notice. As such, Orchard App, Inc. (including its subsidiaries and affiliates, "Orchard") makes no representation regarding the accuracy or completeness of the information contained herein. The intended recipient is advised to consult its own professional advisors, including those specializing in legal, tax and accounting matters. Orchard does not provide legal, tax or accounting advice.
> 
> 
> 
> -- 
> Best Regards,
> Haosdent Huang
> 


-- 
*NOTICE TO RECIPIENTS*: This communication is confidential and intended for 
the use of the addressee only. If you are not an intended recipient of this 
communication, please delete it immediately and notify the sender by return 
email. Unauthorized reading, dissemination, distribution or copying of this 
communication is prohibited. This communication does not constitute an 
offer to sell or a solicitation of an indication of interest to purchase 
any loan, security or any other financial product or instrument, nor is it 
an offer to sell or a solicitation of an indication of interest to purchase 
any products or services to any persons who are prohibited from receiving 
such information under applicable law. The contents of this communication 
may not be accurate or complete and are subject to change without notice. 
As such, Orchard App, Inc. (including its subsidiaries and affiliates, 
"Orchard") makes no representation regarding the accuracy or completeness 
of the information contained herein. The intended recipient is advised to 
consult its own professional advisors, including those specializing in 
legal, tax and accounting matters. Orchard does not provide legal, tax or 
accounting advice.

Re: unable to start mesos-slave as non-root user after 0.25 upgrade

Posted by Joris Van Remoortere <jo...@mesosphere.io>.
This may be related to the systemd support we added in 0.25.
If the agent detects it is running on systemd it will try to launch a
systemd slice under which to run the executors. If your non-root user does
not have sufficient permissions to perform these operations that will be a
problem.
Can you share the agent logs to verify this? You should be able to access
them using journalctl.

Joris

—
*Joris Van Remoortere*
Mesosphere

On Wed, Oct 28, 2015 at 12:33 PM, haosdent <ha...@gmail.com> wrote:

> does mesos slave have any log?
>
> On Wed, Oct 28, 2015 at 11:42 PM, Rodrick Brown <ro...@orchard-app.com>
> wrote:
>
>> After I upgraded the first thing I notice is that permissions on wrapper
>> script
>>
>> # ls -al /usr/bin/mesos-init-wrapper
>> -rwxr-x---. 1 root root 5202 Oct 12 21:08 /usr/bin/mesos-init-wrapper
>>
>> So systemd was unable to EXEC this script
>>
>> So I changed the perms on this wrapper
>> # chmod a+x  /usr/bin/mesos-init-wrapper
>>
>>
>> However I’m still unable to bring up the process via systemd
>>
>> Oct 28 15:39:27 prod-mesos-s-1.aws.orchardplatform.com systemd[1]:
>> Started Mesos Slave.
>> Oct 28 15:39:27 prod-mesos-s-1.aws.orchardplatform.com systemd[1]:
>> Starting Mesos Slave...
>> Oct 28 15:39:27 prod-mesos-s-1.aws.orchardplatform.com systemd[1]:
>> mesos-slave.service: main process exited, code=exited, status=126/n/a
>> Oct 28 15:39:27 prod-mesos-s-1.aws.orchardplatform.com systemd[1]: Unit
>> mesos-slave.service entered failed state.
>> Oct 28 15:39:27 prod-mesos-s-1.aws.orchardplatform.com systemd[1]:
>> mesos-slave.service failed.
>> Oct 28 15:39:47 prod-mesos-s-1.aws.orchardplatform.com systemd[1]:
>> mesos-slave.service holdoff time over, scheduling restart.
>> Oct 28 15:39:47 prod-mesos-s-1.aws.orchardplatform.com systemd[1]:
>> Started Mesos Slave.
>> Oct 28 15:39:47 prod-mesos-s-1.aws.orchardplatform.com systemd[1]:
>> Starting Mesos Slave...
>> Oct 28 15:39:47 prod-mesos-s-1.aws.orchardplatform.com systemd[1]:
>> mesos-slave.service: main process exited, code=exited, status=126/n/a
>> Oct 28 15:39:47 prod-mesos-s-1.aws.orchardplatform.com systemd[1]: Unit
>> mesos-slave.service entered failed state.
>> Oct 28 15:39:47 prod-mesos-s-1.aws.orchardplatform.com systemd[1]:
>> mesos-slave.service failed.
>>
>> # cat /usr/lib/systemd/system/mesos-slave.service
>> [Unit]
>> Description=Mesos Slave
>> After=network.target
>> Wants=network.target
>>
>> [Service]
>> User=mesos
>> ExecStart=/usr/bin/mesos-init-wrapper slave
>> KillMode=process
>> Restart=always
>> RestartSec=20
>> LimitNOFILE=16384
>> CPUAccounting=true
>> MemoryAccounting=true
>>
>> [Install]
>> WantedBy=multi-user.target
>>
>> The only change I made to the unit file was add User=mesos this worked in
>> previous versions of mesos.
>>
>> If remove User=mesos and have systemd bring the process up as root the
>> slave joins the cluster and everything works as designed.
>> Was something changed in 0.24.1 and 0.25 ?
>>
>> Thanks.
>>
>>
>> --
>>
>> [image: Orchard Platform] <http://www.orchardplatform.com/>
>>
>> Rodrick Brown / DevOPs Engineer
>> +1 917 445 6839 / rodrick@orchardplatform.com
>> <ch...@orchardplatform.com>
>>
>> Orchard Platform
>> 101 5th Avenue, 4th Floor, New York, NY 10003
>> http://www.orchardplatform.com
>>
>> Orchard Blog <http://www.orchardplatform.com/blog/> | Marketplace
>> Lending Meetup <http://www.meetup.com/Peer-to-Peer-Lending-P2P/>
>>
>>
>> *NOTICE TO RECIPIENTS*: This communication is confidential and intended
>> for the use of the addressee only. If you are not an intended recipient of
>> this communication, please delete it immediately and notify the sender
>> by return email. Unauthorized reading, dissemination, distribution or
>> copying of this communication is prohibited. This communication does not constitute
>> an offer to sell or a solicitation of an indication of interest to purchase
>> any loan, security or any other financial product or instrument, nor is it
>> an offer to sell or a solicitation of an indication of interest to purchase
>> any products or services to any persons who are prohibited from receiving
>> such information under applicable law. The contents of this communication
>> may not be accurate or complete and are subject to change without notice.
>> As such, Orchard App, Inc. (including its subsidiaries and affiliates,
>> "Orchard") makes no representation regarding the accuracy or
>> completeness of the information contained herein. The intended recipient is
>> advised to consult its own professional advisors, including those
>> specializing in legal, tax and accounting matters. Orchard does not
>> provide legal, tax or accounting advice.
>>
>
>
>
> --
> Best Regards,
> Haosdent Huang
>

Re: unable to start mesos-slave as non-root user after 0.25 upgrade

Posted by haosdent <ha...@gmail.com>.
does mesos slave have any log?

On Wed, Oct 28, 2015 at 11:42 PM, Rodrick Brown <ro...@orchard-app.com>
wrote:

> After I upgraded the first thing I notice is that permissions on wrapper
> script
>
> # ls -al /usr/bin/mesos-init-wrapper
> -rwxr-x---. 1 root root 5202 Oct 12 21:08 /usr/bin/mesos-init-wrapper
>
> So systemd was unable to EXEC this script
>
> So I changed the perms on this wrapper
> # chmod a+x  /usr/bin/mesos-init-wrapper
>
>
> However I’m still unable to bring up the process via systemd
>
> Oct 28 15:39:27 prod-mesos-s-1.aws.orchardplatform.com systemd[1]:
> Started Mesos Slave.
> Oct 28 15:39:27 prod-mesos-s-1.aws.orchardplatform.com systemd[1]:
> Starting Mesos Slave...
> Oct 28 15:39:27 prod-mesos-s-1.aws.orchardplatform.com systemd[1]:
> mesos-slave.service: main process exited, code=exited, status=126/n/a
> Oct 28 15:39:27 prod-mesos-s-1.aws.orchardplatform.com systemd[1]: Unit
> mesos-slave.service entered failed state.
> Oct 28 15:39:27 prod-mesos-s-1.aws.orchardplatform.com systemd[1]:
> mesos-slave.service failed.
> Oct 28 15:39:47 prod-mesos-s-1.aws.orchardplatform.com systemd[1]:
> mesos-slave.service holdoff time over, scheduling restart.
> Oct 28 15:39:47 prod-mesos-s-1.aws.orchardplatform.com systemd[1]:
> Started Mesos Slave.
> Oct 28 15:39:47 prod-mesos-s-1.aws.orchardplatform.com systemd[1]:
> Starting Mesos Slave...
> Oct 28 15:39:47 prod-mesos-s-1.aws.orchardplatform.com systemd[1]:
> mesos-slave.service: main process exited, code=exited, status=126/n/a
> Oct 28 15:39:47 prod-mesos-s-1.aws.orchardplatform.com systemd[1]: Unit
> mesos-slave.service entered failed state.
> Oct 28 15:39:47 prod-mesos-s-1.aws.orchardplatform.com systemd[1]:
> mesos-slave.service failed.
>
> # cat /usr/lib/systemd/system/mesos-slave.service
> [Unit]
> Description=Mesos Slave
> After=network.target
> Wants=network.target
>
> [Service]
> User=mesos
> ExecStart=/usr/bin/mesos-init-wrapper slave
> KillMode=process
> Restart=always
> RestartSec=20
> LimitNOFILE=16384
> CPUAccounting=true
> MemoryAccounting=true
>
> [Install]
> WantedBy=multi-user.target
>
> The only change I made to the unit file was add User=mesos this worked in
> previous versions of mesos.
>
> If remove User=mesos and have systemd bring the process up as root the
> slave joins the cluster and everything works as designed.
> Was something changed in 0.24.1 and 0.25 ?
>
> Thanks.
>
>
> --
>
> [image: Orchard Platform] <http://www.orchardplatform.com/>
>
> Rodrick Brown / DevOPs Engineer
> +1 917 445 6839 / rodrick@orchardplatform.com
> <ch...@orchardplatform.com>
>
> Orchard Platform
> 101 5th Avenue, 4th Floor, New York, NY 10003
> http://www.orchardplatform.com
>
> Orchard Blog <http://www.orchardplatform.com/blog/> | Marketplace Lending
> Meetup <http://www.meetup.com/Peer-to-Peer-Lending-P2P/>
>
>
> *NOTICE TO RECIPIENTS*: This communication is confidential and intended
> for the use of the addressee only. If you are not an intended recipient of
> this communication, please delete it immediately and notify the sender by
> return email. Unauthorized reading, dissemination, distribution or copying
> of this communication is prohibited. This communication does not constitute
> an offer to sell or a solicitation of an indication of interest to purchase
> any loan, security or any other financial product or instrument, nor is it
> an offer to sell or a solicitation of an indication of interest to purchase
> any products or services to any persons who are prohibited from receiving
> such information under applicable law. The contents of this communication
> may not be accurate or complete and are subject to change without notice.
> As such, Orchard App, Inc. (including its subsidiaries and affiliates,
> "Orchard") makes no representation regarding the accuracy or completeness
> of the information contained herein. The intended recipient is advised to
> consult its own professional advisors, including those specializing in
> legal, tax and accounting matters. Orchard does not provide legal, tax or
> accounting advice.
>



-- 
Best Regards,
Haosdent Huang