You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@mesos.apache.org by Marc Roos <M....@f1-outsourcing.eu> on 2019/08/19 14:46:14 UTC

Large container image failing to start 'first' time

I have a container image of around 800MB. I am not sure if that is a 
lot. But I have noticed it is probably to big for a default setup to get 
it to launch. I think the only reason it launches eventually is because 
data is cached and no timeout expires. The container will launch 
eventually when you constrain it to a host.

How can I trace where this timeout occurs? Are there options to specify 
timeouts?








Re: Large container image failing to start 'first' time

Posted by Qian Zhang <zh...@gmail.com>.
These are the logs when the container was being destroyed, we need the logs
when the container was launched to figure out why the container was stuck
at provisioning state.


Regards,
Qian Zhang


On Sat, Aug 31, 2019 at 6:24 AM Marc Roos <M....@f1-outsourcing.eu> wrote:

>
> I only have these two messages
>
>
> mesos-slave.ERROR:E0828 12:51:46.146246 2663200 slave.cpp:6486]
> Container '680d3849-2b2a-4549-8842-8ef358599478' for executor
> 'ldap.instance-afee8840-c981-11e9-8333-0050563001a1._app.1' of framework
> d5168fcd-51be-48c3-ba64-ade27ab23c4e-0000 failed to start: Container is
> being destroyed during provisioning
> mesos-slave.INFO:E0828 12:51:46.146246 2663200 slave.cpp:6486] Container
> '680d3849-2b2a-4549-8842-8ef358599478' for executor
> 'ldap.instance-afee8840-c981-11e9-8333-0050563001a1._app.1' of framework
> d5168fcd-51be-48c3-ba64-ade27ab23c4e-0000 failed to start: Container is
> being destroyed during provisioning
> mesos-slave.INFO:W0828 12:51:46.650323 2663184 containerizer.cpp:2375]
> Ignoring update for unknown container
> 680d3849-2b2a-4549-8842-8ef358599478
> mesos-slave.WARNING:E0828 12:51:46.146246 2663200 slave.cpp:6486]
> Container '680d3849-2b2a-4549-8842-8ef358599478' for executor
> 'ldap.instance-afee8840-c981-11e9-8333-0050563001a1._app.1' of framework
> d5168fcd-51be-48c3-ba64-ade27ab23c4e-0000 failed to start: Container is
> being destroyed during provisioning
> mesos-slave.WARNING:W0828 12:51:46.650323 2663184
> containerizer.cpp:2375] Ignoring update for unknown container
> 680d3849-2b2a-4549-8842-8ef358599478
>
>
>
>
> -----Original Message-----
> From: Qian Zhang [mailto:zhq527725@gmail.com]
> Sent: woensdag 28 augustus 2019 15:07
> To: Marc Roos
> Cc: user
> Subject: Re: Large container image failing to start 'first' time
>
> Can you please send the full logs about this container (just grep
> 680d3849-2b2a-4549-8842-8ef358599478 in agent log)? And is there
> anything left in the staging directory (`--docker_store_dir/staging/`)
> when this issue happens?
>
>
> Regards,
> Qian Zhang
>
>
> On Wed, Aug 28, 2019 at 7:07 PM Marc Roos <M....@f1-outsourcing.eu>
> wrote:
>
>
>          I had this again.
>
>         E0828 12:51:46.146246 2663200 slave.cpp:6486] Container
>         '680d3849-2b2a-4549-8842-8ef358599478' for executor
>         'ldap.instance-afee8840-c981-11e9-8333-0050563001a1._app.1' of
> framework
>         d5168fcd-51be-48c3-ba64-ade27ab23c4e-0000 failed to start:
> Container is
>         being destroyed during provisioning
>
>
>
>         -----Original Message-----
>         From: Qian Zhang [mailto:zhq527725@gmail.com]
>         Sent: dinsdag 20 augustus 2019 1:12
>         To: user
>         Subject: Re: Large container image failing to start 'first' time
>
>         >
>
>          Large container image failing to start 'first' time Did you see
> any
>         errors/warnings in agent logs when the container failed to start?
>
>
>         Regards,
>         Qian Zhang
>
>
>         On Mon, Aug 19, 2019 at 10:46 PM Marc Roos
> <M....@f1-outsourcing.eu>
>         wrote:
>
>
>
>                 I have a container image of around 800MB. I am not sure if
> that is
>         a
>                 lot. But I have noticed it is probably to big for a
> default
> setup
>         to get
>                 it to launch. I think the only reason it launches
> eventually is
>         because
>                 data is cached and no timeout expires. The container will
> launch
>                 eventually when you constrain it to a host.
>
>                 How can I trace where this timeout occurs? Are there
> options to
>         specify
>                 timeouts?
>
>
>
>
>
>
>
>
>
>
>
>
>
>

RE: Large container image failing to start 'first' time

Posted by Marc Roos <M....@f1-outsourcing.eu>.
I only have these two messages 


mesos-slave.ERROR:E0828 12:51:46.146246 2663200 slave.cpp:6486] 
Container '680d3849-2b2a-4549-8842-8ef358599478' for executor 
'ldap.instance-afee8840-c981-11e9-8333-0050563001a1._app.1' of framework 
d5168fcd-51be-48c3-ba64-ade27ab23c4e-0000 failed to start: Container is 
being destroyed during provisioning
mesos-slave.INFO:E0828 12:51:46.146246 2663200 slave.cpp:6486] Container 
'680d3849-2b2a-4549-8842-8ef358599478' for executor 
'ldap.instance-afee8840-c981-11e9-8333-0050563001a1._app.1' of framework 
d5168fcd-51be-48c3-ba64-ade27ab23c4e-0000 failed to start: Container is 
being destroyed during provisioning
mesos-slave.INFO:W0828 12:51:46.650323 2663184 containerizer.cpp:2375] 
Ignoring update for unknown container 
680d3849-2b2a-4549-8842-8ef358599478
mesos-slave.WARNING:E0828 12:51:46.146246 2663200 slave.cpp:6486] 
Container '680d3849-2b2a-4549-8842-8ef358599478' for executor 
'ldap.instance-afee8840-c981-11e9-8333-0050563001a1._app.1' of framework 
d5168fcd-51be-48c3-ba64-ade27ab23c4e-0000 failed to start: Container is 
being destroyed during provisioning
mesos-slave.WARNING:W0828 12:51:46.650323 2663184 
containerizer.cpp:2375] Ignoring update for unknown container 
680d3849-2b2a-4549-8842-8ef358599478 




-----Original Message-----
From: Qian Zhang [mailto:zhq527725@gmail.com] 
Sent: woensdag 28 augustus 2019 15:07
To: Marc Roos
Cc: user
Subject: Re: Large container image failing to start 'first' time

Can you please send the full logs about this container (just grep 
680d3849-2b2a-4549-8842-8ef358599478 in agent log)? And is there 
anything left in the staging directory (`--docker_store_dir/staging/`) 
when this issue happens?


Regards,
Qian Zhang


On Wed, Aug 28, 2019 at 7:07 PM Marc Roos <M....@f1-outsourcing.eu> 
wrote:


	 I had this again.
	
	E0828 12:51:46.146246 2663200 slave.cpp:6486] Container 
	'680d3849-2b2a-4549-8842-8ef358599478' for executor 
	'ldap.instance-afee8840-c981-11e9-8333-0050563001a1._app.1' of 
framework 
	d5168fcd-51be-48c3-ba64-ade27ab23c4e-0000 failed to start: 
Container is 
	being destroyed during provisioning
	
	
	
	-----Original Message-----
	From: Qian Zhang [mailto:zhq527725@gmail.com] 
	Sent: dinsdag 20 augustus 2019 1:12
	To: user
	Subject: Re: Large container image failing to start 'first' time
	
	> 
	
	 Large container image failing to start 'first' time Did you see 
any 
	errors/warnings in agent logs when the container failed to start?
	
	
	Regards,
	Qian Zhang
	
	
	On Mon, Aug 19, 2019 at 10:46 PM Marc Roos 
<M....@f1-outsourcing.eu> 
	wrote:
	
	
	
	        I have a container image of around 800MB. I am not sure if 
that is 
	a 
	        lot. But I have noticed it is probably to big for a default 
setup 
	to get 
	        it to launch. I think the only reason it launches 
eventually is 
	because 
	        data is cached and no timeout expires. The container will 
launch 
	        eventually when you constrain it to a host.
	
	        How can I trace where this timeout occurs? Are there 
options to 
	specify 
	        timeouts?
	
	
	
	
	
	
	
	
	
	
	



Re: Large container image failing to start 'first' time

Posted by Qian Zhang <zh...@gmail.com>.
Can you please send the full logs about this container (just grep
680d3849-2b2a-4549-8842-8ef358599478 in agent log)? And is there anything
left in the staging directory (`--docker_store_dir/staging/`) when this
issue happens?


Regards,
Qian Zhang


On Wed, Aug 28, 2019 at 7:07 PM Marc Roos <M....@f1-outsourcing.eu> wrote:

>  I had this again.
>
> E0828 12:51:46.146246 2663200 slave.cpp:6486] Container
> '680d3849-2b2a-4549-8842-8ef358599478' for executor
> 'ldap.instance-afee8840-c981-11e9-8333-0050563001a1._app.1' of framework
> d5168fcd-51be-48c3-ba64-ade27ab23c4e-0000 failed to start: Container is
> being destroyed during provisioning
>
>
>
> -----Original Message-----
> From: Qian Zhang [mailto:zhq527725@gmail.com]
> Sent: dinsdag 20 augustus 2019 1:12
> To: user
> Subject: Re: Large container image failing to start 'first' time
>
> >
>
>  Large container image failing to start 'first' time Did you see any
> errors/warnings in agent logs when the container failed to start?
>
>
> Regards,
> Qian Zhang
>
>
> On Mon, Aug 19, 2019 at 10:46 PM Marc Roos <M....@f1-outsourcing.eu>
> wrote:
>
>
>
>         I have a container image of around 800MB. I am not sure if that is
> a
>         lot. But I have noticed it is probably to big for a default setup
> to get
>         it to launch. I think the only reason it launches eventually is
> because
>         data is cached and no timeout expires. The container will launch
>         eventually when you constrain it to a host.
>
>         How can I trace where this timeout occurs? Are there options to
> specify
>         timeouts?
>
>
>
>
>
>
>
>
>
>
>

RE: Large container image failing to start 'first' time

Posted by Marc Roos <M....@f1-outsourcing.eu>.
 
I have found several, related to also having multiple networks, I will 
start new threat


-----Original Message-----
To: user
Subject: Re: Large container image failing to start 'first' time

> 

 Large container image failing to start 'first' time Did you see any 
errors/warnings in agent logs when the container failed to start?


Regards,
Qian Zhang


On Mon, Aug 19, 2019 at 10:46 PM Marc Roos <M....@f1-outsourcing.eu> 
wrote:



	I have a container image of around 800MB. I am not sure if that is 
a 
	lot. But I have noticed it is probably to big for a default setup 
to get 
	it to launch. I think the only reason it launches eventually is 
because 
	data is cached and no timeout expires. The container will launch 
	eventually when you constrain it to a host.
	
	How can I trace where this timeout occurs? Are there options to 
specify 
	timeouts?
	
	
	
	
	
	
	
	



RE: Large container image failing to start 'first' time

Posted by Marc Roos <M....@f1-outsourcing.eu>.
 I had this again.

E0828 12:51:46.146246 2663200 slave.cpp:6486] Container 
'680d3849-2b2a-4549-8842-8ef358599478' for executor 
'ldap.instance-afee8840-c981-11e9-8333-0050563001a1._app.1' of framework 
d5168fcd-51be-48c3-ba64-ade27ab23c4e-0000 failed to start: Container is 
being destroyed during provisioning



-----Original Message-----
From: Qian Zhang [mailto:zhq527725@gmail.com] 
Sent: dinsdag 20 augustus 2019 1:12
To: user
Subject: Re: Large container image failing to start 'first' time

> 

 Large container image failing to start 'first' time Did you see any 
errors/warnings in agent logs when the container failed to start?


Regards,
Qian Zhang


On Mon, Aug 19, 2019 at 10:46 PM Marc Roos <M....@f1-outsourcing.eu> 
wrote:



	I have a container image of around 800MB. I am not sure if that is 
a 
	lot. But I have noticed it is probably to big for a default setup 
to get 
	it to launch. I think the only reason it launches eventually is 
because 
	data is cached and no timeout expires. The container will launch 
	eventually when you constrain it to a host.
	
	How can I trace where this timeout occurs? Are there options to 
specify 
	timeouts?
	
	
	
	
	
	
	
	



Re: Large container image failing to start 'first' time

Posted by Qian Zhang <zh...@gmail.com>.
> Large container image failing to start 'first' time
Did you see any errors/warnings in agent logs when the container failed to
start?


Regards,
Qian Zhang


On Mon, Aug 19, 2019 at 10:46 PM Marc Roos <M....@f1-outsourcing.eu> wrote:

>
> I have a container image of around 800MB. I am not sure if that is a
> lot. But I have noticed it is probably to big for a default setup to get
> it to launch. I think the only reason it launches eventually is because
> data is cached and no timeout expires. The container will launch
> eventually when you constrain it to a host.
>
> How can I trace where this timeout occurs? Are there options to specify
> timeouts?
>
>
>
>
>
>
>
>