You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@aurora.apache.org by Ziliang Chen <zl...@gmail.com> on 2016/05/02 16:00:28 UTC

Aurora cron jobs are not scheduled

Hi,

I setup a Mesos cluster which runs Apache Aurora framework, and i
registered 100 cron jobs which run every min on a 5 slave machine pool. I
found after scheduled around 100 times, the cron jobs stuck in "PENDING"
state. May i ask what kind of logs i can inspect and what is the possible
problem ? I tried to restart the Aurora scheduler several times, every time
after restarting, the cron jobs starts scheduled but when they hit around
100 time, all cron jobs stops. The BTW, Job executable is a very simple
program which just write some number to a file, so i don't think it is a
resource problem since i have 40 GB memory/40 CPUs in the slave machine pool

Thank you a lot !

-- 
Regards, Zi-Liang

Mail:zlchen.ken@gmail.com

Re: Aurora cron jobs are not scheduled

Posted by Ziliang Chen <zl...@gmail.com>.
Thanks Stephan.
1. Good to know Aurora keeps a history of 100 finished tasks. I didn't find
any doc about this. Is this configurable ?
2. Example is attached. I build the go code to an executable and when
executed by the Thamos, passing in an ID.
3. I use official RPM package to install the Aurora. So the log location
will be /var/log/aurora (i got nothing there) ? BTW, is there a way to turn
on debugging log ?

On Mon, May 2, 2016 at 10:12 PM, Erb, Stephan <St...@blue-yonder.com>
wrote:

> Going beyond what I wrote at
> http://stackoverflow.com/questions/36941450/apache-aurora-cron-jobs-are-not-scheduled
> a few remarks:
>
>
> * By default, Aurora only keeps a history of 100 finished tasks. It is
> therefore not expected that the number of completed crons increases, even
> if everything works as expected.
>
> * Is your example cron job in form that you can share with us? That would
> allow us to try to reproduce your problem more easily.
>
> * The location of your log files depends on how you install Aurora. Have
> you used the official debian packages or RPMs?
>
>
> Best Regards,
>
> Stephan
> ------------------------------
> *From:* Ziliang Chen <zl...@gmail.com>
> *Sent:* Monday, May 2, 2016 16:00
> *To:* user@aurora.apache.org
> *Subject:* Aurora cron jobs are not scheduled
>
> Hi,
>
> I setup a Mesos cluster which runs Apache Aurora framework, and i
> registered 100 cron jobs which run every min on a 5 slave machine pool. I
> found after scheduled around 100 times, the cron jobs stuck in "PENDING"
> state. May i ask what kind of logs i can inspect and what is the possible
> problem ? I tried to restart the Aurora scheduler several times, every time
> after restarting, the cron jobs starts scheduled but when they hit around
> 100 time, all cron jobs stops. The BTW, Job executable is a very simple
> program which just write some number to a file, so i don't think it is a
> resource problem since i have 40 GB memory/40 CPUs in the slave machine pool
>
> Thank you a lot !
>
> --
> Regards, Zi-Liang
>
> Mail:zlchen.ken@gmail.com
>
>


-- 
Regards, Zi-Liang

Mail:zlchen.ken@gmail.com

Re: Aurora cron jobs are not scheduled

Posted by "Erb, Stephan" <St...@blue-yonder.com>.
Going beyond what I wrote at http://stackoverflow.com/questions/36941450/apache-aurora-cron-jobs-are-not-scheduled a few remarks:


* By default, Aurora only keeps a history of 100 finished tasks. It is therefore not expected that the number of completed crons increases, even if everything works as expected.

* Is your example cron job in form that you can share with us? That would allow us to try to reproduce your problem more easily.

* The location of your log files depends on how you install Aurora. Have you used the official debian packages or RPMs?


Best Regards,

Stephan

________________________________
From: Ziliang Chen <zl...@gmail.com>
Sent: Monday, May 2, 2016 16:00
To: user@aurora.apache.org
Subject: Aurora cron jobs are not scheduled

Hi,

I setup a Mesos cluster which runs Apache Aurora framework, and i registered 100 cron jobs which run every min on a 5 slave machine pool. I found after scheduled around 100 times, the cron jobs stuck in "PENDING" state. May i ask what kind of logs i can inspect and what is the possible problem ? I tried to restart the Aurora scheduler several times, every time after restarting, the cron jobs starts scheduled but when they hit around 100 time, all cron jobs stops. The BTW, Job executable is a very simple program which just write some number to a file, so i don't think it is a resource problem since i have 40 GB memory/40 CPUs in the slave machine pool

Thank you a lot !

--
Regards, Zi-Liang

Mail:zlchen.ken@gmail.com<ma...@gmail.com>