You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@airflow.apache.org by "Daniel Lamblin [Data Science & Platform Center]" <la...@coupang.com> on 2017/10/13 07:11:44 UTC

Indefinitely Queued Tasks

Let me know if you really do need emails entirely in plaintext.

It seems there was a fix AIRFLOW-1074 which prevents tasks which a worker
rejected from getting orphaned and queued indefinitely. Here's a link to
the commit
<https://github.com/apache/incubator-airflow/commit/70024935f24e0ff3d2861c0ccfa69cdd38084b9d>
I could find.
That fix reads to me as though it was meant to address the FIXME in models
<https://github.com/apache/incubator-airflow/blob/cfc2f73c445074e1e09d6ef6a056cd2b33a945da/airflow/models.py#L1355>
and
possibly also the FIXME in jobs
<https://github.com/apache/incubator-airflow/blob/cfc2f73c445074e1e09d6ef6a056cd2b33a945da/airflow/jobs.py#L1967>
.

Those messages weren't removed and I'm investigating one case of a job that
logged these in my cluster, though we're running 1.8.2.

Do I misunderstand the fix, was it considered partial or incomplete? Is
there an archive of the discussion around it (I only found one email about
someone being told his issue would be fixed in 1.8.2) what would have to be
done further to fix and remove the FIXME?

Thanks!
-- 
-Daniel Lamblin

Re: Indefinitely Queued Tasks

Posted by "Daniel Lamblin [Data Science & Platform Center]" <la...@coupang.com>.
Re: "FIXME: Rescheduling due to concurrency limits reached at task runtime."
So is this not the right place to ask for *one* of these:

   - the person who wrote the "FIXME" to explain it.
   - A Jira Issue tracking the FIX proposal or progress.
   - A design document or email thread about what is expected to be fixed
   at this point in models and jobs.

Maybe someone can recommend to me the right way to find out for myself.

On Thu, Oct 19, 2017 at 7:00 PM, Daniel Lamblin [Data Science & Platform
Center] <la...@coupang.com> wrote:

> Re: "FIXME: Rescheduling due to concurrency limits reached at task
> runtime."
> The short version of the above question is:
> Can someone point me to one of:
>
>    - the person who wrote the "FIXME" to explain it.
>    - A Jira Issue tracking the FIX proposal or progress.
>    - A design document or email thread about what is expected to be fixed
>    at this point in models and jobs.
>
> Thanks!
> -Daniel
>
> On Fri, Oct 13, 2017 at 4:11 PM, Daniel Lamblin [Data Science & Platform
> Center] <la...@coupang.com> wrote:
>
>> Let me know if you really do need emails entirely in plaintext.
>>
>> It seems there was a fix AIRFLOW-1074 which prevents tasks which a worker
>> rejected from getting orphaned and queued indefinitely. Here's a link to
>> the commit
>> <https://github.com/apache/incubator-airflow/commit/70024935f24e0ff3d2861c0ccfa69cdd38084b9d>
>> I could find.
>> That fix reads to me as though it was meant to address the FIXME in
>> models
>> <https://github.com/apache/incubator-airflow/blob/cfc2f73c445074e1e09d6ef6a056cd2b33a945da/airflow/models.py#L1355> and
>> possibly also the FIXME in jobs
>> <https://github.com/apache/incubator-airflow/blob/cfc2f73c445074e1e09d6ef6a056cd2b33a945da/airflow/jobs.py#L1967>
>> .
>>
>> Those messages weren't removed and I'm investigating one case of a job
>> that logged these in my cluster, though we're running 1.8.2.
>>
>> Do I misunderstand the fix, was it considered partial or incomplete? Is
>> there an archive of the discussion around it (I only found one email about
>> someone being told his issue would be fixed in 1.8.2) what would have to be
>> done further to fix and remove the FIXME?
>>
>> Thanks!
>> --
>> -Daniel Lamblin
>>
>
>
>
> --
> -Daniel Lamblin
>



-- 
-Daniel Lamblin

Re: Indefinitely Queued Tasks

Posted by "Daniel Lamblin [Data Science & Platform Center]" <la...@coupang.com>.
Re: "FIXME: Rescheduling due to concurrency limits reached at task runtime."
The short version of the above question is:
Can someone point me to one of:

   - the person who wrote the "FIXME" to explain it.
   - A Jira Issue tracking the FIX proposal or progress.
   - A design document or email thread about what is expected to be fixed
   at this point in models and jobs.

Thanks!
-Daniel

On Fri, Oct 13, 2017 at 4:11 PM, Daniel Lamblin [Data Science & Platform
Center] <la...@coupang.com> wrote:

> Let me know if you really do need emails entirely in plaintext.
>
> It seems there was a fix AIRFLOW-1074 which prevents tasks which a worker
> rejected from getting orphaned and queued indefinitely. Here's a link to
> the commit
> <https://github.com/apache/incubator-airflow/commit/70024935f24e0ff3d2861c0ccfa69cdd38084b9d>
> I could find.
> That fix reads to me as though it was meant to address the FIXME in models
> <https://github.com/apache/incubator-airflow/blob/cfc2f73c445074e1e09d6ef6a056cd2b33a945da/airflow/models.py#L1355> and
> possibly also the FIXME in jobs
> <https://github.com/apache/incubator-airflow/blob/cfc2f73c445074e1e09d6ef6a056cd2b33a945da/airflow/jobs.py#L1967>
> .
>
> Those messages weren't removed and I'm investigating one case of a job
> that logged these in my cluster, though we're running 1.8.2.
>
> Do I misunderstand the fix, was it considered partial or incomplete? Is
> there an archive of the discussion around it (I only found one email about
> someone being told his issue would be fixed in 1.8.2) what would have to be
> done further to fix and remove the FIXME?
>
> Thanks!
> --
> -Daniel Lamblin
>



-- 
-Daniel Lamblin