You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@airflow.apache.org by Björn Pollex <bj...@soundcloud.com.INVALID> on 2018/10/04 15:05:46 UTC

Manual validation operator

Hi all,

In some of our workflows we require a manual validation step, where some generated data has to be reviewed by an authorised person before the workflow can continue. We currently model this by using a custom dummy operator that always fails. After the review, we manually mark it as success and clear the downstream tasks. This works, but it would be nice to have better representation of this in the UI. The customisation points for plugins don’t seem to offer any way of customising UI for specific operators. 

Does anyone else have similar use cases? How are you handling this?

Cheers,

	Björn Pollex

Re: Manual validation operator

Posted by Maxime Beauchemin <ma...@gmail.com>.

It's a bit of a hack, but to save up slots you could just have an
instantly-failing PythonOperator (just raise an exception in the callable)
that would go in a failed state. Marking it as "success" when the
conditions are met would act as a trigger.

On Fri, Oct 5, 2018 at 9:07 AM Brian Greene <br...@heisenbergwoodworking.com>
wrote:

> My first thought was this, but my understanding is   That if you had a
> large number of dags “waiting” the sensor would consume all the concurrency.
>
> And what if the user doesn’t approve?
>
> How about the dag you have as it’s last step writes to an api/db the
> status.
>
> Then 2 other dags (or one with a branch) can each have a sensor that’s
> watching for approved/unapproved values.  When it finds one (or a batch
> depending on how you write it), trigger the “next” dag.
>
> This leaves only 1-2 sensors running and would enable your process without
> anyone using the airflow UI (assuming they have some other way to mark
> “approval”).  This avoids the “process by error and recover” logic it seems
> like you’d like to get out of.  (Which makes sense to me)
>
> B
>
> Sent from a device with less than stellar autocorrect
>
> > On Oct 4, 2018, at 10:17 AM, Alek Storm <al...@gmail.com> wrote:
> >
> > Hi Björn,
> >
> > We also sometimes require manual validation, and though we haven't yet
> > implemented this, I imagine you could store the approved/unapproved
> status
> > of the job in a database, expose it via an API, and write an Airflow
> sensor
> > that continuously polls that API until the status becomes "approved", at
> > which point the DAG execution will continue.
> >
> > Best,
> > Alek Storm
> >
> > On Thu, Oct 4, 2018 at 10:05 AM Björn Pollex
> > <bj...@soundcloud.com.invalid> wrote:
> >
> >> Hi all,
> >>
> >> In some of our workflows we require a manual validation step, where some
> >> generated data has to be reviewed by an authorised person before the
> >> workflow can continue. We currently model this by using a custom dummy
> >> operator that always fails. After the review, we manually mark it as
> >> success and clear the downstream tasks. This works, but it would be
> nice to
> >> have better representation of this in the UI. The customisation points
> for
> >> plugins don’t seem to offer any way of customising UI for specific
> >> operators.
> >>
> >> Does anyone else have similar use cases? How are you handling this?
> >>
> >> Cheers,
> >>
> >>        Björn Pollex
> >>
> >>
>

Re: Manual validation operator

Posted by Brian Greene <br...@heisenbergwoodworking.com>.

My first thought was this, but my understanding is   That if you had a large number of dags “waiting” the sensor would consume all the concurrency.

And what if the user doesn’t approve?

How about the dag you have as it’s last step writes to an api/db the status.

Then 2 other dags (or one with a branch) can each have a sensor that’s watching for approved/unapproved values.  When it finds one (or a batch depending on how you write it), trigger the “next” dag.  

This leaves only 1-2 sensors running and would enable your process without anyone using the airflow UI (assuming they have some other way to mark “approval”).  This avoids the “process by error and recover” logic it seems like you’d like to get out of.  (Which makes sense to me)

B

Sent from a device with less than stellar autocorrect

> On Oct 4, 2018, at 10:17 AM, Alek Storm <al...@gmail.com> wrote:
> 
> Hi Björn,
> 
> We also sometimes require manual validation, and though we haven't yet
> implemented this, I imagine you could store the approved/unapproved status
> of the job in a database, expose it via an API, and write an Airflow sensor
> that continuously polls that API until the status becomes "approved", at
> which point the DAG execution will continue.
> 
> Best,
> Alek Storm
> 
> On Thu, Oct 4, 2018 at 10:05 AM Björn Pollex
> <bj...@soundcloud.com.invalid> wrote:
> 
>> Hi all,
>> 
>> In some of our workflows we require a manual validation step, where some
>> generated data has to be reviewed by an authorised person before the
>> workflow can continue. We currently model this by using a custom dummy
>> operator that always fails. After the review, we manually mark it as
>> success and clear the downstream tasks. This works, but it would be nice to
>> have better representation of this in the UI. The customisation points for
>> plugins don’t seem to offer any way of customising UI for specific
>> operators.
>> 
>> Does anyone else have similar use cases? How are you handling this?
>> 
>> Cheers,
>> 
>>        Björn Pollex
>> 
>>

Re: Manual validation operator

Posted by Alek Storm <al...@gmail.com>.

Hi Björn,

We also sometimes require manual validation, and though we haven't yet
implemented this, I imagine you could store the approved/unapproved status
of the job in a database, expose it via an API, and write an Airflow sensor
that continuously polls that API until the status becomes "approved", at
which point the DAG execution will continue.

Best,
Alek Storm

On Thu, Oct 4, 2018 at 10:05 AM Björn Pollex
<bj...@soundcloud.com.invalid> wrote:

> Hi all,
>
> In some of our workflows we require a manual validation step, where some
> generated data has to be reviewed by an authorised person before the
> workflow can continue. We currently model this by using a custom dummy
> operator that always fails. After the review, we manually mark it as
> success and clear the downstream tasks. This works, but it would be nice to
> have better representation of this in the UI. The customisation points for
> plugins don’t seem to offer any way of customising UI for specific
> operators.
>
> Does anyone else have similar use cases? How are you handling this?
>
> Cheers,
>
>         Björn Pollex
>
>