You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@airflow.apache.org by Daniel Standish <da...@astronomer.io.INVALID> on 2022/07/21 21:39:03 UTC

[LAZY CONSENSUS] deprecate "dependency detector" pluggability

Hi y'all

I am calling for a vote by lazy consensus.  Proposal below.  Relevant PR:
https://github.com/apache/airflow/pull/25175

If there are no objections by sunday july 22 at 2:40pm America/Los_Angeles,
then the proposal will be adopted.

Thanks for your consideration.

*Background:*

Did you know that airflow has a "dag dependencies" view?  Wellp, we do.
The dag deps view was added recently, and it is sorta for dags what the dag
graph is for tasks.

As part of the implementation of AIP-48, I was working on adding datasets
to this view.  And I found that it depends on this one-method class,
DependencyDetector
<https://github.com/apache/airflow/blob/main/airflow/serialization/serialized_objects.py#L556-L577>.
But that "detector" logic assumed that the task-to-external-dag
relationship is 1-1.  With datasets, however, this could be 1-many.  So I
need to change the return type on this method.  But the problem is, this
thing is *pluggable!*  User can supply its own detector class with config
scheduler > dependency_detector.  This makes it more complicated to change
the view (i.e. by adding datasets) since now I have to worry about
backcompat with this detector class.

*Proposal:*

I propose we deprecate this config option.  Until the next major release,
if a custom dependency detector is configured, it will be called, and the
returned dependency will be added to the dependencies we get from our core
logic. But after next major, we'll no longer consider custom dependency
detectors.  And immediately, we'll remove it as a config option. Note: when
consideration of custom detectors is finally removed, it won't break
anything; the custom dependency detector will simply be ignored.

*Discussion:*

I suspect that the only reason it was made pluggable was to let users
"register" custom operators as having interdag deps.  When it was added, we
hardcoded just two operators (with known inter-dag dependencies), and by
making the entire class overridable, we enabled users to do the same for
custom operators.

But instead of making the entire class overridable, we could instead do
something a little less invasive like... instead of checking the instance
<https://github.com/apache/airflow/blob/main/airflow/serialization/serialized_objects.py#L562>,
just check for the attribute
<https://github.com/apache/airflow/blob/main/airflow/serialization/serialized_objects.py#L565>.
Or we could do something like we do with template_fields (on operator) or
conn_name_attr (on hooks), i.e. allow user to specify, in the operator
itself, the [optional] field where we refer to either the upstream or
downstream dag_id / task_id.  Then, we don't need anything to be pluggable
and user can effectively "register" the operator as having upstream or
downstream relationships.

In any event, this class is used in only one view.  To me, making such a
thing pluggable gives it a prominence that is really not merited, and not
worth the tradeoff in overhead / development burden.  So I think it's best
to just remove it as a config option.  And if there is desire in the future
to enable users to register other operators as having interdag deps, we can
do so in a way that introduces less overhead.

Re: [LAZY CONSENSUS] deprecate "dependency detector" pluggability

Posted by Daniel Standish <da...@astronomer.io.INVALID>.
This vote has officially passed.

On Thu, Jul 21, 2022 at 3:59 PM Jarek Potiuk <ja...@potiuk.com> wrote:

> Was it really *puggable* ? I would have never guessed. Not so lazy
> consensus here ....
>
> On Thu, Jul 21, 2022 at 11:44 PM Daniel Standish
> <da...@astronomer.io.invalid> wrote:
> >
> > Sorry --- sunday july 24
>

Re: [LAZY CONSENSUS] deprecate "dependency detector" pluggability

Posted by Jarek Potiuk <ja...@potiuk.com>.
Was it really *puggable* ? I would have never guessed. Not so lazy
consensus here ....

On Thu, Jul 21, 2022 at 11:44 PM Daniel Standish
<da...@astronomer.io.invalid> wrote:
>
> Sorry --- sunday july 24

Re: [LAZY CONSENSUS] deprecate "dependency detector" pluggability

Posted by Daniel Standish <da...@astronomer.io.INVALID>.
Sorry --- sunday july *24*