You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by GitBox <gi...@apache.org> on 2022/07/12 13:56:44 UTC

[GitHub] [airflow] npodewitz opened a new pull request, #25002: Fix backwards incompatible change for group id naming check

npodewitz opened a new pull request, #25002:
URL: https://github.com/apache/airflow/pull/25002

   This PR addresses a backwards incompatible change introduced with https://github.com/apache/airflow/pull/17578.
   
   Prior to the change a task group id could contain all characters that were valid for a task id. With the additional validation of the group id the dot character was removed from the list of valid characters.
   Although I can understand the rationale behind this, it seems inconsistent/arbitrary to me as illustrated by the picture below.
   ![task_groups](https://user-images.githubusercontent.com/100271696/178501811-1d1e5cb4-a21c-4132-bd49-c0bf94c7411b.png)
   
   The illustrated tasks all have an effective task id of the form "A.B.C". Right now 1.) through 3.) are valid but 4.) is not, which this PR would remedy. Afterwards all 4 cases would be valid.
   
   I would argue that one cannot reverse engineer the groups from an effective task id alone anyway (e.g. by splitting at every dot), so there should ne no reason why a group id cannot contain a dot.
   
   Here is a real world example where this would help:
   We use airflow to load our data warehouse. Each table load (of which are more than a thousand) consists of the same steps grouped together in a task group (e.g. truncate, load, refresh_stats). This task group is named after the fully qualified name  of the table, which is "schema.table_name".
   
   
   At last I would like to thank you all for your work on this great piece of software.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] npodewitz commented on pull request #25002: Fix backwards incompatible change for group id naming check

Posted by GitBox <gi...@apache.org>.
npodewitz commented on PR #25002:
URL: https://github.com/apache/airflow/pull/25002#issuecomment-1182874348

   Hi,
   
   thanks for the quick response. I did not consider subdags before.
   
   However, I respectfully would like to point out:
   1. Maybe I got this wrong but I thought that "." is used to separate dags from subdags in the dag id. How is this relevant to task and group ids?
   2. Disregarding subdags, a task id like "A.B.C" can be created in 4 different ways (see picture above), 3 of which are valid and one is not (since Airflow 2.1.3, prior to 2.1.3 all 4 cases were valid). You cannot derive the task groups from the task id alone so I do not understand why "." is not allowed in the group id.
   3. Case 4 was valid and working up until Airflow 2.1.3 and did break a lot of things for us when we did the upgrade (that is why we created and use this patch)
   
   
   I am sorry to bother you again with this issue and I would be gratefull if you would reconsider this pr. In any case I would like to thank you for your time and work.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] potiuk closed pull request #25002: Fix backwards incompatible change for group id naming check

Posted by GitBox <gi...@apache.org>.
potiuk closed pull request #25002: Fix backwards incompatible change for group id naming check
URL: https://github.com/apache/airflow/pull/25002


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] potiuk commented on pull request #25002: Fix backwards incompatible change for group id naming check

Posted by GitBox <gi...@apache.org>.
potiuk commented on PR #25002:
URL: https://github.com/apache/airflow/pull/25002#issuecomment-1182457247

   Th reason is that we already use '.' to separate Dags from subdags. Until this feature is deprecated, there is no way to distinguish them. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] potiuk commented on pull request #25002: Fix backwards incompatible change for group id naming check

Posted by GitBox <gi...@apache.org>.
potiuk commented on PR #25002:
URL: https://github.com/apache/airflow/pull/25002#issuecomment-1182458553

   More reasons are explained in the PR.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] potiuk commented on pull request #25002: Fix backwards incompatible change for group id naming check

Posted by GitBox <gi...@apache.org>.
potiuk commented on PR #25002:
URL: https://github.com/apache/airflow/pull/25002#issuecomment-1183059063

   I think - if anything - it just adds confusion. Even if things are "technically" posisble having a separator which can act differently depending on the context of use is confusing for the users. 
   
   I honestly did not even understand all the  complexities involved in your explanation. There are 4 different cases you say and only one of them is valid. Maybe yes, maybe no. But if a 4-year maintainer has a trouble with wrapping his head around understanding why soimetimes "," is valid and sometimes not, then imagine what could be the confusion with a regular user.
   
   Just the fact that you have to explain it and deep dive into why one is ok and the other not and the fact that it is not obvious, is enough to replace it with way simpler "don't use dot". It's simpler to explain, less confusing and very easy to reason about. Most importantly it will not lead to a number of users opening issues, and (rightfully) demanding explanation and guidance.
   
   I prefer to tell the users"don't use dot". That's far better from product point of view rather than looking at it from single user point of view who already used ".".  
   
   I do not recommend you to keep the patch. It is unlikely to be merged back. Rather than that, you can convert your dags to use '-', "_" for example. 
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org