You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by GitBox <gi...@apache.org> on 2022/06/13 20:47:41 UTC

[GitHub] [airflow] Taragolis opened a new pull request, #24425: Preserve original order of providers' connection extra fields

Taragolis opened a new pull request, #24425:
URL: https://github.com/apache/airflow/pull/24425

   At that moment additional extra fields in Provider Connection always orders in alphabetical order of their keys.
   So there is no possible to change this order in UI.
   
   This PR remove sorts original dictionary and keep it stored in original order that it defined in Hook.
   
   Probably I miss idea of this additional transformation dict -> sorted tuples -> OrderedDict, or probably it kept since time when Airflow supported Python 3.5
   
   <details>
     <summary>Snowflake Connection Example</summary>
   
   https://github.com/apache/airflow/blob/6d69dd062f079a8fbf72563fd218017208bfe6c1/airflow/providers/snowflake/hooks/snowflake.py#L86-L110
   
   ### Before Changes
   ![snowflake-before](https://user-images.githubusercontent.com/3998685/173441355-94d7c651-d4b5-4294-990b-b1abf55d951c.jpeg)
   
   ### After Changes
   ![snowflake-after](https://user-images.githubusercontent.com/3998685/173441372-6b209689-b3a5-4df0-8c0d-754658ec62e7.jpeg)
   
   </details>
   
   
   <details>
     <summary>K8S Connection Example</summary>
   
   https://github.com/apache/airflow/blob/6d69dd062f079a8fbf72563fd218017208bfe6c1/airflow/providers/cncf/kubernetes/hooks/kubernetes.py#L84-L107
   
   ### Before Changes
   ![k8s-before](https://user-images.githubusercontent.com/3998685/173441627-ee68719d-31a4-4a40-824b-5bb0216dab10.jpeg)
   
   ### After Changes
   ![k8s-after](https://user-images.githubusercontent.com/3998685/173442126-91d18a1a-af88-4212-8dc1-6f43808a9ad8.png)
   
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] Taragolis commented on pull request #24425: Preserve original order of providers' connection extra fields in UI

Posted by GitBox <gi...@apache.org>.
Taragolis commented on PR #24425:
URL: https://github.com/apache/airflow/pull/24425#issuecomment-1154522724

   Win in test lottery 🤣 
   
   ```  
     >           assert this_pid notin lines[0]  # ensures callback is NOT run by LocalTaskJob
     E           AssertionError: assert '132' not in 'test_on_fai...th pid: 5132'
     E             '132' is contained here:
     E               with pid: 5132
     E             ?            +++
     
     tests/jobs/test_local_task_job.py:571: AssertionError
   ```
   
   https://github.com/apache/airflow/blob/69c46252dd222fbcbfdd035ce6de1868b719023f/tests/jobs/test_local_task_job.py#L571


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] Taragolis commented on pull request #24425: Preserve original order of providers' connection extra fields in UI

Posted by GitBox <gi...@apache.org>.
Taragolis commented on PR #24425:
URL: https://github.com/apache/airflow/pull/24425#issuecomment-1155605084

   I make some changes, return sorted to cli command, add some comments
    
   cc: @uranusjr @potiuk 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] potiuk commented on pull request #24425: Preserve original order of providers' connection extra fields in UI

Posted by GitBox <gi...@apache.org>.
potiuk commented on PR #24425:
URL: https://github.com/apache/airflow/pull/24425#issuecomment-1155014920

   > If we talk about cli `airflow providers` is it mean that in anyway if we call command it would order exactly once in any case? So move sorts in cli part doesn't affect anything
   
   It's also for all other cases (like API) - it's just "preventive sorting" - if you use it later, it will be sorted. I just saw too many cases were people did not care when they implemented it in the "human-facing" context. Many people don't think empathetically about the users. 
   
   Actually this was implemented here precisely because I saw a bunch of CLI commands that did not do sorting at all and the output changed randomly as you added more connections. and I often struggled with the output of cli and APIs. 
   
   So sorting everything upfront is protection against programmers who use the output wtthout thinking about the users :). It happens more often than you think. It did happen in Airlfow.
   
   > Oh it some kind of "Holy War. Just personal thoughts since preserve dict keys orders as part of the language standard in the most cases there is no reason explicit use OrderedDict.
   
   Not realy "war" (and I am not string about it :D ) - but I believe that when you write code, you should optimize for reading, not writing. While you can produce less code by removing Ordered Dict (and technically you can do it and effect of it is the same), sometimes writing more explicit code expressing the intention is better. 
   
   For example we would never have this discussion here - if it was not written this way, and @uranusjr  figured correctly the intent from the code without knowing it upfront. If you don't express the intent, then you could see the sorting order here are "accidental", but it was actually intended and for a good reason. 
   
   It could be a comment, why it is done this way of course, but too many comments are also not good.
   
   Also personal comment :). Don't intend to start tabs vs. spaces war here, just wanted to explain the line of thoughts I actually had when implementing it.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] uranusjr commented on pull request #24425: Preserve original order of providers' connection extra fields in UI

Posted by GitBox <gi...@apache.org>.
uranusjr commented on PR #24425:
URL: https://github.com/apache/airflow/pull/24425#issuecomment-1154879159

   It seem like `field_behaviours` is iterated through in the CLI command `airflow providers behaviours`. This is also the case for `connection_form_widgets` (`airflow providers widgets`). We probably should sort the output there, or at least add a flag to toggle the sorting behaviour. This can be significant if there are a lot of entries in the list.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] potiuk commented on pull request #24425: Preserve original order of providers' connection extra fields in UI

Posted by GitBox <gi...@apache.org>.
potiuk commented on PR #24425:
URL: https://github.com/apache/airflow/pull/24425#issuecomment-1159789567

   Intermittent failure. Merging. Preserving original order makes sense here indeed.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] potiuk commented on pull request #24425: Preserve original order of providers' connection extra fields in UI

Posted by GitBox <gi...@apache.org>.
potiuk commented on PR #24425:
URL: https://github.com/apache/airflow/pull/24425#issuecomment-1154908701

   The main idea to keep them sorted is for `airlfow providers command` - in order to get stable behaviour and alphabetic sorrting when they are printed in a command line. I believe wherever possible where human looks at the output list, the list should be sorted, otherwise it makes it extremely difficult to  find what you are looking for if the output is not alphabetically sorted. 
   
   And using OrderedDict in this is case is "intent signalling" - no more, no less. When you are using dict, it's not "obvious" that your intent was to keep the entries is in order of insert. Yeah it's implementation detail in 3.6 and part of specification as of python 3.7 but by using OrderedDict you explicitly tell the reader "yeah - I want to keep it sorted and this was the intention".


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] uranusjr commented on pull request #24425: Preserve original order of providers' connection extra fields in UI

Posted by GitBox <gi...@apache.org>.
uranusjr commented on PR #24425:
URL: https://github.com/apache/airflow/pull/24425#issuecomment-1154826607

   OrderedDict is fine, they don’t add overhead anyway and it’s of minimal benefit to actively remove them. It’s this `OrderedDict(sorted())` combination that’s problematic, which is why I wonder about `_field_behaviours` specifically.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] Taragolis commented on pull request #24425: Preserve original order of providers' connection extra fields in UI

Posted by GitBox <gi...@apache.org>.
Taragolis commented on PR #24425:
URL: https://github.com/apache/airflow/pull/24425#issuecomment-1155079492

   > It's also for all other cases (like API) - it's just "preventive sorting" - if you use it later
   
   Seems like API doesn't use it yet, so it also could be sorted later in API when it implemented
   
   > Many people don't think empathetically about the users.
   
   The reason of this change is actually thinking about user. I've accidentally found it when I've tried implement extra fields for AWS Connection and found that doesn't matter how I define it in `get_connection_form_widgets()` in ui it appear in the order of keys like.
   
   Expectation
   ```json
   {
     "extra__aws__aws_session_token": "placeholder PasswordField()",
     "extra__aws__region_name": "placeholder StringField()",
     "extra__aws__profile": "placeholder StringField()",
     "extra__aws__role_arn": "placeholder StringField()",
     "extra__aws__assume_role_method": "placeholder StringField()",
     "extra__aws__assume_role_kwargs": "placeholder StringField()"
   }
   ```
   
   Actually
   ```json
   {
     "extra__aws__assume_role_kwargs": "placeholder StringField()",
     "extra__aws__assume_role_method": "placeholder StringField()",
     "extra__aws__aws_session_token": "placeholder PasswordField()",
     "extra__aws__profile": "placeholder StringField()",
     "extra__aws__region_name": "placeholder StringField()",
     "extra__aws__role_arn": "placeholder StringField()"
   }
   ```
   
   In this case I've expected that in UI:
   * _extra__aws__aws_session_token_ in a top of the extra fields
   * _extra__aws__role_arn_, _extra__aws__assume_role_method_, _extra__aws__assume_role_kwargs_  placed together
   
   And also nice to have ability to place very rarely used arguments and deprecated in the bottom of the connection UI.
   
   At that moment it could be only implemented by some hacks (which produces more issues rather than benefits) such as
   ```
   {
     "extra__aws__01_aws_session_token": "placeholder PasswordField()",
     "extra__aws__02_region_name": "placeholder StringField()",
     "extra__aws__03_profile": "placeholder StringField()",
     "extra__aws__04_role_arn": "placeholder StringField()",
     "extra__aws__05_assume_role_method": "placeholder StringField()",
     "extra__aws__06_assume_role_kwargs": "placeholder StringField()"
   }
   ``` 
   
   Or add additional field which returns by `get_ui_field_behaviour()` like **ui_sorting_order** which required makes changes in provider manager, UI, JSON schema
   
   **I guessed it wrong initially** that my simple change doesn't use effect anything rather than UI. My fault
   But what I've suggest now, it keep natural sort in UI and additionally sorts in cli
   
   > I believe that when you write code, you should optimize for reading, not writing. While you can produce less code by removing Ordered Dict (and technically you can do it and effect of it is the same), sometimes writing more explicit code expressing the intention is better.
   
   I would rather move discussion about OrderdDict in direct messages, such a slack, if you would like. Personally for me see OrderedDict more confusing rather than dict, because I thought is it for compatibility or for some additional methods of OrderedDict.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] potiuk merged pull request #24425: Preserve original order of providers' connection extra fields in UI

Posted by GitBox <gi...@apache.org>.
potiuk merged PR #24425:
URL: https://github.com/apache/airflow/pull/24425


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] github-actions[bot] commented on pull request #24425: Preserve original order of providers' connection extra fields in UI

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on PR #24425:
URL: https://github.com/apache/airflow/pull/24425#issuecomment-1154685564

   The PR most likely needs to run full matrix of tests because it modifies parts of the core of Airflow. However, committers might decide to merge it quickly and take the risk. If they don't merge it quickly - please rebase it to the latest main at your convenience, or amend the last commit of the PR, and push it with --force-with-lease.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] Taragolis commented on pull request #24425: Preserve original order of providers' connection extra fields in UI

Posted by GitBox <gi...@apache.org>.
Taragolis commented on PR #24425:
URL: https://github.com/apache/airflow/pull/24425#issuecomment-1154870547

   > OrderedDict is fine, they don’t add overhead anyway and it’s of minimal benefit to actively remove them. 
   
   Actually the do add overhead but agree that the beneficial is minimal - Airflow doesn't actively use OrderedDict 
   
   > It’s this OrderedDict(sorted()) combination that’s problematic, which is why I wonder about _field_behaviours specifically.
   
   I could also remove this line.
   https://github.com/apache/airflow/blob/34e4073316fced72183255390835b6baebd44cbb/airflow/providers_manager.py#L654
   
   Initially I removed both line, but then I realise that only `_connection_form_widgets` affect to order fields elements in UI


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] Taragolis commented on pull request #24425: Preserve original order of providers' connection extra fields in UI

Posted by GitBox <gi...@apache.org>.
Taragolis commented on PR #24425:
URL: https://github.com/apache/airflow/pull/24425#issuecomment-1154820140

   > I wonder if we should do the same with _field_behaviours
   
   I thought all code which use `OrderedDict` could be replaced by `dict`, as you mention since Python 3.7+ ordering in dict is stable, furthermore since CPython 3.6 ordering should be stable.
   
   Particular in that place, as I could find when I investigate,  the ordering do not affect anything.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] Taragolis commented on pull request #24425: Preserve original order of providers' connection extra fields in UI

Posted by GitBox <gi...@apache.org>.
Taragolis commented on PR #24425:
URL: https://github.com/apache/airflow/pull/24425#issuecomment-1155003343

   > We probably should sort the output there, or at least add a flag to toggle the sorting behaviour. This can be significant if there are a lot of entries in the list.
   
   If we talk about cli `airflow providers` is it mean that in anyway if we call command it would order exactly once in any case? So move sorts in cli part doesn't affect anything
   https://github.com/apache/airflow/blob/fe2ef0fcc75fa7be9e1ec1113af260a3a3c9e7ae/airflow/cli/commands/provider_command.py#L82-L93
   
   > And using OrderedDict in this is case is "intent signalling" - no more, no less.
   
   Oh it some kind of "Holy War. Just personal thoughts since preserve `dict` keys orders as part of the language standard in the most cases there is no reason explicit use `OrderedDict`. 
   
   I think `OrderedDict` useful nowadays only when you need compare two dict and be sure that the order exactly the same. Because there is no differences between dict(sorted(d.items())) and OrderedDict(sorted(d.items)), rather than interacting with dict has slightly better performance.
   
   But again that just personal thoughts


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org