You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by GitBox <gi...@apache.org> on 2021/08/15 17:45:11 UTC

[GitHub] [airflow] potiuk opened a new pull request #17625: Adds secret backend and logging information to provider yaml

potiuk opened a new pull request #17625:
URL: https://github.com/apache/airflow/pull/17625


   This is preparatory work to automatically generate summary of
   secret backends and logging handlers for all providers.
   
   It adds logging/secret-backends section in the provider.yaml
   and refactors the code a little to extract common initialization
   routines to a decorator.
   
   The provider manager is also simplified by removing the
   unnecessary `_add` methods.
   
   <!--
   Thank you for contributing! Please make sure that your code changes
   are covered with tests. And in case of new features or big changes
   remember to adjust the documentation.
   
   Feel free to ping committers for the review!
   
   In case of existing issue, reference it using one of the following:
   
   closes: #ISSUE
   related: #ISSUE
   
   How to write a good git commit message:
   http://chris.beams.io/posts/git-commit/
   -->
   
   ---
   **^ Add meaningful description above**
   
   Read the **[Pull Request Guidelines](https://github.com/apache/airflow/blob/main/CONTRIBUTING.rst#pull-request-guidelines)** for more information.
   In case of fundamental code change, Airflow Improvement Proposal ([AIP](https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+Improvements+Proposals)) is needed.
   In case of a new dependency, check compliance with the [ASF 3rd Party License Policy](https://www.apache.org/legal/resolved.html#category-x).
   In case of backwards incompatible changes please leave a note in [UPDATING.md](https://github.com/apache/airflow/blob/main/UPDATING.md).
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] kaxil commented on a change in pull request #17625: Adds secret backend and logging information to provider yaml

Posted by GitBox <gi...@apache.org>.
kaxil commented on a change in pull request #17625:
URL: https://github.com/apache/airflow/pull/17625#discussion_r689124798



##########
File path: airflow/providers_manager.py
##########
@@ -457,3 +498,15 @@ def field_behaviours(self) -> Dict[str, Dict]:
         """Returns dictionary with field behaviours for connection types."""
         self.initialize_providers_hooks()
         return self._field_behaviours
+
+    @property
+    def logging_class_names(self) -> List[str]:
+        """Returns set of log task handlers class names."""
+        self.initialize_providers_logging()
+        return sorted(self._logging_class_name_set)
+
+    @property
+    def secret_backend_class_names(self) -> List[str]:

Review comment:
       ```suggestion
       def secrets_backend_class_names(self) -> List[str]:
   ```

##########
File path: airflow/providers_manager.py
##########
@@ -457,3 +498,15 @@ def field_behaviours(self) -> Dict[str, Dict]:
         """Returns dictionary with field behaviours for connection types."""
         self.initialize_providers_hooks()
         return self._field_behaviours
+
+    @property
+    def logging_class_names(self) -> List[str]:
+        """Returns set of log task handlers class names."""
+        self.initialize_providers_logging()
+        return sorted(self._logging_class_name_set)
+
+    @property
+    def secret_backend_class_names(self) -> List[str]:
+        """Returns set of secret backend class names."""
+        self.initialize_providers_secret_backends()

Review comment:
       ```suggestion
           self.initialize_providers_secrets_backends()
   ```




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on pull request #17625: Adds secret backend and logging information to provider yaml

Posted by GitBox <gi...@apache.org>.
potiuk commented on pull request #17625:
URL: https://github.com/apache/airflow/pull/17625#issuecomment-899087483


   @dimberman  - that one might also help to simplify your Provider Manager's  change from #15330 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on pull request #17625: Adds secret backend and logging information to provider yaml

Posted by GitBox <gi...@apache.org>.
potiuk commented on pull request #17625:
URL: https://github.com/apache/airflow/pull/17625#issuecomment-899611288


   > Glad that you reminded me, I should have updated this as well. I will fix it in a moment.
   > 
   
   Ah no. I did not forget  - I've already added it this weekend :)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on pull request #17625: Adds secret backend and logging information to provider yaml

Posted by GitBox <gi...@apache.org>.
potiuk commented on pull request #17625:
URL: https://github.com/apache/airflow/pull/17625#issuecomment-899755719


   > It makes sense to me. We can add these commands, but I am not sure if it will be informative enough. The class name is not enough to configure the integrative ones. The user will still need to search for more information in our documentation.
   
   This is fine. The whole point is to make the user aware that there is "something" that they should look for. That's the 'discoverability" I mentioned before.
   
    That's why I think this is the very same problem as lack of information in "logging" docs" that there are some "community loggers" .


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on pull request #17625: Adds secret backend and logging information to provider yaml

Posted by GitBox <gi...@apache.org>.
potiuk commented on pull request #17625:
URL: https://github.com/apache/airflow/pull/17625#issuecomment-899287726


   This is not documentation change (yet). It's preparatory work and the main purpose is to prepare a common interface and communicatoin to both "community" providers developers as well as (and this is more important goal) to "custom providers writers"  who get the tool to know about what and how they can customise in their own providers.  
   
   Eventually "get_provider_info"  provides summarized, queryeable, discoverable information about all things provider "provides" - not only those that require some "deep" integration,
   
   The main purpose is to get one place with information on capabilities of providers. This is purely informational one telling "this provider provides those features". It helps with discovery (again) if someone uses "providers" command (or in the future web ui). This is a good way to show what custom providers can deliver by exposing features of providers not only in the documentation, but also in the actual airlfow CLI, "customise  providers" documentation, and webui. One of the best way to teach the users how they can add their own providers and expose such functionalities in similar way. 
   
   Additional (more important) feature is that during our tests we are importing the classes exposed in `provider.yaml` and checking if they are properly configured (which also happens in runtime). This makes it easy to reason  about any kind of typos you might have and you will see them early (as I did and fixed it while adding it).
   
   We are just about to add @taskgroups in the same way.
   
   This is about it.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] mik-laj edited a comment on pull request #17625: Adds secret backend and logging information to provider yaml

Posted by GitBox <gi...@apache.org>.
mik-laj edited a comment on pull request #17625:
URL: https://github.com/apache/airflow/pull/17625#issuecomment-899113732


   I am not sure if I understand what we want to achieve in this change and if we really have adopted the correct data structure for what we want to achieve.
   
   Previously, adding new information to [`provider_info`](https://raw.githubusercontent.com/apache/airflow/e09ce9a25972344eb8760ccabea2ff042ed8ee35/airflow/provider_info.schema.json)/ProviderManager was always related to adding new features or as required by making existing features work. I considered the ability to view information using the CLI as features that only facilitated troubleshooting, but was never the main goal.
   
   Currently, we have the following keys in `providers_info.schema.json` on the main branch.
   `description` -> Makes it easier to identify the package.
   `extra-links` -> Required by extra links to prevent loading unknown classes. Only registered classes can create extra links.
   `hook-class-names` -> Required by Connection UI.
   `name` ->  Makes it easier to identify the package.
   `package-name` ->  Makes it easier to identify the package.
   
   Could you please give **the main purpose** why we want to add this information to provider_info? Should we also add information about operators, sensors, hooks to `provider_info`? How is Airflow going to behave when the class is not added to provider_info but is used by Airflow? If your main goal is to prepare a list of auth_backend/secrets-backend/task-handlers in our documentation. I think it is enough for us to just add these keys to `provider.yaml` like `operators`, `hooks`, `sensors`, `integrations`,  without adding to `provider_info`.
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk edited a comment on pull request #17625: Adds secret backend and logging information to provider yaml

Posted by GitBox <gi...@apache.org>.
potiuk edited a comment on pull request #17625:
URL: https://github.com/apache/airflow/pull/17625#issuecomment-899624391


   What do you think - others?  I think this is really, really useful to be able to see what kind of secret backends we have extra, what logging handlers, what authentication APIS we have. While I agree listing all operators/hooks/sensor for provider makes no sense, listing the "core feature extensions" of Airflow IMHO is worth to be queryable at runtime (and verifiable). For me this is again - disciverability problem (same as with docs) rather that "let the users figure it out and search on their own". 
   
   There is big difference IMHO when you want to just integrate operators/sensors/hooks of a given service, comparing to when you want to extend a "built-in" airflow capability with a given integration. Those are even useful for different kind of users - logging/auth/seceret backends are for Devops/Operations people, where Operators/Hooks/Sensors are for Dag writers. Those are different audiences - CLI is there for the Devops, they will use to to query for different kinds of things and I find it would be super-useful to see it next to "extra links", "connections"  - which are all managed not by DAG writers but by DevOps.
   
   I believe we should not look at it from the "airflow developer" point of view where we treat Operators/Hooks/Sensors vs. Auth/Secrets/Logging as similar "beings" inside  "provider" package (python constructs) and literally "the same kind of thing". 
   
   But I think we should look at it from the side of the user - IMHO even if "Operators/Hooks/Sensors vs Auth/Secrets/Logging" are packaged together in a provider, they have conceptually completely different scope, they are used differently and they have different audience (and we should treat them differently),


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] mik-laj commented on a change in pull request #17625: Adds secret backend and logging information to provider yaml

Posted by GitBox <gi...@apache.org>.
mik-laj commented on a change in pull request #17625:
URL: https://github.com/apache/airflow/pull/17625#discussion_r689670688



##########
File path: airflow/providers_manager.py
##########
@@ -61,6 +62,36 @@ def _create_customized_form_field_behaviours_schema_validator():
     return validator
 
 
+def _sanity_check(provider_package: str, class_name: str) -> bool:
+    """
+    Performs sanity check on provider classes.
+    For apache-airflow providers - it checks if it starts with appropriate package. For all providers
+    it tries to import the provider - checking that there are no exceptions during importing.
+    """
+    if provider_package.startswith("apache-airflow"):
+        provider_path = provider_package[len("apache-") :].replace("-", ".")
+        if not class_name.startswith(provider_path):
+            log.warning(
+                "Sanity check failed when importing '%s' from '%s' package. It should start with '%s'",
+                class_name,
+                provider_package,
+                provider_path,
+            )
+            return False
+    try:
+        module, class_name = class_name.rsplit('.', maxsplit=1)
+        getattr(importlib.import_module(module), class_name)

Review comment:
       Can we use [`import_string`](https://github.com/apache/airflow/blob/main/airflow/utils/module_loading.py#L22) here?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] github-actions[bot] commented on pull request #17625: Adds secret backend and logging information to provider yaml

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on pull request #17625:
URL: https://github.com/apache/airflow/pull/17625#issuecomment-899883500


   The PR most likely needs to run full matrix of tests because it modifies parts of the core of Airflow. However, committers might decide to merge it quickly and take the risk. If they don't merge it quickly - please rebase it to the latest main at your convenience, or amend the last commit of the PR, and push it with --force-with-lease.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk edited a comment on pull request #17625: Adds secret backend and logging information to provider yaml

Posted by GitBox <gi...@apache.org>.
potiuk edited a comment on pull request #17625:
URL: https://github.com/apache/airflow/pull/17625#issuecomment-899624391


   What do you think - others?  I think this is really, really useful to be able to see what kind of secret backends we have extra, what logging handlers, what authentication APIS we have. While I agree listing all operators/hooks/sensor for provider makes no sense, listing the "core feature extensions" of Airflow IMHO is worth to be queryable at runtime (and verifiable). For me this is again - disciverability problem (same as with docs) rather that "let the users figure it out and search on their own". 
   
   There is big difference IMHO when you want to just integrate operators/sensors/hooks of a given service, comparing to when you want to extend a "built-in" airflow capability with a given integration. Those are even useful for different kind of users - logging/auth/seceret backends are for Devops/Operations people, where Operators/Hooks/Sensors are for Dag writers. Those are different audiences - CLI is there for the Devops, they will use to to query for different kinds of things and I find it would be super-useful.
   
   I believe we should not look at it from the "airflow developer" point of view where we treat Operators/Hooks/Sensors vs. Auth/Secrets/Logging as similar "beings" inside  "provider" package (python constructs) and literally "the same kind of thing". 
   
   But I think we should look at it from the side of the user - IMHO even if "Operators/Hooks/Sensors vs Auth/Secrets/Logging" are packaged together in a provider, they have conceptually completely different scope, they are used differently and they have different audience (and we should treat them differently),


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] mik-laj commented on pull request #17625: Adds secret backend and logging information to provider yaml

Posted by GitBox <gi...@apache.org>.
mik-laj commented on pull request #17625:
URL: https://github.com/apache/airflow/pull/17625#issuecomment-899637316


   It makes sense to me. We can add these commands, but I am not sure if it will be informative enough. The class name is not enough to configure the integrative ones. The user will still need to search for more information in our documentation.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on a change in pull request #17625: Adds secret backend and logging information to provider yaml

Posted by GitBox <gi...@apache.org>.
potiuk commented on a change in pull request #17625:
URL: https://github.com/apache/airflow/pull/17625#discussion_r689795995



##########
File path: airflow/providers_manager.py
##########
@@ -61,6 +62,36 @@ def _create_customized_form_field_behaviours_schema_validator():
     return validator
 
 
+def _sanity_check(provider_package: str, class_name: str) -> bool:
+    """
+    Performs sanity check on provider classes.
+    For apache-airflow providers - it checks if it starts with appropriate package. For all providers
+    it tries to import the provider - checking that there are no exceptions during importing.
+    """
+    if provider_package.startswith("apache-airflow"):
+        provider_path = provider_package[len("apache-") :].replace("-", ".")
+        if not class_name.startswith(provider_path):
+            log.warning(
+                "Sanity check failed when importing '%s' from '%s' package. It should start with '%s'",
+                class_name,
+                provider_package,
+                provider_path,
+            )
+            return False
+    try:
+        module, class_name = class_name.rsplit('.', maxsplit=1)
+        getattr(importlib.import_module(module), class_name)

Review comment:
       Ah . I did not even know we had it . Sure :)




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] kaxil commented on a change in pull request #17625: Adds secret backend and logging information to provider yaml

Posted by GitBox <gi...@apache.org>.
kaxil commented on a change in pull request #17625:
URL: https://github.com/apache/airflow/pull/17625#discussion_r689124494



##########
File path: airflow/provider.yaml.schema.json
##########
@@ -210,6 +210,20 @@
     "additional-extras": {
       "type": "object",
       "description": "Additional extras that the provider should have"
+    },
+    "secret-backends": {

Review comment:
       ```suggestion
       "secrets-backends": {
   ```

##########
File path: airflow/provider.yaml.schema.json
##########
@@ -210,6 +210,20 @@
     "additional-extras": {
       "type": "object",
       "description": "Additional extras that the provider should have"
+    },
+    "secret-backends": {
+      "type": "array",
+      "description": "Secret Backend class names",

Review comment:
       ```suggestion
         "description": "Secrets Backend class names",
   ```

##########
File path: airflow/provider_info.schema.json
##########
@@ -27,6 +27,20 @@
       "items": {
         "type": "string"
       }
+    },
+    "secret-backends": {
+      "type": "array",
+      "description": "Secret Backend class names",

Review comment:
       ```suggestion
         "description": "Secrets Backend class names",
   ```

##########
File path: airflow/cli/commands/provider_command.py
##########
@@ -113,3 +113,27 @@ def extra_links_list(args):
             "extra_link_class_name": x,
         },
     )
+
+
+@suppress_logs_and_warning
+def logging_list(args):
+    """Lists all log task handlers at the command line"""
+    AirflowConsole().print_as(
+        data=list(ProvidersManager().logging_class_names),
+        output=args.output,
+        mapper=lambda x: {
+            "logging_class_name": x,
+        },
+    )
+
+
+@suppress_logs_and_warning
+def secret_backends_list(args):
+    """Lists all Secret Backends at the command line"""

Review comment:
       ```suggestion
       """Lists all Secrets Backends at the command line"""
   ```

##########
File path: airflow/cli/commands/provider_command.py
##########
@@ -113,3 +113,27 @@ def extra_links_list(args):
             "extra_link_class_name": x,
         },
     )
+
+
+@suppress_logs_and_warning
+def logging_list(args):
+    """Lists all log task handlers at the command line"""
+    AirflowConsole().print_as(
+        data=list(ProvidersManager().logging_class_names),
+        output=args.output,
+        mapper=lambda x: {
+            "logging_class_name": x,
+        },
+    )
+
+
+@suppress_logs_and_warning
+def secret_backends_list(args):

Review comment:
       ```suggestion
   def secrets_backends_list(args):
   ```




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] mik-laj commented on pull request #17625: Adds secret backend and logging information to provider yaml

Posted by GitBox <gi...@apache.org>.
mik-laj commented on pull request #17625:
URL: https://github.com/apache/airflow/pull/17625#issuecomment-899113732


   I am not sure if I understand what we want to achieve in this change and if we really have adopted the correct data structure for what we want to achieve.
   
   Previously, adding new information to [`provider_info`](https://raw.githubusercontent.com/apache/airflow/e09ce9a25972344eb8760ccabea2ff042ed8ee35/airflow/provider_info.schema.json)/ProviderManager was always related to adding new features or as required by making existing features work. I considered the ability to view information using the CLI as features that only facilitated troubleshooting, but was never the main goal.
   
   Currently, we have the following keys in providers_info schema.json on the main branch.
   `description` -> Makes it easier to identify the package.
   `extra-links` -> Required by extra links to prevent loading unknown classes. Only registered classes can create extra links.
   `hook-class-names` -> Required by Connection UI.
   `name` ->  Makes it easier to identify the package.
   `package-name` ->  Makes it easier to identify the package.
   
   Could you please give **the main purpose** why we want to add this information to provider_info? Should we also add information about operators, sensors, hooks to `provider_info`? How is Airflow going to behave when the class is not added to provider_info but is used by Airflow?
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk edited a comment on pull request #17625: Adds secret backend and logging information to provider yaml

Posted by GitBox <gi...@apache.org>.
potiuk edited a comment on pull request #17625:
URL: https://github.com/apache/airflow/pull/17625#issuecomment-899611288


   > Glad that you reminded me, I should have updated this as well. I will fix it in a moment.
   > 
   
   Ah no. I did not forget  - I've already added it this weekend :)
   
   https://github.com/apache/airflow/pull/17625/files#diff-2d9d90ece6c5a0bc080e0e20bf6a00534c7b642c2ef9a5b83f66894052913c4bR30


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on pull request #17625: Adds secret backend and logging information to provider yaml

Posted by GitBox <gi...@apache.org>.
potiuk commented on pull request #17625:
URL: https://github.com/apache/airflow/pull/17625#issuecomment-899289180


   Documentation for that one (including the summary pages in documentation) and update to "How to write your Custom Provider" to include logging, secret backends and API authentication backends and much deeper interlinking between Airflow "features" and "Communtiy Providers" summary providing those featuers, will follow as the next PR.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] mik-laj edited a comment on pull request #17625: Adds secret backend and logging information to provider yaml

Posted by GitBox <gi...@apache.org>.
mik-laj edited a comment on pull request #17625:
URL: https://github.com/apache/airflow/pull/17625#issuecomment-899113732


   I am not sure if I understand what we want to achieve in this change and if we really have adopted the correct data structure for what we want to achieve.
   
   Previously, adding new information to [`provider_info`](https://raw.githubusercontent.com/apache/airflow/e09ce9a25972344eb8760ccabea2ff042ed8ee35/airflow/provider_info.schema.json)/ProviderManager was always related to adding new features or as required by making existing features work. I considered the ability to view information using the CLI as features that only facilitated troubleshooting, but was never the main goal.
   
   Currently, we have the following keys in `providers_info.schema.json` on the main branch.
   `description` -> Makes it easier to identify the package.
   `extra-links` -> Required by extra links to prevent loading unknown classes. Only registered classes can create extra links.
   `hook-class-names` -> Required by Connection UI.
   `name` ->  Makes it easier to identify the package.
   `package-name` ->  Makes it easier to identify the package.
   
   Could you please give **the main purpose** why we want to add this information to provider_info? Should we also add information about operators, sensors, hooks to `provider_info`? How is Airflow going to behave when the class is not added to provider_info but is used by Airflow? If your main goal is to prepare a list of auth_backend/secrets-backend/task-handlers in our documentation. I think it is enough for us to just add these keys to [`provider.yaml`](https://raw.githubusercontent.com/apache/airflow/e09ce9a25972344eb8760ccabea2ff042ed8ee35/airflow/provider.yaml.schema.json) like `operators`, `hooks`, `sensors`, `integrations`,  without adding to `provider_info`.
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on pull request #17625: Adds secret backend and logging information to provider yaml

Posted by GitBox <gi...@apache.org>.
potiuk commented on pull request #17625:
URL: https://github.com/apache/airflow/pull/17625#issuecomment-899609892


   > You previously added provider info schema to separate information needed only for building documentation from that required for Airflow operation. #13488
   
   Glad that you reminded me, I should have updated this as well. I will fix it in a moment.
   
   > Has anything changed in this regard? Do we want to eliminate this division? What if someone does not define a class in provider_info and wants to use it? 
   
   Nothing changed, I've forgotten to update it, simply. There is no blocking of any sorts. All the schema options in those schemas are optional.
   
   > We have pre-commit checks for testing provider.yaml files, so we can achieve similar featture without any changes to Airflow. Additionally, these tests check that all integration have been added to the provider.yaml files.
   
   And I will also add it while adding the documentation, no doubt about it. The pre-commit will not work for custom providers, the check is really not for ".yaml" but for "get_provider_info" output (those very same checks are run in our CI when we test providers).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on a change in pull request #17625: Adds secret backend and logging information to provider yaml

Posted by GitBox <gi...@apache.org>.
potiuk commented on a change in pull request #17625:
URL: https://github.com/apache/airflow/pull/17625#discussion_r689794770



##########
File path: airflow/provider_info.schema.json
##########
@@ -27,6 +27,27 @@
       "items": {
         "type": "string"
       }
+    },
+    "secrets-backends": {

Review comment:
       I deliberately did not add it now. 
   
   I plan to do it in the follow-up documentation-specific PR where I am going to generate automated "summaries" in providers and provide all the inter-linking from relevant parts of the documentation. 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] kaxil commented on a change in pull request #17625: Adds secret backend and logging information to provider yaml

Posted by GitBox <gi...@apache.org>.
kaxil commented on a change in pull request #17625:
URL: https://github.com/apache/airflow/pull/17625#discussion_r689124542



##########
File path: airflow/cli/cli_parser.py
##########
@@ -1267,6 +1267,18 @@ class GroupCommand(NamedTuple):
         func=lazy_load_command('airflow.cli.commands.provider_command.connection_field_behaviours'),
         args=(ARG_OUTPUT, ARG_VERBOSE),
     ),
+    ActionCommand(
+        name='logging',
+        help='Get information about task logging handlers provided',
+        func=lazy_load_command('airflow.cli.commands.provider_command.logging_list'),
+        args=(ARG_OUTPUT, ARG_VERBOSE),
+    ),
+    ActionCommand(
+        name='secrets',
+        help='Get information about secret backends provided',
+        func=lazy_load_command('airflow.cli.commands.provider_command.secret_backends_list'),

Review comment:
       ```suggestion
           func=lazy_load_command('airflow.cli.commands.provider_command.secrets_backends_list'),
   ```

##########
File path: airflow/cli/cli_parser.py
##########
@@ -1267,6 +1267,18 @@ class GroupCommand(NamedTuple):
         func=lazy_load_command('airflow.cli.commands.provider_command.connection_field_behaviours'),
         args=(ARG_OUTPUT, ARG_VERBOSE),
     ),
+    ActionCommand(
+        name='logging',
+        help='Get information about task logging handlers provided',
+        func=lazy_load_command('airflow.cli.commands.provider_command.logging_list'),
+        args=(ARG_OUTPUT, ARG_VERBOSE),
+    ),
+    ActionCommand(
+        name='secrets',
+        help='Get information about secret backends provided',

Review comment:
       ```suggestion
           help='Get information about secrets backends provided',
   ```

##########
File path: airflow/cli/commands/provider_command.py
##########
@@ -113,3 +113,27 @@ def extra_links_list(args):
             "extra_link_class_name": x,
         },
     )
+
+
+@suppress_logs_and_warning
+def logging_list(args):
+    """Lists all log task handlers at the command line"""
+    AirflowConsole().print_as(
+        data=list(ProvidersManager().logging_class_names),
+        output=args.output,
+        mapper=lambda x: {
+            "logging_class_name": x,
+        },
+    )
+
+
+@suppress_logs_and_warning
+def secret_backends_list(args):
+    """Lists all Secret Backends at the command line"""
+    AirflowConsole().print_as(
+        data=list(ProvidersManager().secret_backend_class_names),

Review comment:
       ```suggestion
           data=list(ProvidersManager().secrets_backend_class_names),
   ```

##########
File path: airflow/cli/commands/provider_command.py
##########
@@ -113,3 +113,27 @@ def extra_links_list(args):
             "extra_link_class_name": x,
         },
     )
+
+
+@suppress_logs_and_warning
+def logging_list(args):
+    """Lists all log task handlers at the command line"""
+    AirflowConsole().print_as(
+        data=list(ProvidersManager().logging_class_names),
+        output=args.output,
+        mapper=lambda x: {
+            "logging_class_name": x,
+        },
+    )
+
+
+@suppress_logs_and_warning
+def secret_backends_list(args):
+    """Lists all Secret Backends at the command line"""
+    AirflowConsole().print_as(
+        data=list(ProvidersManager().secret_backend_class_names),
+        output=args.output,
+        mapper=lambda x: {
+            "secret_backend_class_name": x,

Review comment:
       ```suggestion
               "secrets_backend_class_name": x,
   ```




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] kaxil commented on a change in pull request #17625: Adds secret backend and logging information to provider yaml

Posted by GitBox <gi...@apache.org>.
kaxil commented on a change in pull request #17625:
URL: https://github.com/apache/airflow/pull/17625#discussion_r689124615



##########
File path: airflow/providers/microsoft/azure/provider.yaml
##########
@@ -165,3 +165,9 @@ hook-class-names:
   - airflow.providers.microsoft.azure.hooks.wasb.WasbHook
   - airflow.providers.microsoft.azure.hooks.azure_data_factory.AzureDataFactoryHook
   - airflow.providers.microsoft.azure.hooks.azure_container_registry.AzureContainerRegistryHook
+
+secret-backends:

Review comment:
       ```suggestion
   secrets-backends:
   ```

##########
File path: airflow/providers/hashicorp/provider.yaml
##########
@@ -43,3 +43,6 @@ hooks:
 
 hook-class-names:
   - airflow.providers.hashicorp.hooks.vault.VaultHook
+
+secret-backends:

Review comment:
       ```suggestion
   secrets-backends:
   ```

##########
File path: airflow/providers/google/provider.yaml
##########
@@ -757,3 +757,10 @@ extra-links:
 additional-extras:
   apache.beam: apache-beam[gcp]
   leveldb: plyvel
+
+secret-backends:

Review comment:
       ```suggestion
   secrets-backends:
   ```

##########
File path: airflow/providers/amazon/provider.yaml
##########
@@ -399,3 +399,11 @@ hook-class-names:
   - airflow.providers.amazon.aws.hooks.s3.S3Hook
   - airflow.providers.amazon.aws.hooks.base_aws.AwsBaseHook
   - airflow.providers.amazon.aws.hooks.emr.EmrHook
+
+secret-backends:

Review comment:
       ```suggestion
   secrets-backends:
   ```




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] mik-laj commented on pull request #17625: Adds secret backend and logging information to provider yaml

Posted by GitBox <gi...@apache.org>.
mik-laj commented on pull request #17625:
URL: https://github.com/apache/airflow/pull/17625#issuecomment-899408078


   You previously added provider info schema to separate information needed only for building documentation from that required for Airflow operation. https://github.com/apache/airflow/pull/13488
   Has anything changed in this regard? Do we want to eliminate this division? What if someone does not define a class in provider_info and wants to use it? Should we then block it?
   
   > Additional (more important) feature is that during our tests we are importing the classes exposed in provider.yaml and checking if they are properly configured (which also happens in runtime). This makes it easy to reason about any kind of typos you might have and you will see them early (as I did and fixed it while adding it).
   
   We have pre-commit checks for testing provider.yaml files, so we can achieve similar featture without any changes to Airflow. Additionally, these tests check that all integration have been added to the provider.yaml files.
   https://github.com/apache/airflow/blob/main/scripts/ci/pre_commit/pre_commit_check_provider_yaml_files.py


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk merged pull request #17625: Adds secret backend and logging information to provider yaml

Posted by GitBox <gi...@apache.org>.
potiuk merged pull request #17625:
URL: https://github.com/apache/airflow/pull/17625


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on pull request #17625: Adds secret backend and logging information to provider yaml

Posted by GitBox <gi...@apache.org>.
potiuk commented on pull request #17625:
URL: https://github.com/apache/airflow/pull/17625#issuecomment-899624391


   What do you think - others?  I think this is really, really useful to be able to see what kind of secret backends we have extra, what logging handlers, what authentication APIS we have. While I agree listing all operators/hooks/sensor for provider makes no sense, listing the "core feature extensions" of Airflow IMHO is worth to be queryable at runtime (and verifiable). For me this is again - disciverability problem (same as with docs) rather that "let the users figure it out and search on their own". 
   
   There is big difference IMHO when you want to just integrate operators/sensors/hooks of a give service, and whether you want to extend a "built-in" airflow capability with a given integration. Those are even useful for different kind of users - logging/auth/seceret backends are for Devops/Operations people, where Operators/Hooks/Sensors are for Dag writers. Those are different audiences - CLI is there for the Devops, they will use to to query for different kinds of things and I find it would be super-useful.
   
   I believe we should not look at it from the "airflow developer" point of view where we treat Operators/Hooks/Sensors vs. Auth/Secrets/Logging as similar "beings" inside  "provider" package (python constructs) and literally "the same kind of thing". 
   
   But I think we should look at it from the side of the user - IMHO even if "Operators/Hooks/Sensors vs Auth/Secrets/Logging" are packaged together in a provider, they have conceptually completely different scope, they are used differently and they have different audience (and we should treat them differently),


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on pull request #17625: Adds secret backend and logging information to provider yaml

Posted by GitBox <gi...@apache.org>.
potiuk commented on pull request #17625:
URL: https://github.com/apache/airflow/pull/17625#issuecomment-899098210


   > Can (Should ?) we add api-auth backends too ?
   > 
   > https://airflow.apache.org/docs/apache-airflow-providers-google/stable/api-auth-backend/google-openid.html
   
   Added, the secrets -> secrets name changed as well. (BTW - there are a number of other places where `Secret Backend` is also used rather than `Secrets Backend` so it should likely be fixed there)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] mik-laj edited a comment on pull request #17625: Adds secret backend and logging information to provider yaml

Posted by GitBox <gi...@apache.org>.
mik-laj edited a comment on pull request #17625:
URL: https://github.com/apache/airflow/pull/17625#issuecomment-899113732


   I am not sure if I understand what we want to achieve in this change and if we really have adopted the correct data structure for what we want to achieve.  How are the changes to Airflow CLI and core to improve our documentation?
   
   Previously, adding new information to [`provider_info`](https://raw.githubusercontent.com/apache/airflow/e09ce9a25972344eb8760ccabea2ff042ed8ee35/airflow/provider_info.schema.json)/ProviderManager was always related to adding new features or as required by making existing features work. I considered the ability to view information using the CLI as features that only facilitated troubleshooting, but was never the main goal.
   
   Currently, we have the following keys in `providers_info.schema.json` on the main branch.
   `description` -> Makes it easier to identify the package.
   `extra-links` -> Required by extra links to prevent loading unknown classes. Only registered classes can create extra links.
   `hook-class-names` -> Required by Connection UI.
   `name` ->  Makes it easier to identify the package.
   `package-name` ->  Makes it easier to identify the package.
   
   Could you please give **the main purpose** why we want to add this information to provider_info? Should we also add information about operators, sensors, hooks to `provider_info`? How is Airflow going to behave when the class is not added to provider_info but is used by Airflow? If your main goal is to prepare a list of auth_backend/secrets-backend/task-handlers in our documentation. I think it is enough for us to just add these keys to [`provider.yaml`](https://raw.githubusercontent.com/apache/airflow/e09ce9a25972344eb8760ccabea2ff042ed8ee35/airflow/provider.yaml.schema.json) like `operators`, `hooks`, `sensors`, `integrations`,  without adding to `provider_info`.
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] mik-laj commented on a change in pull request #17625: Adds secret backend and logging information to provider yaml

Posted by GitBox <gi...@apache.org>.
mik-laj commented on a change in pull request #17625:
URL: https://github.com/apache/airflow/pull/17625#discussion_r689675469



##########
File path: airflow/provider_info.schema.json
##########
@@ -27,6 +27,27 @@
       "items": {
         "type": "string"
       }
+    },
+    "secrets-backends": {

Review comment:
       Can you update the user docs also?
   http://apache-airflow-docs.s3-website.eu-central-1.amazonaws.com/docs/apache-airflow-providers/howto/create-update-providers.html




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] kaxil commented on a change in pull request #17625: Adds secret backend and logging information to provider yaml

Posted by GitBox <gi...@apache.org>.
kaxil commented on a change in pull request #17625:
URL: https://github.com/apache/airflow/pull/17625#discussion_r689124474



##########
File path: airflow/provider_info.schema.json
##########
@@ -27,6 +27,20 @@
       "items": {
         "type": "string"
       }
+    },
+    "secret-backends": {

Review comment:
       ```suggestion
       "secrets-backends": {
   ```




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] mik-laj edited a comment on pull request #17625: Adds secret backend and logging information to provider yaml

Posted by GitBox <gi...@apache.org>.
mik-laj edited a comment on pull request #17625:
URL: https://github.com/apache/airflow/pull/17625#issuecomment-899113732


   I am not sure if I understand what we want to achieve in this change and if we really have adopted the correct data structure for what we want to achieve.  How are the changes to Airflow CLI to improve our documentation?
   
   Previously, adding new information to [`provider_info`](https://raw.githubusercontent.com/apache/airflow/e09ce9a25972344eb8760ccabea2ff042ed8ee35/airflow/provider_info.schema.json)/ProviderManager was always related to adding new features or as required by making existing features work. I considered the ability to view information using the CLI as features that only facilitated troubleshooting, but was never the main goal.
   
   Currently, we have the following keys in `providers_info.schema.json` on the main branch.
   `description` -> Makes it easier to identify the package.
   `extra-links` -> Required by extra links to prevent loading unknown classes. Only registered classes can create extra links.
   `hook-class-names` -> Required by Connection UI.
   `name` ->  Makes it easier to identify the package.
   `package-name` ->  Makes it easier to identify the package.
   
   Could you please give **the main purpose** why we want to add this information to provider_info? Should we also add information about operators, sensors, hooks to `provider_info`? How is Airflow going to behave when the class is not added to provider_info but is used by Airflow? If your main goal is to prepare a list of auth_backend/secrets-backend/task-handlers in our documentation. I think it is enough for us to just add these keys to [`provider.yaml`](https://raw.githubusercontent.com/apache/airflow/e09ce9a25972344eb8760ccabea2ff042ed8ee35/airflow/provider.yaml.schema.json) like `operators`, `hooks`, `sensors`, `integrations`,  without adding to `provider_info`.
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org