You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by po...@apache.org on 2020/12/30 17:26:25 UTC

[airflow] branch master updated: Improves documentation regarding providers and custom connections (#13375)

This is an automated email from the ASF dual-hosted git repository.

potiuk pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/airflow.git


The following commit(s) were added to refs/heads/master by this push:
     new b52d39f  Improves documentation regarding providers and custom connections (#13375)
b52d39f is described below

commit b52d39f0475beeaaa5d4e53c79232d64ab3cf12d
Author: Bijan Soltani <46...@users.noreply.github.com>
AuthorDate: Wed Dec 30 18:26:08 2020 +0100

    Improves documentation regarding providers and custom connections (#13375)
    
    Co-authored-by: Bijan <me...@bijansoltani.com>
---
 docs/apache-airflow-providers/index.rst  | 61 ++++++++++++++++++++++++--------
 docs/apache-airflow/howto/connection.rst | 21 +++++------
 2 files changed, 58 insertions(+), 24 deletions(-)

diff --git a/docs/apache-airflow-providers/index.rst b/docs/apache-airflow-providers/index.rst
index 39034f1..2f067ce 100644
--- a/docs/apache-airflow-providers/index.rst
+++ b/docs/apache-airflow-providers/index.rst
@@ -24,12 +24,12 @@ Provider packages
 Provider packages context
 '''''''''''''''''''''''''
 
-Unlike Apache Airflow 1.10, the Airflow 2.0 is delivered in multiple, separate, but connected packages.
+Unlike Apache Airflow 1.10, the Airflow 2.0 is delivered in multiple, separate but connected packages.
 The core of Airflow scheduling system is delivered as ``apache-airflow`` package and there are around
 60 providers packages which can be installed separately as so called "Airflow Provider packages".
 Those provider packages are separated per-provider (for example ``amazon``, ``google``, ``salesforce``
-etc.)  Those packages are available as ``apache-airflow-providers`` packages - separately per each provider
-(for example there is an ``apache-airflow-providers-amazon`` or ``apache-airflow-providers-google`` package.
+etc.). Those packages are available as ``apache-airflow-providers`` packages - separately per each provider
+(for example there is an ``apache-airflow-providers-amazon`` or ``apache-airflow-providers-google`` package).
 
 You can install those provider packages separately in order to interface with a given provider. For those
 providers that have corresponding extras, the provider packages (latest version from PyPI) are installed
@@ -72,7 +72,7 @@ Separate provider packages provide the possibilities that were not available in
 Extending Airflow Connections and Extra links via Providers
 '''''''''''''''''''''''''''''''''''''''''''''''''''''''''''
 
-Providers can not only deliver operators, hooks, sensor, transfer operators to communicate with
+Providers can not only deliver operators, hooks, sensor, and transfer operators to communicate with a
 multitude of external systems, but they can also extend Airflow. Airflow has several extension capabilities
 that can be used by providers. Airflow automatically discovers which providers add those additional
 capabilities and, once you install provider package and re-start Airflow, those become automatically
@@ -80,18 +80,17 @@ available to Airflow Users.
 
 The capabilities are:
 
-* Adding Extra Links to operators delivered by the provider.
-  See :doc:`apache-airflow:howto/define_extra_link`
-  for description of what extra links are and examples of provider registering an operator with extra links
+* Adding Extra Links to operators delivered by the provider. See :doc:`apache-airflow:howto/define_extra_link`
+  for a description of what extra links are and examples of provider registering an operator with extra links
 
 * Adding custom connection types, extending connection form and handling custom form field behaviour for the
-  connections defined by the provider. See :doc:`apache-airflow:howto/connection` for description of
+  connections defined by the provider. See :doc:`apache-airflow:howto/connection` for a description of
   connection and what capabilities of custom connection you can define.
 
 How to create your own provider
-"""""""""""""""""""""""""""""""
+'''''''''''''''''''''''''''''''
 
-Adding provider to Airflow is just a matter of building a Python package and adding the right meta-data to
+Adding a provider to Airflow is just a matter of building a Python package and adding the right meta-data to
 the package. We are using standard mechanism of python to define
 `entry points <https://docs.python.org/3/library/importlib.metadata.html#entry-points>`_ . Your package
 needs to define appropriate entry-point ``apache_airflow_provider`` which has to point to a callable
@@ -111,7 +110,7 @@ your own purpose) but the two important fields from the extensibility point of v
   :doc:`apache-airflow:howto/connection` for more details.
 
 
-When your providers are installed you can query the installed providers and their capabilities with
+When your providers are installed you can query the installed providers and their capabilities with the
 ``airflow providers`` command. This way you can verify if your providers are properly recognized and whether
 they define the extensions properly. See :doc:`cli-and-env-variables-ref` for details of available CLI
 sub-commands.
@@ -178,17 +177,51 @@ Creating your own providers
 **When I write my own provider, do I need to do anything special to make it available to others?**
 
 You do not need to do anything special besides creating the ``apache_airflow_provider`` entry point
-returning properly formatted meta-data (dictionary with ``extra-links`` and ``hook-class-names`` fields.
+returning properly formatted meta-data (dictionary with ``extra-links`` and ``hook-class-names`` fields).
+
+**What do I need to do to turn a package into a provider?**
+
+You need to do the following to turn an existing Python package into a provider (see below for examples):
+
+* Add the ``apache_airflow_provider`` entry point in the ``setup.cfg`` - this tells airflow where to get
+  the required provider metadata
+* Create the function that you refer to in the first step as part of your package: this functions returns a
+  dictionary that contains all meta-data about your provider package; see also ``provider.yaml``
+  files in the community managed provider packages as examples
+
+Example ``setup.cfg``: cfg
+
+.. code-block::
+
+  [options.entry_points]
+  # the function get_provider_info is defined in myproviderpackage.somemodule
+  apache_airflow_provider=
+    provider_info=myproviderpackage.somemodule:get_provider_info
+
+Example ``myproviderpackage/somemodule.py``:
+
+.. code-block:: Python
+
+  def get_provider_info():
+      return {
+          "package-name": "my-package-name",
+          "name": "name",
+          "description": "a description",
+          "hook-class-names": [
+              "myproviderpackage.hooks.source.SourceHook",
+          ],
+          'versions': ["1.0.0"],
+      }
 
 
-**Should I named my provider specifically or should it be created in ``airflow.providers`` package?**
+**Should I name my provider specifically or should it be created in ``airflow.providers`` package?**
 
 We have quite a number (>70) of providers managed by the community and we are going to maintain them
 together with Apache Airflow. All those providers have well-defined structured and follow the
 naming conventions we defined and they are all in ``airflow.providers`` package. If your intention is
 to contribute your provider, then you should follow those conventions and make a PR to Apache Airflow
 to contribute to it. But you are free to use any package name as long as there are no conflicts with other
-names,so preferably choose package that is in your "domain".
+names, so preferably choose package that is in your "domain".
 
 **Is there a convention for a connection id and type?**
 
diff --git a/docs/apache-airflow/howto/connection.rst b/docs/apache-airflow/howto/connection.rst
index 25f1646..2d15324 100644
--- a/docs/apache-airflow/howto/connection.rst
+++ b/docs/apache-airflow/howto/connection.rst
@@ -327,23 +327,24 @@ an secrets backend to retrieve connections. For more details see :doc:`/security
 Custom connection types
 -----------------------
 
-Airflow allows to define custom connection types - including modification of the add/edit form for the
-connections. Custom connection types are defined in community maintained providers, but also you can add
-custom providers, that can add their own connection types. See :doc:`apache-airflow-providers:index`
-for description on how to add your own connection type via custom providers.
+Airflow allows the definition of custom connection types - including modifications of the add/edit form
+for the connections. Custom connection types are defined in community maintained providers, but you can
+can also add a custom provider that adds custom connection types. See :doc:`apache-airflow-providers:index`
+for description on how to add custom providers.
 
 The custom connection types are defined via Hooks delivered by the providers. The Hooks can implement
-methods defined in the protocol :class:`~airflow.hooks.base_hook.DiscoverableHook`. Note that your custom
-Hook should not derive from the class, the class is merely there to document expectations about class
-fields and methods that your Hook might define.
+methods defined in the protocol class :class:`~airflow.hooks.base_hook.DiscoverableHook`. Note that your
+custom Hook should not derive from this class, this class is a dummy example to document expectations
+regarding about class fields and methods that your Hook might define. Another good example is
+:py:class:`~airflow.providers.jdbc.hooks.jdbc.JdbcHook`.
 
-By implementing those method in the hooks of yours and exposing them via ``hook-class-names`` array in
+By implementing those methods in your hooks and exposing them via ``hook-class-names`` array in
 the provider meta-data you can customize Airflow by:
 
-* Adding custom connection type
+* Adding custom connection types
 * Adding automated Hook creation from the connection type
 * Adding custom form widget to display and edit custom "extra" parameters in your connection URL
 * Hiding fields that are not used for your connection
 * Adding placeholders showing examples of how fields should be formatted
 
-You can read more about details how to add custom connection type in the :doc:`apache-airflow-providers:index`
+You can read more about details how to add custom provider packages in the :doc:`apache-airflow-providers:index`