You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by GitBox <gi...@apache.org> on 2022/04/20 19:26:06 UTC

[GitHub] [airflow] itayB opened a new pull request, #23128: image building documentation: adding new provider example

itayB opened a new pull request, #23128:
URL: https://github.com/apache/airflow/pull/23128

   * Current "upgrading provider" example is misleading. `apache-airflow-providers-docker==2.1.0` is downgrade in `apache/airflow:2.2.5` (which shipped with `apache-airflow-providers-docker==2.5.2` ) - so it's more downgrade than upgrade. That's why I switch it to "custom" provider version (as it hard to maintain the version in the example.
   * Following the [last comment](https://stackoverflow.com/a/71142395/1011253) in StackOverflow answer I'm adding a common example of Spark usage which combine both python provider package and `apt` package.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] itayB commented on a diff in pull request #23128: image building documentation: adding new provider example

Posted by GitBox <gi...@apache.org>.
itayB commented on code in PR #23128:
URL: https://github.com/apache/airflow/pull/23128#discussion_r855956171


##########
docs/docker-stack/build.rst:
##########
@@ -274,18 +274,27 @@ You should be aware, about a few things:
 Examples of image extending
 ---------------------------
 
-Example of upgrading Airflow Provider packages
+Example of customizing Airflow Provider packages
 ..............................................
 
 The :ref:`Airflow Providers <providers:community-maintained-providers>` are released independently of core
 Airflow and sometimes you might want to upgrade specific providers only to fix some problems or
 use features available in that provider version. Here is an example of how you can do it
 
-.. exampleinclude:: docker-examples/extending/add-providers/Dockerfile
+.. exampleinclude:: docker-examples/extending/custom-providers/Dockerfile
     :language: Dockerfile
     :start-after: [START Dockerfile]
     :end-before: [END Dockerfile]
 
+Example of adding Airflow Provider packages
+...................................................
+
+The following example adds ``apache-spark`` airflow-providers which requires both ``java``.

Review Comment:
   oops, fixed



##########
docs/docker-stack/build.rst:
##########
@@ -274,18 +274,27 @@ You should be aware, about a few things:
 Examples of image extending
 ---------------------------
 
-Example of upgrading Airflow Provider packages
+Example of customizing Airflow Provider packages
 ..............................................
 
 The :ref:`Airflow Providers <providers:community-maintained-providers>` are released independently of core
 Airflow and sometimes you might want to upgrade specific providers only to fix some problems or
 use features available in that provider version. Here is an example of how you can do it
 
-.. exampleinclude:: docker-examples/extending/add-providers/Dockerfile
+.. exampleinclude:: docker-examples/extending/custom-providers/Dockerfile
     :language: Dockerfile
     :start-after: [START Dockerfile]
     :end-before: [END Dockerfile]
 
+Example of adding Airflow Provider packages

Review Comment:
   Thanks @mik-laj !
   
   I don't mind moving it to the "Recipes" document or closing this PR if you think that it's redundant.
   As a Spark user (using `SparkSubmitOperator` in AWS) I think that the example I added is short and simple. 
   
   Seems like this one is dealing with java, Hadoop, Hive, Spark, GCP.
   * I didn't understand how it installs Spark: seems like it only creates a spark directory and downloads some google cloud jars. Where are the [Spark files](https://www.apache.org/dyn/closer.lua/spark/spark-3.2.1/spark-3.2.1-bin-hadoop3.2.tgz) themselves? By using `apache-airflow-providers-apache-spark` I'm getting it automatically (from `pyspark`). What am I missing?
   * In my use case, which I assume is relatively common, Hadoop/Hive/GCP are not required. Only Spark installation is required to trigger a submit and Hadoop relevant jars are on the spark driver which run on a different pod/instance.
   * Java installation: isn't it better to give an example with officially debian package installation rather than downloading from a 3rd party `jfrog`? 
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] mik-laj commented on a diff in pull request #23128: image building documentation: adding new provider example

Posted by GitBox <gi...@apache.org>.
mik-laj commented on code in PR #23128:
URL: https://github.com/apache/airflow/pull/23128#discussion_r854623728


##########
docs/docker-stack/build.rst:
##########
@@ -274,18 +274,27 @@ You should be aware, about a few things:
 Examples of image extending
 ---------------------------
 
-Example of upgrading Airflow Provider packages
+Example of customizing Airflow Provider packages
 ..............................................
 
 The :ref:`Airflow Providers <providers:community-maintained-providers>` are released independently of core
 Airflow and sometimes you might want to upgrade specific providers only to fix some problems or
 use features available in that provider version. Here is an example of how you can do it
 
-.. exampleinclude:: docker-examples/extending/add-providers/Dockerfile
+.. exampleinclude:: docker-examples/extending/custom-providers/Dockerfile
     :language: Dockerfile
     :start-after: [START Dockerfile]
     :end-before: [END Dockerfile]
 
+Example of adding Airflow Provider packages

Review Comment:
   We already have one guide about java instalation: https://airflow.apache.org/docs/docker-stack/recipes.html#apache-hadoop-stack-installation
   I am not sure if we should add new sections with the same or if it is enough to add a reference to an existing section.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] itayB commented on pull request #23128: image building documentation: adding new provider example

Posted by GitBox <gi...@apache.org>.
itayB commented on PR #23128:
URL: https://github.com/apache/airflow/pull/23128#issuecomment-1106218643

   > Looks cool. Some docs are failing though.
   
   pass now, sorry about that


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] potiuk merged pull request #23128: image building documentation: adding new provider example

Posted by GitBox <gi...@apache.org>.
potiuk merged PR #23128:
URL: https://github.com/apache/airflow/pull/23128


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] potiuk commented on a diff in pull request #23128: image building documentation: adding new provider example

Posted by GitBox <gi...@apache.org>.
potiuk commented on code in PR #23128:
URL: https://github.com/apache/airflow/pull/23128#discussion_r854597537


##########
docs/docker-stack/build.rst:
##########
@@ -274,18 +274,27 @@ You should be aware, about a few things:
 Examples of image extending
 ---------------------------
 
-Example of upgrading Airflow Provider packages
+Example of customizing Airflow Provider packages
 ..............................................
 
 The :ref:`Airflow Providers <providers:community-maintained-providers>` are released independently of core
 Airflow and sometimes you might want to upgrade specific providers only to fix some problems or
 use features available in that provider version. Here is an example of how you can do it
 
-.. exampleinclude:: docker-examples/extending/add-providers/Dockerfile
+.. exampleinclude:: docker-examples/extending/custom-providers/Dockerfile
     :language: Dockerfile
     :start-after: [START Dockerfile]
     :end-before: [END Dockerfile]
 
+Example of adding Airflow Provider packages
+...................................................
+
+The following example adds ``apache-spark`` airflow-providers which requires both ``java``.

Review Comment:
   unfinished sentence?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] potiuk commented on pull request #23128: image building documentation: adding new provider example

Posted by GitBox <gi...@apache.org>.
potiuk commented on PR #23128:
URL: https://github.com/apache/airflow/pull/23128#issuecomment-1104507521

   Looks cool. Some docs are failing .


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org