You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by GitBox <gi...@apache.org> on 2022/03/05 20:17:40 UTC

[GitHub] [airflow] potiuk opened a new pull request #22017: Removes limitations from Dask dependencies

potiuk opened a new pull request #22017:
URL: https://github.com/apache/airflow/pull/22017


   Dask dependencies were holding us back - when it comes to upgrading
   somoe of the packages (for example apache-beam and looker - in google
   provider). This PR removes the limitations but with a twist.
   
   * Dask tests stop working. We reach out to the Dask Team to fix them
     but since a very old version of `distributed` library was used
     the Dask team is called for help to fix those
   
   * The typing-extensions library was limited by `distributed` but it
     seems that version 4.0.0+ breaks kubernetes tests
   
   <!--
   Thank you for contributing! Please make sure that your code changes
   are covered with tests. And in case of new features or big changes
   remember to adjust the documentation.
   
   Feel free to ping committers for the review!
   
   In case of existing issue, reference it using one of the following:
   
   closes: #ISSUE
   related: #ISSUE
   
   How to write a good git commit message:
   http://chris.beams.io/posts/git-commit/
   -->
   
   ---
   **^ Add meaningful description above**
   
   Read the **[Pull Request Guidelines](https://github.com/apache/airflow/blob/main/CONTRIBUTING.rst#pull-request-guidelines)** for more information.
   In case of fundamental code change, Airflow Improvement Proposal ([AIP](https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+Improvements+Proposals)) is needed.
   In case of a new dependency, check compliance with the [ASF 3rd Party License Policy](https://www.apache.org/legal/resolved.html#category-x).
   In case of backwards incompatible changes please leave a note in [UPDATING.md](https://github.com/apache/airflow/blob/main/UPDATING.md).
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk edited a comment on pull request #22017: Removes limitations from Dask dependencies

Posted by GitBox <gi...@apache.org>.
potiuk edited a comment on pull request #22017:
URL: https://github.com/apache/airflow/pull/22017#issuecomment-1060034762


   > @potiuk isn't it possible (and easier) to skip these tests "from the outside" – ?
   > 
   > I would think `skipif` is more suitable for a situation where you want to skip a test depending on for example the operating system.
   
   How would you imagine "externall" exclusion :)? 
   
   I would like to avoid having anyone to add extra pytest arguments or command line switches to skip those. The idea is that whoever runs "All tests" for Airlflow can see "blessed tests" succeed, no matter if they specify some exclusions.
   
   In our case, if you want to make sure that all tests are succeeding you do this:
   
   * `./breeze build-image --python 3.7`
   * `./breeze --backend mysql`
   * while in "breeze" you run `pytest tests` 
   
   Those three commands should lead to successful test execution (in 3.7 + mysql combination).
   
   Having to add some additional exclusion rules is bad, especially in case we want Dask team members to help with fixing those. What we ask them, is to eventually remove skipIf (initially set the `skip_dask_test' to False and run the failed tests). They do not have to learn anything about external scripts and "Exclusions" we do outside of our "python" code. their fix will be limited to "dask" tests only - they don't have to understand what our CI does and how we exclude stuff externally.
   
   I think in this case skipIf  "close" to the tests being disabled is exactly what we need.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk edited a comment on pull request #22017: Removes limitations from Dask dependencies

Posted by GitBox <gi...@apache.org>.
potiuk edited a comment on pull request #22017:
URL: https://github.com/apache/airflow/pull/22017#issuecomment-1060034762


   > @potiuk isn't it possible (and easier) to skip these tests "from the outside" – ?
   > 
   > I would think `skipif` is more suitable for a situation where you want to skip a test depending on for example the operating system.
   
   How would you imagine "externall" exclusion :)? 
   
   I would like to avoid having anyone to add extra pytest arguments or command line switches to skip those. The idea is that whoever runs "All tests" for Airlflow can see them succeed, no matter if they specify some exclusions.
   
   In our case, if you want to make sure that all tests are succeeding you do this:
   
   * `./breeze build-image --python 3.7`
   * `./breeze --backend mysql`
   * while in "breeze" you run `pytest tests` 
   
   Those three commands should lead to successful test execution (in 3.7 + mysql combination).
   
   Having to add some additional exclusion rules is bad, especially in case we want Dask team members to help with fixing those. What we ask them, is to make either remove skipIf (initially set the `skip_dask_test' to False and run the failed tests. They do not have to learn anything about external scripts and "Exclusions" we do outside of our "python" code. their fix will be limited to "dask" tests only - they don't have to understand what our CI does and how we exclude stuff externally.
   
   I think in this case skip if "close" to the tests being disabled is exactly what we need.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk merged pull request #22017: Removes limitations from Dask dependencies

Posted by GitBox <gi...@apache.org>.
potiuk merged pull request #22017:
URL: https://github.com/apache/airflow/pull/22017


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk edited a comment on pull request #22017: Removes limitations from Dask dependencies

Posted by GitBox <gi...@apache.org>.
potiuk edited a comment on pull request #22017:
URL: https://github.com/apache/airflow/pull/22017#issuecomment-1060034762


   > @potiuk isn't it possible (and easier) to skip these tests "from the outside" – ?
   > 
   > I would think `skipif` is more suitable for a situation where you want to skip a test depending on for example the operating system.
   
   How would you imagine "externall" exclusion :)? 
   
   I would like to avoid having anyone to add extra pytest arguments or command line switches to skip those. The idea is that whoever runs "All tests" for Airlflow can see them succeed, no matter if they specify some exclusions.
   
   In our case, if you want to make sure that all tests are succeeding you do this:
   
   * `./breeze build-image --python 3.7`
   * `./breeze --backend mysql`
   * while in "breeze" you run `pytest tests` 
   
   Those three commands should lead to successful test execution (in 3.7 + mysql combination).
   
   Having to add some additional exclusion rules is bad, especially in case we want Dask team members to help with fixing those. What we ask them, is to eventually remove skipIf (initially set the `skip_dask_test' to False and run the failed tests). They do not have to learn anything about external scripts and "Exclusions" we do outside of our "python" code. their fix will be limited to "dask" tests only - they don't have to understand what our CI does and how we exclude stuff externally.
   
   I think in this case skip if "close" to the tests being disabled is exactly what we need.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on pull request #22017: Removes limitations from Dask dependencies

Posted by GitBox <gi...@apache.org>.
potiuk commented on pull request #22017:
URL: https://github.com/apache/airflow/pull/22017#issuecomment-1059832519


   cc: @alekseiloginov  - related to #20882 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk edited a comment on pull request #22017: Removes limitations from Dask dependencies

Posted by GitBox <gi...@apache.org>.
potiuk edited a comment on pull request #22017:
URL: https://github.com/apache/airflow/pull/22017#issuecomment-1060034762


   > @potiuk isn't it possible (and easier) to skip these tests "from the outside" – ?
   > 
   > I would think `skipif` is more suitable for a situation where you want to skip a test depending on for example the operating system.
   
   How would you imagine "externall" exclusion :)? 
   
   I would like to avoid having anyone to add extra pytest arguments or command line switches to skip those. The idea is that whoever runs "All tests" for Airlflow can see them succeed, no matter if they specify some exclusions.
   
   In our case, if you want to make sure that all tests are succeeding you do this:
   
   * `./breeze build-image --python 3.7`
   * `./breeze --backend mysql`
   * while in "breeze" you run `pytest tests` 
   
   Those three commands should lead to successful test execution (in 3.7 + mysql combination).
   
   Having to add some additional exclusion rules is bad, especially in case we want Dask team members to help with fixing those. What we ask them, is to eventually remove skipIf (initially set the `skip_dask_test' to False and run the failed tests). They do not have to learn anything about external scripts and "Exclusions" we do outside of our "python" code. their fix will be limited to "dask" tests only - they don't have to understand what our CI does and how we exclude stuff externally.
   
   I think in this case skipIf  "close" to the tests being disabled is exactly what we need.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on a change in pull request #22017: Removes limitations from Dask dependencies

Posted by GitBox <gi...@apache.org>.
potiuk commented on a change in pull request #22017:
URL: https://github.com/apache/airflow/pull/22017#discussion_r820302132



##########
File path: tests/executors/test_dask_executor.py
##########
@@ -20,31 +20,39 @@
 from unittest import mock
 
 import pytest
+from distributed import LocalCluster
 
 from airflow.exceptions import AirflowException
+from airflow.executors.dask_executor import DaskExecutor
 from airflow.jobs.backfill_job import BackfillJob
 from airflow.models import DagBag
 from airflow.utils import timezone
 from tests.test_utils.config import conf_vars
 
 try:
-    from distributed import LocalCluster
-
     # utility functions imported from the dask testing suite to instantiate a test
     # cluster for tls tests
+    from distributed import tests  # noqa
     from distributed.utils_test import cluster as dask_testing_cluster, get_cert, tls_security
 
-    from airflow.executors.dask_executor import DaskExecutor
-
     skip_tls_tests = False
 except ImportError:
     skip_tls_tests = True
+    # In case the tests are skipped because of lacking test harness, get_cert should be
+    # overridden to avoid get_cert failing during test discovery as get_cert is used
+    # in conf_vars decorator
+    get_cert = lambda x: x
 
 DEFAULT_DATE = timezone.datetime(2017, 1, 1)
 SUCCESS_COMMAND = ['airflow', 'tasks', 'run', '--help']
 FAIL_COMMAND = ['airflow', 'tasks', 'run', 'false']
 
+# For now we are temporarily removing Dask support until we get Dask Team help us in making the
+# tests pass again
+skip_dask_tests = True
+
 
+@pytest.mark.skipif(skip_dask_tests, reason="The tests are skipped because it needs testing from Dask team")

Review comment:
       I wanted to give an easy way for Dask committers to be able to "play" with it.  it's easier to set up one variable to True to do it, rather than manually remove @pytest.mark.skip for all test cases.
   
   This is really a "temporary" state I wanted to make easy for them to work on (same as last time).




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] mik-laj commented on a change in pull request #22017: Removes limitations from Dask dependencies

Posted by GitBox <gi...@apache.org>.
mik-laj commented on a change in pull request #22017:
URL: https://github.com/apache/airflow/pull/22017#discussion_r820297635



##########
File path: tests/executors/test_dask_executor.py
##########
@@ -20,31 +20,39 @@
 from unittest import mock
 
 import pytest
+from distributed import LocalCluster
 
 from airflow.exceptions import AirflowException
+from airflow.executors.dask_executor import DaskExecutor
 from airflow.jobs.backfill_job import BackfillJob
 from airflow.models import DagBag
 from airflow.utils import timezone
 from tests.test_utils.config import conf_vars
 
 try:
-    from distributed import LocalCluster
-
     # utility functions imported from the dask testing suite to instantiate a test
     # cluster for tls tests
+    from distributed import tests  # noqa
     from distributed.utils_test import cluster as dask_testing_cluster, get_cert, tls_security
 
-    from airflow.executors.dask_executor import DaskExecutor
-
     skip_tls_tests = False
 except ImportError:
     skip_tls_tests = True
+    # In case the tests are skipped because of lacking test harness, get_cert should be
+    # overridden to avoid get_cert failing during test discovery as get_cert is used
+    # in conf_vars decorator
+    get_cert = lambda x: x
 
 DEFAULT_DATE = timezone.datetime(2017, 1, 1)
 SUCCESS_COMMAND = ['airflow', 'tasks', 'run', '--help']
 FAIL_COMMAND = ['airflow', 'tasks', 'run', 'false']
 
+# For now we are temporarily removing Dask support until we get Dask Team help us in making the
+# tests pass again
+skip_dask_tests = True
+
 
+@pytest.mark.skipif(skip_dask_tests, reason="The tests are skipped because it needs testing from Dask team")

Review comment:
       ```suggestion
   @pytest.mark.skip(reason="The tests are skipped because it needs testing from Dask team")
   ```
   `skipif` is used to dynamically skip tests depending on a dynamic value, e.g. in the active database backend.
   




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on pull request #22017: Removes limitations from Dask dependencies

Posted by GitBox <gi...@apache.org>.
potiuk commented on pull request #22017:
URL: https://github.com/apache/airflow/pull/22017#issuecomment-1060034762


   > @potiuk isn't it possible (and easier) to skip these tests "from the outside" – ?
   > 
   > I would think `skipif` is more suitable for a situation where you want to skip a test depending on for example the operating system.
   
   How would you imagine "externall" exclusion :)? 
   
   I would like to avoid having anyone to add extra pytest arguments or command line switches to skip those. The idea is that whoever runs "All tests" for Airlflow can see them succeed, no matter if they specify some exclusions.
   
   In our case, if you want to make sure that all tests are succeeding you do this:
   
   * `./breeze build-image --python 3.7`
   * `./breeze --backend mysql`
   * while in "breeze" you run `pytest tests` 
   
   Those three commands should lead to successful test execution.
   
   Having to add some additional exclusion rules is bad, especially in case we want Dask team members to help with fixing those. What we ask them, is to make either remove skipIf (initially set the `skip_dask_test' to False and run the failed tests. They do not have to learn anything about external scripts and "Exclusions" we do outside of our "python" code. their fix will be limited to "dask" tests only - they don't have to understand what our CI does and how we exclude stuff externally.
   
   I think in this case skip if "close" to the tests being disabled is exactly what we need.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on a change in pull request #22017: Removes limitations from Dask dependencies

Posted by GitBox <gi...@apache.org>.
potiuk commented on a change in pull request #22017:
URL: https://github.com/apache/airflow/pull/22017#discussion_r820154297



##########
File path: tests/executors/test_dask_executor.py
##########
@@ -20,31 +20,39 @@
 from unittest import mock
 
 import pytest
+from distributed import LocalCluster
 
 from airflow.exceptions import AirflowException
+from airflow.executors.dask_executor import DaskExecutor
 from airflow.jobs.backfill_job import BackfillJob
 from airflow.models import DagBag
 from airflow.utils import timezone
 from tests.test_utils.config import conf_vars
 
 try:
-    from distributed import LocalCluster
-
     # utility functions imported from the dask testing suite to instantiate a test
     # cluster for tls tests
+    from distributed import tests  # noqa
     from distributed.utils_test import cluster as dask_testing_cluster, get_cert, tls_security
 
-    from airflow.executors.dask_executor import DaskExecutor
-
     skip_tls_tests = False
 except ImportError:
     skip_tls_tests = True
+    # In case the tests are skipped because of lacking test harness, get_cert should be
+    # overridden to avoid get_cert failing during test discovery as get_cert is used
+    # in conf_vars decorator
+    get_cert = lambda x: x
 
 DEFAULT_DATE = timezone.datetime(2017, 1, 1)
 SUCCESS_COMMAND = ['airflow', 'tasks', 'run', '--help']
 FAIL_COMMAND = ['airflow', 'tasks', 'run', 'false']
 
+# For now we are temporarily removing Dask support until we get Dask Team help us in making the
+# tests pass again
+skip_dask_tests = True

Review comment:
       It is not. I started discussion on that here https://lists.apache.org/thread/6stgcpjt5jb3xfw92oo1j486j33c8v7m
   
   This is is a second time I start similar discussion - the previous time it was in Jan 2020 https://lists.apache.org/thread/875fpgb7vfpmtxrmt19jmo8d3p6mgqnh and then Dask team chimed in and helped in fixing the tests. 
   
   But more than 1 year later we have similar problem.
   
   I also asked at Dask's disccoure whether they can help again: https://dask.discourse.group/t/potential-removal-of-dask-executor-support-in-airflow/433




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] subkanthi commented on a change in pull request #22017: Removes limitations from Dask dependencies

Posted by GitBox <gi...@apache.org>.
subkanthi commented on a change in pull request #22017:
URL: https://github.com/apache/airflow/pull/22017#discussion_r820153996



##########
File path: tests/executors/test_dask_executor.py
##########
@@ -20,31 +20,39 @@
 from unittest import mock
 
 import pytest
+from distributed import LocalCluster
 
 from airflow.exceptions import AirflowException
+from airflow.executors.dask_executor import DaskExecutor
 from airflow.jobs.backfill_job import BackfillJob
 from airflow.models import DagBag
 from airflow.utils import timezone
 from tests.test_utils.config import conf_vars
 
 try:
-    from distributed import LocalCluster
-
     # utility functions imported from the dask testing suite to instantiate a test
     # cluster for tls tests
+    from distributed import tests  # noqa
     from distributed.utils_test import cluster as dask_testing_cluster, get_cert, tls_security
 
-    from airflow.executors.dask_executor import DaskExecutor
-
     skip_tls_tests = False
 except ImportError:
     skip_tls_tests = True
+    # In case the tests are skipped because of lacking test harness, get_cert should be
+    # overridden to avoid get_cert failing during test discovery as get_cert is used
+    # in conf_vars decorator
+    get_cert = lambda x: x
 
 DEFAULT_DATE = timezone.datetime(2017, 1, 1)
 SUCCESS_COMMAND = ['airflow', 'tasks', 'run', '--help']
 FAIL_COMMAND = ['airflow', 'tasks', 'run', 'false']
 
+# For now we are temporarily removing Dask support until we get Dask Team help us in making the
+# tests pass again
+skip_dask_tests = True

Review comment:
       @potiuk , Out of curiosity, is this strictly maintained by the Dask team, is this something I can take a look, I was looking also at options of connecting to a Local Dask cluster along with distributed.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] malthe commented on pull request #22017: Removes limitations from Dask dependencies

Posted by GitBox <gi...@apache.org>.
malthe commented on pull request #22017:
URL: https://github.com/apache/airflow/pull/22017#issuecomment-1060029158


   @potiuk isn't it possible (and easier) to skip these tests "from the outside" – ?
   
   I would think `skipif` is more suitable for a situation where you want to skip a test depending on for example the operating system.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk edited a comment on pull request #22017: Removes limitations from Dask dependencies

Posted by GitBox <gi...@apache.org>.
potiuk edited a comment on pull request #22017:
URL: https://github.com/apache/airflow/pull/22017#issuecomment-1060034762


   > @potiuk isn't it possible (and easier) to skip these tests "from the outside" – ?
   > 
   > I would think `skipif` is more suitable for a situation where you want to skip a test depending on for example the operating system.
   
   How would you imagine "externall" exclusion :)? 
   
   I would like to avoid having anyone to add extra pytest arguments or command line switches to skip those. The idea is that whoever runs "All tests" for Airlflow can see them succeed, no matter if they specify some exclusions.
   
   In our case, if you want to make sure that all tests are succeeding you do this:
   
   * `./breeze build-image --python 3.7`
   * `./breeze --backend mysql`
   * while in "breeze" you run `pytest tests` 
   
   Those three commands should lead to successful test execution (in 3.7 + mysql combination).
   
   Having to add some additional exclusion rules is bad, especially in case we want Dask team members to help with fixing those. What we ask them, is to eventually remove skipIf (initially set the `skip_dask_test' to False and run the failed tests. They do not have to learn anything about external scripts and "Exclusions" we do outside of our "python" code. their fix will be limited to "dask" tests only - they don't have to understand what our CI does and how we exclude stuff externally.
   
   I think in this case skip if "close" to the tests being disabled is exactly what we need.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on a change in pull request #22017: Removes limitations from Dask dependencies

Posted by GitBox <gi...@apache.org>.
potiuk commented on a change in pull request #22017:
URL: https://github.com/apache/airflow/pull/22017#discussion_r820302132



##########
File path: tests/executors/test_dask_executor.py
##########
@@ -20,31 +20,39 @@
 from unittest import mock
 
 import pytest
+from distributed import LocalCluster
 
 from airflow.exceptions import AirflowException
+from airflow.executors.dask_executor import DaskExecutor
 from airflow.jobs.backfill_job import BackfillJob
 from airflow.models import DagBag
 from airflow.utils import timezone
 from tests.test_utils.config import conf_vars
 
 try:
-    from distributed import LocalCluster
-
     # utility functions imported from the dask testing suite to instantiate a test
     # cluster for tls tests
+    from distributed import tests  # noqa
     from distributed.utils_test import cluster as dask_testing_cluster, get_cert, tls_security
 
-    from airflow.executors.dask_executor import DaskExecutor
-
     skip_tls_tests = False
 except ImportError:
     skip_tls_tests = True
+    # In case the tests are skipped because of lacking test harness, get_cert should be
+    # overridden to avoid get_cert failing during test discovery as get_cert is used
+    # in conf_vars decorator
+    get_cert = lambda x: x
 
 DEFAULT_DATE = timezone.datetime(2017, 1, 1)
 SUCCESS_COMMAND = ['airflow', 'tasks', 'run', '--help']
 FAIL_COMMAND = ['airflow', 'tasks', 'run', 'false']
 
+# For now we are temporarily removing Dask support until we get Dask Team help us in making the
+# tests pass again
+skip_dask_tests = True
+
 
+@pytest.mark.skipif(skip_dask_tests, reason="The tests are skipped because it needs testing from Dask team")

Review comment:
       I wanted to give an easy way for Dask committers to be able to "play" with it.  it's easier to set up one variable to True to do it, rather than manually remove @pytest.mark.skip




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on pull request #22017: Removes limitations from Dask dependencies

Posted by GitBox <gi...@apache.org>.
potiuk commented on pull request #22017:
URL: https://github.com/apache/airflow/pull/22017#issuecomment-1059927352


   All looks good  now with those changes. I propose to merge it so that we are unblocked while we discuss next steps. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on a change in pull request #22017: Removes limitations from Dask dependencies

Posted by GitBox <gi...@apache.org>.
potiuk commented on a change in pull request #22017:
URL: https://github.com/apache/airflow/pull/22017#discussion_r820154297



##########
File path: tests/executors/test_dask_executor.py
##########
@@ -20,31 +20,39 @@
 from unittest import mock
 
 import pytest
+from distributed import LocalCluster
 
 from airflow.exceptions import AirflowException
+from airflow.executors.dask_executor import DaskExecutor
 from airflow.jobs.backfill_job import BackfillJob
 from airflow.models import DagBag
 from airflow.utils import timezone
 from tests.test_utils.config import conf_vars
 
 try:
-    from distributed import LocalCluster
-
     # utility functions imported from the dask testing suite to instantiate a test
     # cluster for tls tests
+    from distributed import tests  # noqa
     from distributed.utils_test import cluster as dask_testing_cluster, get_cert, tls_security
 
-    from airflow.executors.dask_executor import DaskExecutor
-
     skip_tls_tests = False
 except ImportError:
     skip_tls_tests = True
+    # In case the tests are skipped because of lacking test harness, get_cert should be
+    # overridden to avoid get_cert failing during test discovery as get_cert is used
+    # in conf_vars decorator
+    get_cert = lambda x: x
 
 DEFAULT_DATE = timezone.datetime(2017, 1, 1)
 SUCCESS_COMMAND = ['airflow', 'tasks', 'run', '--help']
 FAIL_COMMAND = ['airflow', 'tasks', 'run', 'false']
 
+# For now we are temporarily removing Dask support until we get Dask Team help us in making the
+# tests pass again
+skip_dask_tests = True

Review comment:
       It is not. I started discussion on that here https://lists.apache.org/thread/6stgcpjt5jb3xfw92oo1j486j33c8v7m
   
   This is is a second time I start similar discussion - the previous time it was in Jan 2020 https://lists.apache.org/thread/875fpgb7vfpmtxrmt19jmo8d3p6mgqnh and then Dask team chimed in and helped in fixing the tests. 
   
   But more than 1 year later we have similar problem:
   
   I also asked at Dask's disccoure whether they can help again: https://dask.discourse.group/t/potential-removal-of-dask-executor-support-in-airflow/433




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on a change in pull request #22017: Removes limitations from Dask dependencies

Posted by GitBox <gi...@apache.org>.
potiuk commented on a change in pull request #22017:
URL: https://github.com/apache/airflow/pull/22017#discussion_r820302497



##########
File path: tests/executors/test_dask_executor.py
##########
@@ -20,31 +20,39 @@
 from unittest import mock
 
 import pytest
+from distributed import LocalCluster
 
 from airflow.exceptions import AirflowException
+from airflow.executors.dask_executor import DaskExecutor
 from airflow.jobs.backfill_job import BackfillJob
 from airflow.models import DagBag
 from airflow.utils import timezone
 from tests.test_utils.config import conf_vars
 
 try:
-    from distributed import LocalCluster
-
     # utility functions imported from the dask testing suite to instantiate a test
     # cluster for tls tests
+    from distributed import tests  # noqa
     from distributed.utils_test import cluster as dask_testing_cluster, get_cert, tls_security
 
-    from airflow.executors.dask_executor import DaskExecutor
-
     skip_tls_tests = False
 except ImportError:
     skip_tls_tests = True
+    # In case the tests are skipped because of lacking test harness, get_cert should be
+    # overridden to avoid get_cert failing during test discovery as get_cert is used
+    # in conf_vars decorator
+    get_cert = lambda x: x
 
 DEFAULT_DATE = timezone.datetime(2017, 1, 1)
 SUCCESS_COMMAND = ['airflow', 'tasks', 'run', '--help']
 FAIL_COMMAND = ['airflow', 'tasks', 'run', 'false']
 
+# For now we are temporarily removing Dask support until we get Dask Team help us in making the
+# tests pass again
+skip_dask_tests = True
+
 
+@pytest.mark.skipif(skip_dask_tests, reason="The tests are skipped because it needs testing from Dask team")

Review comment:
       I imagine the Dask developers will do it this way:
   
   1) setup Breeze
   2) get Dask service to run tests on
   3) set `skip_dag_tests` to False
   4) fix the tests
   5) remove skips
   
   By having single flag to switch that enables all tests it is just easier to not forget about removing some of the skips. 
   
   




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk closed pull request #22017: Removes limitations from Dask dependencies

Posted by GitBox <gi...@apache.org>.
potiuk closed pull request #22017:
URL: https://github.com/apache/airflow/pull/22017


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org