You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by GitBox <gi...@apache.org> on 2022/03/29 22:53:39 UTC

[GitHub] [airflow] pateash opened a new pull request #22548: Adding ArangoDB Provider

pateash opened a new pull request #22548:
URL: https://github.com/apache/airflow/pull/22548


   closes: #17778 
   
   ---
   ## Description
   
   Adding ArangoDB provider based on Python SDK https://github.com/ArangoDB-Community/python-arango
   
   Users can create their own custom operators leveraging the **ArangoDBHook**  directly 
   or building their operator on **AQLOperator** by providing **result_processor**  method,
   
   ```
   operator = AQLOperator(
       task_id='aql_operator',
       sql="FOR doc IN students " \
           "RETURN doc",
       dag=dag,
       result_processor=lambda cursor: print([document["name"] for document in cursor])
   )
   ```
   
   Sensor can be implemented by SQL
   
   ```
   sensor = AQLSensor(
       task_id="aql_sensor",
       sql="FOR doc IN students " \
           "FILTER doc.name == 'judy' " \
           "RETURN doc",
       timeout=60,
       poke_interval=10,
       dag=dag,
   )
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] pateash commented on pull request #22548: WIP: Add Arango hook

Posted by GitBox <gi...@apache.org>.
pateash commented on pull request #22548:
URL: https://github.com/apache/airflow/pull/22548#issuecomment-1081881802


   ![image](https://user-images.githubusercontent.com/16856802/160623668-63ef1a09-f9a2-4cb6-8099-2425288294eb.png)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] eladkal merged pull request #22548: Adding ArangoDB Provider

Posted by GitBox <gi...@apache.org>.
eladkal merged pull request #22548:
URL: https://github.com/apache/airflow/pull/22548


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on pull request #22548: Adding ArangoDB Provider

Posted by GitBox <gi...@apache.org>.
potiuk commented on pull request #22548:
URL: https://github.com/apache/airflow/pull/22548#issuecomment-1086868648


   :tada: :tada: :tada: :tada: :tada: :tada: :tada: 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on pull request #22548: Adding ArangoDB Provider

Posted by GitBox <gi...@apache.org>.
potiuk commented on pull request #22548:
URL: https://github.com/apache/airflow/pull/22548#issuecomment-1083119045


   Let me rebase and see it happening again :)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on pull request #22548: Adding ArangoDB Provider

Posted by GitBox <gi...@apache.org>.
potiuk commented on pull request #22548:
URL: https://github.com/apache/airflow/pull/22548#issuecomment-1085168697


   > I am not sure, Is it possible to do something with MANIFEST.in in CI?
   
   I do not want to even try to do it or workaround it. There might be many more problems due to that and I do not want to make Airflow workaround such deeply broken problems. @uranusjr might confirm that this is entirely wrong. Arango DB has terribly broken dependencies. 
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] pateash commented on pull request #22548: Adding ArangoDB Provider

Posted by GitBox <gi...@apache.org>.
pateash commented on pull request #22548:
URL: https://github.com/apache/airflow/pull/22548#issuecomment-1084766371


   > Rebased it @pateash -> I have high hopes for #22649 to either fix it or make it easier to understand where it came from
   
   Sure, let me rebase 🤞


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on pull request #22548: Adding ArangoDB Provider

Posted by GitBox <gi...@apache.org>.
potiuk commented on pull request #22548:
URL: https://github.com/apache/airflow/pull/22548#issuecomment-1084768378


   I did rebase before and it did not work. So the quest continues :)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] pateash commented on pull request #22548: Adding ArangoDB Provider

Posted by GitBox <gi...@apache.org>.
pateash commented on pull request #22548:
URL: https://github.com/apache/airflow/pull/22548#issuecomment-1085150020


   Thanks @potiuk @mik-laj , this is CRAZY stuff.
   
   I have raised an Issue https://github.com/ArangoDB-Community/python-arango/issues/194 and will follow up.
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] eladkal commented on a change in pull request #22548: Adding ArangoDB Provider

Posted by GitBox <gi...@apache.org>.
eladkal commented on a change in pull request #22548:
URL: https://github.com/apache/airflow/pull/22548#discussion_r837805960



##########
File path: airflow/providers/arangodb/hooks/arangodb.py
##########
@@ -0,0 +1,107 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+"""This module allows connecting to a ArangoDB."""
+from typing import Dict, Optional, Any
+
+from arango import ArangoClient as ArangoDBClient, AQLQueryExecuteError
+from arango.result import Result
+
+from airflow import AirflowException
+from airflow.hooks.base import BaseHook
+
+
+class ArangoDBHook(BaseHook):

Review comment:
       Can you add test coverage for them?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] pateash commented on a change in pull request #22548: WIP: Add Arango hook

Posted by GitBox <gi...@apache.org>.
pateash commented on a change in pull request #22548:
URL: https://github.com/apache/airflow/pull/22548#discussion_r837484408



##########
File path: airflow/providers/arangodb/sensors/arangodb.py
##########
@@ -0,0 +1,55 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+from typing import TYPE_CHECKING, Sequence
+
+from airflow.providers.arangodb.hooks.arangodb import ArangoDBHook
+from airflow.sensors.base import BaseSensorOperator
+
+if TYPE_CHECKING:
+    from airflow.utils.context import Context
+
+
+class AQLSensor(BaseSensorOperator):
+    """
+    Checks for the existence of a document which
+    matches the given query in ArangoDB. Example:
+
+    :param collection: Target DB collection.
+    :param query: The query to find the target document.
+    :param arangodb_conn_id: The :ref:`ArangoDB connection id <howto/connection:arangodb>` to use
+        when connecting to ArangoDB.
+    :param arangodb_db: Target ArangoDB name.
+    """
+
+    template_fields: Sequence[str] = ('sql',)

Review comment:
       
   added




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk edited a comment on pull request #22548: Adding ArangoDB Provider

Posted by GitBox <gi...@apache.org>.
potiuk edited a comment on pull request #22548:
URL: https://github.com/apache/airflow/pull/22548#issuecomment-1085016498


   TL;DR; I know the problem. The problem is that `python-arango` package dependencies are extremely wrong and break airflow package installation. We cannot accept python-arango until they fix it.
   
   # More details
   
   I was not very far when guessing that `setuptools` is the problem.
    
   The problem is that `python-arangodb` package depends on this:
   
   ```
   setuptools-scm[toml] (>=3.4)
   ```
   
   This is a package that automatically adds all files added to version control to the package that is build. This basically means that we have no way to control which of the source code files are used in which package. It's EXTREMELY dirsruptive on package preparation. An issue about that should be opended to ArangoDB developers and they should remove this package and fix their way of building the package.
   
   The effect of this change  was that when we build airflow package, all providers are added to the package even if we did not include them. It's VERY WRONG.
   
   no way we can approve it until Arango DB developers will fix it. Is it possible @eladkal @pateash that you open an issue and get it solved?
   
   This is rather easy to reproduce:
   
   1. Checkout main of airflow
   2. Run `rm -rf dist/*`
   3. Run `./breeze build-image`
   4. run `./breeze prepare-airlfow-packages`
   5. In `dist` folder you will find a `.whl` package (it's a zip file really) which do not contain any of the providers (you can check it with `unzip -t ./dist/apache*`
   
   6. Run `rm -rf dist/*`
   7. Modify setup.cfg by addding `setuptools-scm[toml] (>=3.4)` to install_requires 
   8. Run `./breeze build-image`
   9. run `./breeze prepare-airlfow-packages`
   10. In `dist` folder you will find a `.whl` package (it's a zip file really) which contains all providers
   
   The `setuptools-scm[toml] (>=3.4)` is a dependency of `python-arango`
   
   ```
   ⏚ [jarek:~/code/airflow] main ± http https://pypi.org/pypi/python-arango/json | jq ".info.requires_dist"
   [
     "urllib3 (>=1.26.0)",
     "requests",
     "requests-toolbelt",
     "PyJWT",
     "setuptools (>=42)",
     "setuptools-scm[toml] (>=3.4)",
     "dataclasses (>=0.6) ; python_version < \"3.7\"",
     "black ; extra == 'dev'",
     "flake8 (>=3.8.4) ; extra == 'dev'",
     "isort (>=5.0.0) ; extra == 'dev'",
     "mypy (>=0.790) ; extra == 'dev'",
     "mock ; extra == 'dev'",
     "pre-commit (>=2.9.3) ; extra == 'dev'",
     "pytest (>=6.0.0) ; extra == 'dev'",
     "pytest-cov (>=2.0.0) ; extra == 'dev'",
     "sphinx ; extra == 'dev'",
     "sphinx-rtd-theme ; extra == 'dev'",
     "types-pkg-resources ; extra == 'dev'",
     "types-requests ; extra == 'dev'"
   ]
   ```
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on pull request #22548: Adding ArangoDB Provider

Posted by GitBox <gi...@apache.org>.
potiuk commented on pull request #22548:
URL: https://github.com/apache/airflow/pull/22548#issuecomment-1083240173


   Still puzzled :) but I am getting closer to solve it


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] pateash commented on pull request #22548: Adding ArangoDB Provider

Posted by GitBox <gi...@apache.org>.
pateash commented on pull request #22548:
URL: https://github.com/apache/airflow/pull/22548#issuecomment-1083251267


   > 
   
   thanks @potiuk.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] pateash commented on pull request #22548: Adding ArangoDB Provider

Posted by GitBox <gi...@apache.org>.
pateash commented on pull request #22548:
URL: https://github.com/apache/airflow/pull/22548#issuecomment-1085156002


   @potiuk 
   
   > Additionally setuptools_scm provides setuptools with a list of files that are managed by the SCM (i.e. it automatically adds all of the SCM-managed files to the sdist). Unwanted files must be excluded by discarding them via MANIFEST.in.
   
   
   I am not sure, Is it possible to do something with MANIFEST.in in CI?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] pateash closed pull request #22548: Adding ArangoDB Provider

Posted by GitBox <gi...@apache.org>.
pateash closed pull request #22548:
URL: https://github.com/apache/airflow/pull/22548


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on pull request #22548: Adding ArangoDB Provider

Posted by GitBox <gi...@apache.org>.
potiuk commented on pull request #22548:
URL: https://github.com/apache/airflow/pull/22548#issuecomment-1083175445


   OK. I know what causes it but I do not know why it happens yet. When PROD build image is prepared we prepare "airflow" package so that it can be installed there from latest sources. But for SOME reason, it contains "all" providers as well. not only airflow. I do not know where it came from yet. But It proves the tests from @mik-laj are useful to catch it.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] pateash commented on a change in pull request #22548: Adding ArangoDB Provider

Posted by GitBox <gi...@apache.org>.
pateash commented on a change in pull request #22548:
URL: https://github.com/apache/airflow/pull/22548#discussion_r838003717



##########
File path: airflow/providers/arangodb/hooks/arangodb.py
##########
@@ -0,0 +1,107 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+"""This module allows connecting to a ArangoDB."""
+from typing import Dict, Optional, Any
+
+from arango import ArangoClient as ArangoDBClient, AQLQueryExecuteError
+from arango.result import Result
+
+from airflow import AirflowException
+from airflow.hooks.base import BaseHook
+
+
+class ArangoDBHook(BaseHook):

Review comment:
       Actually, we don't have any logic. Its just a wrapper.
   adding basic coverage anyways.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] eladkal commented on pull request #22548: Adding ArangoDB Provider

Posted by GitBox <gi...@apache.org>.
eladkal commented on pull request #22548:
URL: https://github.com/apache/airflow/pull/22548#issuecomment-1085788407


   I'll take a look later today


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on pull request #22548: Adding ArangoDB Provider

Posted by GitBox <gi...@apache.org>.
potiuk commented on pull request #22548:
URL: https://github.com/apache/airflow/pull/22548#issuecomment-1084769481


   Funny thing - this PR is the only one with the problem. And I already know WHAT happens (when preparing airflow package it includes all the providers where it shoud not), but I have no idea WHY it happens.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on pull request #22548: Adding ArangoDB Provider

Posted by GitBox <gi...@apache.org>.
potiuk commented on pull request #22548:
URL: https://github.com/apache/airflow/pull/22548#issuecomment-1085028396


   @mik-laj KUDOS, KUDOS KUDOS for the test you did .... This is a FANTASTIC  test that saved us from HUGE PROBLEM. If not your test, we would not have discovoered the problem until it would be too late!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] joowani commented on pull request #22548: Adding ArangoDB Provider

Posted by GitBox <gi...@apache.org>.
joowani commented on pull request #22548:
URL: https://github.com/apache/airflow/pull/22548#issuecomment-1085367182


   Hi maintainer of python-arango here. I've removed the dependency. Please try again with release version [7.3.2](https://github.com/ArangoDB-Community/python-arango/releases/tag/7.3.2). Thanks.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] eladkal commented on a change in pull request #22548: WIP: Add Arango hook

Posted by GitBox <gi...@apache.org>.
eladkal commented on a change in pull request #22548:
URL: https://github.com/apache/airflow/pull/22548#discussion_r835898414



##########
File path: airflow/providers/arangodb/sensors/arangodb.py
##########
@@ -0,0 +1,55 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+from typing import TYPE_CHECKING, Sequence
+
+from airflow.providers.arangodb.hooks.arangodb import ArangoDBHook
+from airflow.sensors.base import BaseSensorOperator
+
+if TYPE_CHECKING:
+    from airflow.utils.context import Context
+
+
+class AQLSensor(BaseSensorOperator):
+    """
+    Checks for the existence of a document which
+    matches the given query in ArangoDB. Example:
+
+    :param collection: Target DB collection.
+    :param query: The query to find the target document.
+    :param arangodb_conn_id: The :ref:`ArangoDB connection id <howto/connection:arangodb>` to use
+        when connecting to ArangoDB.
+    :param arangodb_db: Target ArangoDB name.
+    """
+
+    template_fields: Sequence[str] = ('sql',)

Review comment:
       Should we add also `template_ext`?

##########
File path: airflow/providers/arangodb/example_dags/example_arangodb.py
##########
@@ -0,0 +1,59 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+import logging
+from datetime import datetime
+from typing import Any, Optional
+
+from github import GithubException
+
+from airflow import AirflowException

Review comment:
       Please check the imports. You have 3 unused imports

##########
File path: airflow/providers/arangodb/operators/arangodb.py
##########
@@ -0,0 +1,61 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+from typing import TYPE_CHECKING, Optional, Sequence, Callable
+
+from airflow.models import BaseOperator
+from airflow.providers.arangodb.hooks.arangodb import ArangoDBHook
+
+if TYPE_CHECKING:
+    from airflow.utils.context import Context
+
+
+class AQLOperator(BaseOperator):
+    """
+    Executes AQL query in a ArangoDB database
+
+    .. seealso::
+        For more information on how to use this operator, take a look at the guide:
+        :ref:`howto/operator:AQLOperator`
+
+    :param sql: the AQL query to be executed. Can receive a str representing a
+        sql statement
+    :param result_processor: function to further process the Result from ArangoDB
+    :param arangodb_conn_id: Reference to :ref:`ArangoDB connection id <howto/connection:arangodb>`.
+    """
+
+    template_fields: Sequence[str] = ('sql',)

Review comment:
       Should we add also `template_ext`?

##########
File path: airflow/providers/arangodb/hooks/arangodb.py
##########
@@ -0,0 +1,107 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+"""This module allows connecting to a ArangoDB."""
+from typing import Dict, Optional, Any
+
+from arango import ArangoClient as ArangoDBClient, AQLQueryExecuteError
+from arango.result import Result
+
+from airflow import AirflowException
+from airflow.hooks.base import BaseHook
+
+
+class ArangoDBHook(BaseHook):

Review comment:
       Can we expose also `create_database`, `create_collection`, `create_graph` from the python lib?
   These are all very simple 1-2 line functions I believe




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] pateash commented on a change in pull request #22548: WIP: Add Arango hook

Posted by GitBox <gi...@apache.org>.
pateash commented on a change in pull request #22548:
URL: https://github.com/apache/airflow/pull/22548#discussion_r837506123



##########
File path: airflow/providers/arangodb/hooks/arangodb.py
##########
@@ -0,0 +1,107 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+"""This module allows connecting to a ArangoDB."""
+from typing import Dict, Optional, Any
+
+from arango import ArangoClient as ArangoDBClient, AQLQueryExecuteError
+from arango.result import Result
+
+from airflow import AirflowException
+from airflow.hooks.base import BaseHook
+
+
+class ArangoDBHook(BaseHook):

Review comment:
       sure, created.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on pull request #22548: Adding ArangoDB Provider

Posted by GitBox <gi...@apache.org>.
potiuk commented on pull request #22548:
URL: https://github.com/apache/airflow/pull/22548#issuecomment-1083110731


   Yeah: seems that for some reason it contains all providers:
   
   ```
   docker run -it ghcr.io/apache/airflow/main/prod/python3.7:23b7d64b40261dcdcf73187464c6f09b67afcc57  bash
   Unable to find image 'ghcr.io/apache/airflow/main/prod/python3.7:23b7d64b40261dcdcf73187464c6f09b67afcc57' locally
   23b7d64b40261dcdcf73187464c6f09b67afcc57: Pulling from apache/airflow/main/prod/python3.7
   c229119241af: Pull complete 
   5a3ae98ea812: Pull complete 
   d6bab1fc351b: Pull complete 
   f9cea33fb9b5: Pull complete 
   23c22d6e5b5d: Pull complete 
   b21b38d9bc75: Pull complete 
   e52ad88eda59: Pull complete 
   5938673019d8: Pull complete 
   10aec20ab867: Pull complete 
   bfa0b2f2703d: Pull complete 
   abea59e2f689: Pull complete 
   ffd9264d5a4a: Pull complete 
   ea7c97498e3e: Pull complete 
   4aed0971f3f7: Pull complete 
   8f85ceb1d546: Pull complete 
   b6132f0f6227: Pull complete 
   83d18601cc4f: Pull complete 
   88748a7a2d95: Pull complete 
   4f4fb700ef54: Pull complete 
   Digest: sha256:bf5da3a686feab47684c036de99a492c3d024fabd4e7a3b69ea9d63ce941b8c8
   Status: Downloaded newer image for ghcr.io/apache/airflow/main/prod/python3.7:23b7d64b40261dcdcf73187464c6f09b67afcc57
   
   airflow@54c94bf4e3b9:/opt/airflow$ airflow providers list
   package_name                              | description                                                                                     | version
   ==========================================+=================================================================================================+========
   apache-airflow-providers-airbyte          | Airbyte https://airbyte.io/                                                                     | 2.1.4  
   apache-airflow-providers-alibaba          | Alibaba Cloud integration (including Alibaba Cloud https://www.alibabacloud.com//)              | 1.1.1  
   apache-airflow-providers-amazon           | Amazon integration (including Amazon Web Services (AWS) https://aws.amazon.com/)                | 3.2.0  
   apache-airflow-providers-apache-beam      | Apache Beam https://beam.apache.org/                                                            | 3.3.0  
   apache-airflow-providers-apache-cassandra | Apache Cassandra http://cassandra.apache.org/                                                   | 2.1.3  
   apache-airflow-providers-apache-drill     | Apache Drill https://drill.apache.org/                                                          | 1.0.4  
   apache-airflow-providers-apache-druid     | Apache Druid https://druid.apache.org/                                                          | 2.3.3  
   apache-airflow-providers-apache-hdfs      | Hadoop Distributed File System (HDFS) https://hadoop.apache.org/docs/r1.2.1/hdfsdesign.html     | 2.2.3  
                                             | and WebHDFS https://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-hdfs/WebHDFS.html |        
   apache-airflow-providers-apache-hive      | Apache Hive https://hive.apache.org/                                                            | 2.3.2  
   apache-airflow-providers-apache-kylin     | Apache Kylin https://kylin.apache.org/                                                          | 2.0.4  
   apache-airflow-providers-apache-livy      | Apache Livy https://livy.apache.org/                                                            | 2.2.2  
   apache-airflow-providers-apache-pig       | Apache Pig https://pig.apache.org/                                                              | 2.0.4  
   apache-airflow-providers-apache-pinot     | Apache Pinot https://pinot.apache.org/                                                          | 2.0.4  
   apache-airflow-providers-apache-spark     | Apache Spark https://spark.apache.org/                                                          | 2.1.3  
   apache-airflow-providers-apache-sqoop     | Apache Sqoop https://sqoop.apache.org/                                                          | 2.1.3  
   apache-airflow-providers-arangodb         | ArangoDB https://www.arangodb.com/                                                              | 1.0.0  
   apache-airflow-providers-asana            | Asana https://app.asana.com/                                                                    | 1.1.3  
   apache-airflow-providers-celery           | Celery http://www.celeryproject.org/                                                            | 2.1.3  
   apache-airflow-providers-cloudant         | IBM Cloudant https://www.ibm.com/cloud/cloudant                                                 | 2.0.4  
   apache-airflow-providers-cncf-kubernetes  | Kubernetes https://kubernetes.io/                                                               | 3.1.2  
   apache-airflow-providers-databricks       | Databricks https://databricks.com/                                                              | 2.5.0  
   apache-airflow-providers-datadog          | Datadog https://www.datadoghq.com/                                                              | 2.0.4  
   apache-airflow-providers-dbt-cloud        | dbt Cloud https://www.getdbt.com/product/what-is-dbt/)                                          | 1.0.2  
   apache-airflow-providers-dingding         | Dingding https://oapi.dingtalk.com/                                                             | 2.0.4  
   apache-airflow-providers-discord          | Discord https://discordapp.com/                                                                 | 2.0.4  
   apache-airflow-providers-docker           | Docker https://docs.docker.com/install/                                                         | 2.5.2  
   apache-airflow-providers-elasticsearch    | Elasticsearch https://www.elastic.co/elasticsearch                                              | 3.0.2  
   apache-airflow-providers-exasol           | Exasol https://docs.exasol.com/home.htm                                                         | 2.1.3  
   apache-airflow-providers-facebook         | Facebook Ads http://business.facebook.com/                                                      | 2.2.3  
   apache-airflow-providers-ftp              | File Transfer Protocol (FTP) https://tools.ietf.org/html/rfc114                                 | 2.1.2  
   apache-airflow-providers-github           | Github https://www.github.com/                                                                  | 1.0.3  
   apache-airflow-providers-google           | Google services including:                                                                      | 6.7.0  
                                             |                                                                                                 |        
                                             |   - Google Ads https://ads.google.com/                                                          |        
                                             |   - Google Cloud (GCP) https://cloud.google.com/                                                |        
                                             |   - Google Firebase https://firebase.google.com/                                                |        
                                             |   - Google LevelDB https://github.com/google/leveldb/                                           |        
                                             |   - Google Marketing Platform https://marketingplatform.google.com/                             |        
                                             |   - Google Workspace https://workspace.google.pl/ (formerly Google Suite)                       |        
   apache-airflow-providers-grpc             | gRPC https://grpc.io/                                                                           | 2.0.4  
   apache-airflow-providers-hashicorp        | Hashicorp including Hashicorp Vault https://www.vaultproject.io/                                | 2.1.4  
   apache-airflow-providers-http             | Hypertext Transfer Protocol (HTTP) https://www.w3.org/Protocols/                                | 2.1.2  
   apache-airflow-providers-imap             | Internet Message Access Protocol (IMAP) https://tools.ietf.org/html/rfc3501                     | 2.2.3  
   apache-airflow-providers-influxdb         | InfluxDB https://www.influxdata.com/                                                            | 1.1.3  
   apache-airflow-providers-jdbc             | Java Database Connectivity (JDBC) https://docs.oracle.com/javase/8/docs/technotes/guides/jdbc/  | 2.1.3  
   apache-airflow-providers-jenkins          | Jenkins https://jenkins.io/                                                                     | 2.0.7  
   apache-airflow-providers-jira             | Atlassian Jira https://www.atlassian.com/                                                       | 2.0.4  
   apache-airflow-providers-microsoft-azure  | Microsoft Azure https://azure.microsoft.com/                                                    | 3.7.2  
   apache-airflow-providers-microsoft-mssql  | Microsoft SQL Server (MSSQL) https://www.microsoft.com/en-us/sql-server/sql-server-downloads    | 2.1.3  
   apache-airflow-providers-microsoft-psrp   | This package provides remote execution capabilities via the                                     | 1.1.3  
                                             | PowerShell Remoting Protocol (PSRP)                                                             |        
                                             | https://docs.microsoft.com/en-us/openspecs/windowsprotocols/ms-psrp/                            |        
   apache-airflow-providers-microsoft-winrm  | Windows Remote Management (WinRM) https://docs.microsoft.com/en-us/windows/win32/winrm/portal   | 2.0.5  
   apache-airflow-providers-mongo            | MongoDB https://www.mongodb.com/what-is-mongodb                                                 | 2.3.3  
   apache-airflow-providers-mysql            | MySQL https://www.mysql.com/products/                                                           | 2.2.3  
   apache-airflow-providers-neo4j            | Neo4j https://neo4j.com/                                                                        | 2.1.3  
   apache-airflow-providers-odbc             | ODBC https://github.com/mkleehammer/pyodbc/wiki                                                 | 2.0.4  
   apache-airflow-providers-openfaas         | OpenFaaS https://www.openfaas.com/                                                              | 2.0.3  
   apache-airflow-providers-opsgenie         | Opsgenie https://www.opsgenie.com/                                                              | 3.0.3  
   apache-airflow-providers-oracle           | Oracle https://www.oracle.com/en/database/                                                      | 2.2.3  
   apache-airflow-providers-pagerduty        | Pagerduty https://www.pagerduty.com/                                                            | 2.1.3  
   apache-airflow-providers-papermill        | Papermill https://github.com/nteract/papermill                                                  | 2.2.3  
   apache-airflow-providers-plexus           | Plexus https://plexus.corescientific.com/                                                       | 2.0.4  
   apache-airflow-providers-postgres         | PostgreSQL https://www.postgresql.org/                                                          | 4.1.0  
   apache-airflow-providers-presto           | Presto https://prestodb.github.io/                                                              | 2.1.2  
   apache-airflow-providers-qubole           | Qubole https://www.qubole.com/                                                                  | 2.1.3  
   apache-airflow-providers-redis            | Redis https://redis.io/                                                                         | 2.0.4  
   apache-airflow-providers-salesforce       | Salesforce https://www.salesforce.com/                                                          | 3.4.3  
   apache-airflow-providers-samba            | Samba https://www.samba.org/                                                                    | 3.0.4  
   apache-airflow-providers-segment          | Segment https://segment.com/                                                                    | 2.0.4  
   apache-airflow-providers-sendgrid         | Sendgrid https://sendgrid.com/                                                                  | 2.0.4  
   apache-airflow-providers-sftp             | SSH File Transfer Protocol (SFTP) https://tools.ietf.org/wg/secsh/draft-ietf-secsh-filexfer/    | 2.5.2  
   apache-airflow-providers-singularity      | Singularity https://sylabs.io/guides/latest/user-guide/                                         | 2.0.4  
   apache-airflow-providers-slack            | Slack https://slack.com/                                                                        | 4.2.3  
   apache-airflow-providers-snowflake        | Snowflake https://www.snowflake.com/                                                            | 2.6.0  
   apache-airflow-providers-sqlite           | SQLite https://www.sqlite.org/                                                                  | 2.1.3  
   apache-airflow-providers-ssh              | Secure Shell (SSH) https://tools.ietf.org/html/rfc4251                                          | 2.4.3  
   apache-airflow-providers-tableau          | Tableau https://www.tableau.com/                                                                | 2.1.7  
   apache-airflow-providers-telegram         | Telegram https://telegram.org/                                                                  | 2.0.4  
   apache-airflow-providers-trino            | Trino https://trino.io/                                                                         | 2.1.2  
   apache-airflow-providers-vertica          | Vertica https://www.vertica.com/                                                                | 2.1.3  
   apache-airflow-providers-yandex           | Yandex including Yandex.Cloud https://cloud.yandex.com/                                         | 2.2.3  
   apache-airflow-providers-zendesk          | Zendesk https://www.zendesk.com/                                                                | 3.0.3  
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on pull request #22548: Adding ArangoDB Provider

Posted by GitBox <gi...@apache.org>.
potiuk commented on pull request #22548:
URL: https://github.com/apache/airflow/pull/22548#issuecomment-1083176664


   I actually think it could come from the new setuptools release https://pypi.org/project/setuptools/61.2.0/


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on pull request #22548: Adding ArangoDB Provider

Posted by GitBox <gi...@apache.org>.
potiuk commented on pull request #22548:
URL: https://github.com/apache/airflow/pull/22548#issuecomment-1084621580


   Rebased it @pateash -> I have high hopes for  #22649 to either fix it or make it easier to understand where it came from


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on pull request #22548: Adding ArangoDB Provider

Posted by GitBox <gi...@apache.org>.
potiuk commented on pull request #22548:
URL: https://github.com/apache/airflow/pull/22548#issuecomment-1085016498


   TL;DR; I know the problem. The problem is that `python-arango` package dependencies are extremely wrong and break airflow package installation. We cannot accept python-arango until they fix it.
   
   I was not very far from guessing that `setuptools` is the problem.
    
   The problem is that `python-arangodb` package depends on this:
   
   ```
   setuptools-scm[toml] (>=3.4)
   ```
   
   This is a package that automatically adds all files added to version control to the package that is build. This basically means that we have no way to control whihc of the source code files are used in which package. It's EXTREMELY dirsruptive on package preparation. An issue about that should be opended to ArangoDB developers and they should remove this package and fix their way of building the package.
   
   The effect of this change  was that when we build airflow package, all providers are added to the package even if we did not include them. It's VERY WRONG.
   
   no way we can approve it until Arango DB developers will fix it. Is it possible @eladkal @pateash that you open an issue and get it solved?
   
   This is rather easy to reproduce:
   
   1. Checkout main of airflow
   2. Run `rm -rf dist/*`
   3. Run `./breeze build-image`
   4. run `./breeze prepare-airlfow-packages`
   5. In `dist` folder you will find a `.whl` package (it's a zip file really) which do not contain any of the providers (you can check it with `unzip -t ./dist/apache*`
   
   6. Run `rm -rf dist/*`
   7. Modify setup.cfg by addding `setuptools-scm[toml] (>=3.4)` to install_requires 
   8. Run `./breeze build-image`
   9. run `./breeze prepare-airlfow-packages`
   10. In `dist` folder you will find a `.whl` package (it's a zip file really) which contains all providers
   
   The `setuptools-scm[toml] (>=3.4)` is a dependency of `python-arango`
   
   ```
   ⏚ [jarek:~/code/airflow] main ± http https://pypi.org/pypi/python-arango/json | jq ".info.requires_dist"
   [
     "urllib3 (>=1.26.0)",
     "requests",
     "requests-toolbelt",
     "PyJWT",
     "setuptools (>=42)",
     "setuptools-scm[toml] (>=3.4)",
     "dataclasses (>=0.6) ; python_version < \"3.7\"",
     "black ; extra == 'dev'",
     "flake8 (>=3.8.4) ; extra == 'dev'",
     "isort (>=5.0.0) ; extra == 'dev'",
     "mypy (>=0.790) ; extra == 'dev'",
     "mock ; extra == 'dev'",
     "pre-commit (>=2.9.3) ; extra == 'dev'",
     "pytest (>=6.0.0) ; extra == 'dev'",
     "pytest-cov (>=2.0.0) ; extra == 'dev'",
     "sphinx ; extra == 'dev'",
     "sphinx-rtd-theme ; extra == 'dev'",
     "types-pkg-resources ; extra == 'dev'",
     "types-requests ; extra == 'dev'"
   ]
   ```
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] eladkal commented on pull request #22548: Adding ArangoDB Provider

Posted by GitBox <gi...@apache.org>.
eladkal commented on pull request #22548:
URL: https://github.com/apache/airflow/pull/22548#issuecomment-1083150950


   I don't recall we had such issue when GitHub provider was added (and it was after https://github.com/apache/airflow/commit/621d17bb77e3160c1a927803e5d190c0e2aade3c )


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] pateash closed pull request #22548: Adding ArangoDB Provider

Posted by GitBox <gi...@apache.org>.
pateash closed pull request #22548:
URL: https://github.com/apache/airflow/pull/22548


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on pull request #22548: Adding ArangoDB Provider

Posted by GitBox <gi...@apache.org>.
potiuk commented on pull request #22548:
URL: https://github.com/apache/airflow/pull/22548#issuecomment-1083118479


   This is VERY strange as it seems that when the image was built, it actually used only a small subset (as expected):
   
   ```
   #64 1.486 Force re-installing airflow and providers from local files with eager upgrade
   #64 1.486 
   #64 2.925 Looking in links: file:///docker-context-files
   #64 2.937 Processing /docker-context-files/apache_airflow_providers_amazon-3.2.0.dev0-py3-none-any.whl
   #64 2.952 Processing /docker-context-files/apache_airflow_providers_celery-2.1.3.dev0-py3-none-any.whl
   #64 2.959 Processing /docker-context-files/apache_airflow_providers_cncf_kubernetes-3.1.2.dev0-py3-none-any.whl
   #64 2.966 Processing /docker-context-files/apache_airflow_providers_docker-2.5.2.dev0-py3-none-any.whl
   #64 2.973 Processing /docker-context-files/apache_airflow_providers_elasticsearch-3.0.2.dev0-py3-none-any.whl
   #64 2.980 Processing /docker-context-files/apache_airflow_providers_ftp-2.1.2.dev0-py3-none-any.whl
   #64 2.988 Processing /docker-context-files/apache_airflow_providers_google-6.7.0.dev0-py3-none-any.whl
   #64 2.997 Processing /docker-context-files/apache_airflow_providers_grpc-2.0.4.dev0-py3-none-any.whl
   #64 3.004 Processing /docker-context-files/apache_airflow_providers_hashicorp-2.1.4.dev0-py3-none-any.whl
   #64 3.011 Processing /docker-context-files/apache_airflow_providers_http-2.1.2.dev0-py3-none-any.whl
   #64 3.018 Processing /docker-context-files/apache_airflow_providers_imap-2.2.3.dev0-py3-none-any.whl
   #64 3.026 Processing /docker-context-files/apache_airflow_providers_microsoft_azure-3.7.2.dev0-py3-none-any.whl
   #64 3.033 Processing /docker-context-files/apache_airflow_providers_mysql-2.2.3.dev0-py3-none-any.whl
   #64 3.040 Processing /docker-context-files/apache_airflow_providers_odbc-2.0.4.dev0-py3-none-any.whl
   #64 3.047 Processing /docker-context-files/apache_airflow_providers_postgres-4.1.0.dev0-py3-none-any.whl
   #64 3.054 Processing /docker-context-files/apache_airflow_providers_redis-2.0.4.dev0-py3-none-any.whl
   #64 3.062 Processing /docker-context-files/apache_airflow_providers_sendgrid-2.0.4.dev0-py3-none-any.whl
   #64 3.069 Processing /docker-context-files/apache_airflow_providers_sftp-2.5.2.dev0-py3-none-any.whl
   #64 3.076 Processing /docker-context-files/apache_airflow_providers_slack-4.2.3.dev0-py3-none-any.whl
   #64 3.083 Processing /docker-context-files/apache_airflow_providers_sqlite-2.1.3.dev0-py3-none-any.whl
   #64 3.090 Processing /docker-context-files/apache_airflow_providers_ssh-2.4.3.dev0-py3-none-any.whl
   #64 3.223 Processing /docker-context-files/apache_airflow-2.3.0.dev0-py3-none-any.whl
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk edited a comment on pull request #22548: Adding ArangoDB Provider

Posted by GitBox <gi...@apache.org>.
potiuk edited a comment on pull request #22548:
URL: https://github.com/apache/airflow/pull/22548#issuecomment-1085016498


   TL;DR; I know the problem. The problem is that `python-arango` package dependencies are extremely wrong and break airflow package installation. We cannot accept python-arango until they fix it.
   
   I was not very far when guessing that `setuptools` is the problem.
    
   The problem is that `python-arangodb` package depends on this:
   
   ```
   setuptools-scm[toml] (>=3.4)
   ```
   
   This is a package that automatically adds all files added to version control to the package that is build. This basically means that we have no way to control whihc of the source code files are used in which package. It's EXTREMELY dirsruptive on package preparation. An issue about that should be opended to ArangoDB developers and they should remove this package and fix their way of building the package.
   
   The effect of this change  was that when we build airflow package, all providers are added to the package even if we did not include them. It's VERY WRONG.
   
   no way we can approve it until Arango DB developers will fix it. Is it possible @eladkal @pateash that you open an issue and get it solved?
   
   This is rather easy to reproduce:
   
   1. Checkout main of airflow
   2. Run `rm -rf dist/*`
   3. Run `./breeze build-image`
   4. run `./breeze prepare-airlfow-packages`
   5. In `dist` folder you will find a `.whl` package (it's a zip file really) which do not contain any of the providers (you can check it with `unzip -t ./dist/apache*`
   
   6. Run `rm -rf dist/*`
   7. Modify setup.cfg by addding `setuptools-scm[toml] (>=3.4)` to install_requires 
   8. Run `./breeze build-image`
   9. run `./breeze prepare-airlfow-packages`
   10. In `dist` folder you will find a `.whl` package (it's a zip file really) which contains all providers
   
   The `setuptools-scm[toml] (>=3.4)` is a dependency of `python-arango`
   
   ```
   ⏚ [jarek:~/code/airflow] main ± http https://pypi.org/pypi/python-arango/json | jq ".info.requires_dist"
   [
     "urllib3 (>=1.26.0)",
     "requests",
     "requests-toolbelt",
     "PyJWT",
     "setuptools (>=42)",
     "setuptools-scm[toml] (>=3.4)",
     "dataclasses (>=0.6) ; python_version < \"3.7\"",
     "black ; extra == 'dev'",
     "flake8 (>=3.8.4) ; extra == 'dev'",
     "isort (>=5.0.0) ; extra == 'dev'",
     "mypy (>=0.790) ; extra == 'dev'",
     "mock ; extra == 'dev'",
     "pre-commit (>=2.9.3) ; extra == 'dev'",
     "pytest (>=6.0.0) ; extra == 'dev'",
     "pytest-cov (>=2.0.0) ; extra == 'dev'",
     "sphinx ; extra == 'dev'",
     "sphinx-rtd-theme ; extra == 'dev'",
     "types-pkg-resources ; extra == 'dev'",
     "types-requests ; extra == 'dev'"
   ]
   ```
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] github-actions[bot] commented on pull request #22548: Adding ArangoDB Provider

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on pull request #22548:
URL: https://github.com/apache/airflow/pull/22548#issuecomment-1085778712


   The PR most likely needs to run full matrix of tests because it modifies parts of the core of Airflow. However, committers might decide to merge it quickly and take the risk. If they don't merge it quickly - please rebase it to the latest main at your convenience, or amend the last commit of the PR, and push it with --force-with-lease.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on pull request #22548: Adding ArangoDB Provider

Posted by GitBox <gi...@apache.org>.
potiuk commented on pull request #22548:
URL: https://github.com/apache/airflow/pull/22548#issuecomment-1085613449


   > Hi maintainer of python-arango here. I've removed the dependency. Please try again with release version [7.3.2](https://github.com/ArangoDB-Community/python-arango/releases/tag/7.3.2). Thanks.
   
   Cool. Thanks! @pateash  -> can you add >=7.3.2  to our requirements please ? 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on pull request #22548: WIP: Add Arango hook

Posted by GitBox <gi...@apache.org>.
potiuk commented on pull request #22548:
URL: https://github.com/apache/airflow/pull/22548#issuecomment-1080014103


   Can you fix static checks please.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on pull request #22548: Adding ArangoDB Provider

Posted by GitBox <gi...@apache.org>.
potiuk commented on pull request #22548:
URL: https://github.com/apache/airflow/pull/22548#issuecomment-1083102272


   > @potiuk can you take a look at the test failure? `AssertionError: List of expected installed packages and image content mismatch. Check /home/runner/work/airflow/airflow/scripts/ci/installed_providers.txt file.`
   > 
   > I don't recall that when adding a new provider we need to edit the CI script
   
   Not everything in providers has to be me :) - this test was added by @mik-laj  actually:  https://github.com/apache/airflow/commit/621d17bb77e3160c1a927803e5d190c0e2aade3chttps://github.com/apache/airflow/commit/621d17bb77e3160c1a927803e5d190c0e2aade3c
   
   It looks like for some reason prodcution image produced in this build contains many more providers than it should 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk edited a comment on pull request #22548: Adding ArangoDB Provider

Posted by GitBox <gi...@apache.org>.
potiuk edited a comment on pull request #22548:
URL: https://github.com/apache/airflow/pull/22548#issuecomment-1083102272


   > @potiuk can you take a look at the test failure? `AssertionError: List of expected installed packages and image content mismatch. Check /home/runner/work/airflow/airflow/scripts/ci/installed_providers.txt file.`
   > 
   > I don't recall that when adding a new provider we need to edit the CI script
   
   Not everything in providers has to be me :) - this test was added by @mik-laj  actually:  https://github.com/apache/airflow/commit/621d17bb77e3160c1a927803e5d190c0e2aade3c 
   
   It looks like for some reason prodcution image produced in this build contains many more providers than it should 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk edited a comment on pull request #22548: Adding ArangoDB Provider

Posted by GitBox <gi...@apache.org>.
potiuk edited a comment on pull request #22548:
URL: https://github.com/apache/airflow/pull/22548#issuecomment-1085016498


   TL;DR; I know the problem. The problem is that `python-arango` package dependencies are extremely wrong and break airflow package installation. We cannot accept python-arango until they fix it.
   
   I was not very far when guessing that `setuptools` is the problem.
    
   The problem is that `python-arangodb` package depends on this:
   
   ```
   setuptools-scm[toml] (>=3.4)
   ```
   
   This is a package that automatically adds all files added to version control to the package that is build. This basically means that we have no way to control which of the source code files are used in which package. It's EXTREMELY dirsruptive on package preparation. An issue about that should be opended to ArangoDB developers and they should remove this package and fix their way of building the package.
   
   The effect of this change  was that when we build airflow package, all providers are added to the package even if we did not include them. It's VERY WRONG.
   
   no way we can approve it until Arango DB developers will fix it. Is it possible @eladkal @pateash that you open an issue and get it solved?
   
   This is rather easy to reproduce:
   
   1. Checkout main of airflow
   2. Run `rm -rf dist/*`
   3. Run `./breeze build-image`
   4. run `./breeze prepare-airlfow-packages`
   5. In `dist` folder you will find a `.whl` package (it's a zip file really) which do not contain any of the providers (you can check it with `unzip -t ./dist/apache*`
   
   6. Run `rm -rf dist/*`
   7. Modify setup.cfg by addding `setuptools-scm[toml] (>=3.4)` to install_requires 
   8. Run `./breeze build-image`
   9. run `./breeze prepare-airlfow-packages`
   10. In `dist` folder you will find a `.whl` package (it's a zip file really) which contains all providers
   
   The `setuptools-scm[toml] (>=3.4)` is a dependency of `python-arango`
   
   ```
   ⏚ [jarek:~/code/airflow] main ± http https://pypi.org/pypi/python-arango/json | jq ".info.requires_dist"
   [
     "urllib3 (>=1.26.0)",
     "requests",
     "requests-toolbelt",
     "PyJWT",
     "setuptools (>=42)",
     "setuptools-scm[toml] (>=3.4)",
     "dataclasses (>=0.6) ; python_version < \"3.7\"",
     "black ; extra == 'dev'",
     "flake8 (>=3.8.4) ; extra == 'dev'",
     "isort (>=5.0.0) ; extra == 'dev'",
     "mypy (>=0.790) ; extra == 'dev'",
     "mock ; extra == 'dev'",
     "pre-commit (>=2.9.3) ; extra == 'dev'",
     "pytest (>=6.0.0) ; extra == 'dev'",
     "pytest-cov (>=2.0.0) ; extra == 'dev'",
     "sphinx ; extra == 'dev'",
     "sphinx-rtd-theme ; extra == 'dev'",
     "types-pkg-resources ; extra == 'dev'",
     "types-requests ; extra == 'dev'"
   ]
   ```
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] pateash commented on pull request #22548: WIP: Add Arango hook

Posted by GitBox <gi...@apache.org>.
pateash commented on pull request #22548:
URL: https://github.com/apache/airflow/pull/22548#issuecomment-1081882408


   ![image](https://user-images.githubusercontent.com/16856802/160623778-bfaffe50-f79e-4ac5-a5a6-6648e1d0d5e3.png)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on pull request #22548: Adding ArangoDB Provider

Posted by GitBox <gi...@apache.org>.
potiuk commented on pull request #22548:
URL: https://github.com/apache/airflow/pull/22548#issuecomment-1083162902


   > I don't recall we had such issue when GitHub provider was added (and it was after [621d17b](https://github.com/apache/airflow/commit/621d17bb77e3160c1a927803e5d190c0e2aade3c) )
   
   Me neither. It basicallly SHOUD NOT happen :D. Yet it seems it did again


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] pateash commented on a change in pull request #22548: WIP: Add Arango hook

Posted by GitBox <gi...@apache.org>.
pateash commented on a change in pull request #22548:
URL: https://github.com/apache/airflow/pull/22548#discussion_r837484314



##########
File path: airflow/providers/arangodb/operators/arangodb.py
##########
@@ -0,0 +1,61 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+from typing import TYPE_CHECKING, Optional, Sequence, Callable
+
+from airflow.models import BaseOperator
+from airflow.providers.arangodb.hooks.arangodb import ArangoDBHook
+
+if TYPE_CHECKING:
+    from airflow.utils.context import Context
+
+
+class AQLOperator(BaseOperator):
+    """
+    Executes AQL query in a ArangoDB database
+
+    .. seealso::
+        For more information on how to use this operator, take a look at the guide:
+        :ref:`howto/operator:AQLOperator`
+
+    :param sql: the AQL query to be executed. Can receive a str representing a
+        sql statement
+    :param result_processor: function to further process the Result from ArangoDB
+    :param arangodb_conn_id: Reference to :ref:`ArangoDB connection id <howto/connection:arangodb>`.
+    """
+
+    template_fields: Sequence[str] = ('sql',)

Review comment:
       yeah, make sense 
   added

##########
File path: airflow/providers/arangodb/sensors/arangodb.py
##########
@@ -0,0 +1,55 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+from typing import TYPE_CHECKING, Sequence
+
+from airflow.providers.arangodb.hooks.arangodb import ArangoDBHook
+from airflow.sensors.base import BaseSensorOperator
+
+if TYPE_CHECKING:
+    from airflow.utils.context import Context
+
+
+class AQLSensor(BaseSensorOperator):
+    """
+    Checks for the existence of a document which
+    matches the given query in ArangoDB. Example:
+
+    :param collection: Target DB collection.
+    :param query: The query to find the target document.
+    :param arangodb_conn_id: The :ref:`ArangoDB connection id <howto/connection:arangodb>` to use
+        when connecting to ArangoDB.
+    :param arangodb_db: Target ArangoDB name.
+    """
+
+    template_fields: Sequence[str] = ('sql',)

Review comment:
       I am not sure, we really want to 
   added




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] pateash edited a comment on pull request #22548: Adding ArangoDB Provider

Posted by GitBox <gi...@apache.org>.
pateash edited a comment on pull request #22548:
URL: https://github.com/apache/airflow/pull/22548#issuecomment-1085701006


   voila 🥳,
   It worked.
   Thanks @joowani 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] eladkal commented on pull request #22548: Adding ArangoDB Provider

Posted by GitBox <gi...@apache.org>.
eladkal commented on pull request #22548:
URL: https://github.com/apache/airflow/pull/22548#issuecomment-1083093840


   @potiuk can you take a look at the test failure?
   `AssertionError: List of expected installed packages and image content mismatch. Check /home/runner/work/airflow/airflow/scripts/ci/installed_providers.txt file.`
   
   I don't recall that when adding a new provider we need to edit the CI script


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] pateash commented on pull request #22548: Adding ArangoDB Provider

Posted by GitBox <gi...@apache.org>.
pateash commented on pull request #22548:
URL: https://github.com/apache/airflow/pull/22548#issuecomment-1085701006


   voila 🥳,
   It worked.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] eladkal commented on a change in pull request #22548: Adding ArangoDB Provider

Posted by GitBox <gi...@apache.org>.
eladkal commented on a change in pull request #22548:
URL: https://github.com/apache/airflow/pull/22548#discussion_r841175175



##########
File path: docs/apache-airflow/extra-packages-ref.rst
##########
@@ -256,6 +256,8 @@ Those are extras that add dependencies needed for integration with other softwar
 +---------------------+-----------------------------------------------------+-------------------------------------------+
 | trino               | ``pip install 'apache-airflow[trino]'``             | All Trino related operators & hooks       |
 +---------------------+-----------------------------------------------------+-------------------------------------------+
+| arangodb            | ``pip install 'apache-airflow[arangodb]'``          | ArangoDB operators, sensors and hook      |

Review comment:
       I think this list is sorted alphabetically? 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org