You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by GitBox <gi...@apache.org> on 2022/10/10 13:50:06 UTC

[GitHub] [airflow] blcksrx opened a new pull request, #26970: Impala

blcksrx opened a new pull request, #26970:
URL: https://github.com/apache/airflow/pull/26970

   <!--
   Thank you for contributing! Please make sure that your code changes
   are covered with tests. And in case of new features or big changes
   remember to adjust the documentation.
   
   Feel free to ping committers for the review!
   
   In case of an existing issue, reference it using one of the following:
   
   closes: #ISSUE
   related: #ISSUE
   
   How to write a good git commit message:
   http://chris.beams.io/posts/git-commit/
   -->
   
   ---
   **^ Add meaningful description above**
   
   Read the **[Pull Request Guidelines](https://github.com/apache/airflow/blob/main/CONTRIBUTING.rst#pull-request-guidelines)** for more information.
   In case of fundamental code changes, an Airflow Improvement Proposal ([AIP](https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+Improvement+Proposals)) is needed.
   In case of a new dependency, check compliance with the [ASF 3rd Party License Policy](https://www.apache.org/legal/resolved.html#category-x).
   In case of backwards incompatible changes please leave a note in a newsfragment file, named `{pr_number}.significant.rst` or `{issue_number}.significant.rst`, in [newsfragments](https://github.com/apache/airflow/tree/main/newsfragments).
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] eladkal commented on a diff in pull request #26970: Impala Hook

Posted by GitBox <gi...@apache.org>.
eladkal commented on code in PR #26970:
URL: https://github.com/apache/airflow/pull/26970#discussion_r1011470569


##########
tests/providers/apache/impala/hooks/test_impala.py:
##########
@@ -0,0 +1,113 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+from __future__ import annotations
+
+import unittest
+from unittest import mock
+from unittest.mock import patch
+
+from airflow.models import Connection
+from airflow.providers.apache.impala.hooks.impala import ImpalaHook
+
+
+class TestImpalaHookConn(unittest.TestCase):
+    def setUp(self):
+        super().setUp()
+        self.connection = Connection(
+            login="login",
+            password="password",
+            host="host",
+            port=21050,
+            schema="test",
+        )
+
+        class UnitTestImapalHook(ImpalaHook):

Review Comment:
   is it spelled right?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] blcksrx closed pull request #26970: Impala Hook

Posted by GitBox <gi...@apache.org>.
blcksrx closed pull request #26970: Impala Hook
URL: https://github.com/apache/airflow/pull/26970


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] blcksrx commented on a diff in pull request #26970: Impala Hook

Posted by GitBox <gi...@apache.org>.
blcksrx commented on code in PR #26970:
URL: https://github.com/apache/airflow/pull/26970#discussion_r1009726027


##########
generated/provider_dependencies.json:
##########
@@ -107,6 +107,12 @@
       "vertica"
     ]
   },
+  "apache.impala": {
+    "deps": [
+      "impyla>=0.18.0,<1.0"

Review Comment:
   I believe it could broke some dependency as exactly what happened on old PR and we needed to wait for new version of `impyla`



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] eladkal commented on a diff in pull request #26970: Impala Hook

Posted by GitBox <gi...@apache.org>.
eladkal commented on code in PR #26970:
URL: https://github.com/apache/airflow/pull/26970#discussion_r1011466436


##########
airflow/providers/apache/impala/provider.yaml:
##########
@@ -0,0 +1,41 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+---
+package-name: apache-airflow-providers-apache-impala
+name: Apache Impala
+description: |
+    `Apache Impala <https://impala.apache.org/>`__
+versions:
+  - 2.4.1

Review Comment:
   ```suggestion
     - 1.0.0
   ```



##########
airflow/providers/apache/impala/provider.yaml:
##########
@@ -0,0 +1,41 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+---
+package-name: apache-airflow-providers-apache-impala
+name: Apache Impala
+description: |
+    `Apache Impala <https://impala.apache.org/>`__
+versions:
+  - 2.4.1
+
+dependencies:
+  - impyla>=0.18.0,<1.0

Review Comment:
   Should also have apache-airflow>=2.3.0



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] bdsoha commented on pull request #26970: Impala Hook

Posted by GitBox <gi...@apache.org>.
bdsoha commented on PR #26970:
URL: https://github.com/apache/airflow/pull/26970#issuecomment-1284212589

   @blcksrx You might want to consider allowing additional parameters to be passed to `impala.dbapi.connect` for more advanced configurations such as *LDAP (`auth_mechanism = "LDAP"`) etc.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] bdsoha commented on a diff in pull request #26970: Impala Hook

Posted by GitBox <gi...@apache.org>.
bdsoha commented on code in PR #26970:
URL: https://github.com/apache/airflow/pull/26970#discussion_r1012799383


##########
INSTALL:
##########
@@ -95,22 +95,15 @@ The list of available extras:
 
 # START EXTRAS HERE
 airbyte, alibaba, all, all_dbs, amazon, apache.atlas, apache.beam, apache.cassandra, apache.drill,
-apache.druid, apache.hdfs, apache.hive, apache.kylin, apache.livy, apache.pig, apache.pinot,
-apache.spark, apache.sqoop, apache.webhdfs, arangodb, asana, async, atlas, atlassian.jira, aws,
-azure, cassandra, celery, cgroups, cloudant, cncf.kubernetes, common.sql, crypto, dask, databricks,
-datadog, dbt.cloud, deprecated_api, devel, devel_all, devel_ci, devel_hadoop, dingding, discord,
-doc, docker, druid, elasticsearch, exasol, facebook, ftp, gcp, gcp_api, github, github_enterprise,
-google, google_auth, grpc, hashicorp, hdfs, hive, http, imap, influxdb, jdbc, jenkins, kerberos,
-kubernetes, ldap, leveldb, microsoft.azure, microsoft.mssql, microsoft.psrp, microsoft.winrm, mongo,
-mssql, mysql, neo4j, odbc, openfaas, opsgenie, oracle, pagerduty, pandas, papermill, password,
-pinot, plexus, postgres, presto, qds, qubole, rabbitmq, redis, s3, salesforce, samba, segment,
-sendgrid, sentry, sftp, singularity, slack, snowflake, spark, sqlite, ssh, statsd, tableau, tabular,
-telegram, trino, vertica, virtualenv, webhdfs, winrm, yandex, zendesk
-# END EXTRAS HERE
-
-# For installing Airflow in development environments - see CONTRIBUTING.rst
-
-# COMPILING FRONT-END ASSETS (in case you see "Please make sure to build the frontend in static/ directory and then restart the server")
-# Optional : Installing yarn - https://classic.yarnpkg.com/en/docs/install
-
-python setup.py compile_assets

Review Comment:
   It seems like these lines were accidentally removed.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] eladkal commented on pull request #26970: Impala Hook

Posted by GitBox <gi...@apache.org>.
eladkal commented on PR #26970:
URL: https://github.com/apache/airflow/pull/26970#issuecomment-1356313116

   @blcksrx for some reason the thread on the airflow dependency was resolved but it wasn't added.
   we need minimum apache-airflow>=2.3.0 as dependency in the provider yaml.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] eladkal commented on a diff in pull request #26970: Impala Hook

Posted by GitBox <gi...@apache.org>.
eladkal commented on code in PR #26970:
URL: https://github.com/apache/airflow/pull/26970#discussion_r993692527


##########
airflow/providers/apache/impala/README.md:
##########
@@ -0,0 +1,59 @@
+<!--
+ Licensed to the Apache Software Foundation (ASF) under one
+ or more contributor license agreements.  See the NOTICE file
+ distributed with this work for additional information
+ regarding copyright ownership.  The ASF licenses this file
+ to you under the Apache License, Version 2.0 (the
+ "License"); you may not use this file except in compliance
+ with the License.  You may obtain a copy of the License at
+
+   http://www.apache.org/licenses/LICENSE-2.0
+
+ Unless required by applicable law or agreed to in writing,
+ software distributed under the License is distributed on an
+ "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ KIND, either express or implied.  See the License for the
+ specific language governing permissions and limitations
+ under the License.
+ -->
+
+# Package apache-airflow-backport-providers-apache-impala

Review Comment:
   This is not right. There is no backport provider.
   
   I think this README file is redundant.



##########
airflow/providers/apache/impala/provider.yaml:
##########
@@ -0,0 +1,35 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+---
+package-name: apache-airflow-providers-apache-impala
+name: Apache Impala
+description: |
+    `Apache Impala <https://impala.apache.org/>`__
+
+versions:
+  - 1.0.0b2

Review Comment:
   Why beta?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] potiuk commented on pull request #26970: Impala Hook

Posted by GitBox <gi...@apache.org>.
potiuk commented on PR #26970:
URL: https://github.com/apache/airflow/pull/26970#issuecomment-1296476331

   Conflicts need resolving tool


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] eladkal commented on a diff in pull request #26970: Impala Hook

Posted by GitBox <gi...@apache.org>.
eladkal commented on code in PR #26970:
URL: https://github.com/apache/airflow/pull/26970#discussion_r1049069615


##########
INSTALL:
##########
@@ -95,6 +95,7 @@ The list of available extras:
 
 # START EXTRAS HERE
 airbyte, alibaba, all, all_dbs, amazon, apache.atlas, apache.beam, apache.cassandra, apache.drill,
+<<<<<<< HEAD

Review Comment:
   Same



##########
CONTRIBUTING.rst:
##########
@@ -611,6 +611,7 @@ This is the full list of those extras:
 
   .. START EXTRAS HERE
 airbyte, alibaba, all, all_dbs, amazon, apache.atlas, apache.beam, apache.cassandra, apache.drill,
+<<<<<<< HEAD

Review Comment:
   You have a bad merge conflict



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] uranusjr commented on a diff in pull request #26970: Impala Hook

Posted by GitBox <gi...@apache.org>.
uranusjr commented on code in PR #26970:
URL: https://github.com/apache/airflow/pull/26970#discussion_r1022379026


##########
INSTALL:
##########
@@ -95,22 +95,15 @@ The list of available extras:
 
 # START EXTRAS HERE
 airbyte, alibaba, all, all_dbs, amazon, apache.atlas, apache.beam, apache.cassandra, apache.drill,
-apache.druid, apache.hdfs, apache.hive, apache.kylin, apache.livy, apache.pig, apache.pinot,
-apache.spark, apache.sqoop, apache.webhdfs, arangodb, asana, async, atlas, atlassian.jira, aws,
-azure, cassandra, celery, cgroups, cloudant, cncf.kubernetes, common.sql, crypto, dask, databricks,
-datadog, dbt.cloud, deprecated_api, devel, devel_all, devel_ci, devel_hadoop, dingding, discord,
-doc, docker, druid, elasticsearch, exasol, facebook, ftp, gcp, gcp_api, github, github_enterprise,
-google, google_auth, grpc, hashicorp, hdfs, hive, http, imap, influxdb, jdbc, jenkins, kerberos,
-kubernetes, ldap, leveldb, microsoft.azure, microsoft.mssql, microsoft.psrp, microsoft.winrm, mongo,
-mssql, mysql, neo4j, odbc, openfaas, opsgenie, oracle, pagerduty, pandas, papermill, password,
-pinot, plexus, postgres, presto, qds, qubole, rabbitmq, redis, s3, salesforce, samba, segment,
-sendgrid, sentry, sftp, singularity, slack, snowflake, spark, sqlite, ssh, statsd, tableau, tabular,
-telegram, trino, vertica, virtualenv, webhdfs, winrm, yandex, zendesk
-# END EXTRAS HERE
-
-# For installing Airflow in development environments - see CONTRIBUTING.rst
-
-# COMPILING FRONT-END ASSETS (in case you see "Please make sure to build the frontend in static/ directory and then restart the server")
-# Optional : Installing yarn - https://classic.yarnpkg.com/en/docs/install
-
-python setup.py compile_assets

Review Comment:
   IIRC we merged `compile_assets` into startup so this is no longer needed? Don’t quite remember, I think @potiuk might?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] eladkal merged pull request #26970: Add `ImpalaHook`

Posted by GitBox <gi...@apache.org>.
eladkal merged PR #26970:
URL: https://github.com/apache/airflow/pull/26970


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] blcksrx commented on pull request #26970: Impala Hook

Posted by GitBox <gi...@apache.org>.
blcksrx commented on PR #26970:
URL: https://github.com/apache/airflow/pull/26970#issuecomment-1294888741

   @eladkal could you please review this PR?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] bdsoha commented on a diff in pull request #26970: Impala Hook

Posted by GitBox <gi...@apache.org>.
bdsoha commented on code in PR #26970:
URL: https://github.com/apache/airflow/pull/26970#discussion_r1008877401


##########
tests/providers/apache/impala/hooks/test_impala.py:
##########
@@ -0,0 +1,110 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+import unittest
+from unittest import mock
+from unittest.mock import patch
+
+from airflow.models import Connection
+from airflow.providers.apache.impala.hooks.impala import ImpalaHook
+
+
+class TestImpalaHookConn(unittest.TestCase):
+    def setUp(self):
+        super().setUp()
+        self.connection = Connection(
+            login='login',
+            password='password',
+            host='host',
+            port=21050,
+            schema='test',
+        )
+
+        class UnitTestImapalHook(ImpalaHook):
+            conn_name_attr = 'impala_conn_id'

Review Comment:
   Replace quotes:
   
   ```suggestion
   class TestImpalaHookConn(unittest.TestCase):
       def setUp(self):
           super().setUp()
           self.connection = Connection(
               login="login",
               password="password",
               host="host",
               port=21050,
               schema="test",
           )
   
           class UnitTestImapalHook(ImpalaHook):
               conn_name_attr = "impala_conn_id"
   ```



##########
tests/providers/apache/impala/hooks/test_impala.py:
##########
@@ -0,0 +1,110 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+import unittest
+from unittest import mock
+from unittest.mock import patch
+
+from airflow.models import Connection
+from airflow.providers.apache.impala.hooks.impala import ImpalaHook
+
+
+class TestImpalaHookConn(unittest.TestCase):
+    def setUp(self):
+        super().setUp()
+        self.connection = Connection(
+            login='login',
+            password='password',
+            host='host',
+            port=21050,
+            schema='test',
+        )
+
+        class UnitTestImapalHook(ImpalaHook):
+            conn_name_attr = 'impala_conn_id'
+
+        self.db_hook = UnitTestImapalHook()
+        self.db_hook.get_connection = mock.Mock()
+        self.db_hook.get_connection.return_value = self.connection
+
+    @patch("airflow.providers.apache.impala.hooks.impala.connect")
+    def test_get_conn(self, mock_connect):
+        self.db_hook.get_conn()
+        mock_connect.assert_called_once_with(
+            host="host", port=21050, user="login", password="password", database="test"
+        )
+
+
+class TestImpalaHook(unittest.TestCase):
+    def setUp(self):
+        super().setUp()
+
+        self.cur = mock.MagicMock()
+        self.conn = mock.MagicMock()
+        self.conn.cursor.return_value = self.cur
+        conn = self.conn
+
+        class UnitTestImpalaHook(ImpalaHook):
+            conn_name_attr = 'test_conn_id'
+
+            def get_conn(self):
+                return conn
+
+        self.db_hook = UnitTestImpalaHook()
+
+    @patch('airflow.hooks.dbapi_hook.DbApiHook.insert_rows')
+    def test_insert_rows(self, mock_insert_rows):

Review Comment:
   Replace quotes:
   
   ```suggestion
           class UnitTestImpalaHook(ImpalaHook):
               conn_name_attr = "test_conn_id"
   
               def get_conn(self):
                   return conn
   
           self.db_hook = UnitTestImpalaHook()
   
       @patch("airflow.hooks.dbapi_hook.DbApiHook.insert_rows")
       def test_insert_rows(self, mock_insert_rows):
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] bdsoha commented on a diff in pull request #26970: Impala Hook

Posted by GitBox <gi...@apache.org>.
bdsoha commented on code in PR #26970:
URL: https://github.com/apache/airflow/pull/26970#discussion_r1012799383


##########
INSTALL:
##########
@@ -95,22 +95,15 @@ The list of available extras:
 
 # START EXTRAS HERE
 airbyte, alibaba, all, all_dbs, amazon, apache.atlas, apache.beam, apache.cassandra, apache.drill,
-apache.druid, apache.hdfs, apache.hive, apache.kylin, apache.livy, apache.pig, apache.pinot,
-apache.spark, apache.sqoop, apache.webhdfs, arangodb, asana, async, atlas, atlassian.jira, aws,
-azure, cassandra, celery, cgroups, cloudant, cncf.kubernetes, common.sql, crypto, dask, databricks,
-datadog, dbt.cloud, deprecated_api, devel, devel_all, devel_ci, devel_hadoop, dingding, discord,
-doc, docker, druid, elasticsearch, exasol, facebook, ftp, gcp, gcp_api, github, github_enterprise,
-google, google_auth, grpc, hashicorp, hdfs, hive, http, imap, influxdb, jdbc, jenkins, kerberos,
-kubernetes, ldap, leveldb, microsoft.azure, microsoft.mssql, microsoft.psrp, microsoft.winrm, mongo,
-mssql, mysql, neo4j, odbc, openfaas, opsgenie, oracle, pagerduty, pandas, papermill, password,
-pinot, plexus, postgres, presto, qds, qubole, rabbitmq, redis, s3, salesforce, samba, segment,
-sendgrid, sentry, sftp, singularity, slack, snowflake, spark, sqlite, ssh, statsd, tableau, tabular,
-telegram, trino, vertica, virtualenv, webhdfs, winrm, yandex, zendesk
-# END EXTRAS HERE
-
-# For installing Airflow in development environments - see CONTRIBUTING.rst
-
-# COMPILING FRONT-END ASSETS (in case you see "Please make sure to build the frontend in static/ directory and then restart the server")
-# Optional : Installing yarn - https://classic.yarnpkg.com/en/docs/install
-
-python setup.py compile_assets

Review Comment:
   @blcksrx It seems like these lines were accidentally removed.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] potiuk commented on pull request #26970: Impala Hook

Posted by GitBox <gi...@apache.org>.
potiuk commented on PR #26970:
URL: https://github.com/apache/airflow/pull/26970#issuecomment-1336124082

   Hmm. Once again (and I know it's again much too ask).
   
   Somehow your change keeps on getting conflicts but .. it should not. I think big problem is that you have multiple merges and you are merging main to your change rather than rebasing your commits on top of the latest main.
   
    I think this causes all kinds of problems that we have to ask you for rebase rather than doing the rebase ourselves when needed.
   
   This is what I see when I try to rebase now:
   
   <img width="503" alt="Screenshot 2022-12-03 at 10 40 20" src="https://user-images.githubusercontent.com/595491/205434475-aa9d9b41-20dd-45bb-b2f5-32d0b3a02ca1.png">
   
   Rebasing is way better than merging especially in the scenario that you have long running change that has some failures to be fixed. 
   
   Those "merge" commits that you have in the history of your PR are those that cause the conflicts:
   
   <img width="788" alt="Screenshot 2022-12-03 at 10 41 57" src="https://user-images.githubusercontent.com/595491/205434524-2773c881-49d0-40c0-9b51-7673de5d3b7f.png">
   
   Is it possible that you rebase your change ? Just squash your change into a single commit and rebase on top of main (see our CONTRIBUTING.rst for some guides on that).
   
   Currently you have this:
   
   <img width="505" alt="Screenshot 2022-12-03 at 10 36 37" src="https://user-images.githubusercontent.com/595491/205434352-1a58ce5f-f37b-46ba-8cc5-5b31211b146e.png">
   
   But what would be much better if you have "you are 1 commit ahead, 0 commits behind" and then we can rebase that one commit of yours on top


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] eladkal commented on a diff in pull request #26970: Impala Hook

Posted by GitBox <gi...@apache.org>.
eladkal commented on code in PR #26970:
URL: https://github.com/apache/airflow/pull/26970#discussion_r1009622184


##########
generated/provider_dependencies.json:
##########
@@ -107,6 +107,12 @@
       "vertica"
     ]
   },
+  "apache.impala": {
+    "deps": [
+      "impyla>=0.18.0,<1.0"

Review Comment:
   Why should we limit?
   This invites questions when 1.0 will be released so if there is no good reason to limit in advanced we should relax the dependency.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] blcksrx commented on pull request #26970: Impala Hook

Posted by GitBox <gi...@apache.org>.
blcksrx commented on PR #26970:
URL: https://github.com/apache/airflow/pull/26970#issuecomment-1301103878

   > There are some test failures :( 
   
   Its second time pre-commit didn't throw errors on my side. This is irritating. Im going to fix it


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] bdsoha commented on pull request #26970: Impala Hook

Posted by GitBox <gi...@apache.org>.
bdsoha commented on PR #26970:
URL: https://github.com/apache/airflow/pull/26970#issuecomment-1318119271

   @blcksrx Do you need assistance with the tests?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] blcksrx commented on pull request #26970: Add `ImpalaHook`

Posted by GitBox <gi...@apache.org>.
blcksrx commented on PR #26970:
URL: https://github.com/apache/airflow/pull/26970#issuecomment-1373396754

   πŸ₯³


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] eladkal commented on pull request #26970: Impala Hook

Posted by GitBox <gi...@apache.org>.
eladkal commented on PR #26970:
URL: https://github.com/apache/airflow/pull/26970#issuecomment-1354394485

   @potiuk i took a look at CI. The wait for CI images steps takes 2 hours befor timeout. Retry doesnt solve it.
   I know something is missing/not right in the PR but I wasnt able to find exactly what we are missing here. Will you have time to give us some clues?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] potiuk commented on pull request #26970: Impala Hook

Posted by GitBox <gi...@apache.org>.
potiuk commented on PR #26970:
URL: https://github.com/apache/airflow/pull/26970#issuecomment-1317712622

   Tests need fixing.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] blcksrx commented on pull request #26970: Impala Hook

Posted by GitBox <gi...@apache.org>.
blcksrx commented on PR #26970:
URL: https://github.com/apache/airflow/pull/26970#issuecomment-1318227449

   Yeah, please. Im really tired of this PR πŸ˜€ 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] eladkal commented on a diff in pull request #26970: Impala Hook

Posted by GitBox <gi...@apache.org>.
eladkal commented on code in PR #26970:
URL: https://github.com/apache/airflow/pull/26970#discussion_r1011511887


##########
tests/providers/apache/impala/hooks/test_impala.py:
##########
@@ -0,0 +1,113 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+from __future__ import annotations
+
+import unittest
+from unittest import mock
+from unittest.mock import patch
+
+from airflow.models import Connection
+from airflow.providers.apache.impala.hooks.impala import ImpalaHook
+
+
+class TestImpalaHookConn(unittest.TestCase):
+    def setUp(self):
+        super().setUp()
+        self.connection = Connection(
+            login="login",
+            password="password",
+            host="host",
+            port=21050,
+            schema="test",
+        )
+
+        class UnitTestImpalHook(ImpalaHook):
+            conn_name_attr = "impala_conn_id"
+
+        self.db_hook = UnitTestImpalHook()
+        self.db_hook.get_connection = mock.Mock()
+        self.db_hook.get_connection.return_value = self.connection
+
+    @patch("airflow.providers.apache.impala.hooks.impala.connect")
+    def test_get_conn(self, mock_connect):
+        self.db_hook.get_conn()
+        mock_connect.assert_called_once_with(
+            host="host", port=21050, user="login", password="password", database="test"
+        )
+
+
+class TestImpalaHook(unittest.TestCase):

Review Comment:
   Would be cool to write it as pytest
   We have on going effort to do it so we can have less work in the future https://github.com/apache/airflow/pull/26831



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] blcksrx commented on pull request #26970: Impala Hook

Posted by GitBox <gi...@apache.org>.
blcksrx commented on PR #26970:
URL: https://github.com/apache/airflow/pull/26970#issuecomment-1297483607

   > > @eladkal could you please review this PR?
   > 
   > Did you address my comment? Maybe you forgot to commit the change as I don't see the connection doc in the PR.
   
   Sorry, it was included on an old commits. I just rescued it. my bad


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] blcksrx commented on a diff in pull request #26970: Impala Hook

Posted by GitBox <gi...@apache.org>.
blcksrx commented on code in PR #26970:
URL: https://github.com/apache/airflow/pull/26970#discussion_r1011561779


##########
tests/providers/apache/impala/hooks/test_impala.py:
##########
@@ -0,0 +1,113 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+from __future__ import annotations
+
+import unittest
+from unittest import mock
+from unittest.mock import patch
+
+from airflow.models import Connection
+from airflow.providers.apache.impala.hooks.impala import ImpalaHook
+
+
+class TestImpalaHookConn(unittest.TestCase):
+    def setUp(self):
+        super().setUp()
+        self.connection = Connection(
+            login="login",
+            password="password",
+            host="host",
+            port=21050,
+            schema="test",
+        )
+
+        class UnitTestImpalHook(ImpalaHook):
+            conn_name_attr = "impala_conn_id"
+
+        self.db_hook = UnitTestImpalHook()
+        self.db_hook.get_connection = mock.Mock()
+        self.db_hook.get_connection.return_value = self.connection
+
+    @patch("airflow.providers.apache.impala.hooks.impala.connect")
+    def test_get_conn(self, mock_connect):
+        self.db_hook.get_conn()
+        mock_connect.assert_called_once_with(
+            host="host", port=21050, user="login", password="password", database="test"
+        )
+
+
+class TestImpalaHook(unittest.TestCase):

Review Comment:
   yeah sure



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] blcksrx commented on a diff in pull request #26970: Impala Hook

Posted by GitBox <gi...@apache.org>.
blcksrx commented on code in PR #26970:
URL: https://github.com/apache/airflow/pull/26970#discussion_r1012807974


##########
INSTALL:
##########
@@ -95,22 +95,15 @@ The list of available extras:
 
 # START EXTRAS HERE
 airbyte, alibaba, all, all_dbs, amazon, apache.atlas, apache.beam, apache.cassandra, apache.drill,
-apache.druid, apache.hdfs, apache.hive, apache.kylin, apache.livy, apache.pig, apache.pinot,
-apache.spark, apache.sqoop, apache.webhdfs, arangodb, asana, async, atlas, atlassian.jira, aws,
-azure, cassandra, celery, cgroups, cloudant, cncf.kubernetes, common.sql, crypto, dask, databricks,
-datadog, dbt.cloud, deprecated_api, devel, devel_all, devel_ci, devel_hadoop, dingding, discord,
-doc, docker, druid, elasticsearch, exasol, facebook, ftp, gcp, gcp_api, github, github_enterprise,
-google, google_auth, grpc, hashicorp, hdfs, hive, http, imap, influxdb, jdbc, jenkins, kerberos,
-kubernetes, ldap, leveldb, microsoft.azure, microsoft.mssql, microsoft.psrp, microsoft.winrm, mongo,
-mssql, mysql, neo4j, odbc, openfaas, opsgenie, oracle, pagerduty, pandas, papermill, password,
-pinot, plexus, postgres, presto, qds, qubole, rabbitmq, redis, s3, salesforce, samba, segment,
-sendgrid, sentry, sftp, singularity, slack, snowflake, spark, sqlite, ssh, statsd, tableau, tabular,
-telegram, trino, vertica, virtualenv, webhdfs, winrm, yandex, zendesk
-# END EXTRAS HERE
-
-# For installing Airflow in development environments - see CONTRIBUTING.rst
-
-# COMPILING FRONT-END ASSETS (in case you see "Please make sure to build the frontend in static/ directory and then restart the server")
-# Optional : Installing yarn - https://classic.yarnpkg.com/en/docs/install
-
-python setup.py compile_assets

Review Comment:
   its not accidentally. last CI ran to a problem to fix it I needed to ran this command:
   ```shell
   breeze setup regenerate-command-images
   ```
   and it changed some files.
   https://github.com/apache/airflow/actions/runs/3380995761/jobs/5614846732



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] bdsoha commented on pull request #26970: Impala Hook

Posted by GitBox <gi...@apache.org>.
bdsoha commented on PR #26970:
URL: https://github.com/apache/airflow/pull/26970#issuecomment-1284445767

   > > @blcksrx You might want to consider allowing additional parameters to be passed to `impala.dbapi.connect` for more advanced configurations such as _LDAP (`auth_mechanism = "LDAP"`)_ etc.
   > > The most flexible way would be to allow passing `**kwargs`.
   > 
   > It's already implemented by passing `extra_dejson` to the `connect` method. does it need to be added also in constructor?
   
   Seems like that will do it.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] blcksrx commented on a diff in pull request #26970: Impala Hook

Posted by GitBox <gi...@apache.org>.
blcksrx commented on code in PR #26970:
URL: https://github.com/apache/airflow/pull/26970#discussion_r1023702617


##########
INSTALL:
##########
@@ -95,22 +95,15 @@ The list of available extras:
 
 # START EXTRAS HERE
 airbyte, alibaba, all, all_dbs, amazon, apache.atlas, apache.beam, apache.cassandra, apache.drill,
-apache.druid, apache.hdfs, apache.hive, apache.kylin, apache.livy, apache.pig, apache.pinot,
-apache.spark, apache.sqoop, apache.webhdfs, arangodb, asana, async, atlas, atlassian.jira, aws,
-azure, cassandra, celery, cgroups, cloudant, cncf.kubernetes, common.sql, crypto, dask, databricks,
-datadog, dbt.cloud, deprecated_api, devel, devel_all, devel_ci, devel_hadoop, dingding, discord,
-doc, docker, druid, elasticsearch, exasol, facebook, ftp, gcp, gcp_api, github, github_enterprise,
-google, google_auth, grpc, hashicorp, hdfs, hive, http, imap, influxdb, jdbc, jenkins, kerberos,
-kubernetes, ldap, leveldb, microsoft.azure, microsoft.mssql, microsoft.psrp, microsoft.winrm, mongo,
-mssql, mysql, neo4j, odbc, openfaas, opsgenie, oracle, pagerduty, pandas, papermill, password,
-pinot, plexus, postgres, presto, qds, qubole, rabbitmq, redis, s3, salesforce, samba, segment,
-sendgrid, sentry, sftp, singularity, slack, snowflake, spark, sqlite, ssh, statsd, tableau, tabular,
-telegram, trino, vertica, virtualenv, webhdfs, winrm, yandex, zendesk
-# END EXTRAS HERE
-
-# For installing Airflow in development environments - see CONTRIBUTING.rst
-
-# COMPILING FRONT-END ASSETS (in case you see "Please make sure to build the frontend in static/ directory and then restart the server")
-# Optional : Installing yarn - https://classic.yarnpkg.com/en/docs/install
-
-python setup.py compile_assets

Review Comment:
   @potiuk done



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] bdsoha commented on a diff in pull request #26970: Impala Hook

Posted by GitBox <gi...@apache.org>.
bdsoha commented on code in PR #26970:
URL: https://github.com/apache/airflow/pull/26970#discussion_r1008877705


##########
tests/providers/apache/impala/hooks/test_impala.py:
##########
@@ -0,0 +1,110 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+import unittest
+from unittest import mock
+from unittest.mock import patch
+
+from airflow.models import Connection
+from airflow.providers.apache.impala.hooks.impala import ImpalaHook
+
+
+class TestImpalaHookConn(unittest.TestCase):
+    def setUp(self):
+        super().setUp()
+        self.connection = Connection(
+            login='login',
+            password='password',
+            host='host',
+            port=21050,
+            schema='test',
+        )
+
+        class UnitTestImapalHook(ImpalaHook):
+            conn_name_attr = 'impala_conn_id'
+
+        self.db_hook = UnitTestImapalHook()
+        self.db_hook.get_connection = mock.Mock()
+        self.db_hook.get_connection.return_value = self.connection
+
+    @patch("airflow.providers.apache.impala.hooks.impala.connect")
+    def test_get_conn(self, mock_connect):
+        self.db_hook.get_conn()
+        mock_connect.assert_called_once_with(
+            host="host", port=21050, user="login", password="password", database="test"
+        )
+
+
+class TestImpalaHook(unittest.TestCase):
+    def setUp(self):
+        super().setUp()
+
+        self.cur = mock.MagicMock()
+        self.conn = mock.MagicMock()
+        self.conn.cursor.return_value = self.cur
+        conn = self.conn
+
+        class UnitTestImpalaHook(ImpalaHook):
+            conn_name_attr = 'test_conn_id'
+
+            def get_conn(self):
+                return conn
+
+        self.db_hook = UnitTestImpalaHook()
+
+    @patch('airflow.hooks.dbapi_hook.DbApiHook.insert_rows')
+    def test_insert_rows(self, mock_insert_rows):
+        table = "table"
+        rows = [("hello",), ("world",)]
+        target_fields = None
+        commit_every = 10
+        self.db_hook.insert_rows(table, rows, target_fields, commit_every)
+        mock_insert_rows.assert_called_once_with(table, rows, None, 10)
+
+    def test_get_first_record(self):
+        statement = 'SQL'
+        result_sets = [('row1',), ('row2',)]
+        self.cur.fetchone.return_value = result_sets[0]
+
+        self.assertEqual(result_sets[0], self.db_hook.get_first(statement))
+        self.conn.close.assert_called_once_with()
+        self.cur.close.assert_called_once_with()
+        self.cur.execute.assert_called_once_with(statement)
+
+    def test_get_records(self):
+        statement = 'SQL'
+        result_sets = [('row1',), ('row2',)]
+        self.cur.fetchall.return_value = result_sets
+
+        self.assertEqual(result_sets, self.db_hook.get_records(statement))
+        self.conn.close.assert_called_once_with()
+        self.cur.close.assert_called_once_with()
+        self.cur.execute.assert_called_once_with(statement)
+
+    def test_get_pandas_df(self):
+        statement = 'SQL'
+        column = 'col'
+        result_sets = [('row1',), ('row2',)]
+        self.cur.description = [(column,)]
+        self.cur.fetchall.return_value = result_sets
+        df = self.db_hook.get_pandas_df(statement)

Review Comment:
   Replace quotes:
   
   ```suggestion
       def test_get_first_record(self):
           statement = "SQL"
           result_sets = [("row1",), ("row2",)]
           self.cur.fetchone.return_value = result_sets[0]
   
           self.assertEqual(result_sets[0], self.db_hook.get_first(statement))
           self.conn.close.assert_called_once_with()
           self.cur.close.assert_called_once_with()
           self.cur.execute.assert_called_once_with(statement)
   
       def test_get_records(self):
           statement = "SQL"
           result_sets = [("row1",), ("row2",)]
           self.cur.fetchall.return_value = result_sets
   
           self.assertEqual(result_sets, self.db_hook.get_records(statement))
           self.conn.close.assert_called_once_with()
           self.cur.close.assert_called_once_with()
           self.cur.execute.assert_called_once_with(statement)
   
       def test_get_pandas_df(self):
           statement = "SQL"
           column = "col"
           result_sets = [("row1",), ("row2",)]
           self.cur.description = [(column,)]
           self.cur.fetchall.return_value = result_sets
           df = self.db_hook.get_pandas_df(statement)
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] bdsoha commented on pull request #26970: Impala Hook

Posted by GitBox <gi...@apache.org>.
bdsoha commented on PR #26970:
URL: https://github.com/apache/airflow/pull/26970#issuecomment-1296277896

   @blcksrx Have a look at this [failing test](https://github.com/apache/airflow/actions/runs/3306614640/jobs/5457864748#step:11:12065)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] blcksrx commented on a diff in pull request #26970: Impala Hook

Posted by GitBox <gi...@apache.org>.
blcksrx commented on code in PR #26970:
URL: https://github.com/apache/airflow/pull/26970#discussion_r1012066261


##########
tests/providers/apache/impala/hooks/test_impala.py:
##########
@@ -0,0 +1,113 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+from __future__ import annotations
+
+import unittest
+from unittest import mock
+from unittest.mock import patch
+
+from airflow.models import Connection
+from airflow.providers.apache.impala.hooks.impala import ImpalaHook
+
+
+class TestImpalaHookConn(unittest.TestCase):
+    def setUp(self):
+        super().setUp()
+        self.connection = Connection(
+            login="login",
+            password="password",
+            host="host",
+            port=21050,
+            schema="test",
+        )
+
+        class UnitTestImpalHook(ImpalaHook):
+            conn_name_attr = "impala_conn_id"
+
+        self.db_hook = UnitTestImpalHook()
+        self.db_hook.get_connection = mock.Mock()
+        self.db_hook.get_connection.return_value = self.connection
+
+    @patch("airflow.providers.apache.impala.hooks.impala.connect")
+    def test_get_conn(self, mock_connect):
+        self.db_hook.get_conn()
+        mock_connect.assert_called_once_with(
+            host="host", port=21050, user="login", password="password", database="test"
+        )
+
+
+class TestImpalaHook(unittest.TestCase):

Review Comment:
   done



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] eladkal commented on pull request #26970: Impala Hook

Posted by GitBox <gi...@apache.org>.
eladkal commented on PR #26970:
URL: https://github.com/apache/airflow/pull/26970#issuecomment-1301074721

   There are some test failures :( 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] bdsoha commented on a diff in pull request #26970: Impala Hook

Posted by GitBox <gi...@apache.org>.
bdsoha commented on code in PR #26970:
URL: https://github.com/apache/airflow/pull/26970#discussion_r993689390


##########
setup.py:
##########
@@ -293,6 +293,7 @@ def write_version(filename: str = os.path.join(*[my_dir, "airflow", "git_version
     'hmsclient>=0.1.0',
     'pyhive[hive]>=0.6.0',
 ]
+impala = ["impyla>=0.16.3"]

Review Comment:
   @uranusjr I assume it has to do with peer-dependencies mentioned [on a previous PR](https://github.com/apache/airflow/pull/8918)



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] eladkal commented on pull request #26970: Impala Hook

Posted by GitBox <gi...@apache.org>.
eladkal commented on PR #26970:
URL: https://github.com/apache/airflow/pull/26970#issuecomment-1371996241

   Latest error was already fixed in https://github.com/apache/airflow/pull/28725
   I rebased the PR. lets see


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] Taragolis commented on a diff in pull request #26970: Impala Hook

Posted by GitBox <gi...@apache.org>.
Taragolis commented on code in PR #26970:
URL: https://github.com/apache/airflow/pull/26970#discussion_r1043596965


##########
tests/providers/apache/impala/hooks/test_impala.py:
##########
@@ -0,0 +1,113 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+from __future__ import annotations
+
+import unittest
+from unittest import mock
+from unittest.mock import patch
+
+from airflow.models import Connection
+from airflow.providers.apache.impala.hooks.impala import ImpalaHook
+
+
+class TestImpalaHookConn(unittest.TestCase):
+    def setUp(self):
+        super().setUp()
+        self.connection = Connection(
+            login="login",
+            password="password",
+            host="host",
+            port=21050,
+            schema="test",
+        )
+
+        class UnitTestImpalHook(ImpalaHook):
+            conn_name_attr = "impala_conn_id"
+
+        self.db_hook = UnitTestImpalHook()
+        self.db_hook.get_connection = mock.Mock()
+        self.db_hook.get_connection.return_value = self.connection
+
+    @patch("airflow.providers.apache.impala.hooks.impala.connect")
+    def test_get_conn(self, mock_connect):
+        self.db_hook.get_conn()
+        mock_connect.assert_called_once_with(
+            host="host", port=21050, user="login", password="password", database="test"
+        )
+
+
+class TestImpalaHook(unittest.TestCase):

Review Comment:
   Class still based on `unittest.TestCase` and use their methods.
   You need to:
   1. Remove bases of `unittest.TestCase`
   2. Rename `def setUp(self)` to `def setup_method`. And you don't need call `super()`
   3. Instead of `self.assertEqual(a, b)` use just use `assert a == b`



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] blcksrx commented on pull request #26970: Impala Hook

Posted by GitBox <gi...@apache.org>.
blcksrx commented on PR #26970:
URL: https://github.com/apache/airflow/pull/26970#issuecomment-1299968552

   @eladkal sorry it did took long and messy to fix it


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] bdsoha commented on a diff in pull request #26970: Impala Hook

Posted by GitBox <gi...@apache.org>.
bdsoha commented on code in PR #26970:
URL: https://github.com/apache/airflow/pull/26970#discussion_r1008877401


##########
tests/providers/apache/impala/hooks/test_impala.py:
##########
@@ -0,0 +1,110 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+import unittest
+from unittest import mock
+from unittest.mock import patch
+
+from airflow.models import Connection
+from airflow.providers.apache.impala.hooks.impala import ImpalaHook
+
+
+class TestImpalaHookConn(unittest.TestCase):
+    def setUp(self):
+        super().setUp()
+        self.connection = Connection(
+            login='login',
+            password='password',
+            host='host',
+            port=21050,
+            schema='test',
+        )
+
+        class UnitTestImapalHook(ImpalaHook):
+            conn_name_attr = 'impala_conn_id'

Review Comment:
   Replace quotes:
   
   ```python
   class TestImpalaHookConn(unittest.TestCase):
       def setUp(self):
           super().setUp()
           self.connection = Connection(
               login="login",
               password="password",
               host="host",
               port=21050,
               schema="test",
           )
   
           class UnitTestImapalHook(ImpalaHook):
               conn_name_attr = "impala_conn_id"
   ```



##########
airflow/providers/apache/impala/hooks/impala.py:
##########
@@ -0,0 +1,42 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+from __future__ import annotations
+
+from impala.dbapi import connect
+from impala.interface import Connection
+
+from airflow.hooks.dbapi import DbApiHook
+
+
+class ImpalaHook(DbApiHook):
+    """Interact with Apache Impala through impyla."""
+
+    conn_name_attr = 'impala_conn_id'
+    default_conn_name = 'impala_default'
+    conn_type = 'impala'
+    hook_name = "Impala"

Review Comment:
   You should replace all quotes with double-quotes in order to pass static checks.
   
   ```python
   class ImpalaHook(DbApiHook):
       """Interact with Apache Impala through impyla."""
   
       conn_name_attr = "impala_conn_id"
       default_conn_name = "impala_default"
       conn_type = "impala"
       hook_name = "Impala"
   ```



##########
tests/providers/apache/impala/hooks/test_impala.py:
##########
@@ -0,0 +1,110 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+import unittest
+from unittest import mock
+from unittest.mock import patch
+
+from airflow.models import Connection
+from airflow.providers.apache.impala.hooks.impala import ImpalaHook
+
+
+class TestImpalaHookConn(unittest.TestCase):
+    def setUp(self):
+        super().setUp()
+        self.connection = Connection(
+            login='login',
+            password='password',
+            host='host',
+            port=21050,
+            schema='test',
+        )
+
+        class UnitTestImapalHook(ImpalaHook):
+            conn_name_attr = 'impala_conn_id'
+
+        self.db_hook = UnitTestImapalHook()
+        self.db_hook.get_connection = mock.Mock()
+        self.db_hook.get_connection.return_value = self.connection
+
+    @patch("airflow.providers.apache.impala.hooks.impala.connect")
+    def test_get_conn(self, mock_connect):
+        self.db_hook.get_conn()
+        mock_connect.assert_called_once_with(
+            host="host", port=21050, user="login", password="password", database="test"
+        )
+
+
+class TestImpalaHook(unittest.TestCase):
+    def setUp(self):
+        super().setUp()
+
+        self.cur = mock.MagicMock()
+        self.conn = mock.MagicMock()
+        self.conn.cursor.return_value = self.cur
+        conn = self.conn
+
+        class UnitTestImpalaHook(ImpalaHook):
+            conn_name_attr = 'test_conn_id'
+
+            def get_conn(self):
+                return conn
+
+        self.db_hook = UnitTestImpalaHook()
+
+    @patch('airflow.hooks.dbapi_hook.DbApiHook.insert_rows')
+    def test_insert_rows(self, mock_insert_rows):
+        table = "table"
+        rows = [("hello",), ("world",)]
+        target_fields = None
+        commit_every = 10
+        self.db_hook.insert_rows(table, rows, target_fields, commit_every)
+        mock_insert_rows.assert_called_once_with(table, rows, None, 10)
+
+    def test_get_first_record(self):
+        statement = 'SQL'
+        result_sets = [('row1',), ('row2',)]
+        self.cur.fetchone.return_value = result_sets[0]
+
+        self.assertEqual(result_sets[0], self.db_hook.get_first(statement))
+        self.conn.close.assert_called_once_with()
+        self.cur.close.assert_called_once_with()
+        self.cur.execute.assert_called_once_with(statement)
+
+    def test_get_records(self):
+        statement = 'SQL'
+        result_sets = [('row1',), ('row2',)]
+        self.cur.fetchall.return_value = result_sets
+
+        self.assertEqual(result_sets, self.db_hook.get_records(statement))
+        self.conn.close.assert_called_once_with()
+        self.cur.close.assert_called_once_with()
+        self.cur.execute.assert_called_once_with(statement)
+
+    def test_get_pandas_df(self):
+        statement = 'SQL'
+        column = 'col'
+        result_sets = [('row1',), ('row2',)]
+        self.cur.description = [(column,)]
+        self.cur.fetchall.return_value = result_sets
+        df = self.db_hook.get_pandas_df(statement)

Review Comment:
   Replace quotes:
   
   ```python
       def test_get_first_record(self):
           statement = "SQL"
           result_sets = [("row1",), ("row2",)]
           self.cur.fetchone.return_value = result_sets[0]
   
           self.assertEqual(result_sets[0], self.db_hook.get_first(statement))
           self.conn.close.assert_called_once_with()
           self.cur.close.assert_called_once_with()
           self.cur.execute.assert_called_once_with(statement)
   
       def test_get_records(self):
           statement = "SQL"
           result_sets = [("row1",), ("row2",)]
           self.cur.fetchall.return_value = result_sets
   
           self.assertEqual(result_sets, self.db_hook.get_records(statement))
           self.conn.close.assert_called_once_with()
           self.cur.close.assert_called_once_with()
           self.cur.execute.assert_called_once_with(statement)
   
       def test_get_pandas_df(self):
           statement = "SQL"
           column = "col"
           result_sets = [("row1",), ("row2",)]
           self.cur.description = [(column,)]
           self.cur.fetchall.return_value = result_sets
           df = self.db_hook.get_pandas_df(statement)
   ```



##########
tests/providers/apache/impala/hooks/test_impala.py:
##########
@@ -0,0 +1,110 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+import unittest
+from unittest import mock
+from unittest.mock import patch
+
+from airflow.models import Connection
+from airflow.providers.apache.impala.hooks.impala import ImpalaHook
+
+
+class TestImpalaHookConn(unittest.TestCase):
+    def setUp(self):
+        super().setUp()
+        self.connection = Connection(
+            login='login',
+            password='password',
+            host='host',
+            port=21050,
+            schema='test',
+        )
+
+        class UnitTestImapalHook(ImpalaHook):
+            conn_name_attr = 'impala_conn_id'
+
+        self.db_hook = UnitTestImapalHook()
+        self.db_hook.get_connection = mock.Mock()
+        self.db_hook.get_connection.return_value = self.connection
+
+    @patch("airflow.providers.apache.impala.hooks.impala.connect")
+    def test_get_conn(self, mock_connect):
+        self.db_hook.get_conn()
+        mock_connect.assert_called_once_with(
+            host="host", port=21050, user="login", password="password", database="test"
+        )
+
+
+class TestImpalaHook(unittest.TestCase):
+    def setUp(self):
+        super().setUp()
+
+        self.cur = mock.MagicMock()
+        self.conn = mock.MagicMock()
+        self.conn.cursor.return_value = self.cur
+        conn = self.conn
+
+        class UnitTestImpalaHook(ImpalaHook):
+            conn_name_attr = 'test_conn_id'
+
+            def get_conn(self):
+                return conn
+
+        self.db_hook = UnitTestImpalaHook()
+
+    @patch('airflow.hooks.dbapi_hook.DbApiHook.insert_rows')
+    def test_insert_rows(self, mock_insert_rows):

Review Comment:
   Replace quotes:
   
   ```python
           class UnitTestImpalaHook(ImpalaHook):
               conn_name_attr = "test_conn_id"
   
               def get_conn(self):
                   return conn
   
           self.db_hook = UnitTestImpalaHook()
   
       @patch("airflow.hooks.dbapi_hook.DbApiHook.insert_rows")
       def test_insert_rows(self, mock_insert_rows):
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] uranusjr commented on a diff in pull request #26970: Impala Hook

Posted by GitBox <gi...@apache.org>.
uranusjr commented on code in PR #26970:
URL: https://github.com/apache/airflow/pull/26970#discussion_r991803561


##########
setup.py:
##########
@@ -293,6 +293,7 @@ def write_version(filename: str = os.path.join(*[my_dir, "airflow", "git_version
     'hmsclient>=0.1.0',
     'pyhive[hive]>=0.6.0',
 ]
+impala = ["impyla>=0.16.3"]

Review Comment:
   Any particular reasoning for this dependency range?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] blcksrx commented on pull request #26970: Impala Hook

Posted by GitBox <gi...@apache.org>.
blcksrx commented on PR #26970:
URL: https://github.com/apache/airflow/pull/26970#issuecomment-1276584444

   Hwy guys thanks for the comments.
   This is my old pr that I opened it.
   Im going to work on it on this Friday


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] blcksrx commented on pull request #26970: Impala Hook

Posted by GitBox <gi...@apache.org>.
blcksrx commented on PR #26970:
URL: https://github.com/apache/airflow/pull/26970#issuecomment-1284246945

   > @blcksrx You might want to consider allowing additional parameters to be passed to `impala.dbapi.connect` for more advanced configurations such as _LDAP (`auth_mechanism = "LDAP"`)_ etc.
   > 
   > The most flexible way would be to allow passing `**kwargs`.
   
   It's already implemented by passing `extra_dejson` to the `connect` method. does it need to be added also in constructor? 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] eladkal commented on pull request #26970: Impala Hook

Posted by GitBox <gi...@apache.org>.
eladkal commented on PR #26970:
URL: https://github.com/apache/airflow/pull/26970#issuecomment-1357592048

   error to address:
   ```
   Checking dist/apache-airflow-providers-apache-impala-2.4.1.dev0.tar.gz: FAILED
   ERROR    `long_description` has syntax errors in markup and would not be        
            rendered on PyPI.   
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] eladkal commented on pull request #26970: Add `ImpalaHook`

Posted by GitBox <gi...@apache.org>.
eladkal commented on PR #26970:
URL: https://github.com/apache/airflow/pull/26970#issuecomment-1373319681

   :tada::tada::tada::tada:


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] potiuk commented on pull request #26970: Add `ImpalaHook`

Posted by GitBox <gi...@apache.org>.
potiuk commented on PR #26970:
URL: https://github.com/apache/airflow/pull/26970#issuecomment-1373339289

   πŸŽ‰ 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] potiuk commented on a diff in pull request #26970: Impala Hook

Posted by GitBox <gi...@apache.org>.
potiuk commented on code in PR #26970:
URL: https://github.com/apache/airflow/pull/26970#discussion_r1022645624


##########
INSTALL:
##########
@@ -95,22 +95,15 @@ The list of available extras:
 
 # START EXTRAS HERE
 airbyte, alibaba, all, all_dbs, amazon, apache.atlas, apache.beam, apache.cassandra, apache.drill,
-apache.druid, apache.hdfs, apache.hive, apache.kylin, apache.livy, apache.pig, apache.pinot,
-apache.spark, apache.sqoop, apache.webhdfs, arangodb, asana, async, atlas, atlassian.jira, aws,
-azure, cassandra, celery, cgroups, cloudant, cncf.kubernetes, common.sql, crypto, dask, databricks,
-datadog, dbt.cloud, deprecated_api, devel, devel_all, devel_ci, devel_hadoop, dingding, discord,
-doc, docker, druid, elasticsearch, exasol, facebook, ftp, gcp, gcp_api, github, github_enterprise,
-google, google_auth, grpc, hashicorp, hdfs, hive, http, imap, influxdb, jdbc, jenkins, kerberos,
-kubernetes, ldap, leveldb, microsoft.azure, microsoft.mssql, microsoft.psrp, microsoft.winrm, mongo,
-mssql, mysql, neo4j, odbc, openfaas, opsgenie, oracle, pagerduty, pandas, papermill, password,
-pinot, plexus, postgres, presto, qds, qubole, rabbitmq, redis, s3, salesforce, samba, segment,
-sendgrid, sentry, sftp, singularity, slack, snowflake, spark, sqlite, ssh, statsd, tableau, tabular,
-telegram, trino, vertica, virtualenv, webhdfs, winrm, yandex, zendesk
-# END EXTRAS HERE
-
-# For installing Airflow in development environments - see CONTRIBUTING.rst
-
-# COMPILING FRONT-END ASSETS (in case you see "Please make sure to build the frontend in static/ directory and then restart the server")
-# Optional : Installing yarn - https://classic.yarnpkg.com/en/docs/install
-
-python setup.py compile_assets

Review Comment:
   Good eye @bdsoha - I think this was removed by accident early when generating extras (not really connected with regenerating images). Could you please restore it @blcksrx ?
   
   @uranusjr: not really.
   
   A bit of context.
   
   The `compile_assets` is only automated when you start breeze, not in any other circumstances. The INSTALL here describes the process of installing airflow manualy from sources - this is the absolute requirement of the Apache Software Foundation, that "sufficient platform and tools" you should be able to take sources published in https://downloads.apache.org/airflow/2.4.3/ as *-sources.tar.gz and you should be able to get workinf software.
   
   This is what INSTALL doc is for (it is also the first thing anyone who takes the .source.tgz file should look at). It is also described in this part of our installation documentation https://airflow.apache.org/docs/apache-airflow/stable/installation/index.html#using-released-sources. Those published sources should only contain source files that can and should be possible to manually edit to make any change to the software of ours. It should not contain any compiled sources 
   
   This is described here: https://www.apache.org/legal/release-policy.html#source-packages
   
   > Every ASF release MUST contain one or more source packages, which MUST be sufficient for a user to build and test the release provided they have access to the appropriate platform and tools. A source release SHOULD not contain compiled code.
   
   This is not optional, it's a mandatory requirement of ASF release policy and we MUST follow it - otherwise PMC members of Apache Airlfow are not idemnified by the Apache Sofware Foundation in case of any legal battles - this is one of the reasons why we have release voting process and why PMC members only have binding votes.
   
   In this context, running `python setup.py compile_assets` is absolutely necessary step in those instructions - we cannot (and we do not) have compiled assets released in the source.tgz. file and INSTALL document is about this file.
   
   BTW. By answering this one I noticed that there are few things about installing providers and using constraints that are not needed here, actually, so I will shortly correct it.
   
   
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] blcksrx commented on pull request #26970: Impala Hook

Posted by GitBox <gi...@apache.org>.
blcksrx commented on PR #26970:
URL: https://github.com/apache/airflow/pull/26970#issuecomment-1356266628

   @eladkal Idk what was the problem, I just rebased it and it's done!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] eladkal commented on a diff in pull request #26970: Impala Hook

Posted by GitBox <gi...@apache.org>.
eladkal commented on code in PR #26970:
URL: https://github.com/apache/airflow/pull/26970#discussion_r1051419948


##########
airflow/providers/apache/impala/provider.yaml:
##########
@@ -0,0 +1,41 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+---
+package-name: apache-airflow-providers-apache-impala
+name: Apache Impala
+description: |
+    `Apache Impala <https://impala.apache.org/>`__
+versions:
+  - 2.4.1
+
+dependencies:
+  - impyla>=0.18.0,<1.0

Review Comment:
   @blcksrx this still need to be handled..
   We need to add minimum Airflow version `apache-airflow>=2.3.0`



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] bdsoha commented on a diff in pull request #26970: Impala Hook

Posted by GitBox <gi...@apache.org>.
bdsoha commented on code in PR #26970:
URL: https://github.com/apache/airflow/pull/26970#discussion_r993689390


##########
setup.py:
##########
@@ -293,6 +293,7 @@ def write_version(filename: str = os.path.join(*[my_dir, "airflow", "git_version
     'hmsclient>=0.1.0',
     'pyhive[hive]>=0.6.0',
 ]
+impala = ["impyla>=0.16.3"]

Review Comment:
   @uranusjr I assume it has to do with peer-dependencies mentioned [here](https://github.com/apache/airflow/pull/8918)



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] bdsoha commented on a diff in pull request #26970: Impala Hook

Posted by GitBox <gi...@apache.org>.
bdsoha commented on code in PR #26970:
URL: https://github.com/apache/airflow/pull/26970#discussion_r1008877198


##########
airflow/providers/apache/impala/hooks/impala.py:
##########
@@ -0,0 +1,42 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+from __future__ import annotations
+
+from impala.dbapi import connect
+from impala.interface import Connection
+
+from airflow.hooks.dbapi import DbApiHook
+
+
+class ImpalaHook(DbApiHook):
+    """Interact with Apache Impala through impyla."""
+
+    conn_name_attr = 'impala_conn_id'
+    default_conn_name = 'impala_default'
+    conn_type = 'impala'
+    hook_name = "Impala"

Review Comment:
   You should replace all quotes with double-quotes in order to pass static checks.
   
   ```suggestion
   class ImpalaHook(DbApiHook):
       """Interact with Apache Impala through impyla."""
   
       conn_name_attr = "impala_conn_id"
       default_conn_name = "impala_default"
       conn_type = "impala"
       hook_name = "Impala"
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] eladkal commented on pull request #26970: Impala Hook

Posted by GitBox <gi...@apache.org>.
eladkal commented on PR #26970:
URL: https://github.com/apache/airflow/pull/26970#issuecomment-1297352622

   > @eladkal could you please review this PR?
   
   Did you address my comment? Maybe you forgot to commit the change as I don't see the connection doc in the PR.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] blcksrx commented on a diff in pull request #26970: Impala Hook

Posted by GitBox <gi...@apache.org>.
blcksrx commented on code in PR #26970:
URL: https://github.com/apache/airflow/pull/26970#discussion_r1011487431


##########
tests/providers/apache/impala/hooks/test_impala.py:
##########
@@ -0,0 +1,113 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+from __future__ import annotations
+
+import unittest
+from unittest import mock
+from unittest.mock import patch
+
+from airflow.models import Connection
+from airflow.providers.apache.impala.hooks.impala import ImpalaHook
+
+
+class TestImpalaHookConn(unittest.TestCase):
+    def setUp(self):
+        super().setUp()
+        self.connection = Connection(
+            login="login",
+            password="password",
+            host="host",
+            port=21050,
+            schema="test",
+        )
+
+        class UnitTestImapalHook(ImpalaHook):

Review Comment:
   Oh man no :(



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] eladkal commented on a diff in pull request #26970: Impala Hook

Posted by GitBox <gi...@apache.org>.
eladkal commented on code in PR #26970:
URL: https://github.com/apache/airflow/pull/26970#discussion_r1011514697


##########
airflow/providers/apache/impala/provider.yaml:
##########
@@ -0,0 +1,41 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+---
+package-name: apache-airflow-providers-apache-impala
+name: Apache Impala
+description: |
+    `Apache Impala <https://impala.apache.org/>`__
+versions:
+  - 2.4.1
+
+dependencies:
+  - impyla>=0.18.0,<1.0

Review Comment:
   @potiuk should we consider to write a precommit that verify all providers have apache-airflow in provider yaml dependencies? I'm not sure because it seems like some providers don't have it 
   https://github.com/apache/airflow/blob/8f5e100f30764e7b1818a336feaa8bb390cbb327/airflow/providers/http/provider.yaml#L38



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] potiuk commented on pull request #26970: Impala Hook

Posted by GitBox <gi...@apache.org>.
potiuk commented on PR #26970:
URL: https://github.com/apache/airflow/pull/26970#issuecomment-1311065794

   Needs rebase.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] potiuk commented on pull request #26970: Impala Hook

Posted by GitBox <gi...@apache.org>.
potiuk commented on PR #26970:
URL: https://github.com/apache/airflow/pull/26970#issuecomment-1327880189

   One more rebase?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] potiuk commented on pull request #26970: Impala Hook

Posted by GitBox <gi...@apache.org>.
potiuk commented on PR #26970:
URL: https://github.com/apache/airflow/pull/26970#issuecomment-1319386686

   Unfortunatetly conflict. Yeah. It takes long but addig new provider is not fast/easy. You will get there :)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org