You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by GitBox <gi...@apache.org> on 2022/06/19 23:15:34 UTC

[GitHub] [airflow] ankurbajaj9 opened a new pull request, #24550: Bugfix/optional port

ankurbajaj9 opened a new pull request, #24550:
URL: https://github.com/apache/airflow/pull/24550

   <!--
   Thank you for contributing! Please make sure that your code changes
   are covered with tests. And in case of new features or big changes
   remember to adjust the documentation.
   
   Feel free to ping committers for the review!
   
   In case of existing issue, reference it using one of the following:
   
   closes: #24022
   related: #24022
   
   How to write a good git commit message:
   http://chris.beams.io/posts/git-commit/
   -->
   
   ---
   **^ Add meaningful description above**
   
   Read the **[Pull Request Guidelines](https://github.com/apache/airflow/blob/main/CONTRIBUTING.rst#pull-request-guidelines)** for more information.
   In case of fundamental code change, Airflow Improvement Proposal ([AIP](https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+Improvements+Proposals)) is needed.
   In case of a new dependency, check compliance with the [ASF 3rd Party License Policy](https://www.apache.org/legal/resolved.html#category-x).
   In case of backwards incompatible changes please leave a note in a newsfragement file, named `{pr_number}.significant.rst`, in [newsfragments](https://github.com/apache/airflow/tree/main/newsfragments).
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] ankurbajaj9 commented on pull request #24550: `WebHDFSHook` Bugfix/optional port

Posted by GitBox <gi...@apache.org>.
ankurbajaj9 commented on PR #24550:
URL: https://github.com/apache/airflow/pull/24550#issuecomment-1167966742

   Thanks for the run , I have fixed all the issues that I could find. Please enable the workflows awaiting approval once again to check if everything is working fine


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] pankajastro commented on pull request #24550: `WebHDFSHook` Bugfix/optional port

Posted by GitBox <gi...@apache.org>.
pankajastro commented on PR #24550:
URL: https://github.com/apache/airflow/pull/24550#issuecomment-1164501375

   Can you please add some description like case or some dummy extra where it not working 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] ankurbajaj9 commented on a diff in pull request #24550: `WebHDFSHook` Bugfix/optional port

Posted by GitBox <gi...@apache.org>.
ankurbajaj9 commented on code in PR #24550:
URL: https://github.com/apache/airflow/pull/24550#discussion_r905400115


##########
airflow/providers/apache/hdfs/hooks/webhdfs.py:
##########
@@ -63,39 +63,51 @@ def get_conn(self) -> Any:
         """
         connection = self._find_valid_server()
         if connection is None:
-            raise AirflowWebHDFSHookException("Failed to locate the valid server.")
+            raise AirflowWebHDFSHookException(
+                "Failed to locate the valid server.")
         return connection
 
     def _find_valid_server(self) -> Any:
         connection = self.get_connection(self.webhdfs_conn_id)
         namenodes = connection.host.split(',')
         for namenode in namenodes:
             host_socket = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
-            self.log.info("Trying to connect to %s:%s", namenode, connection.port)
+            self.log.info("Trying to connect to %s:%s",
+                          namenode, connection.port)
             try:
-                conn_check = host_socket.connect_ex((namenode, connection.port))
+                conn_check = host_socket.connect_ex(
+                    (namenode, connection.port))
                 if conn_check == 0:
                     self.log.info('Trying namenode %s', namenode)
                     client = self._get_client(
-                        namenode, connection.port, connection.login, connection.extra_dejson
+                        namenode, connection.port, connection.login, connection.schema,
+                        connection.extra_dejson
                     )
                     client.status('/')
                     self.log.info('Using namenode %s for hook', namenode)
                     host_socket.close()
                     return client
                 else:
-                    self.log.warning("Could not connect to %s:%s", namenode, connection.port)
+                    self.log.warning("Could not connect to %s:%s",
+                                     namenode, connection.port)
             except HdfsError as hdfs_error:
-                self.log.info('Read operation on namenode %s failed with error: %s', namenode, hdfs_error)
+                self.log.info(
+                    'Read operation on namenode %s failed with error: %s', namenode, hdfs_error)
         return None
 
-    def _get_client(self, namenode: str, port: int, login: str, extra_dejson: dict) -> Any:
-        connection_str = f'http://{namenode}:{port}'
+    def _get_client(self, namenode: str, port: int, login: str, schema: str, extra_dejson: dict) -> Any:
+        connection_str = f'http://{namenode}'
         session = requests.Session()
 
-        if extra_dejson.get('use_ssl', False):
-            connection_str = f'https://{namenode}:{port}'
-            session.verify = extra_dejson.get('verify', True)
+        if extra_dejson.get('use_ssl', 'False') == 'True':

Review Comment:
   When using the application , it takes json input from the UI where it only accepts String . Should I change it to accept both ?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] pankajastro commented on a diff in pull request #24550: `WebHDFSHook` Bugfix/optional port

Posted by GitBox <gi...@apache.org>.
pankajastro commented on code in PR #24550:
URL: https://github.com/apache/airflow/pull/24550#discussion_r905112068


##########
airflow/providers/apache/hdfs/hooks/webhdfs.py:
##########
@@ -63,39 +63,51 @@ def get_conn(self) -> Any:
         """
         connection = self._find_valid_server()
         if connection is None:
-            raise AirflowWebHDFSHookException("Failed to locate the valid server.")
+            raise AirflowWebHDFSHookException(
+                "Failed to locate the valid server.")
         return connection
 
     def _find_valid_server(self) -> Any:
         connection = self.get_connection(self.webhdfs_conn_id)
         namenodes = connection.host.split(',')
         for namenode in namenodes:
             host_socket = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
-            self.log.info("Trying to connect to %s:%s", namenode, connection.port)
+            self.log.info("Trying to connect to %s:%s",
+                          namenode, connection.port)
             try:
-                conn_check = host_socket.connect_ex((namenode, connection.port))
+                conn_check = host_socket.connect_ex(
+                    (namenode, connection.port))
                 if conn_check == 0:
                     self.log.info('Trying namenode %s', namenode)
                     client = self._get_client(
-                        namenode, connection.port, connection.login, connection.extra_dejson
+                        namenode, connection.port, connection.login, connection.schema,
+                        connection.extra_dejson
                     )
                     client.status('/')
                     self.log.info('Using namenode %s for hook', namenode)
                     host_socket.close()
                     return client
                 else:
-                    self.log.warning("Could not connect to %s:%s", namenode, connection.port)
+                    self.log.warning("Could not connect to %s:%s",
+                                     namenode, connection.port)
             except HdfsError as hdfs_error:
-                self.log.info('Read operation on namenode %s failed with error: %s', namenode, hdfs_error)
+                self.log.info(
+                    'Read operation on namenode %s failed with error: %s', namenode, hdfs_error)
         return None
 
-    def _get_client(self, namenode: str, port: int, login: str, extra_dejson: dict) -> Any:
-        connection_str = f'http://{namenode}:{port}'
+    def _get_client(self, namenode: str, port: int, login: str, schema: str, extra_dejson: dict) -> Any:
+        connection_str = f'http://{namenode}'
         session = requests.Session()
 
-        if extra_dejson.get('use_ssl', False):
-            connection_str = f'https://{namenode}:{port}'
-            session.verify = extra_dejson.get('verify', True)
+        if extra_dejson.get('use_ssl', 'False') == 'True':

Review Comment:
   I think connection extra can be both strings as well as dict/JSON  so not sure, actually, this change requires or not https://github.com/apache/airflow/blob/main/airflow/models/connection.py#L119 .



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] ankurbajaj9 commented on a diff in pull request #24550: `WebHDFSHook` Bugfix/optional port

Posted by GitBox <gi...@apache.org>.
ankurbajaj9 commented on code in PR #24550:
URL: https://github.com/apache/airflow/pull/24550#discussion_r905396971


##########
airflow/providers/apache/hdfs/hooks/webhdfs.py:
##########
@@ -63,39 +63,51 @@ def get_conn(self) -> Any:
         """
         connection = self._find_valid_server()
         if connection is None:
-            raise AirflowWebHDFSHookException("Failed to locate the valid server.")
+            raise AirflowWebHDFSHookException(

Review Comment:
   it was changed automatically using the python formatter 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] boring-cyborg[bot] commented on pull request #24550: Bugfix/optional port

Posted by GitBox <gi...@apache.org>.
boring-cyborg[bot] commented on PR #24550:
URL: https://github.com/apache/airflow/pull/24550#issuecomment-1159828568

   Congratulations on your first Pull Request and welcome to the Apache Airflow community! If you have any issues or are unsure about any anything please check our Contribution Guide (https://github.com/apache/airflow/blob/main/CONTRIBUTING.rst)
   Here are some useful points:
   - Pay attention to the quality of your code (flake8, mypy and type annotations). Our [pre-commits]( https://github.com/apache/airflow/blob/main/STATIC_CODE_CHECKS.rst#prerequisites-for-pre-commit-hooks) will help you with that.
   - In case of a new feature add useful documentation (in docstrings or in `docs/` directory). Adding a new operator? Check this short [guide](https://github.com/apache/airflow/blob/main/docs/apache-airflow/howto/custom-operator.rst) Consider adding an example DAG that shows how users should use it.
   - Consider using [Breeze environment](https://github.com/apache/airflow/blob/main/BREEZE.rst) for testing locally, itโ€™s a heavy docker but it ships with a working Airflow and a lot of integrations.
   - Be patient and persistent. It might take some time to get a review or get the final approval from Committers.
   - Please follow [ASF Code of Conduct](https://www.apache.org/foundation/policies/conduct) for all communication including (but not limited to) comments on Pull Requests, Mailing list and Slack.
   - Be sure to read the [Airflow Coding style]( https://github.com/apache/airflow/blob/main/CONTRIBUTING.rst#coding-style-and-best-practices).
   Apache Airflow is a community-driven project and together we are making it better ๐Ÿš€.
   In case of doubts contact the developers at:
   Mailing List: dev@airflow.apache.org
   Slack: https://s.apache.org/airflow-slack
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] ankurbajaj9 commented on a diff in pull request #24550: `WebHDFSHook` Bugfix/optional port

Posted by GitBox <gi...@apache.org>.
ankurbajaj9 commented on code in PR #24550:
URL: https://github.com/apache/airflow/pull/24550#discussion_r905396971


##########
airflow/providers/apache/hdfs/hooks/webhdfs.py:
##########
@@ -63,39 +63,51 @@ def get_conn(self) -> Any:
         """
         connection = self._find_valid_server()
         if connection is None:
-            raise AirflowWebHDFSHookException("Failed to locate the valid server.")
+            raise AirflowWebHDFSHookException(

Review Comment:
   it was changed automatically using the python formatter. Should I revert it ?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] ankurbajaj9 commented on pull request #24550: `WebHDFSHook` Bugfix/optional port

Posted by GitBox <gi...@apache.org>.
ankurbajaj9 commented on PR #24550:
URL: https://github.com/apache/airflow/pull/24550#issuecomment-1162351230

   Hey, Can you please enable the workflow which is awaiting approval. It will help me understand if If there is anything else needed in code and then I can work on the documentation of the change.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] ankurbajaj9 commented on pull request #24550: `WebHDFSHook` Bugfix/optional port

Posted by GitBox <gi...@apache.org>.
ankurbajaj9 commented on PR #24550:
URL: https://github.com/apache/airflow/pull/24550#issuecomment-1166966157

   > Can you please add some description like case or some dummy extra where it not working
   
   Do you want me to add more tests ? bascially when you send a string "False" from the UI , it doesnt work , you have to send an empty string to make it work


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] pankajastro commented on a diff in pull request #24550: `WebHDFSHook` Bugfix/optional port

Posted by GitBox <gi...@apache.org>.
pankajastro commented on code in PR #24550:
URL: https://github.com/apache/airflow/pull/24550#discussion_r905103811


##########
airflow/providers/apache/hdfs/hooks/webhdfs.py:
##########
@@ -63,39 +63,51 @@ def get_conn(self) -> Any:
         """
         connection = self._find_valid_server()
         if connection is None:
-            raise AirflowWebHDFSHookException("Failed to locate the valid server.")
+            raise AirflowWebHDFSHookException(

Review Comment:
   Look like the code formatting has changed.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] boring-cyborg[bot] commented on pull request #24550: `WebHDFSHook` Bugfix/optional port

Posted by GitBox <gi...@apache.org>.
boring-cyborg[bot] commented on PR #24550:
URL: https://github.com/apache/airflow/pull/24550#issuecomment-1168674411

   Awesome work, congrats on your first merged pull request!
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] potiuk merged pull request #24550: `WebHDFSHook` Bugfix/optional port

Posted by GitBox <gi...@apache.org>.
potiuk merged PR #24550:
URL: https://github.com/apache/airflow/pull/24550


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org