You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by "Taragolis (via GitHub)" <gi...@apache.org> on 2023/02/15 00:42:51 UTC

[GitHub] [airflow] Taragolis commented on a diff in pull request #29502: Validate Hive Beeline parameters

Taragolis commented on code in PR #29502:
URL: https://github.com/apache/airflow/pull/29502#discussion_r1106533831


##########
airflow/providers/apache/hive/hooks/hive.py:
##########
@@ -141,6 +141,7 @@ def _prepare_cli_cmd(self) -> list[Any]:
 
         if self.use_beeline:
             hive_bin = "beeline"
+            self._validate_beeline_parameters(conn)

Review Comment:
   I agree that we need to avoid anything which might cause call to Airflow DB or any other resource. 
   There is situation when hook initialised in operator not lazily and even expected as one of argument: 
   [SSHHook](https://github.com/apache/airflow/blob/50b30e5b92808e91ad9b6b05189f560d58dd8152/airflow/providers/ssh/hooks/ssh.py#L143-L151) and [SSHOperator](https://github.com/apache/airflow/blob/50b30e5b92808e91ad9b6b05189f560d58dd8152/airflow/providers/ssh/operators/ssh.py#L81-L82)
   
   Also I agree that operator should only allow fields declared however even if I'am a fan of dataclasses there is couple of issues exists:
   1. [`__post_init__`](https://docs.python.org/3/library/dataclasses.html#post-init-processing), there will be temptation to use it.
   2. A bit dumb inheritance, you should decorate child class with `@dataclass` otherwise it turned into the regular class with part of dataclass 🙄 
   3. kwargs-only dataclasses introduced only in Python 3.10+



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org