You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by GitBox <gi...@apache.org> on 2022/09/21 20:36:11 UTC

[GitHub] [airflow] pauldalewilliams opened a new pull request, #26576: Add oracledb thick mode support for oracle provider

pauldalewilliams opened a new pull request, #26576:
URL: https://github.com/apache/airflow/pull/26576

   closes: #24618
   
   Adds support for oracledb thick mode in the oracle provider, along with the option to set some defaults supported by oracledb (fetch_decimals and fetch_lobs).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] potiuk commented on pull request #26576: Add oracledb thick mode support for oracle provider

Posted by GitBox <gi...@apache.org>.
potiuk commented on PR #26576:
URL: https://github.com/apache/airflow/pull/26576#issuecomment-1254263076

   Some errors :(


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] pauldalewilliams commented on pull request #26576: Add oracledb thick mode support for oracle provider

Posted by GitBox <gi...@apache.org>.
pauldalewilliams commented on PR #26576:
URL: https://github.com/apache/airflow/pull/26576#issuecomment-1255345372

   > So you have done quite a bit of refactoring, and added, it seems, quite a bit of new functionality, but you have only added a test for the thick mode part. In writing hooks, particularly when you need to reconcile information between hook params and airflow conn attrs (which might _both_ be supplied), it's generally desired to test that behavior to verify that you resolve it properly and what the order of precedence is.
   > 
   > What I would like to suggest is, just keep this PR confined to adding thick mode support. And do it as I suggested before, by adding a parameter or two to `__init__` but keep the airflow conn parsing logic confined to `get_conn`. It seems this could be a much simpler change and easier to review. Then in a separate PR you can add the other functionality you desire and it can be considered independently. You might consider separating the followup PR into two, one simply adding the extra parameters you desire and the other being the refactor that possibly moves things from get_conn to init.
   
   @dstandish I don't understand how I'd add the parameters to check for the thick_mode stuff to init without duplicating the get_connection stuff.  Why not just add it in `get_conn` for now?  Then I don't need to do my goofy init call in the tests either. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] pauldalewilliams commented on pull request #26576: Add oracledb thick mode support for oracle provider

Posted by GitBox <gi...@apache.org>.
pauldalewilliams commented on PR #26576:
URL: https://github.com/apache/airflow/pull/26576#issuecomment-1254384246

   @dstandish Do you mean just move the connection retrieval to the `__init__`, store the connection and extra_options as instance variables, and just reference them in `get_conn` instead of calling `get_connection` again?  Maybe move a lot of the parameter parsing in `get_conn` to `__init__` similar to how it's handled in the [ssh provider](https://github.com/apache/airflow/blob/main/airflow/providers/ssh/hooks/ssh.py#L104)?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] dstandish commented on a diff in pull request #26576: Add oracledb thick mode support for oracle provider

Posted by GitBox <gi...@apache.org>.
dstandish commented on code in PR #26576:
URL: https://github.com/apache/airflow/pull/26576#discussion_r978270823


##########
airflow/providers/oracle/hooks/oracle.py:
##########
@@ -90,6 +134,49 @@ def get_conn(self) -> oracledb.Connection:
         mod = conn.extra_dejson.get('module')
         schema = conn.schema
 
+        # Enable oracledb thick mode if thick_mode is set to True, defaults to False
+        # Parameters take precedence over connection config extra
+        # Defaults to False (use thin mode) if not provided in params or connection config extra
+        if self.thick_mode is None:
+            self.thick_mode = conn.extra_dejson.get('thick_mode', False)
+            if not isinstance(self.thick_mode, bool):
+                raise TypeError(f'thick_mode expected bool, got {type(self.thick_mode).__name__}')

Review Comment:
   if defining conn in airflow URI you may end up with string value e.g.`'True'`. best to handle this case.



##########
airflow/providers/oracle/hooks/oracle.py:
##########
@@ -90,6 +134,49 @@ def get_conn(self) -> oracledb.Connection:
         mod = conn.extra_dejson.get('module')
         schema = conn.schema
 
+        # Enable oracledb thick mode if thick_mode is set to True, defaults to False
+        # Parameters take precedence over connection config extra
+        # Defaults to False (use thin mode) if not provided in params or connection config extra
+        if self.thick_mode is None:
+            self.thick_mode = conn.extra_dejson.get('thick_mode', False)
+            if not isinstance(self.thick_mode, bool):
+                raise TypeError(f'thick_mode expected bool, got {type(self.thick_mode).__name__}')
+        if self.thick_mode:

Review Comment:
   might as well check `is True` at this point?



##########
airflow/providers/oracle/hooks/oracle.py:
##########
@@ -90,6 +134,49 @@ def get_conn(self) -> oracledb.Connection:
         mod = conn.extra_dejson.get('module')
         schema = conn.schema
 
+        # Enable oracledb thick mode if thick_mode is set to True, defaults to False
+        # Parameters take precedence over connection config extra
+        # Defaults to False (use thin mode) if not provided in params or connection config extra
+        if self.thick_mode is None:
+            self.thick_mode = conn.extra_dejson.get('thick_mode', False)
+            if not isinstance(self.thick_mode, bool):
+                raise TypeError(f'thick_mode expected bool, got {type(self.thick_mode).__name__}')
+        if self.thick_mode:
+            if self.thick_mode_lib_dir is None:
+                self.thick_mode_lib_dir = conn.extra_dejson.get('thick_mode_lib_dir')
+                if not isinstance(self.thick_mode_lib_dir, (str, type(None))):
+                    raise TypeError(
+                        f'thick_mode_lib_dir expected str or None, '
+                        f'got {type(self.thick_mode_lib_dir).__name__}'
+                    )
+            if self.thick_mode_config_dir is None:
+                self.thick_mode_config_dir = conn.extra_dejson.get('thick_mode_config_dir')
+                if not isinstance(self.thick_mode_config_dir, (str, type(None))):
+                    raise TypeError(
+                        f'thick_mode_config_dir expected str or None, '
+                        f'got {type(self.thick_mode_config_dir).__name__}'
+                    )
+            oracledb.init_oracle_client(
+                lib_dir=self.thick_mode_lib_dir, config_dir=self.thick_mode_config_dir
+            )
+
+        # Set oracledb Defaults Attributes
+        # Default to the initial values
+        # if not provided in params or connection config extra
+        # (https://python-oracledb.readthedocs.io/en/latest/api_manual/defaults.html)
+        if self.fetch_decimals is None:
+            self.fetch_decimals = conn.extra_dejson.get('fetch_decimals', False)
+            if not isinstance(self.fetch_decimals, bool):
+                raise TypeError(f'fetch_decimals expected bool, got {type(self.fetch_decimals).__name__}')
+        oracledb.defaults.fetch_decimals = self.fetch_decimals

Review Comment:
   notice here that you are setting this default always, even when the user does not specify this param.  it's sorta ok because you are setting it to _false_, and that is what the default is in the library.  but, what if the library changed the default?  or, what if you made a mistake and the library's default isn't what you thought?  you can avoid this risk by only setting attributes that are explicitly provided by the user.  it's probably just a little more code, but, it means not doing stuff that you don't really need to do.
   
   but actually, it can be done pretty simply... you know ... you need to both (1) convert string to bool and (2) coalesce params a bunch of times here.... probably best to do something like this:
   
   ```python
           def _get_bool(val):
               if isinstance(val, bool):
                   return val
               if isinstance(val, str):
                   val = val.lower().strip()
                   if val == 'true':
                       return True
                   if val == 'false':
                       return False
               return None
           def _get_first_bool(*vals):
               for val in vals:
                   converted = _get_bool(val)
                   if isinstance(converted, bool):
                       return converted
               return None
           fetch_decimals = _get_first_bool(self.fetch_decimals, extras.get('fetch_decimals'))
           if isinstance(fetch_decimals, bool):
               oracledb.defaults.fetch_decimals = fetch_decimals
   ```



##########
airflow/providers/oracle/hooks/oracle.py:
##########
@@ -90,6 +134,49 @@ def get_conn(self) -> oracledb.Connection:
         mod = conn.extra_dejson.get('module')
         schema = conn.schema
 
+        # Enable oracledb thick mode if thick_mode is set to True, defaults to False
+        # Parameters take precedence over connection config extra
+        # Defaults to False (use thin mode) if not provided in params or connection config extra
+        if self.thick_mode is None:
+            self.thick_mode = conn.extra_dejson.get('thick_mode', False)
+            if not isinstance(self.thick_mode, bool):
+                raise TypeError(f'thick_mode expected bool, got {type(self.thick_mode).__name__}')
+        if self.thick_mode:
+            if self.thick_mode_lib_dir is None:
+                self.thick_mode_lib_dir = conn.extra_dejson.get('thick_mode_lib_dir')
+                if not isinstance(self.thick_mode_lib_dir, (str, type(None))):
+                    raise TypeError(
+                        f'thick_mode_lib_dir expected str or None, '
+                        f'got {type(self.thick_mode_lib_dir).__name__}'
+                    )
+            if self.thick_mode_config_dir is None:
+                self.thick_mode_config_dir = conn.extra_dejson.get('thick_mode_config_dir')
+                if not isinstance(self.thick_mode_config_dir, (str, type(None))):
+                    raise TypeError(
+                        f'thick_mode_config_dir expected str or None, '
+                        f'got {type(self.thick_mode_config_dir).__name__}'
+                    )
+            oracledb.init_oracle_client(
+                lib_dir=self.thick_mode_lib_dir, config_dir=self.thick_mode_config_dir
+            )
+
+        # Set oracledb Defaults Attributes
+        # Default to the initial values
+        # if not provided in params or connection config extra
+        # (https://python-oracledb.readthedocs.io/en/latest/api_manual/defaults.html)
+        if self.fetch_decimals is None:
+            self.fetch_decimals = conn.extra_dejson.get('fetch_decimals', False)

Review Comment:
   again, if using airflow URI, could be string



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] pauldalewilliams commented on a diff in pull request #26576: Add oracledb thick mode support for oracle provider

Posted by GitBox <gi...@apache.org>.
pauldalewilliams commented on code in PR #26576:
URL: https://github.com/apache/airflow/pull/26576#discussion_r980646728


##########
airflow/providers/oracle/hooks/oracle.py:
##########
@@ -90,6 +152,40 @@ def get_conn(self) -> oracledb.Connection:
         mod = conn.extra_dejson.get('module')
         schema = conn.schema
 
+        # Enable oracledb thick mode if thick_mode is set to True
+        # Parameters take precedence over connection config extra
+        # Defaults to use thin mode if not provided in params or connection config extra
+        thick_mode = _get_first_bool(self.thick_mode, conn.extra_dejson.get('thick_mode'))
+        if thick_mode is True:
+            if self.thick_mode_lib_dir is None:
+                self.thick_mode_lib_dir = conn.extra_dejson.get('thick_mode_lib_dir')
+                if not isinstance(self.thick_mode_lib_dir, (str, type(None))):
+                    raise TypeError(
+                        f'thick_mode_lib_dir expected str or None, '
+                        f'got {type(self.thick_mode_lib_dir).__name__}'
+                    )
+            if self.thick_mode_config_dir is None:
+                self.thick_mode_config_dir = conn.extra_dejson.get('thick_mode_config_dir')
+                if not isinstance(self.thick_mode_config_dir, (str, type(None))):
+                    raise TypeError(
+                        f'thick_mode_config_dir expected str or None, '

Review Comment:
   Same as above comment.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] pauldalewilliams commented on a diff in pull request #26576: Add oracledb thick mode support for oracle provider

Posted by GitBox <gi...@apache.org>.
pauldalewilliams commented on code in PR #26576:
URL: https://github.com/apache/airflow/pull/26576#discussion_r977875742


##########
tests/providers/oracle/hooks/test_oracle.py:
##########
@@ -82,6 +102,7 @@ def test_get_conn_sid(self, mock_connect):
     def test_get_conn_service_name(self, mock_connect):
         dsn_service_name = {'dsn': 'ignored', 'service_name': 'service_name'}
         self.connection.extra = json.dumps(dsn_service_name)
+        self.db_hook.__init__()

Review Comment:
   I did because of the way the tests were originally structured.  They create the hook in `setUp` (line 64 in my commit) and then modify the extra in the tests themselves.  Since all the parameter parsing is now happening in the `__init__` it never picks that up.  This seemed to work for picking up those changes without drastically changing the tests.  I considered trying to rework it to function more like the way tests are structured in https://github.com/apache/airflow/blob/main/tests/providers/ssh/hooks/test_ssh.py with a separate connection for each variation.  But I'm not very confident in writing tests in the first place and didn't fully understand how to rewrite all this in that way.
   
   I did feel a little goofy doing it this way but it seemed to work for the purpose just fine and I couldn't think of any negatives aside from it just being goofy.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] potiuk merged pull request #26576: Add oracledb thick mode support for oracle provider

Posted by GitBox <gi...@apache.org>.
potiuk merged PR #26576:
URL: https://github.com/apache/airflow/pull/26576


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] pauldalewilliams commented on pull request #26576: Add oracledb thick mode support for oracle provider

Posted by GitBox <gi...@apache.org>.
pauldalewilliams commented on PR #26576:
URL: https://github.com/apache/airflow/pull/26576#issuecomment-1254546726

   @dstandish Let me know if my most recent commit is more in line with what you were thinking.  Tests passed locally but we'll see how the full suite goes.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] dstandish commented on a diff in pull request #26576: Add oracledb thick mode support for oracle provider

Posted by GitBox <gi...@apache.org>.
dstandish commented on code in PR #26576:
URL: https://github.com/apache/airflow/pull/26576#discussion_r977852146


##########
tests/providers/oracle/hooks/test_oracle.py:
##########
@@ -82,6 +102,7 @@ def test_get_conn_sid(self, mock_connect):
     def test_get_conn_service_name(self, mock_connect):
         dsn_service_name = {'dsn': 'ignored', 'service_name': 'service_name'}
         self.connection.extra = json.dumps(dsn_service_name)
+        self.db_hook.__init__()

Review Comment:
   seems you should not have to do this?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] dstandish commented on pull request #26576: Add oracledb thick mode support for oracle provider

Posted by GitBox <gi...@apache.org>.
dstandish commented on PR #26576:
URL: https://github.com/apache/airflow/pull/26576#issuecomment-1255326265

   So you have done quite a bit of refactoring, and added, it seems, quite a bit of new functionality, but you have only added a test for the thick mode part.  In writing hooks, particularly when you need to reconcile information between hook params and airflow conn attrs (which might _both_ be supplied), it's generally desired to test that behavior to verify that you resolve it properly and what the order of precedence is.
   
   What I would like to suggest is, just keep this PR confined to adding thick mode support.  And do it as I suggested before, by adding a parameter or two to `__init__` but keep the airflow conn parsing logic confined to `get_conn`.  It seems this could be a much simpler change and easier to review.  Then in a separate PR you can add the other functionality you desire and it can be considered independently.  You might consider separating the followup PR into two, one simply adding the extra parameters you desire and the other being the refactor that possibly moves things from get_conn to init. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] dstandish commented on pull request #26576: Add oracledb thick mode support for oracle provider

Posted by GitBox <gi...@apache.org>.
dstandish commented on PR #26576:
URL: https://github.com/apache/airflow/pull/26576#issuecomment-1254353297

   > I took @potiuk 's comments on #24618 to mean it should go in the `__init__` for the hook
   
   looks like you're referring to [this comment](https://github.com/apache/airflow/issues/24618#issuecomment-1173070594). he's suggesting adding another param but not necessarily suggesting add more airflow conn retrieval. you can store the mode param as an instance attr and look at it in get_conn.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] Taragolis commented on pull request #26576: Add oracledb thick mode support for oracle provider

Posted by GitBox <gi...@apache.org>.
Taragolis commented on PR #26576:
URL: https://github.com/apache/airflow/pull/26576#issuecomment-1254322495

   >@potiuk Not related to my changes but I'll keep updating from main until resolved.
   
   Seems like it is somehow related to #25980 I will have a look to it.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] pauldalewilliams commented on pull request #26576: Add oracledb thick mode support for oracle provider

Posted by GitBox <gi...@apache.org>.
pauldalewilliams commented on PR #26576:
URL: https://github.com/apache/airflow/pull/26576#issuecomment-1254270357

   @potiuk Not related to my changes but I'll keep updating from main until resolved.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] pauldalewilliams commented on a diff in pull request #26576: Add oracledb thick mode support for oracle provider

Posted by GitBox <gi...@apache.org>.
pauldalewilliams commented on code in PR #26576:
URL: https://github.com/apache/airflow/pull/26576#discussion_r978326743


##########
airflow/providers/oracle/hooks/oracle.py:
##########
@@ -90,6 +134,49 @@ def get_conn(self) -> oracledb.Connection:
         mod = conn.extra_dejson.get('module')
         schema = conn.schema
 
+        # Enable oracledb thick mode if thick_mode is set to True, defaults to False
+        # Parameters take precedence over connection config extra
+        # Defaults to False (use thin mode) if not provided in params or connection config extra
+        if self.thick_mode is None:
+            self.thick_mode = conn.extra_dejson.get('thick_mode', False)
+            if not isinstance(self.thick_mode, bool):
+                raise TypeError(f'thick_mode expected bool, got {type(self.thick_mode).__name__}')
+        if self.thick_mode:
+            if self.thick_mode_lib_dir is None:
+                self.thick_mode_lib_dir = conn.extra_dejson.get('thick_mode_lib_dir')
+                if not isinstance(self.thick_mode_lib_dir, (str, type(None))):
+                    raise TypeError(
+                        f'thick_mode_lib_dir expected str or None, '
+                        f'got {type(self.thick_mode_lib_dir).__name__}'
+                    )
+            if self.thick_mode_config_dir is None:
+                self.thick_mode_config_dir = conn.extra_dejson.get('thick_mode_config_dir')
+                if not isinstance(self.thick_mode_config_dir, (str, type(None))):
+                    raise TypeError(
+                        f'thick_mode_config_dir expected str or None, '
+                        f'got {type(self.thick_mode_config_dir).__name__}'
+                    )
+            oracledb.init_oracle_client(
+                lib_dir=self.thick_mode_lib_dir, config_dir=self.thick_mode_config_dir
+            )
+
+        # Set oracledb Defaults Attributes
+        # Default to the initial values
+        # if not provided in params or connection config extra
+        # (https://python-oracledb.readthedocs.io/en/latest/api_manual/defaults.html)
+        if self.fetch_decimals is None:
+            self.fetch_decimals = conn.extra_dejson.get('fetch_decimals', False)
+            if not isinstance(self.fetch_decimals, bool):
+                raise TypeError(f'fetch_decimals expected bool, got {type(self.fetch_decimals).__name__}')
+        oracledb.defaults.fetch_decimals = self.fetch_decimals

Review Comment:
   OK, see pauldalewilliams/airflow@d3f09c0 - I think I correctly applied the approach you're talking about here.  I moved the helper functions outside the class - let me know if that was a bad decision.  I modified the tests appropriately and added some new ones to check for converting str to bool in the connection config extra.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] pauldalewilliams commented on a diff in pull request #26576: Add oracledb thick mode support for oracle provider

Posted by GitBox <gi...@apache.org>.
pauldalewilliams commented on code in PR #26576:
URL: https://github.com/apache/airflow/pull/26576#discussion_r980657056


##########
airflow/providers/oracle/hooks/oracle.py:
##########
@@ -57,6 +101,24 @@ class OracleHook(DbApiHook):
 
     supports_autocommit = True
 
+    def __init__(
+        self,
+        *args,
+        thick_mode: bool | None = None,
+        thick_mode_lib_dir: str | None = None,
+        thick_mode_config_dir: str | None = None,
+        fetch_decimals: bool | None = None,
+        fetch_lobs: bool | None = None,

Review Comment:
   Thanks for catching that!  All cleaned up now.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] josh-fell commented on a diff in pull request #26576: Add oracledb thick mode support for oracle provider

Posted by GitBox <gi...@apache.org>.
josh-fell commented on code in PR #26576:
URL: https://github.com/apache/airflow/pull/26576#discussion_r980612083


##########
airflow/providers/oracle/hooks/oracle.py:
##########
@@ -90,6 +152,40 @@ def get_conn(self) -> oracledb.Connection:
         mod = conn.extra_dejson.get('module')
         schema = conn.schema
 
+        # Enable oracledb thick mode if thick_mode is set to True
+        # Parameters take precedence over connection config extra
+        # Defaults to use thin mode if not provided in params or connection config extra
+        thick_mode = _get_first_bool(self.thick_mode, conn.extra_dejson.get('thick_mode'))
+        if thick_mode is True:
+            if self.thick_mode_lib_dir is None:
+                self.thick_mode_lib_dir = conn.extra_dejson.get('thick_mode_lib_dir')
+                if not isinstance(self.thick_mode_lib_dir, (str, type(None))):
+                    raise TypeError(
+                        f'thick_mode_lib_dir expected str or None, '

Review Comment:
   ```suggestion
                           'thick_mode_lib_dir expected str or None, '
   ```
   Nit: f-strings not needed here.



##########
airflow/providers/oracle/hooks/oracle.py:
##########
@@ -57,6 +101,24 @@ class OracleHook(DbApiHook):
 
     supports_autocommit = True
 
+    def __init__(
+        self,
+        *args,
+        thick_mode: bool | None = None,
+        thick_mode_lib_dir: str | None = None,
+        thick_mode_config_dir: str | None = None,
+        fetch_decimals: bool | None = None,
+        fetch_lobs: bool | None = None,

Review Comment:
   ```suggestion
           thick_mode: bool = True,
           thick_mode_lib_dir: str | None = None,
           thick_mode_config_dir: str | None = None,
           fetch_decimals: bool = False,
           fetch_lobs: bool = True,
   ```
   In the docstring, `thick_mode` defaults to `True` while in the doc `fetch_decimals` and `fetch_lobs` are said to default to `False` and `True` respectively. Should these all be aligned in the docstring, doc, and default values? Or is the reference to the defaults in the doc _really_ referring to something deeper in the Oracle libs?



##########
airflow/providers/oracle/hooks/oracle.py:
##########
@@ -90,6 +152,40 @@ def get_conn(self) -> oracledb.Connection:
         mod = conn.extra_dejson.get('module')
         schema = conn.schema
 
+        # Enable oracledb thick mode if thick_mode is set to True
+        # Parameters take precedence over connection config extra
+        # Defaults to use thin mode if not provided in params or connection config extra
+        thick_mode = _get_first_bool(self.thick_mode, conn.extra_dejson.get('thick_mode'))
+        if thick_mode is True:
+            if self.thick_mode_lib_dir is None:
+                self.thick_mode_lib_dir = conn.extra_dejson.get('thick_mode_lib_dir')
+                if not isinstance(self.thick_mode_lib_dir, (str, type(None))):
+                    raise TypeError(
+                        f'thick_mode_lib_dir expected str or None, '
+                        f'got {type(self.thick_mode_lib_dir).__name__}'
+                    )
+            if self.thick_mode_config_dir is None:
+                self.thick_mode_config_dir = conn.extra_dejson.get('thick_mode_config_dir')
+                if not isinstance(self.thick_mode_config_dir, (str, type(None))):
+                    raise TypeError(
+                        f'thick_mode_config_dir expected str or None, '

Review Comment:
   ```suggestion
                           'thick_mode_config_dir expected str or None, '
   ```
   Same f-string nit here.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] pauldalewilliams commented on a diff in pull request #26576: Add oracledb thick mode support for oracle provider

Posted by GitBox <gi...@apache.org>.
pauldalewilliams commented on code in PR #26576:
URL: https://github.com/apache/airflow/pull/26576#discussion_r978327003


##########
airflow/providers/oracle/hooks/oracle.py:
##########
@@ -90,6 +134,49 @@ def get_conn(self) -> oracledb.Connection:
         mod = conn.extra_dejson.get('module')
         schema = conn.schema
 
+        # Enable oracledb thick mode if thick_mode is set to True, defaults to False
+        # Parameters take precedence over connection config extra
+        # Defaults to False (use thin mode) if not provided in params or connection config extra
+        if self.thick_mode is None:
+            self.thick_mode = conn.extra_dejson.get('thick_mode', False)
+            if not isinstance(self.thick_mode, bool):
+                raise TypeError(f'thick_mode expected bool, got {type(self.thick_mode).__name__}')
+        if self.thick_mode:

Review Comment:
   Done in latest commit



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] dstandish commented on pull request #26576: Add oracledb thick mode support for oracle provider

Posted by GitBox <gi...@apache.org>.
dstandish commented on PR #26576:
URL: https://github.com/apache/airflow/pull/26576#issuecomment-1254352083

   > I'm not sure [init_oracle_client](https://github.com/oracle/python-oracledb/blob/main/src/oracledb/impl/thick/utils.pyx#L436) returns a client - seems to be just setting up the parameters for calling the thick client. I think?
   
   doesn't really matter.  why retrieve the conn in two places, in two different ways? if you do it in `__init__.py`, why do we do it again in `get_conn`?  
   
   what's the difference between the way we handle the airflow conn in init vs in get_conn?  probably there shouldn't be any difference, because probably it should just be done once and in one place right?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] dstandish commented on a diff in pull request #26576: Add oracledb thick mode support for oracle provider

Posted by GitBox <gi...@apache.org>.
dstandish commented on code in PR #26576:
URL: https://github.com/apache/airflow/pull/26576#discussion_r977926306


##########
tests/providers/oracle/hooks/test_oracle.py:
##########
@@ -82,6 +102,7 @@ def test_get_conn_sid(self, mock_connect):
     def test_get_conn_service_name(self, mock_connect):
         dsn_service_name = {'dsn': 'ignored', 'service_name': 'service_name'}
         self.connection.extra = json.dumps(dsn_service_name)
+        self.db_hook.__init__()

Review Comment:
   right so the code before, it retrieved the airflow conn object and recreated the client object every time get_conn was called.
   and then they changed the airflow conn object resulting in different behavior with client creation time.  and your move of the logic to `__init__` changes this behavior, so that get_conn doesn't re-retrieve the object.
   while it's strictly speaking a "breaking" change probably it's unlikely to trip anyone up.  but i think it would be best to adjust the tests so that the code is invoked in a more normal way.  
   what you could is apply mock.patch decorator to patch the `get_connection` method (instead of doing it by assignment in setUp) and then set it to return the modified conn object, then instantiate your hook in each test method e.g. `db_hook = OracleHook(...)`. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] pauldalewilliams commented on a diff in pull request #26576: Add oracledb thick mode support for oracle provider

Posted by GitBox <gi...@apache.org>.
pauldalewilliams commented on code in PR #26576:
URL: https://github.com/apache/airflow/pull/26576#discussion_r980641047


##########
airflow/providers/oracle/hooks/oracle.py:
##########
@@ -90,6 +152,40 @@ def get_conn(self) -> oracledb.Connection:
         mod = conn.extra_dejson.get('module')
         schema = conn.schema
 
+        # Enable oracledb thick mode if thick_mode is set to True
+        # Parameters take precedence over connection config extra
+        # Defaults to use thin mode if not provided in params or connection config extra
+        thick_mode = _get_first_bool(self.thick_mode, conn.extra_dejson.get('thick_mode'))
+        if thick_mode is True:
+            if self.thick_mode_lib_dir is None:
+                self.thick_mode_lib_dir = conn.extra_dejson.get('thick_mode_lib_dir')
+                if not isinstance(self.thick_mode_lib_dir, (str, type(None))):
+                    raise TypeError(
+                        f'thick_mode_lib_dir expected str or None, '

Review Comment:
   I thought it was needed on every line when you do a multiline f-string.  Turns out I was wrong about that! :)
   It looks like it doesn't make a difference for plain strings though:  https://stackoverflow.com/questions/48471086/style-for-continued-multi-line-f-strings
   Do we have a style guide for this?  To me prepending each line with `f` looks cleaner.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] pauldalewilliams commented on a diff in pull request #26576: Add oracledb thick mode support for oracle provider

Posted by GitBox <gi...@apache.org>.
pauldalewilliams commented on code in PR #26576:
URL: https://github.com/apache/airflow/pull/26576#discussion_r980646330


##########
airflow/providers/oracle/hooks/oracle.py:
##########
@@ -57,6 +101,24 @@ class OracleHook(DbApiHook):
 
     supports_autocommit = True
 
+    def __init__(
+        self,
+        *args,
+        thick_mode: bool | None = None,
+        thick_mode_lib_dir: str | None = None,
+        thick_mode_config_dir: str | None = None,
+        fetch_decimals: bool | None = None,
+        fetch_lobs: bool | None = None,

Review Comment:
   I think you may be looking at an earlier commit in this PR for the docstrings.  The current code does not state the defaults for `fetch_decimals` or `fetch_lobs` but refers you to the python-oracledb documentation.
   
   For `thick_mode` the docstring states that it defaults to False.  This is true of python-oracledb's behavior.  In this hook, that default doesn't get set in the parameters of the `__init__` but is controlled by the behavior on this line which allows the parameter passed to override what is in the connection's extra config:
   https://github.com/apache/airflow/blob/f2899820c2385773258b91ac5d661115d9f813c3/airflow/providers/oracle/hooks/oracle.py#L158
   
   If I set a default of `False` in `__init__` it will always override what's in the connection's extra config with the current logic being used.  This structure was proposed by @dstandish earlier in the conversation.
   
   I think it's best to keep the current state of the code as is.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] pauldalewilliams commented on pull request #26576: Add oracledb thick mode support for oracle provider

Posted by GitBox <gi...@apache.org>.
pauldalewilliams commented on PR #26576:
URL: https://github.com/apache/airflow/pull/26576#issuecomment-1254343685

   > you are duplicating some code from `get_conn`
   > 
   > `get_conn` is by convention the method you call to get the client. can you update that code instead of adding a separate client creation path?
   
   @dstandish Apologies - I took @potiuk 's comments on #24618 to mean it should go in the __init__ for the hook.  I'm fine moving it within the `get_conn` method.  The duplicated code was just to get access to the connection's extra options to grab the parameters.  I'm not sure (init_oracle_client)[https://github.com/oracle/python-oracledb/blob/main/src/oracledb/impl/thick/utils.pyx#L436] returns a client - seems to be just setting up the parameters for calling the thick client.  I think?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] pauldalewilliams commented on a diff in pull request #26576: Add oracledb thick mode support for oracle provider

Posted by GitBox <gi...@apache.org>.
pauldalewilliams commented on code in PR #26576:
URL: https://github.com/apache/airflow/pull/26576#discussion_r980655082


##########
airflow/providers/oracle/hooks/oracle.py:
##########
@@ -57,6 +101,24 @@ class OracleHook(DbApiHook):
 
     supports_autocommit = True
 
+    def __init__(
+        self,
+        *args,
+        thick_mode: bool | None = None,
+        thick_mode_lib_dir: str | None = None,
+        thick_mode_config_dir: str | None = None,
+        fetch_decimals: bool | None = None,
+        fetch_lobs: bool | None = None,

Review Comment:
   Ah, sorry @josh-fell - I'm with you now.  I see I forgot to remove those notes in the connections docs.  I thought you were referring to the docstrings on the class's `__init__` (which I did remember to update - ha!).  I'll make the change to remove the references in the connections docs.
   
   Just to make sure I've got it right - no objection to keeping the "Defaults to False." for `thick_mode`?  I feel like that's important to keep in because setting it to True requires you to have the Oracle client libraries installed, which are not required by the default behavior of python-oracledb and its thin mode - which was part of the point of upgrading from cx_Oracle in the earlier version of this provider.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] josh-fell commented on a diff in pull request #26576: Add oracledb thick mode support for oracle provider

Posted by GitBox <gi...@apache.org>.
josh-fell commented on code in PR #26576:
URL: https://github.com/apache/airflow/pull/26576#discussion_r980641367


##########
airflow/providers/oracle/hooks/oracle.py:
##########
@@ -90,6 +152,40 @@ def get_conn(self) -> oracledb.Connection:
         mod = conn.extra_dejson.get('module')
         schema = conn.schema
 
+        # Enable oracledb thick mode if thick_mode is set to True
+        # Parameters take precedence over connection config extra
+        # Defaults to use thin mode if not provided in params or connection config extra
+        thick_mode = _get_first_bool(self.thick_mode, conn.extra_dejson.get('thick_mode'))
+        if thick_mode is True:
+            if self.thick_mode_lib_dir is None:
+                self.thick_mode_lib_dir = conn.extra_dejson.get('thick_mode_lib_dir')
+                if not isinstance(self.thick_mode_lib_dir, (str, type(None))):
+                    raise TypeError(
+                        f'thick_mode_lib_dir expected str or None, '

Review Comment:
   Nope, just a nit. I have no strong feelings really. 🙂 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] josh-fell commented on a diff in pull request #26576: Add oracledb thick mode support for oracle provider

Posted by GitBox <gi...@apache.org>.
josh-fell commented on code in PR #26576:
URL: https://github.com/apache/airflow/pull/26576#discussion_r980652602


##########
airflow/providers/oracle/hooks/oracle.py:
##########
@@ -57,6 +101,24 @@ class OracleHook(DbApiHook):
 
     supports_autocommit = True
 
+    def __init__(
+        self,
+        *args,
+        thick_mode: bool | None = None,
+        thick_mode_lib_dir: str | None = None,
+        thick_mode_config_dir: str | None = None,
+        fetch_decimals: bool | None = None,
+        fetch_lobs: bool | None = None,

Review Comment:
   > The current code does not state the defaults for fetch_decimals or fetch_lobs but refers you to the python-oracledb documentation.
   
   Right, the docstring only stated a default for `thick_mode` but the doc updates _do_ mention defaults for the others:
   > * ``fetch_decimals`` (bool) - Specify whether numbers should be fetched as ``decimal.Decimal`` values.  **Defaults to False.**
         See `defaults.fetch_decimals<https://python-oracledb.readthedocs.io/en/latest/api_manual/defaults.html#defaults.fetch_decimals>` for more info.
       * ``fetch_lobs`` (bool) - Specify whether to fetch strings/bytes for CLOBs or BLOBs instead of locators.  **Defaults to True.**
         See `defaults.fetch_lobs<https://python-oracledb.readthedocs.io/en/latest/api_manual/defaults.html#defaults.fetch_decimals>` for more info.
   
   The comment was really to align what's expected between the docs and the code. Typically the docstrings in Airflow refer to what's happening in Airflow not in the underlying libs relative to default values, when mentioned. Users could be confused when reading the Python API documentation for this hook. (Granted it does not functionally make a difference in this case, but it's better to be consistent). Perhaps it's better to remove the "Defaults to False" bit?
   
   >is controlled by the behavior on this line which allows the parameter passed to override what is in the connection's extra config
   
   Ah yes, fair point. I dig that.
   
   Side note, it would be really great if these were custom form fields in the UI instead of fumbling around with Extras as they have been historically cumbersome. Certainly doesn't need to be part of this PR, but food for thought for a future PR maybe.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org