You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by GitBox <gi...@apache.org> on 2022/04/03 23:36:03 UTC
[GitHub] [airflow] dbarrundiag commented on pull request #22692: Add support for Delta Sharing protocol
dbarrundiag commented on PR #22692:
URL: https://github.com/apache/airflow/pull/22692#issuecomment-1086974829
Just out of curiosity, shouldn't we try to leverage the official [Python Connector](https://github.com/delta-io/delta-sharing/tree/main/python) that `delta-sharing` has already implemented? I feel like I see here a bunch of duplicate or unnecessary code and might be a better design if we follow the "official" protocol so we don't have to keep up with API changes here?
For example instead of re-writing all this code code and implementing a `_extract_delta_sharing_version` method as we do here: https://github.com/apache/airflow/blob/9547f8c718719f24d5b25b856c4b6839bf5d2524/airflow/providers/delta/sharing/hooks/delta_sharing.py#L223
couldn't we just call `query_table_version` which has already been implemented in the "official" REST client: https://github.com/delta-io/delta-sharing/blob/main/python/delta_sharing/rest_client.py#L229
or why introduce a `DeltaSharingQueryResult` in the Airflow provider if we already have the "official" `QueryTableMetadataResponse` or `QueryTableVersionResponse` here
https://github.com/delta-io/delta-sharing/blob/main/python/delta_sharing/rest_client.py#L66
What do you all think? The only "con" I see is that it introduces a requirement to install the python module, but i feel like that's totally fair no?
CC: @alexott @mik-laj @potiuk
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org