You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by GitBox <gi...@apache.org> on 2022/03/08 10:55:42 UTC

[GitHub] [airflow] alexott opened a new pull request #22076: Add new options to DatabricksCopyIntoOperator

alexott opened a new pull request #22076:
URL: https://github.com/apache/airflow/pull/22076


   Extended `DatabricksCopyIntoOperator` with new options. This includes:
   * `encryption` - to specify encryption options for a given location
   * `credential` - to specify authentication options for a given location
   * `validate` - to control validation of schema & data
   
   also, made `files` option templated
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] alexott commented on a change in pull request #22076: Add new options to DatabricksCopyIntoOperator

Posted by GitBox <gi...@apache.org>.
alexott commented on a change in pull request #22076:
URL: https://github.com/apache/airflow/pull/22076#discussion_r821604584



##########
File path: airflow/providers/databricks/operators/databricks_sql.py
##########
@@ -163,18 +163,23 @@ class DatabricksCopyIntoOperator(BaseOperator):
         or ``sql_endpoint_name`` must be specified.
     :param sql_endpoint_name: Optional name of Databricks SQL Endpoint.
         If not specified, ``http_path`` must be provided as described above.
-    :param files: optional list of files to import. Can't be specified together with ``pattern``.
+    :param files: optional list of files to import. Can't be specified together with ``pattern``. (templated)
     :param pattern: optional regex string to match file names to import.
         Can't be specified together with ``files``.
     :param expression_list: optional string that will be used in the ``SELECT`` expression.
+    :param credential: optional credential configuration for authentication against a specified location.

Review comment:
       done




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] github-actions[bot] commented on pull request #22076: Add new options to DatabricksCopyIntoOperator

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on pull request #22076:
URL: https://github.com/apache/airflow/pull/22076#issuecomment-1061730370


   The PR is likely OK to be merged with just subset of tests for default Python and Database versions without running the full matrix of tests, because it does not modify the core of Airflow. If the committers decide that the full tests matrix is needed, they will add the label 'full tests needed'. Then you should rebase to the latest main or amend the last commit of the PR, and push it with --force-with-lease.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk merged pull request #22076: Add new options to DatabricksCopyIntoOperator

Posted by GitBox <gi...@apache.org>.
potiuk merged pull request #22076:
URL: https://github.com/apache/airflow/pull/22076


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on a change in pull request #22076: Add new options to DatabricksCopyIntoOperator

Posted by GitBox <gi...@apache.org>.
potiuk commented on a change in pull request #22076:
URL: https://github.com/apache/airflow/pull/22076#discussion_r821589797



##########
File path: airflow/providers/databricks/operators/databricks_sql.py
##########
@@ -163,18 +163,23 @@ class DatabricksCopyIntoOperator(BaseOperator):
         or ``sql_endpoint_name`` must be specified.
     :param sql_endpoint_name: Optional name of Databricks SQL Endpoint.
         If not specified, ``http_path`` must be provided as described above.
-    :param files: optional list of files to import. Can't be specified together with ``pattern``.
+    :param files: optional list of files to import. Can't be specified together with ``pattern``. (templated)
     :param pattern: optional regex string to match file names to import.
         Can't be specified together with ``files``.
     :param expression_list: optional string that will be used in the ``SELECT`` expression.
+    :param credential: optional credential configuration for authentication against a specified location.

Review comment:
       (the reason is that it creates a custom query and it's not entirely obvious that it is happening and how the query is constructed).




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] alexott commented on a change in pull request #22076: Add new options to DatabricksCopyIntoOperator

Posted by GitBox <gi...@apache.org>.
alexott commented on a change in pull request #22076:
URL: https://github.com/apache/airflow/pull/22076#discussion_r821592835



##########
File path: airflow/providers/databricks/operators/databricks_sql.py
##########
@@ -163,18 +163,23 @@ class DatabricksCopyIntoOperator(BaseOperator):
         or ``sql_endpoint_name`` must be specified.
     :param sql_endpoint_name: Optional name of Databricks SQL Endpoint.
         If not specified, ``http_path`` must be provided as described above.
-    :param files: optional list of files to import. Can't be specified together with ``pattern``.
+    :param files: optional list of files to import. Can't be specified together with ``pattern``. (templated)
     :param pattern: optional regex string to match file names to import.
         Can't be specified together with ``files``.
     :param expression_list: optional string that will be used in the ``SELECT`` expression.
+    :param credential: optional credential configuration for authentication against a specified location.

Review comment:
       yes, it's good idea - let me add that information




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on a change in pull request #22076: Add new options to DatabricksCopyIntoOperator

Posted by GitBox <gi...@apache.org>.
potiuk commented on a change in pull request #22076:
URL: https://github.com/apache/airflow/pull/22076#discussion_r821589092



##########
File path: airflow/providers/databricks/operators/databricks_sql.py
##########
@@ -163,18 +163,23 @@ class DatabricksCopyIntoOperator(BaseOperator):
         or ``sql_endpoint_name`` must be specified.
     :param sql_endpoint_name: Optional name of Databricks SQL Endpoint.
         If not specified, ``http_path`` must be provided as described above.
-    :param files: optional list of files to import. Can't be specified together with ``pattern``.
+    :param files: optional list of files to import. Can't be specified together with ``pattern``. (templated)
     :param pattern: optional regex string to match file names to import.
         Can't be specified together with ``files``.
     :param expression_list: optional string that will be used in the ``SELECT`` expression.
+    :param credential: optional credential configuration for authentication against a specified location.

Review comment:
       It would be great to provide a link to the docs explaining what are possible credential/encryption options. And I think  example_dags should show a few examples (not only unit tests)




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org