You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by GitBox <gi...@apache.org> on 2020/10/06 18:56:40 UTC

[GitHub] [airflow] vshshjn7 opened a new pull request #11318: [AIRFLOW] Adding Vertica to GCS cloud transfer

vshshjn7 opened a new pull request #11318:
URL: https://github.com/apache/airflow/pull/11318


   <!--
   Thank you for contributing! Please make sure that your code changes
   are covered with tests. And in case of new features or big changes
   remember to adjust the documentation.
   
   Feel free to ping committers for the review!
   
   In case of existing issue, reference it using one of the following:
   
   closes: #ISSUE
   related: #ISSUE
   
   How to write a good git commit message:
   http://chris.beams.io/posts/git-commit/
   -->
   Adding a new operator for Vertica To GCS cloud transfer. 
   
   [Done] Local testing
   [Done] Adding test case
   
   @msumit
   ---
   **^ Add meaningful description above**
   
   Read the **[Pull Request Guidelines](https://github.com/apache/airflow/blob/master/CONTRIBUTING.rst#pull-request-guidelines)** for more information.
   In case of fundamental code change, Airflow Improvement Proposal ([AIP](https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+Improvements+Proposals)) is needed.
   In case of a new dependency, check compliance with the [ASF 3rd Party License Policy](https://www.apache.org/legal/resolved.html#category-x).
   In case of backwards incompatible changes please leave a note in [UPDATING.md](https://github.com/apache/airflow/blob/master/UPDATING.md).
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] vshshjn7 commented on a change in pull request #11318: Add Vertica to GCS cloud transfer operator

Posted by GitBox <gi...@apache.org>.
vshshjn7 commented on a change in pull request #11318:
URL: https://github.com/apache/airflow/pull/11318#discussion_r501622843



##########
File path: airflow/providers/google/cloud/transfers/vertica_to_gcs.py
##########
@@ -0,0 +1,106 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+import decimal
+
+import datetime
+import sys
+
+import time
+
+from airflow.providers.vertica.hooks.vertica import VerticaHook
+from airflow.providers.google.cloud.transfers.sql_to_gcs import BaseSQLToGCSOperator
+from airflow.utils.decorators import apply_defaults
+
+PY3 = sys.version_info[0] == 3
+
+
+class VerticaToGoogleCloudStorageOperator(BaseSQLToGCSOperator):

Review comment:
       +1




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] mik-laj commented on a change in pull request #11318: Add Vertica to GCS cloud transfer operator

Posted by GitBox <gi...@apache.org>.
mik-laj commented on a change in pull request #11318:
URL: https://github.com/apache/airflow/pull/11318#discussion_r504945638



##########
File path: airflow/providers/google/cloud/transfers/sql_to_gcs.py
##########
@@ -232,6 +232,10 @@ def _write_local_data_files(self, cursor):
 
         return files_to_upload
 
+    def get_cursor_iterator(self, cursor):

Review comment:
       I think that PEP documents should also be treated as a recommendation and good practice. In this case, PEP recommends creating an interface with  `__iter__` method instead of alternative solutions such as the get_iter methods.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] msumit commented on a change in pull request #11318: [AIRFLOW] Adding Vertica to GCS cloud transfer

Posted by GitBox <gi...@apache.org>.
msumit commented on a change in pull request #11318:
URL: https://github.com/apache/airflow/pull/11318#discussion_r500765511



##########
File path: CHANGELOG.txt
##########
@@ -9,6 +9,7 @@ New Features
 - Get Airflow configs with sensitive data from Secret Backends (#9645)
 - [AIRFLOW-4734] Upsert functionality for PostgresHook.insert_rows() (#8625)
 - Allow defining custom XCom class (#8560)
+- [AIRFLOW] Adding Vertica to GCS cloud transfer (#11318)

Review comment:
       nit: `- Adding Vertica to GCS cloud transfer operator`




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] github-actions[bot] commented on pull request #11318: [AIRFLOW] Adding Vertica to GCS cloud transfer

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on pull request #11318:
URL: https://github.com/apache/airflow/pull/11318#issuecomment-704589041


   [The Build Workflow run](https://github.com/apache/airflow/actions/runs/292180024) is cancelling this PR. It has some failed jobs matching ^Pylint$,^Static checks$,^Build docs$,^Spell check docs$,^Backport packages$,^Checks: Helm tests$,^Test OpenAPI*.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] vshshjn7 commented on pull request #11318: Add Vertica to GCS cloud transfer operator

Posted by GitBox <gi...@apache.org>.
vshshjn7 commented on pull request #11318:
URL: https://github.com/apache/airflow/pull/11318#issuecomment-707131465


   Hi @mik-laj
    
   Can you provide your comments?
   
   Thanks


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] mik-laj commented on a change in pull request #11318: Add Vertica to GCS cloud transfer operator

Posted by GitBox <gi...@apache.org>.
mik-laj commented on a change in pull request #11318:
URL: https://github.com/apache/airflow/pull/11318#discussion_r501346905



##########
File path: airflow/providers/google/cloud/transfers/vertica_to_gcs.py
##########
@@ -0,0 +1,106 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+import decimal
+
+import datetime
+import sys
+
+import time
+
+from airflow.providers.vertica.hooks.vertica import VerticaHook
+from airflow.providers.google.cloud.transfers.sql_to_gcs import BaseSQLToGCSOperator
+from airflow.utils.decorators import apply_defaults
+
+PY3 = sys.version_info[0] == 3
+
+
+class VerticaToGoogleCloudStorageOperator(BaseSQLToGCSOperator):

Review comment:
       ```suggestion
   class VerticaToGCSOperator(BaseSQLToGCSOperator):
   ```
   We try to use the GCS abbreviation because it is widely recognized.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] vshshjn7 commented on a change in pull request #11318: Add Vertica to GCS cloud transfer operator

Posted by GitBox <gi...@apache.org>.
vshshjn7 commented on a change in pull request #11318:
URL: https://github.com/apache/airflow/pull/11318#discussion_r501622664



##########
File path: docs/howto/operator/google/transfer/vertica_to_gcs.rst
##########
@@ -0,0 +1,58 @@
+ .. Licensed to the Apache Software Foundation (ASF) under one
+    or more contributor license agreements.  See the NOTICE file
+    distributed with this work for additional information
+    regarding copyright ownership.  The ASF licenses this file
+    to you under the Apache License, Version 2.0 (the
+    "License"); you may not use this file except in compliance
+    with the License.  You may obtain a copy of the License at
+
+ ..   http://www.apache.org/licenses/LICENSE-2.0
+
+ .. Unless required by applicable law or agreed to in writing,
+    software distributed under the License is distributed on an
+    "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+    KIND, either express or implied.  See the License for the
+    specific language governing permissions and limitations
+    under the License.
+
+Vertica To Google Cloud Storage Operator
+========================================
+The `Google Cloud Storage <https://cloud.google.com/storage/>`__ (GCS) service is
+used to store large data from various applications. This page shows how to copy
+data from Vertica to GCS.
+
+.. contents::
+  :depth: 1
+  :local:
+
+
+Prerequisite Tasks
+^^^^^^^^^^^^^^^^^^
+
+.. include::/howto/operator/google/_partials/prerequisite_tasks.rst

Review comment:
       Why do we need to configure Vertica? Credentials and DB name are part of the Airflow connection and the rest of the details are taken by the operator.
   
   Are you talking about ``pip install 'apache-airflow[vertica]'`` ? Am I missing something?  




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] msumit commented on pull request #11318: [AIRFLOW] Adding Vertica to GCS cloud transfer

Posted by GitBox <gi...@apache.org>.
msumit commented on pull request #11318:
URL: https://github.com/apache/airflow/pull/11318#issuecomment-704728643


   LGTM.. but I don't have much experience with Vertica or even GCS operators in general. Maybe @mik-laj or @kaxil or @potiuk  can take a look as well. 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] mik-laj commented on a change in pull request #11318: Add Vertica to GCS cloud transfer operator

Posted by GitBox <gi...@apache.org>.
mik-laj commented on a change in pull request #11318:
URL: https://github.com/apache/airflow/pull/11318#discussion_r501350091



##########
File path: airflow/providers/google/cloud/transfers/sql_to_gcs.py
##########
@@ -232,6 +232,10 @@ def _write_local_data_files(self, cursor):
 
         return files_to_upload
 
+    def get_cursor_iterator(self, cursor):

Review comment:
       Why is it necessary? [pep-0249](https://www.python.org/dev/peps/pep-0249/) recommends that cursor have a `__iiter__` method, so this should work without it. If the Vertica library does not implement it, I think it is worth fixing the problem and not try to change the interface of other classes. the adapter design pattern can be helpful here. See: 
   https://github.com/apache/airflow/blob/2bac4810a48d43692be8f585167925e2c48abb46/airflow/providers/google/cloud/transfers/presto_to_gcs.py#L27
   




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] vshshjn7 commented on a change in pull request #11318: Add Vertica to GCS cloud transfer operator

Posted by GitBox <gi...@apache.org>.
vshshjn7 commented on a change in pull request #11318:
URL: https://github.com/apache/airflow/pull/11318#discussion_r501609745



##########
File path: airflow/providers/google/cloud/transfers/sql_to_gcs.py
##########
@@ -232,6 +232,10 @@ def _write_local_data_files(self, cursor):
 
         return files_to_upload
 
+    def get_cursor_iterator(self, cursor):

Review comment:
       As per pep-0249, `__iter__` is an optional dbapi extension. So, other datastores also may or may not choose to implement this.
   
   The above change will not change the existing functionality, rather will give the flexibility to the user to define/modify the iterator if in case it is not provided by the datastore cursor.
   
   I am fine with making changes, but I feel this gives more flexibility to the user to modify or define iterator if he wants.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] kaxil commented on pull request #11318: Add Vertica to GCS cloud transfer operator

Posted by GitBox <gi...@apache.org>.
kaxil commented on pull request #11318:
URL: https://github.com/apache/airflow/pull/11318#issuecomment-721446242


   Can you please rebase your PR on latest Master since we have applied [Black](https://github.com/apache/airflow/commit/4e8f9cc8d02b29c325b8a5a76b4837671bdf5f68) and [PyUpgrade](https://github.com/apache/airflow/commit/8c42cf1b00c90f0d7f11b8a3a455381de8e003c5) on Master.
   
   It will help if your squash your commits into single commit first so that there are less conflicts.
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] vshshjn7 commented on a change in pull request #11318: Add Vertica to GCS cloud transfer operator

Posted by GitBox <gi...@apache.org>.
vshshjn7 commented on a change in pull request #11318:
URL: https://github.com/apache/airflow/pull/11318#discussion_r501034122



##########
File path: CHANGELOG.txt
##########
@@ -9,6 +9,7 @@ New Features
 - Get Airflow configs with sensitive data from Secret Backends (#9645)
 - [AIRFLOW-4734] Upsert functionality for PostgresHook.insert_rows() (#8625)
 - Allow defining custom XCom class (#8560)
+- Adding Vertica to GCS cloud transfer (#11318)

Review comment:
       sure




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] github-actions[bot] commented on pull request #11318: Add Vertica to GCS cloud transfer operator

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on pull request #11318:
URL: https://github.com/apache/airflow/pull/11318#issuecomment-792130011


   This pull request has been automatically marked as stale because it has not had recent activity. It will be closed in 5 days if no further activity occurs. Thank you for your contributions.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] github-actions[bot] commented on pull request #11318: [AIRFLOW] Adding Vertica to GCS cloud transfer

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on pull request #11318:
URL: https://github.com/apache/airflow/pull/11318#issuecomment-704589325


   [The Build Workflow run](https://github.com/apache/airflow/actions/runs/292186194) is cancelling this PR. It has some failed jobs matching ^Pylint$,^Static checks$,^Build docs$,^Spell check docs$,^Backport packages$,^Checks: Helm tests$,^Test OpenAPI*.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] vshshjn7 commented on pull request #11318: Add Vertica to GCS cloud transfer operator

Posted by GitBox <gi...@apache.org>.
vshshjn7 commented on pull request #11318:
URL: https://github.com/apache/airflow/pull/11318#issuecomment-708569299


   > Vertica can export to CSV, as well as other formats like parquet. When would a user want to iterate over rows from a vertica cursor in python instead of using a faster native export?
   
   True @jmcarp .
   But here, this operator is to enable the user to run a query on Vertica, which can be scheduled, instead of taking the dump every time.
   One of the use-cases could be migration. The entire table dump is a one-time job, but later it could be a scheduled job to push delta change (one of the use cases) using this operator until the older job is not replaced by a newer one.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] github-actions[bot] commented on pull request #11318: Add Vertica to GCS cloud transfer operator

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on pull request #11318:
URL: https://github.com/apache/airflow/pull/11318#issuecomment-706156616


   [The Workflow run](https://github.com/apache/airflow/actions/runs/297466150) is cancelling this PR. It has some failed jobs matching ^Pylint$,^Static checks$,^Build docs$,^Spell check docs$,^Backport packages$,^Checks: Helm tests$,^Test OpenAPI*.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] github-actions[bot] commented on pull request #11318: [AIRFLOW] Adding Vertica to GCS cloud transfer

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on pull request #11318:
URL: https://github.com/apache/airflow/pull/11318#issuecomment-704727974


   [The Build Workflow run](https://github.com/apache/airflow/actions/runs/292873095) is cancelling this PR. It has some failed jobs matching ^Pylint$,^Static checks$,^Build docs$,^Spell check docs$,^Backport packages$,^Checks: Helm tests$,^Test OpenAPI*.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] github-actions[bot] commented on pull request #11318: Add Vertica to GCS cloud transfer operator

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on pull request #11318:
URL: https://github.com/apache/airflow/pull/11318#issuecomment-705032098


   [The Build Workflow run](https://github.com/apache/airflow/actions/runs/293694851) is cancelling this PR. It has some failed jobs matching ^Pylint$,^Static checks$,^Build docs$,^Spell check docs$,^Backport packages$,^Checks: Helm tests$,^Test OpenAPI*.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] mik-laj commented on a change in pull request #11318: Add Vertica to GCS cloud transfer operator

Posted by GitBox <gi...@apache.org>.
mik-laj commented on a change in pull request #11318:
URL: https://github.com/apache/airflow/pull/11318#discussion_r504946437



##########
File path: airflow/providers/google/cloud/transfers/sql_to_gcs.py
##########
@@ -232,6 +232,10 @@ def _write_local_data_files(self, cursor):
 
         return files_to_upload
 
+    def get_cursor_iterator(self, cursor):

Review comment:
       If it is possible, we should stick to the PEP recommendations. When this is not possible, we should use our own solutions, but I think that here it is possible to achieve compatibility with PEP.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] mik-laj commented on a change in pull request #11318: Add Vertica to GCS cloud transfer operator

Posted by GitBox <gi...@apache.org>.
mik-laj commented on a change in pull request #11318:
URL: https://github.com/apache/airflow/pull/11318#discussion_r501347217



##########
File path: docs/howto/operator/google/transfer/vertica_to_gcs.rst
##########
@@ -0,0 +1,58 @@
+ .. Licensed to the Apache Software Foundation (ASF) under one
+    or more contributor license agreements.  See the NOTICE file
+    distributed with this work for additional information
+    regarding copyright ownership.  The ASF licenses this file
+    to you under the Apache License, Version 2.0 (the
+    "License"); you may not use this file except in compliance
+    with the License.  You may obtain a copy of the License at
+
+ ..   http://www.apache.org/licenses/LICENSE-2.0
+
+ .. Unless required by applicable law or agreed to in writing,
+    software distributed under the License is distributed on an
+    "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+    KIND, either express or implied.  See the License for the
+    specific language governing permissions and limitations
+    under the License.
+
+Vertica To Google Cloud Storage Operator
+========================================
+The `Google Cloud Storage <https://cloud.google.com/storage/>`__ (GCS) service is
+used to store large data from various applications. This page shows how to copy
+data from Vertica to GCS.
+
+.. contents::
+  :depth: 1
+  :local:
+
+
+Prerequisite Tasks
+^^^^^^^^^^^^^^^^^^
+
+.. include::/howto/operator/google/_partials/prerequisite_tasks.rst

Review comment:
       These are not the complete steps required. You probably still need to configure Vertica.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] vshshjn7 commented on a change in pull request #11318: Add Vertica to GCS cloud transfer operator

Posted by GitBox <gi...@apache.org>.
vshshjn7 commented on a change in pull request #11318:
URL: https://github.com/apache/airflow/pull/11318#discussion_r508354923



##########
File path: airflow/providers/google/cloud/transfers/sql_to_gcs.py
##########
@@ -232,6 +232,10 @@ def _write_local_data_files(self, cursor):
 
         return files_to_upload
 
+    def get_cursor_iterator(self, cursor):

Review comment:
       I was looking at `_PrestoToGCSPrestoCursorAdapter` code, it looks like the whole cursor code is copied from [presto code](https://github.com/prestodb/presto-python-client/blob/master/prestodb/dbapi.py#L170).
   
   I am afraid if we do the same, then in the future, for some reason, logic changes, then we will also have to change the logic at our end also. Isn't that overhead? 




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] vshshjn7 commented on a change in pull request #11318: Add Vertica to GCS cloud transfer operator

Posted by GitBox <gi...@apache.org>.
vshshjn7 commented on a change in pull request #11318:
URL: https://github.com/apache/airflow/pull/11318#discussion_r501609745



##########
File path: airflow/providers/google/cloud/transfers/sql_to_gcs.py
##########
@@ -232,6 +232,10 @@ def _write_local_data_files(self, cursor):
 
         return files_to_upload
 
+    def get_cursor_iterator(self, cursor):

Review comment:
       As per pep-0249, `__iter__` is an optional dbapi extension. So, other datastores also may or may not choose to implement this.
   
   The above change will not change the existing functionality, rather will give the flexibility to the user to define/modify the iterator if in case it is not provided by the datastore cursor.
   
   I am fine with making changes. Kindly let me know.

##########
File path: airflow/providers/google/cloud/transfers/sql_to_gcs.py
##########
@@ -232,6 +232,10 @@ def _write_local_data_files(self, cursor):
 
         return files_to_upload
 
+    def get_cursor_iterator(self, cursor):

Review comment:
       As per pep-0249, `__iter__` is an optional dbapi extension. So, other datastores also may or may not choose to implement this.
   
   The above change will not change the existing functionality, rather will give the flexibility to the user to define/modify the iterator if in case it is not provided by the datastore cursor.
   
   I am fine with making changes, but I feel this gives more flexibility to the user to modify or define iterator if he wants.

##########
File path: docs/howto/operator/google/transfer/vertica_to_gcs.rst
##########
@@ -0,0 +1,58 @@
+ .. Licensed to the Apache Software Foundation (ASF) under one
+    or more contributor license agreements.  See the NOTICE file
+    distributed with this work for additional information
+    regarding copyright ownership.  The ASF licenses this file
+    to you under the Apache License, Version 2.0 (the
+    "License"); you may not use this file except in compliance
+    with the License.  You may obtain a copy of the License at
+
+ ..   http://www.apache.org/licenses/LICENSE-2.0
+
+ .. Unless required by applicable law or agreed to in writing,
+    software distributed under the License is distributed on an
+    "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+    KIND, either express or implied.  See the License for the
+    specific language governing permissions and limitations
+    under the License.
+
+Vertica To Google Cloud Storage Operator
+========================================
+The `Google Cloud Storage <https://cloud.google.com/storage/>`__ (GCS) service is
+used to store large data from various applications. This page shows how to copy
+data from Vertica to GCS.
+
+.. contents::
+  :depth: 1
+  :local:
+
+
+Prerequisite Tasks
+^^^^^^^^^^^^^^^^^^
+
+.. include::/howto/operator/google/_partials/prerequisite_tasks.rst

Review comment:
       Why do we need to configure Vertica? Credentials and DB name are part of the Airflow connection and the rest of the details are taken by the operator.
   
   Are you talking about ``pip install 'apache-airflow[vertica]'`` ? Am I missing something?  

##########
File path: airflow/providers/google/cloud/transfers/vertica_to_gcs.py
##########
@@ -0,0 +1,106 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+import decimal
+
+import datetime
+import sys
+
+import time
+
+from airflow.providers.vertica.hooks.vertica import VerticaHook
+from airflow.providers.google.cloud.transfers.sql_to_gcs import BaseSQLToGCSOperator
+from airflow.utils.decorators import apply_defaults
+
+PY3 = sys.version_info[0] == 3
+
+
+class VerticaToGoogleCloudStorageOperator(BaseSQLToGCSOperator):

Review comment:
       +1




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] vshshjn7 commented on a change in pull request #11318: Add Vertica to GCS cloud transfer operator

Posted by GitBox <gi...@apache.org>.
vshshjn7 commented on a change in pull request #11318:
URL: https://github.com/apache/airflow/pull/11318#discussion_r501034122



##########
File path: CHANGELOG.txt
##########
@@ -9,6 +9,7 @@ New Features
 - Get Airflow configs with sensitive data from Secret Backends (#9645)
 - [AIRFLOW-4734] Upsert functionality for PostgresHook.insert_rows() (#8625)
 - Allow defining custom XCom class (#8560)
+- Adding Vertica to GCS cloud transfer (#11318)

Review comment:
       ack




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] github-actions[bot] commented on pull request #11318: Add Vertica to GCS cloud transfer operator

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on pull request #11318:
URL: https://github.com/apache/airflow/pull/11318#issuecomment-705031611


   [The Build Workflow run](https://github.com/apache/airflow/actions/runs/293669713) is cancelling this PR. It has some failed jobs matching ^Pylint$,^Static checks$,^Build docs$,^Spell check docs$,^Backport packages$,^Checks: Helm tests$,^Test OpenAPI*.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] github-actions[bot] commented on pull request #11318: Add Vertica to GCS cloud transfer operator

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on pull request #11318:
URL: https://github.com/apache/airflow/pull/11318#issuecomment-706156616


   [The Workflow run](https://github.com/apache/airflow/actions/runs/297466150) is cancelling this PR. It has some failed jobs matching ^Pylint$,^Static checks$,^Build docs$,^Spell check docs$,^Backport packages$,^Checks: Helm tests$,^Test OpenAPI*.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] mik-laj commented on a change in pull request #11318: Add Vertica to GCS cloud transfer operator

Posted by GitBox <gi...@apache.org>.
mik-laj commented on a change in pull request #11318:
URL: https://github.com/apache/airflow/pull/11318#discussion_r500958431



##########
File path: CHANGELOG.txt
##########
@@ -9,6 +9,7 @@ New Features
 - Get Airflow configs with sensitive data from Secret Backends (#9645)
 - [AIRFLOW-4734] Upsert functionality for PostgresHook.insert_rows() (#8625)
 - Allow defining custom XCom class (#8560)
+- Adding Vertica to GCS cloud transfer (#11318)

Review comment:
       The changelog is prepared by the release maintainer. Can you remove it?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] vshshjn7 commented on a change in pull request #11318: Add Vertica to GCS cloud transfer operator

Posted by GitBox <gi...@apache.org>.
vshshjn7 commented on a change in pull request #11318:
URL: https://github.com/apache/airflow/pull/11318#discussion_r501609745



##########
File path: airflow/providers/google/cloud/transfers/sql_to_gcs.py
##########
@@ -232,6 +232,10 @@ def _write_local_data_files(self, cursor):
 
         return files_to_upload
 
+    def get_cursor_iterator(self, cursor):

Review comment:
       As per pep-0249, `__iter__` is an optional dbapi extension. So, other datastores also may or may not choose to implement this.
   
   The above change will not change the existing functionality, rather will give the flexibility to the user to define/modify the iterator if in case it is not provided by the datastore cursor.
   
   I am fine with making changes. Kindly let me know.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] github-actions[bot] closed pull request #11318: Add Vertica to GCS cloud transfer operator

Posted by GitBox <gi...@apache.org>.
github-actions[bot] closed pull request #11318:
URL: https://github.com/apache/airflow/pull/11318


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] github-actions[bot] commented on pull request #11318: Add Vertica to GCS cloud transfer operator

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on pull request #11318:
URL: https://github.com/apache/airflow/pull/11318#issuecomment-705142905


   [The Build Workflow run](https://github.com/apache/airflow/actions/runs/294039474) is cancelling this PR. The job has been cancelled by another workflow.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org